Presenting at Digital Humanities (DH) 2015, July 1, 2015, Sydney, Australia.


Organizational Practices in Digital Humanities Centers

This paper addresses organizational practices and potential future developments of digital humanities centers (DHC). The paper draws on findings of an ethnographic study carried out between 2011-2013 at eleven DHCs in the US and Europe (to leave room for the empirical findings, contextualization in relevant literature was intentionally omitted from this abstract). The analyzed centers were founded from the early 1990s to present date. All centers are affiliated with research-intensive universities, have a dedicated space, and employ between five and fifteen.

A significant difference is the bottom-up character of the early DHCs contrasted against the top-down nature of newer centers. This difference was also evident among the contemporary DHCs. In the top-down cases, university administrators often wanted to establish a hub of scholarly innovation without a clear sense of what that innovation entailed, what the center’s function would be, the range of services it should provide, and who would comprise the primary user-base. At one center, housed in the university library, new glass walls represented visually transparent boundary between the print and the electronic materials. Flexible workspaces with standing stations equipped with high-end technologies were designed to adjust to user needs. But while design, ergonomic and technical features were meticulously considered the goals for the center and thoughts about the user community were given less attention:

That’s really a big question for us, figuring out what kind of services we want to offer.We’re still trying to figure out what our outreach will have to be to scholars to get them to use this. We’re not sure what’s going to happen on the first day when we open the doors. Are people going to know what to do? Are they going to come in, sit down and check email?

Another campus of the same university system simultaneously launched its DHC using a different organizational strategy. The idea to establish the center came from humanities scholars rather than from the administration. This group of scholars began the planning phase with a university-wide dialogue. They included their colleagues from the humanities division, information technologists, computer scientists, librarians, and other potential stakeholders. They prioritized talking to humanities students and faculty about their visions, wishes, and potential points of resistance related to the center. These early conversations and comprehensive planning helped identify which organizational strategies and design possibilities were consistent with their goals for the center and resulted in well-defined goals and clear mission statements for the new unit.

In a number of the analyzed centers, the main goal was to support humanities scholars in their current research and to gradually introduce them to digital tools and methods. These centers target the widest spectrum of humanists rather than rely on scholars who were already well versed in, or inclined to use digital tools. These centers bridge the critical gap between technologically advanced and less advanced scholars, but with such diverse user community they sometimes struggle to profile their activities. Some centers thus adopt a more precisely defined approach when establishing goals and characterizing potential user communities.

Parallel functioning of disconnected DHCs of the same institution is a significant problem. DH units are often differentiated according to the areas of work that they support, and according to their funding sources. This organizational system can help DHCs formulate clear goals, but it can also lead to user confusion: “They don’t know which center is best for them and it may just turn them off from going anywhere.”

Considering resistance toward digital scholarship sometimes held by traditionally trained humanists, DHCs develop an outreach strategy that involves two preparatory steps: 1) making humanists aware that they already rely on digital technologies; 2) explaining that DHCs are offering digital tools and methods rather than imposing them. In their outreach activities, DHCs frequently stumble upon the users’ lack of time:

These are very busy people, and their concern is primarily what is their current research about. And very often they are ultimately making choices between do they go and pay attention to our offer of a demonstration about digital tools, or do they spend that hour and focus on their own research questions. We’re competing with natural priorities in a scholar’s life.

DHCs also rely on word of mouth to reach potential users; exchange of experiences among peers is one of their most successful outreach strategies:

If you can reach the right people out in the faculty or even grad students, that’s a much more compelling argument than anything that we can say. Even though we may have PhD’s and may use technology for our own research, we’re not in the same role as they are. So, hearing from someone who lives a life just like they do in many ways is very compelling.

User support is an important, but not always clearly anticipated part of activities in DHCs. Growing effective DHCs involves keeping regular office hours, organizing talks and workshops, answering user inquiries, but also balancing disciplines within the user community so that no single discipline dominates the center’s activities and conversations. An important element of maintaining user communities also involves shaping user expectations and clarifying the nature of partnership with users. Users often view DHCs as serving humanists’ requests rather than collaborating with them:

They see us as the programming shop that will do whatever they tell us to, and that’s not really how we’re trying to approach this. It’s more of collaboration, and we actually get something back out of it also.

DHCs often employ hybrids—traditionally trained humanists who also have good computer skills. They are seen as a necessary link between “the two cultures” who can adeptly translate epistemological and methodological concepts and approaches. “Smart organizations will have more of me,” remarked the Head of the R&D team at one of the centers who also happens to be a historian with good programming skills. The advantage of speaking both “programese” and “scholarese,” as this interviewee put it, is the capacity to help humanists grasp digital tools and methods while simultaneously helping programmers understand humanities work. DHCs have found an efficient way of employing hybrids through engaging humanities graduate students, who are usually early adopters of technology. Graduate students who participated in the study liked working in DHCs. It allowed them to advance their research skills and to build expertise through participation in important research and decision-making activities.

The reversal of instructional roles, in which students were teaching teachers, facilitated students’ understanding of some of the didactic principles motivating them to develop their own pedagogic strategies.

Hybrids as a type of scholarly workforce are linked to the concept of alternative academic careers. The interviewees described alternative academic paths as intentional career choices scholars make. Although this choice allows scholars to continue working in their preferred field, the transition is nonetheless perceived as difficult and consequential:

“You can’t just yank somebody out of the faculty and out of years or decades of training without some accounting for how they conceive of themselves as a scholar.”

The respondents argue that from an administrative point of view, thinking about people’s time and labor might be the most important issue DHCs will need to engage with, because the internal inherited systems of classifying employees are not well suited to DH practices.

Traditional organizational systems for classifying scholars are not only inefficient for addressing contemporary issues of academic labor and knowledge production; they are also potentially detrimental to the future of scholarly work:

If we can’t get this generation of graduate students comfortable with alternate modes of work, not feeling like they are failures if they don’t get a tenure-track position, and seeing good career paths for themselves within the DH, we’re going to lose that generation of scholars.

Among the respondents, two related but disparate theories of the future of digital humanities and DHCs emerged. One line of thought suggests that digital tools and methods will progressively become a standard part of humanities research. DH should thus be marked as a transitional moment in the humanities disciplines rather than as a distinct field. The distinction between digital and mainstream humanities will diminish over time, even though certain methodological differences might remain. Another group held that DH already ranks as a distinct field. The field will retain its autonomy because the need for innovative work and thinking with technology in the humanities will never cease. Although the use of digital tools and methods will become increasingly mainstream, there will always be a need for research groups on the frontier of innovation.

The far-reaching presumably rhetorical question of whether digital scholarship in the humanities will be designated as DH or “just” humanities will have important implications for the future. Both scholars and administrators are musing about whether DHCs will be needed in the future, or digital scholarship will blend into the existing disciplinary and departmental structures. Overall, the respondents agreed that we will see a wave of interest in DHCs, some of which will persist, while others will peak quickly only to fade away.

Presenting at HASTAC conference, June 26, 2015, East Lansing, Michigan


Tracing the Workflow of a Digital Scholar

This paper presents findings of an Andrew W. Mellon Foundation-funded project conducted at Penn State University from April 2012-June 2013. It also outlines preliminary results of Phase II of the same project, currently underway at Penn State and George Mason University.

Phase I explored scholarly workflow of the Penn State faculty across the sciences,humanities, and social sciences, focusing on the integration of digital technologies at all stages of a research life cycle—from collecting and analyzing data, over managing and storing, to writing up and sharing research findings. This paper harvests a comparative multidisciplinary perspective of our study to explore specificities of humanists’ digital workflow, enabling development of a service architecture that supports those practices.

Phase I was comprised of a web-based survey (n=196) and a set of ethnographic interviews (n=23). The results showed that across disciplines digital tools were most actively used for finding, storing, and archiving research materials, although disciplinary differences could be traced. For instance, while the respondents overwhelmingly (92%) store research materials, humanists reported the highest percentage of lost and inaccessible files, predominantly because of failing to migrate materials from obsolete to contemporary formats.

Concerning data collecting and analysis, the use of digital technologies significantly differs across disciplines. Respondents in the science commonly noted that their work would be impossible without digital technologies, and scholars in the social sciences indicated digital tools and methods becoming ‘a new normal’ in their practice. In contrast, humanists implied the lack of digital technology use in those segments of their research process. They nonetheless indicated awareness of digital tools and methods that could facilitate their analytical practice, suggesting the lack of training and time as key impediments to developing needed skills.

Disciplinary differences were also evident in data sharing activities. Two thirds (63%) of scholars in the sciences indicated that they actively share their data; a nearly identical percentage of the humanists (69%) indicated opposite practice. Academic standing also influenced data sharing practices, with tenure-track faculty being more protective of their data than tenured scholars.

Annotating and reflecting emerged as research activities where the use of digital technologies is most idiosyncratic, based on scholars’ personal preferences rather than the level of technical skills or availability of digital tools. The use of citation management programs was higher in the sciences (55 % vs. 30 %), but the overall level of use was lower than in other segments of the workflow.

Phase II of our study is devoted to developing a digital research tool for humanities scholarship using Zotero as a test platform. Based on the results of Phase I, we focus on unifying several segments of the workflow, and facilitating elements such as better integration of archiving into the scholar’s online path. Since the loss of information among the humanists is significant, there is a need to build into the research workflow easy strategies for users to self-archive their work. Optimizations to connect the institutional repository within Zotero, as well as expose references and metadata within uploaded PDFs will be explored.

Presenting at the Personal Digital Archiving conference, April 26, 2015, New York, NY


Integrating Self-Archiving and Discovery into the User’s Workflow: The Zotero / ScholarSphere Project

This poster / demo centers on an Andrew W. Mellon Foundation grant of $440,000 to the Penn State University Libraries.  We will detail the planned trajectory of the 2014-2016 project, designed in collaboration with George Mason University and Zotero, which builds upon prior Penn State Mellon-funded research studying how faculty managed and archived their scholarly information collections. During 2014-16, the research team will direct software development centered on Zotero, an open source citation manager. The new software will enhance Zotero’s archiving capabilities by linking to ScholarSphere, Penn State’s Hydra-based institutional repository service. This will allow Penn State users to claim and deposit self-authored works securely into ScholarSphere via Zotero. The software developed in this project will allow other colleges and universities with similar Hydra-based institutional repositories to implement a Zotero deposit connection. The poster / demo will showcase the current enhancements to Zotero, and will also provide information for conference participants interested in making these optimizations available to their local Zotero / Hydra-based institutional repository users.


ITHAKA 2012 Faculty Survey–Research Dissemination Findings


The ITHAKA S&R Faculty Survey 2012 is out, and with it are some interesting findings relevant to our study of scholarly workflow and personal archiving.

In the survey, faculty from institutions across the US were asked about their opinions and practices, with regard to the following areas:  Materials used for research and teaching; Discovery; Provisioning Materials to faculty members:  formats and sources; Research topics and practices; Undergraduate education; Research dissemination; The role of the library and The role of the scholarly society.  Of particular relevance to our study were some of the findings focused on the research dissemination and research topics and practices.

Participating faculty were asked, “In the course of your research, you may build collections of scientific, qualitative, quantitative, or primary source research data.  If these collections of research data are preserved following the conclusion of the projects, what methods are used to preserve them?  Faculty indicated that overwhelmingly (80% in the Sciences; and nearly 80% in the Humanities and Social Sciences) they preserve the materials themselves, using commercially or freely available software or services.    Less frequent was the use of an institutional or other type of repository (highest in the Humanities–just over 20%), not preserving the materials at all (less than 20% in general), or preserved by publisher or university (less than 10% in general).

This articulated prevalence of self-preservation, while working well in the short-term, is worrisome as a long-term preservation strategy.  A majority of faculty note that their data is worth preserving, yet the data itself is only preserved in a manner entirely dependent on the individual researcher to care for and migrate the data as software, services, or technologies retire.  This highly distributed practice of individualized preservation will likely result in significant data loss in the long term.  The ITHAKA report notes of these findings, “If long-term data preservation is to become an important priority for the scholarly community, new solutions–or greater uptake of existing solutions– will be required to ensure that materials are preserved responsibly.” (p. 63)

Related to this issue, faculty were asked, “How valuable do you find support from your college or university library, scholarly society, university press, or another service provider for the following aspects of the publication process (or how valuable would you find it if this support was offered to you?”  The following options were offered:

  1. Managing a public webpage for me that lists links to my recent scholarly outputs, provides information on my areas of research and teaching, and provides contact information for me. (Highest positive response in the Humanities—just over 40%, with the Social Sciences and Sciences in the 30-35% positive range).
  2. Helping me to assess the impact of my work following its publication (Highest positive response in the Social Sciences (nearly 40%); less so in the Humanities (just over 30%) or the Sciences (just over 20%).
  3. Helping me determine where to publish a given work to maximize its impact. (30% or less across the disciplines favorable response–less than 20% favorable in the Sciences)
  4. Making a version of my research outputs freely available online in addition to the formally published version. (Around 30% favorable across the disciplines)
  5. Helping me understand and negotiate favorable publishing contracts. (Highest positive response in the Humanities (just over 30%).

If you put these two questions together–who is preserving your research, and how can we provide research support, there is a path illuminated for libraries in these responses.  While the overall responses for types of research support were generally lukewarm, most popular was the automatic creation of a faculty web page for research and contact information.  This reminds me of the UR (Rochester) Research IR, but perhaps something that did this sort of automatic population and archiving, while also connected to the scholar’s existing workflow, would be more heavily used.  I am thinking of a service that ties together (and auto creates) Google’s My Citations page with a service like Mendeley or Zotero (and the whole package is also linked into the IR).

In my next post (because this one is long enough already!) I’ll share some of the findings relevant to workflow and learning new digital research activities and methodologies.

Presentation at eHumanities Group–Royal Netherlands Academy of Arts and Sciences


Smiljana and Ellysa had the honor of presenting to some of Smiljana’s former colleagues at the Royal Netherlands Academy of Arts and Sciences eHumanities group while in Amsterdam for the International Digital Curation Conference!  This was a terrific opportunity to have an interactive discussion with faculty and technologists about their own digital workflow, and strategies for unifying practices throughout the workflow.  We are especially grateful to Dr. Sally Wyatt, Programme Leader for the eHumanities group and our gracious host.  Slides from our presentation are below:

Some Thoughts on Acquiring Personal Archives in the Digital Age

conference, data, interviews, photos

What follows is the text of the talk I gave at the DLF Forum on November 4, 2012, as part of the presentation headed by Ellysa and Smiljana.

Most research libraries typically acquire personal archives from creators at the end of careers, or lives, or even after their death, which effectively means that material can be acquired decades after its creation. Because of technological obsolescence and how fluid our personal uses of technology are, it’s logical to conclude that creators need to become more active in curation activities, and libraries need to engage them much earlier in the creation process, else we risk losing valuable cultural material down the digital black hole.

But I think there are some challenges to compressing this distance between creation and acquisition. For starters, do creators want to be bothered with our curatorial recommendations? In a report on born-digital literary manuscripts, Matthew Kirschenbaum and others concluded, based on interviews with working authors, that suggesting to them certain technological behaviors might impinge on their creative process, and hence be unwelcome advice. In our grant’s initial survey less than half of respondents answered affirmatively to the question of whether librarians should offer them personal archiving and digital preservation assistance. Several respondents even explicitly stated their opposition to being told how to preserve their own material in a free text comment field.

Of course, we shouldn’t just assume that creators are absent from the preservation process to begin with. For example, 86% of respondents to our initial survey indicated they regularly back up important material. I know redundancy does not constitute preservation, but it was encouraging to see that most of the respondents back up with a frequency falling between “continually and monthly.”

My favorite answer, and the funniest one, to this question about backup frequency was a single word: “episodically”. I actually think this answer could be instructive to archivists and librarians: for purposes of preservation we may be inclined to isolate material into neat little piles based on documentary types or by file type, as discreet items on a file system, or even as atomized chunks of data, but creators themselves may instead view them as events in a continuum of research – they progress from one to the other and together form a whole, a whole that is not entirely separable from other aspects of their appointments, including teaching, dissertation advising, etc.

And it’s not entirely separable from the physical spaces in which they work either. In Penn State Special Collections we have a physical reconstruction of the author John O’Hara’s study – his writing desk, typewriter, books shelves, even his fireplace and mantle. Now… projects like the Rushdie one out of Emory are starting to make us think of the author’s study as a virtual place, his reconstructed computer desktop screen replacing the physical reconstruction of a creative space. But in our first in-person interview with a Penn State faculty member in the Communications department, we got an intimate view into how these spaces are really wrapped up with one another in complex ways we do not yet fully understand. This researcher commented on the ways in which his forty-plus years in the same office and his pending move have had an impact on the way he works. At a much higher level, being with one institution for so long influenced his relationship to technology over the years. In the early 80s he learned of a mainframe on campus and successfully petitioned to have terminal access to it, which he used to draft a book in the SCRIPT markup language. As we proceed with the interviews, we’ll be documenting both the physical and virtual desktop environments graphically, and I think this is an area of the grant work that can be especially enlightening and fruitful.

In another Mellon funded digital archives grant, the AIMS project, conducted by Stanford, Yale, UVa, and Hull, they concluded that institutions need to spend more pre-acquisition time collaborating with potential donors and getting to know the details of their digital ecologies in order to make informed decisions about ingest, management, and preservation. The AIMS project produced a comprehensive model donor survey that can and should be used by archivists, and one that helped guide some of my recommendations to Ellysa on this grant. It contains questions about general computing habits, tools, use of mobile devices, social networking and even privacy. I think if we could always get this level of detail from potential donors, then a grant like ours wouldn’t be necessary. But donors of personal papers are busy, distracted, outgoing people who lead complicated and active lives just like the rest of us. In my own very limited experience with donors, I have not always had great luck teasing the most basic responses out of them, and I think it will be hard to have the kind of in-depth discussions about their digital desktops that we as archivists desire.

With this in mind, I think what we need are more case studies about different classes of creators to draw on for insight. There are two common methods for deciding what to acquire in our profession – one suggests that we appraise a collection based on the value of the content, and another (called macro-appraisal) suggests an emphasis on the role played by the records creator and how that person fits within an established and documented institutional collecting area. The latter has become increasingly common, but it presents significant challenges when the material to be acquired is digital, distributed, duplicated (in both digital formats and analog), and dependent on specific hardware or software.

In the absence of a collaborative relationship with potential donors, a condition which will likely persist, I think archives and libraries need a community-produced set of data about different types of creators to fall back on. This kind of appraisal might fall somewhere between micro and macro. Let’s call it meso-appraisal: the idea being that we start to document, quantitatively, the tools and habits used and exhibited by creators working in different fields. We’ve seen this kind of work already coalescing around literary archives. And in a way, I think this approach might succeed in aligning our appraisal efforts with our preservation efforts, which focus on developing preservation strategies at an intermediate level, usually for different formats. Not individual items, and not whole collections of disparate items.

For instance, heading into this grant I was curious to see how many unusual or non-standard file formats we might encounter. With the exception of data sets, the survey results have showed a surprising uniformity in the common types of digital material people have. There is persistent use of documentary types familiar to us all: word docs, spreadsheets, email, image files, and already some indication that certain formats, like PDF, are rather ubiquitous. And only 5% of respondents checked the “other” option on the question related to formats.

I think we can also produce some useful information about how people are using certain tools. Email is already turning out to be an interesting case study. People use it for everything, and not just communication. They use it for sharing documents, obviously, but also rudimentary backup, and even version control.

I think for digital library professionals, some of these findings can help inform our approach to the development of tools, especially related to repository services, which is something I’ve been thinking about a lot since Penn State just released its Hydra based repository, ScholarSphere.

If there’s anything our initial findings are demonstrating, it’s that researchers on campus are not shy about going outside the academy to get the tools they need. The communications professor we interviewed uses Dropbox because it’s easier, and provides ample space, even though it adds a financial burden. He uses Gmail because he feels it’s more user friendly than the email system his department offers. In general, researchers appear to use such tools interchangeably for a variety of purposes, including sharing, versioning, and redundancy. But they don’t know where to go for things like data recovery, and have simply accepted certain levels of data loss as part of doing business. I think the data is already trending in a way that demands that our repository efforts reduce as many barriers to adoption as possible — that our repository be as easy to use as Dropbox or email, or even be interoperable with these tools. Most important is that any repository services we develop fall comfortably within their existing workflows. And if we can at the same time sell the benefits that they cannot get from other tools — greater accessibility, data integrity, etc. – then I think we’ll be able to spur much greater adoption.

— Ben Goldman, Digital Records Archivist