A Digital Humanities status report: Where are we now?
Does digitization help or hurt intellectual contributions?
Centuries ago, it would never have been possible to deeply analyze the entire corpus of philosopher and theologian Thomas Aquinas, but within the past three decades rapid advances in technology and its infrastructure are making it possible.
But is that a good or a bad thing? Johanna Drucker, the inaugural Breslauer Professor of Bibliographical Studies in the Department of Information Studies at UCLA, is internationally known for her work in the history of graphic design, typography, experimental poetry, fine art and digital humanities. She addressed the status of digital humanities in a talk to librarians and academics on Sept. 18.
The automation of texts for the purpose of analysis began in the 1960s with the introduction of MAchine-Readable Cataloging (MARC) records, and now has grown to include images, sound and more. “Here we are, 20 years in, imaginatively 30 years, and then 40 and 50 years, and the question that overhangs the field is: ‘What intellectual contribution has it made?’” Drucker said.
Considering the questions of “How can we make responsible decisions about tools and platforms from a critical and socially responsible point of view as well as a realistic bottom line perspective?” and “Why invest in the field if there is no intellectual contribution?” Drucker noted that the question of institutional infrastructure and how we build it could be the subject of an entire lecture.
Instead, she highlighted four critical moments in the field of digital humanities, asking, “What was the epistemologically at stake at critical moments? What did we get out of it? Why was it worth it?”
The first moment was one with no internet, when Roberto Busa – a pioneer in the use of computers for linguistic and literary analysis – was interested in what the word “presence” meant in the works of Thomas Aquinas. “On three-by-five cards, he wrote down each instance of the word,” Drucker said. “But you can’t take that word by itself because you can’t take the meaning words away from the function words.”
Then Busa met Thomas Watson and Watson said, ‘We can automate this process.’ And they did, importing the entire corpus of Aquinas works, turning them into ascii files and automating the process. “It was the first kind of computational work,” said Drucker. “They made the words machine readable without ambiguity. The punch cards made it unambiguous.
“You knew you wanted standards that could be read by machines across a corpus,” Drucker said, “but what kind of intellectual model is being made?”
The second critical moment for Drucker was the emergence of TEI structural markups – standardized markup language – to analyze texts that could be integrated across a large scale.
“There are benefits at scale, but what’s the intellectual model and what does it do for the humanities?” she asked. “It’s a text coding initiative and they’re so much fun, like a cross between crossword puzzles and charades, and they become a way to standardize the analysis of the value of the semantics of text.
“So HTML became the language of the web for browsers and the relationship between intellectual work and browsers became vital. Browsers need to be able to read. Standardization requires you to do certain things but enables the conversation across different browsers and projects.”
Scholars began to think about putting an entire corpus online and making it available in a searchable way, she explained, allowing every search on a phrase or word to take a user to all of the standard references for the corpus, as was done for the Perseus Digital Library.
Though it’s implemented technically, “it’s intellectual work,” she said. “Every bit has to be transcribed by hand, checked and approved. When you start to think about the sheer labor involved, it’s a great act of faith.”
Drucker said you have to realize in digital form a model of critical editing that can’t be done in print. “It’s not just integrating all of the materials, but thinking about how they show their relationship to each other. It’s an intellectual contribution,” she said. “Critical editing is much disparaged in our era and it shouldn’t be. At the time it was being done, there were no other models, examples of how to make it do something we can’t do in print.”
The William Blake Archive, sponsored by the Library of Congress in the mid-1990s, also enacted something difficult by figuring out how to put images online, Drucker said. “The implementation was, ‘Let’s take every example of every Blake plate and create an electronic environment where they can be studied together.’ These are things that would geographically never be in the same room together. It was a new model of scholarship.”
Large-scale projects, however, were silos and didn’t talk to each other, Drucker said. “How were we going to use these projects started to bother funding agencies,” she said. “How could they be used?”
Which brought Drucker to her third moment – the point at which data mining became attractive.
“This is where we started to have intersections with materials sciences and other disciplines,” she said. “Projects are hard to make, time-consuming and always on life support on obsolete platforms, browsers, etc. They’re really resource intensive, so let’s see what can do with all of this text.”
One such project by visual designer Ben Fry looked at all six of Darwin’s editions of The Origins of Species and it’s a brilliant piece of work, Drucker said. “It shows what happened to Darwin, what price he paid and you can figure it out by reading all six editions, but you can’t hold it in your head.”
But data mining went crazy and most was visually really ugly, she added. “Everyone had forgotten the aesthetics of the last several centuries. Humanists were naïve as they started to use tools from other fields and apply them without understanding statistics. They produced magnificent diagrams, but they were easy to make fun of unless you understood the tool you were using.”
How to provide tools for reading graphical expressions became the new problem because they were tending to forget text work/print work, reducing relationships to a single-node quality with no inflection or way to nuance anything.
“There are things you can do at scale, but to look at word use over Shakepeare’s corpus, the problems are enormous,” Drucker said. “And text analysis and image analysis doubles the problems.”
Arriving at the fourth and final moment as yet in the evolution of digital humanities, Drucker asked: “Where are we now?”
“We’re at the intersection with materials science with imaging technologies that take information and process it computationally to take it apart and create a new image that makes it available to us as part of the cultural record and renders it legible,” she said. “And it’s not technical work. It’s intellectual because you need you to know what you’re aiming at.”
For example, a researcher looking at vellum in the medieval era discovered that it was made from newborn, not unborn calves, upending traditional thought. “It made us rethink medieval agriculture. It opened new ways of thinking.”
But use of computational tools to produce artifacts can conceal as much as they reveal, Drucker said. “Some intellectual modeling is based on ‘this much’ evidence and it’s been extrapolated and generalized so it produces a falsehood that I consider problematic.
“There is a critical shift to use the skills of interpretation of ideological and political insight to look at artifacts,” she said.
“I’m a great advocate for infrastructure,” Drucker noted, as she talked about the easy to learn and use off-the-shelf tools available today. “All of these contain within them various assumptions and epistemological models that can make knowledge explicit and attractive and able to make it numeric in some way.”
Intellectual labor should be thought of from the get-go within the institution where it will live, she said. “Infrastructure at this stage, in 2017, requires first looking back then figuring out what to do with it. It has nontrivial intellectual contributions within it,” she said. “It’s really, really important to figure out how we’re going to describe things.
“So, has the digital humanities had a big impact?” she asked. “I think so. We all do our business digitally and if we don’t understand how it’s shaped, we’re missing an opportunity.”