Introduction: Correcting Method

How do you think something through with technology?

Descartes in his 1637 Discourse on Method describes a moment of solitude that allowed him to talk to himself about his thoughts and to develop a method for thinking correctly. Here is how he describes the solitude he needed:

… as I was returning to the army from the coronation of the emperor, I was halted by the onset of winter in quarters where, having no diverting company and fortunately also no cares or emotional turmoil to trouble me, I spent the whole day shut up in a small room heated by a stove, in which I could converse with my own thoughts at leisure. Among the first of these was the realization that things made up of different elements and produced by the hands of several master craftsmen are often less perfect than those on which only one person has worked. [1]

The Discourse is important to the practices of the humanities because it introduced an accessible method for anyone to do philosophy without needing to be widely read or part of an intellectual community. His story of method, and its accompanying provisional moral code for behaviour without certainty, is one of the fables that founded modern philosophy and in particular the solitary, doubting and reflective practice that still dominates how we think we should think things through.

But practices are changing, and older forms of communal inquiry are being remixed into modern research. We have come to recognize how intellectual work is participatory even when it includes moments of solitary meditation. Internet conferencing tools allow us to remediate dialogical practices, collaborative communities like Wikipedia depend on contributions by a large group of editors, and the communal research cultures of the arts collective or engineering lab are infecting the humanities. Accessible computing, the amount of data available, and the opportunities of new media have provoked textual disciplines to think again about practices and methods as we try to build digital libraries, process millions of digital books, and imagine research cyberinfrastructure that can support the next generation of scholars.

What is new is that we are imagining projects in the humanities that are big and need a variety of skills for implementation, skills rarely found in one solitary scholar/programmer let alone a Cartesian humanist. Thus we find ourselves working in teams, reflecting on how to best organize the teams, and then reflecting on what it means to reason through with others. This reflection in project teams inevitably turns to method correcting for the new media as we try to balance our traditional Cartesian values with the opportunities of open and communal work.

Hermeneuti.ca, like Descartes’ “histoire” is a story about the turn to method, this time methods of interpretation. Hermeneuti.ca is both a story of return to dialogical practices that predate Descartes and introduces computer-assisted methods that are just becoming hermeneutically interesting with the digitization of the human record. Specifically Hermeneuti.ca returns to method in four ways:

  1. First, Hermeneuti.ca is a hybrid project, both printed book and online reflection. The online heremeneuti.ca mirrors the convenience of a book by weaving text with interactive components, so as to show one of our conclusions about the opportunities for online interpretation.[2]
  2. Hermeneuti.ca is both a text about computer-assisted methods and an instantiation of tools called Voyeur Tools that implement our interpretation of method. The code is an interpretation of method presented in a particular way online so that you can try it with its companion text, manual and documentation.[3]
  3. Hermeneuti.ca presents three case studies - each with an example essay (Essay) that shows computer-assisted text analysis in application. The example is paired with a reflective chapter (Reflection) on analytics that uses the Essay as an example. The examples, like “Now Analyze That”, are essays that interpret texts using the hermeneutical tool things or "hermeneutica" (the plural of hermeneuticon or interpretative thing). [4] Accompanying these Essays are demonstration Recipes from our Methods Commons that show you how to use computing methods with Voyeur (or similar tools). The Recipes are tutorials on how to do interpretative things with common tools.[5]
  4. Just as Hermeneuti.ca is both book/site, texts/tools, so you will find that our Essays are both text and code: both narrative text and embedded interactive panels. The panels are part of the text quoting results, but they are also interactive so you can recapitulate and experiment with our results. They let you return to a computer-assisted method in the contextIn text analysis, context refers to the text surrounding a string of characters, which may be as short as a word or as long as a paragraph. Context is particularly important when generating a concordance for a string. Return to Glossary. of an essay. Here, for example, is an interactive panelWeb frameworks like the TAPoR Portal organize information into panels (sometimes called portlets or coplets.) These can me minimized, maximized and closed using the three buttons in the upper left-hand corner of the panel. With Voyant you can export panels of results and place them into other web sites. Return to Glossary.. It show the Voyeur Collocate Clusters of this Introduction. (Collocate Clusters allows users to visualize words that are interconnected by proximity and high frequency). Such panels are a difference made possible when publishing online. Try it!

Collocate Cluster of the Introduction

Collocate Cluster of the Introduction[6]

In short, Hermeneuti.ca is a weaving together of hermeneutical things whether print and electronic, text and code, essays and reflections, or narrative and interaction; all of which are a thinking through of interpretative method through computing. Hermeneuti.ca tries to correct for the Cartesian solitudes of text and method by showing how code is an instantiation of interpretative method and that it can be woven closely into other hermeneutical things like text.

The Story of Method

To confront the privilege of solitary reflection in academic practice we would do well to pay attention to how Descartes introduces his story of method. Descartes calls his Discourse a personal history or fable (“histoire”) which “you can imitate” or not. It illustrates his practices for readers to imitate, which in turn leads into his method and provisional moral code. While the formal method gets attention, the story of the provisional practices that he adopted to get there typically do not. That story, with all its baggage, gets passed over on the road to method, but it is in those provisional practices that those of us without certainty are stuck. It is a story about doubting oneself and others in order to rid oneself of possible influence. It is a story of isolating oneself from traditions of scholastic disputation in order to think alone, think afresh, think thoroughly, free oneself of error (especially from the errors of others) by assuming nothing, and think about thinking. This story of thinking thoroughly has four aspects that interest us,

  1. First, Descartes doubts all authority in order to think afresh about thinking, rejecting all opportunities for thinking along with others. He turned to solitary work apart from others as the best way to think about thinking, a practice still common in the humanities. We still tend to think that to really get research work done we need time away. Hermeneuti.ca proposes an alternative and more participatory practice of dialogical criticism. You don’t have to do it alone, especially if you haven’t all the skills.
  2. Second, and paradoxically, Descartes writes his thinking through as an interior dialogue. Rejecting dialogue with others gave him “the leisure to talk to myself about my thoughts”. Avoiding conversation with others freed him to discover himself as a conversational partner, which is his way of reflecting on thinking, or at least reflecting on his own thinking, and this reflection generates personal certainty. [7] The Discourse traces the trajectory of the personal; Descartes tells the story of his education in Part 1, his development of a personal method in Part 2, and even why he decided to publish the Discourse in Part 6. The autobiographical narrative made method accessible to others as a personal path, which accounts for popularity and influence of the work. Descartes let readers voyeuristically listen in on him doing philosophy, which helped them imagine how they could be philosophers if they took the time to think alone about their thoughts. In Hermeneuti.ca we likewise make our practices part of this story, but they are practices of working together, a different type of dialogue and a different type of practice. This story is more about what is needed when one wants to tightly couple interpretation with the development of software tools of interpretation. Ours is a story of collaborative reflection and development across the divide of writing code and text. Readers will find reflections on pair work later in the concluding dialogue on Agile Interpretation.
  3. Third, Descartes thinks through by reflecting on thinking. His method is a thinking that takes itself, thought, as its first companion and subject for interpretation. What is being interpreted is first of all the thinking of reflection. Likewise Hermeneuti.ca is about thinking, but thinking through interpretation rather than reflection, and a thinking through with things of interpretation whether text or tool. One way we do this is by being open and documenting our experiments in the very writing of these documents. Heremeneuti.ca is an open and self-documented work: you can recapitulate our analysis, examine our code, recover earlier versions and recover blog entries.
  4. Fourth, Descartes provides case studies that show the results of employing his method. In Descartes’s Discourse these are provided as appendixes. We too have provided case studies, but in Hermeneuti.ca the case studies are not pushed to the end of the story. The Essays are hermeneutical things that interpret using analytical methods and they are interpreted in the Reflections.

It is not surprising that the Cartesian train of doubting, solitary and reflective thought leads to the Cogito – “I think, therefore I am” – from which, with personal certainty, Descartes methodically rebuilds his ideas. The irony of the Discourse is that if his practice appeals to you, then you should suspect its results as the authority of another, namely Descartes, and start all over by interrogating your thinking practices. The rhetorical power of the Cogito is that Descartes bets you will end up right where he did, all the more convinced since you followed (even if only remotely as a reader) his "correct conduct" of reason, not his conclusions. [8] But that's the point - the Discourse is first presented as a guide towards method which you can imitate and reuse, not an authority to consider true - and that's going to be our point: you really shouldn't imitate our practice without thinking it through too, which is why we have provided you the tools to try it yourself. The image of the solitary Cartesian philosopher has influenced the how we think about intellectual work; perhaps it is time to correct our methods and reanimate other images of practice. In Hermeneuti.ca we have stitched together hermeneutical things so that you can try tools-as-methods as you read about them. Our story is one of thinking through together, as we hope yours will be.

Agile Interpretation

Knowing Geoffrey was leaving for Alberta, we decided to try some experiments in text analysis together while we still had access to quarters, in this case a lab away from our offices. Rather than be diverted by our personal projects we thought we would direct our conversations to the intersection of methods, tools and interpretation by taking a small project through from conception to writing in one day - a day set aside away from the distractions of other work. We spent the day closed up together in a lab overheated by all the computers, where we had the leisure to talk while thinking through tools and experimenting with texts. Among the many reflections of the day we noticed how few questions you can ask alone using one tool and how much more richness there was to interpretation in dialogue that weaves evidence together from several tools by different masters, as needed by the questions at hand.

Computer-assisted research in the humanities, by contrast to the Cartesian story and traditional humanities practices, has almost always been collaborative and not solitary. This is due to the variety of skills needed to implement digital humanities projects, and because of the relationship between the practices of interpretation and the development of the tools of interpretation, be they text analysis tools or digital editions. This difference, while acknowledged in various ways, has been a professional hindrance, as anyone who submits a CV for promotion with nothing but co-authored papers knows.[9] More importantly collaboration is not always good. Collaboration separates the interpreter/scholar from the implementer of the scholarly methods (programmer). Willard McCarty notes that the introduction of "software separated the conception of the problems (domain of the scholar) from the computational means of working them out (bailiwick of the programmer) and so came at a significant cost."[10] As computing is introduced into research it separates conception, implementation, and interpretation in ways that can only be overcome through dialogue and collaboration across very different fields. Typically humanities scholars know little about programming and software engineering, and programmers know little about humanities scholarship; going it alone is an option only for the few with the time to master both.[11]

There are obviously all sorts of ways people can collaborate, but for the purpose of correcting method we propose that collaboration is the normal practice of humanities computing and should therefore be imagined as part of any discussion of method. Solitary time, while much desired in the bustle of academic life, is here conceived of as a withdrawal from a background of working together in various structured and unstructured ways. Even Descartes starts with the collaboration of authority as the norm from which he has to retreat to correct his thinking. The very desire for solitary time for reflection proves our point; what is normal is collaborating with students in teaching or meeting with colleagues in committees. Thinking alone is the dream of the humanities not the ground from which to develop our method.

Collaboration, however, can take many forms. Working on Hermeneuti.ca we modelled our collaborative practice loosely on a programming methodology called Extreme Programming (XP) which includes practices of Pair Programming and which belongs in the wider category of Agile Programming, for which reason we call our practice Agile Interpretation (AI).[12] What is “extreme” about such methods is how extremely different they are from what we expect of best practices in coding. Traditional programming wisdom emphasized the need for careful analysis and specification before coding, while XP recommends rapid iterations of coding and reflection to achieve immediate goals without worrying much about the long term or big picture. You don't analyze the situation and then fully specify the final product before coding. You scratch an itch in a short iteration, look at it, and then start adding stuff as needed (not as anticipated). Often that means throwing out your code and starting all over when adding functionality leads to redesigning basic structures. The traditional wisdom held that rewriting your code was a sign of failure, XP makes it part of the process. Likewise in AI we started small trying to take an interpretation through from conceiving the problem to write a short essay in one day. (We failed to finish in a day!) This grew to larger iterations around each one of the case study Essays. Each iteration forced us to rewrite the code and to rethink Hermeneuit.ca.

XP also recommends working in pairs right down to the typing of code. You don't meet and then go off to code alone, you code in pairs alternating typing and guiding - one person in the pair coding at any moment to force discussion with the other. Likewise, where traditional practices in the humanities are solitary or forced compromised collaborations, AI is purposefully collaborative – at its heart is pair-work where one person performs the work of interpretation (or using the text analysis tools) while the other looks ahead or reflects on what is needed (actually, both members engage in both activities, but each member has a dominant role to play). The idea is to maximize the dialogue between the scholar function and development function to the point where they are woven into an organic whole.

Where the humanities aim to be theoretically grounded - you are supposed to have it all theorized beforehand (just as traditional programmers should have complete specifications,) AI is pragmatic, starting with small experiments and generating hermeneutical theories as the things of interpretation, like texts and tools. Where the humanities avoid formal methods in favour of loose and largely unexamined practices, AI makes methods (and the instantiation of methods in tools) an issue to be discussed throughout the experiment. It’s hard to avoid talking about what you are doing when only one person has the keyboard and everything has to be negotiated. Try it; it isn’t the waste of time you think it is. Above all, where the Cartesian practices involve reflection and talking with yourself, AI is about talking with another with complementary skills and summarizing those conversations in different ways.

The particular practice we followed involved redeveloping tools as we wanted to pose new questions and then continually testing the tools in the contextIn text analysis, context refers to the text surrounding a string of characters, which may be as short as a word or as long as a paragraph. Context is particularly important when generating a concordance for a string. Return to Glossary. of concrete experiments. We were fortunate that we actually could hack our tools as we needed. There was no divide between literary scholar and programmer, we were both of us capable of both. It undoubtedly helped our project to both have some familiarity with literary criticism and programming, but we don’t consider this a pre-condition of Agile Interpretation; AI can happen between two literary scholars or even two programmers.

In summary, Hermeneuti.ca is the record, outcome and essay of our three AI experiments:

1. The first Experiment starts with an Essay, “Now Analyze That” comparing the important pre-election speeches on race by Barack Obama and his spiritual father Jeremiah A. Wright. This essay uses text analysis for the comparison of shorter cultural texts available online and it is accompanied by a reflective chapter entitled “There's a Toy in my Essay!” which looks at how the results of text analysis have been woven into essays. A Recipe on “Exploring Themes Across a Text” shows you how you can try text analysis with our tools. [13]

2. The second Experiment is about studying a large collection of texts over time. The essay, “Humanist: The Sparrow Flies Swiftly Through” looks at diachronic patters in the archives of the Humanist discussion group which have documented the digital humanities community. “The Epidemiology of Ideas” is a reflective chapter that looks at how text analysis can be applied to such archives to track ideas through time in a community. A Recipe shows you how to Explore a Diachronic Collection.

3. The third Experiment on a community of text starts with an essay, “What's In A Day of Digital Humanities?” which analyzes the combined blogs of a social research experiment, The Day in the Life of Digital Humanities, where about 90 people blogged what they do. [14] It is accompanied by a reflective chapter “Animating the Knowledge Radio” that looks at real-time text analysis and animated visualization. A final Recipe helps you Visualize a Collection of Blog Entries.

These three sections, each with their Essay, Reflection and Recipe, are framed by historical and theoretical chapters on computer-assisted text analysis. “From the Concordance A concordance is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word. Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends. See the Wikipedia entry on Concordance (Publishing) Return to Glossary. to Ubiquitous Analytics” surveys the development of analytical tools in the textual disciplines in order to provide a contextIn text analysis, context refers to the text surrounding a string of characters, which may be as short as a word or as long as a paragraph. Context is particularly important when generating a concordance for a string. Return to Glossary. for Voyeur Tools. “Theorizing Analytics” returns to issues about interpretation and the place of computer-assisted text analysis. We conclude with a Dialogue on Agile Interpretation, the method followed in Hermeneuti.ca.

Computing in Humanities Research

Heremeneuti.ca is a work about and for the application of computing to humanities research, specifically to textual studies and interpretation. We have both been part of the field that used to be called Humanities Computing and is now commonly referred to as the Digital Humanities. This field is one of the communities of practice that has been negotiating this application of computing into the humanities. [15] To some extent this book is the result of decades of development and reflection in this field and we will on occasion engage in dialogue with others in the field, especially around tools and collaboration, though this is not a survey of the field. One of the characteristics of the field is that it has focused and supported the development of applied technology rather than being strongly theoretical. Humanities computing has – through training, conferences and projects – bridged the gap of scholarly practice and technology development rather than a theory/practice gap. Humanities computing was often based in units that supported computing for humanists in universities and therefore brought together faculty, staff, programmers and students to run labs, run servers, and develop tools. In short, computing humanists tended to build digital things, often for research uses by others, rather than theorize.

In Canada there has been a long tradition of building concording tools, starting with the PRORA concording tools, whose manual was published in 1966, to TACT in 1989, and the TAPoR project which led to and supports Hermeneuti.ca.[16] Heremeneuti.ca is, by virtue of being a hybrid of text and tool, another contribution in this tradition, and one that reflects back on what these code things are, the subject of the second chapter, “From the Concordance A concordance is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word. Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends. See the Wikipedia entry on Concordance (Publishing) Return to Glossary. to Ubiquitous Analytics”. Willard McCarty, author of a book titled Humanities Computing (which is one of the first attempts to theorize the field) writes, in an essay he gave as a plenary lecture at 2006 Canadian Symposium on Text Analysis,

But the more important lesson Iʹve learned is that although better tools are possible, the humanist’s perspective on tools problematizes them. That is ultimately the point of tool‐development in humanities computing, just as problematizing our methods and objects of study is ultimately the point of applying the tools we do have.[17]

Hermeneuti.ca is in this tradition of problematizing methods through developing tools, but we have tried to more tightly couple the development (writing of code) and interpretation (using the code.) Hermeneuti.ca tries to argue, through its hybrid structure, that the lines between tool and text are blurred – and that blurring is good. Voyeur Tools isn't a better tool, or the one everyone has been waiting for; it is another contribution in a tradition of developmental and interpretative research.

Further, as mentioned above, we believe that collaborative practices of research and development are at the heart of humanities computing, and therefore Hermeneuti.ca also presents itself in a tradition of reflecting on collaborative practice. Here, however, is where we diverge from the service tradition in Humanities Computing that sees the field as a "methods commons" for research that happens elsewhere. [18] For us Humanities Computing is not just the collaborative development of tools for others (that would be closer to software engineering) or the application of tools by others to humanities problems (that would be digital humanities), it can be a disciplined set of practices that problematizes methodology, tools and interpretation at the same time. There is now a tradition of research development and discussion within the community independent of instrumental concerns. Heremeneuti.ca, is a contribution to that tradition, from and for the field and involved in development as a form of research. Our research is simultaneously about "how we might think" while "thinking through" prototyping, coding, documenting and testing with real questions.[19] It is a particular type of research craft where one of the important outcomes is a re-imagination of how research tools should be designed to fit in the cycle of research.

Voyeur Tools, the tool intervention of Hermeneuti.ca, is a new text analysis environment meant to support Agile Interpretation in the following ways:

  • Voyeur is a web-accessible application available through the browser that doesn't need installation. Being online it is part of Hermeneuti.ca and woven together with the narrative text, essays, and documentation. You can weave it into your projects, share it, and work collaboratively. As such it is in a tradition of web accessible tools like TACTweb, HyperPo and TAPoRware. [20]
  • Voyeur is also designed to work "just-in-time" on any online or uploaded text. If you can see the text on the web (without a special subscription or access mechanism), Voyeur can interpret it, thus giving you the agility to experiment. The difference between Voyeur and TAPoRware or HyperPo is that Voyeur was designed to handle much larger corpora. It tries to combine the accessibility of web-based tools that work on texts retrieved on the fly with the capabilities of PC-based tools like TACT that pre-index their texts.
  • Voyeur is designed to fit differently into the research cycle than most text analysis tools which are stand-alone environments. Voyeur has ways of exporting results, but more importantly it allows one to easily export a panelWeb frameworks like the TAPoR Portal organize information into panels (sometimes called portlets or coplets.) These can me minimized, maximized and closed using the three buttons in the upper left-hand corner of the panel. With Voyant you can export panels of results and place them into other web sites. Return to Glossary. of evidence that can be embedded into an online essay the way you can embed a YouTube video into a blog entry.

To support research Voyeur is designed to be rewritten and to support different interfaces. There has been much handwringing about how we are constantly reinventing our tools, something we think is actually a good thing ... called interpretation. Extreme Programming is built around turning change, refactoring, and iteration into a virtue; likewise we try to make reimplementation a virtue of Voyeur. We have come to the conclusion that if reinterpreting tools (and therefore rebuilding tools) for the humanities is an inescapable part of problematizing method, then we should welcome it as a research practice and design an environment for reimplementation instead of being tempted by the teleology of “getting it right once and for all”. Another way to put this is beware of Voyeur. It is a research project, not a production tool you can buy shrinkwrapped and stable.

Thinking Through Text Technology

Like Descartes' Discourse, this work is also about thinking, but of a very different type of thinking through that both returns to an earlier model of how to do work and looks forward to how to do it with the hermeneutical tools at hand. This book takes a different path back than Descartes’s Discourse; our story does not reject authority or talking with others and, in fact, it is the story of writing and software development embedded in traditions. Our story does not present solitary thinking as a dialogue, instead we present our communal thinking as possibilities for dialogue, or, as you will see, as interactive essays where the dialogue is a possibility for you to follow through with hermeneutical things. Most importantly, our story doesn’t begin with reflecting about thinking but with interpretation assisted by tools.

What we have in common is that this work is about thinking through, but we will play with another sense of thinking-through than the Cartesian sense of thinking about or thinking thoroughly through method. This book is about thinking-through as thinking with or by means of extensions of the mind that instantiate methods. It is about thinking through with others and with technology where Descartes shunned both.

Thinking through is rooted in one of the paradigmatic styles of doing philosophy, dialogue (as opposed to solitary meditation). The Greek “dia” in the word “dialogos” meaning "conversation" does not, as many assume, meanIn statistics, the mean is the arithmetic average of a set of values. When used in text analysis, the set of values is the distribution of words in the source text, and the mean value the word with the occurrence rate closest to the average. For more information, see the Wikipedia. Return to Glossary. “two”, but instead can be translated as “through”, “between” or “exchange”. Thus a playful etymology of “dialogue” would explain it as “thinking through” or that which comes "through conversation" whether it is the Cartesian inner dialogue or a conversation with another. [21]

Socrates, in one of Xenophon’s dialogues, played with the connection between dialogos (conversation) and dialego – (to classify). Xenophon, writing about Socrates says “The very word ‘discussion,’ according to him, owes its name to the practice of meeting together for common deliberation, sorting, discussing things after their kind: and therefore one should be ready and prepared for this and be zealous for it…”.[22] In the Greek the joke is obvious because there is only one word dialegontas, a form of dialego for both sorting and for discussing. Dialogue for the Greeks was clearly connected with thinking through, by way of sorting and classifying, a collaborative practice illustrated in many of the Platonic dialogues as they sort through different definitions of the virtues.

Why text technology now?

In this book, however, we are going to concentrate on ways of thinking collaboratively through technology, specifically text technologies which we believe are of epochal importance. But why text technology? Why is information technology and in particular text technology so important now?

  1. First of all, because we are surrounded by electronic texts that we read mediated through technology. The e-texts we read on our laptops, smartphones, e-readers, off the web, and on screens are all read through technology. Text analysis tools can be considered as simply more powerful versions of the search utilities in the browsing and editing tools available for e-texts from your favourite word-processor to your PDF reader (text analysis tools can also be more nuanced and speculative).
  2. Second, we’re interested in text technology because the market for electronic reading is changing dramatically as we write. With the Kindle and the iPad, we seem to have viable electronic book and media readers that are actually doing well in the market place. Both have succeeded in connecting ease of acquisition to ease of reading so that many are reading electronic representations despite the convenience of paper. Hermeneuti.ca looks at how we can go beyond reading in the sense of flipping virtual page now that we have texts that can be processed by the computing. It does so by not asking about electronic reading, but about interpretation.
  3. Third and for Hermeneuti.ca most importantly, is the change in scale of available electronic texts. Thanks to Google Books, researchers can read millions of books in digital form. Should the intellectual property issues around Google Books ever be solved we could have access not just to page images, but the raw text files.[23] The question is what can we do with some much data? Our tools for analyzing texts have grown out of concording tools designed to handle one book or a small collection, not millions of books. The types of questions we ask also tend to be about individual works, small collections of a single author, or comparative collections. What sorts of tools, methods and questions can handle millions of texts?[24]

But it is not just researchers who need access to text technologies. According to a 2003 study “How Much Information?” by Peter Lyman, Hal R. Varian and colleagues at Berkley, there was about 5 exabytes of new print, film, magnetic and optical information produced in 2002. And it is growing by about 30% a year. Of this only a small amount – a mere 1,634 petabytes is print, but consider that 2 petabytes is sufficient to represent all the U.S. Academic research libraries.[25] Most of this print information is office documents: North Americans in 2003 were consuming 11,916 sheets of paper per person and they estimate that half of that is used in printers and copiers for office documents.

A more recent, and more alarming study, “The Diverse and Exploding Digital Universe” prepared by IDC, a “global provider of market intelligence” and commissioned by EMC2 (a storage solutions company) estimates that,

In 2006, the amount of digital information created, captured, and replicated was 1,288 × 1018 bits. In computer parlance, that’s 161 exabytes or 161 billion gigabytes. This is about 3 million times the information in all the books ever written. [26]

They go on to say that “between 2006 and 2010, the information added annually to this digital universe will increase more than six fold from 161 exabytes to 988 exabytes”.

It should not be surprising that, according to “How Much Information,” the Internet is the fastest growing medium, accounting for 532,897 terabytes between the web, e-mail, and instant messaging. Most of this is text like email. Text is even on the multimedia web, where HTMLHTML, or Hypertext Markup Language, is a language used in web development to make a text readable by web browsers. HTML is primarily formed of paired elements, such as < body >< /body > or < p >< /p >, that apply some characteristic to the text within it. One pair of elements may be nested inside another like this: < body >< p >< /p >< /body > In this case, < body >< /body > marks the beginning and end of the body of the document, while < p >< /p > marks the beginning and end of a paragraph within the body. Elements may also be modified by attributes and attribute values: < p class="hangingindent" > In this case, the paragraph element has the attribute 'class' and the attribute value 'hangingindent'. Attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element. Return to Glossary. and PDF account for 17.8% and 9.2% respectively while images and movies account for 23.2% and 4.3% respectively. If you think about how people search and find information on the web through search engines like Google you can see the importance of text. Even if a growing amount of the information on the web is time-based media like video, it is text that we use to search for that information, it is text that is indexed, and it is text that makes up the metadata.

This explosion of information raises ethical and privacy issues connected to hermeneutical issues. One major issue is control and text-mining over this universe of text.

IDC predicts that by 2010, while nearly 70% of the digital universe will be created by individuals, organizations (businesses of all sizes, agencies, governments, associations, etc.) will be responsible for the security, privacy, reliability, and compliance of at least 85% of that same digital universe.[27]

Where organizations have access to our words they can use analytical tools to mine them in order to draw inferences individuals wouldn’t want drawn. We are already seeing the fall-out from the tensions between individually created information and corporate management of it in a cover story in the CAUT Bulletin, “Email Outsourcing Threatens Privacy & Academic Freedom” which reports on the Lakehead University Faculty Association grievance against the university for outsourcing e-mail to Google Gmail whose terms of use allow it to store and process information in the United States which opens the possibility that the email may be mined by the American government if ordered to do so. [28] As more and more of even our “private” textual correspondence is available for large-scale analysis and interpretation we need to learn more about these methods. Hermeneuti.ca introduces you to how analytics can be used so that individuals might have some control over the tools of analysis.

In short, we are practicing thinking in the humanities in an epoch of change in the way people read, the tools of reading, and the amount, privacy and organization of the information that we care about. And this matters.

Dangers

There are, however, some dangers ahead in Hermeneuti.ca with its doubled practices. The first is the disappearance of the author. To paraphrase what Shaftesbury said about dialogue, “the author is annihilated, and the reader, being in no way addressed, stands for nobody”. [29] This is a danger Heidegger and other philosophers of technology talk about: the danger that tools, when ready-at-hand, are transparent and the creator’s authorial responsibility for the tool is hidden. When using a hammer you don't wonder about the author of the hammer. The tool is an extension of their interpretation about what you might need that you should be careful of. To avoid this danger we have to ask how one might interpret tools. If tool development is research then it should be open to scrutiny as other types of research are, but open source is not openness in the way that a philosophical paper is open. It is hard to interpret things designed to be thought with rather than thought about because they are designed to withdraw, much as it has always been hard to interpret philosophical dialogues, at least as the position of a dissappearing author. This is not to give undue value to Romantic ideas about the importance of the author, it is simply to point out that the interpretation of technologies is hard to do, especially while using them. Matt Kirschenbaum’s book Mechanisms shows us one way forward in that he adapts bibliographic practices to electronic classics of electronic literature from Michael Joyce’s Afternoon to early adventure style computer games. He reads these as literature. We are taking a different approach and reading tools as hermeneutical things.

The second, and more prosaic danger, is that entanglement can lead to commoditization which can then corrupt research. David Noble in “Digital Diploma Mills: The Automation of Higher Education” warns about “the commoditization of the research function of the university, transforming scientific and engineering knowledge into commercially viable proprietary products that could be owned and bought and sold in the market”.[30] While we doubt there is any risk of being corrupted by commercialization in this project, especially since we are providing free access to the text and releasing the code for Voyeur Tools under an open license, we do worry about the entanglement in the administration of technology. Could we walk away from our tools if we were convinced they were inadequate or inappropriate? Are we willing to build tools that are conceptually interesting and innovative but that have little chance of being used? Does the weaving of development and critique mix practices that should be kept at arms-length for the sake of perspective? An easy answer is to argue that we have always been entangled, but that may just be sophistry as the entanglements in the humanities are usually trivial. No one really wants to buy our souls the way they want to buy pharmaceutical research results. That said, there is in the sort of humanities computing work that develops tools a very real difference and danger when you find you need to get grants, and to get grants you have to get matching funds from industry, and so on. Such engaged work is not by definition corrupt, but it is corruptible and that’s why we consider it a danger.

A third and final danger is a bundle of commitments that we can call the modernist commitments to progress through technique. To think through the development of possible technologies is to agree, at least provisionally, that there could be better designs. Bundled with the practices of design comes a hope of improvement and a belief in progress. While this hope can be moderated by care for unanticipated outcomes, and by a skepticism regarding the hyper-ventilated claims of computing, you don’t do it without any hope and you do such work in the knowledge that you can’t anticipate how it will be used ultimately. We regard this danger as unavoidable – it is the danger of any action – any involvement in the world that is not cynical. We all, in some form or another, try things out in the face of dangers, and that is our hope. For that matter, it is the danger of any other type of intellectual work – you could be misinterpreted. Without the confidence of an intellectual ground or clear ends we are all local-modernists – trying to make a way forward in the local, but in ignorance of the outer grounds or end.

Don’t Imitate, Contribute

How can I read a hybrid book/site like Hermeneuti.ca?

Descartes’ Discourse is important to the practices of the humanities because it marks a shift to the explicit discussion of method. Descartes, when he introduces his method as a personal history which you can imitate or not, is suggesting how the reader should engage the work that has influenced how research is done and that has dominated the methodological imagination of the humanities since. This book is about an alternative dialogical method where interpretation is done in conversation, specifically three types of conversation which suggest three ways you could read and engage us:

  • If you are interested in the discussion of interpretation and tools then you can stick to the theoretical parts of Hermeneuti.ca. We recommend to you the printed book as it is easier to read if you don't care to try things right away. If you want to disagree with us or otherwise engage things we've written about then contribute a comment and read our How to Contribute Reviewed Comments online. [31]
  • If the conversation between building the tools of interpretation and using them interests you then we recommend you start with Voyeur Tools and try it on your texts. You can try some of the "Recipes" using them to cook up your own hermeneutical things. From there you could then work back to documentation about Voyeur Tools. If you want to contribute a Recipe back for a different use or if you want to extend the code, then read our How to Contribute Recipes or Code online.
  • If you are interested in ways of writing with interactive analytics then you should look at how we do it with our essays and read the documentation on embedding panels in your own online writings. You could start by trying a small blog essay analyzing a work off the web, or, for that matter, chapters from this work here. If you have an essay that you think is exemplary and want to contribute it back to hermeneuti.ca or have us point to it, then please, read our How to Contribute Essays.

Voyeur Tools was developed in part as a result of our first experiment, “Now Analyze That” where we found it difficult to swiftly move from the analytical tool environment where textual evidence is explored to the environment of the essay where a new interpretation is crafted. If you will, we had trouble flying through from interpretation (the environment and activity of interpreting evidence) to interpretation (the writing environment of the new essay.) Voyeur Tools was designed, as mentioned above, to allow us to cycle back and forth from interpretation to interpretation. Voyeur Tools is designed to work with the types of Web 2.0 online writing environments that have emerged from blogs, to wikis, to works like hermeneuti.ca, which uses the open-source content management system Drupal. As in any experiment in interactive interface, the goal was to get to a point where the practices of moving between tool and text were swift enough to be experienced as another thread of dialogue rather than the long wait of years for the tool to come along that lets you ask the next question. We hope we have enabled your iterative experimentation with text technologies. We hope with hermeneuti.ca you can move swiftly through from interpretation to interpretation and not be held back as we were. If the interaction works, you will soon find the limits of Voyeur, and when you want one more feature you will have caught the bug that infected us. And if you then want to help write code you should see our How to Contribute Code online at hermeneuti.ca.

Works Cited

Auer K., and R. Miller. Extreme Programming Applied: Playing to Win. Boston, Addison-Wesley, 2002.

Beck, K. Extreme Programming Explained: Embrace Change. Boston, Addison-Wesley, 2000.

Crane, G. “What Do You Do with a Million Books?” D-Lib Magazine. Vol. 12:3. 2006. Online at http://www.dlib.org/dlib/march06/crane/03crane.html.

Descartes, R. A Discourse on the Method of Correctly Conducting One's Reason and Seeking Truth in the Sciences. Trans. I. Maclean. Oxford: Oxford University Press, 2006.

Galey, A. and S. Ruecker. “Design as a Hermeneutic Process: Thinking Through Making from Book History to Critical Design.” Paper presented at the Digital Humanities 2009 conference, University of Maryland. June 22-25, 2009.

Glickman R., and G. Staalman. Manual for the Printing of Literary Texts and ConcordancesA concordance or keyword in context (KWIC) is usually represented as a list of occurrences of a word with some limited context shown (words to the left and words to the right). Here is an example that shows the occurrences of the word "dream" in A Midsummer Night's Dream in TACTweb: I.1/577.1 | Four nights will quickly dream away the time; | And I.1/578.2 Swift as a shadow, short as any dream; | Brief as the II.2/585.1 | Ay me, for pity! what a dream was here! | Lysander, III.2/591.1 this derision | Shall seem a dream and fruitless vision, | IV.1/593.1 as the fierce vexation of a dream. | But first I will IV.1/594.2 to me | That yet we sleep, we dream. Do not you think | The IV.1/594.2 rare | vision. I have had a dream, past the wit of man to IV.1/594.2 the wit of man to | say what dream it was: man is but an IV.1/594.2 he go | about to expound this dream. Methought I was--there IV.1/594.2 his heart to report, what my dream | was. I will get Peter IV.1/594.2 to write a ballad of | this dream: it shall be called IV.1/594.2 it shall be called Bottom's dream, | because it hath no V.1/599.1 | Following darkness like a dream, | Now are frolic: not a V.1/599.2 theme, | No more yielding but a dream, | Gentles, do not See also the definition at Wikipedia. Return to Glossary. by Computer. Toronto: University of Toronto Press, 1966.

IDC, “The Expanding Digital Universe.” Project Director J. F. Gantz. White paper available online. 2007. <http://www.emc.com/leadership/digital-universe/expanding-digital-univers....

Kirschenabum M. G. Mechanisms: New Media and the Forensic Imagination. Cambridge, MA: MIT Press, 2008.

Lyman P. and Varian H. R. How Much Information. 2003. Report online at <http://www.sims.berkeley.edu/how-much-info-2003>.

Noble, D. F. “Digital Diploma Mills: The Automation of Higher Education.” First Monday. Vol. 3, No. 1. January 1998. <http://firstmonday.org/htbin/cgiwrap/bin/ojs/index.php/fm/article/view/5...

Miles, M. “Descartes’s Method.” A Companion to Descartes. Blackwell Reference Online. Eds. Broughton J. and J. Carriero. Oxford: Blackwell, 2008.

McCarthy, W. Humanities Computing. New York: Palgrave Macmillan, 2005.

McCarty, W. “Beyond the word: modeling literary contextIn text analysis, context refers to the text surrounding a string of characters, which may be as short as a word or as long as a paragraph. Context is particularly important when generating a concordance for a string. Return to Glossary..” Text Technology. Forthcoming.

Rockwell, G. "Multimedia, Is it a Discipline? The Liberal and Servile Arts in Humanities Computing", Jahrbuch für Computerphilologie. Online and in print. Vol. 4, 2002. See <http://computerphilologie.uni-muenchen.de/jg02/rockwell.html>.

Rockwell, G. Defining Dialogue From Socrates to the Internet. Amherst, New York: Prometheus Books, 2003.

Shaftesbury, A. Earl of. Characteristics of Men, Manners, Opinions, Times, etc. 2 vols. Gloucester, Massachusetts: Peter Smith, 1963.

Xenophon. Xenophon in Seven Volumes. Trans. O. J. Todd and E. C. Marchant. 7 vols. Cambridge, Massachusetts: Harvard University Press, 1968.

Footnotes

[1] Descartes, Discourse on Method, Part 2, p. 12.

[2] To see the online version see <http://hermeneuti.ca>.

[3] To try Voyeur Tools see <http://voyeurtools.org>.

[4] “Now Analyze That” is included in print in this book. To see the interactive online version see <http://hermeneuti.ca/node/15>.

[5] To see all the Recipes of the Methods Commons see <http://methodi.ca>.

[6] Sinclair, S. and G. Rockwell. Collocate Cluster. Voyeur Tools. To try the tool with your text see <http://voyeurtools.org/tool/Links>.

[7] See Miles, "Descartes's Method" about his method of analytical reflexion.

[8] The full title of the Discourse is A Discourse on the Method of Correctly Conducting One's Reason and Seeking Truth in the Sciences. It should be noted, as mentioned above, that his "method" is different, though closely related, to his story of how he went about correcting his thinking until led to the method. It is only in the three appendixes to the Discourse like that on Geometry that you get results from the method. They are the case studies showing how his method could be employed scientifically.

[9] Collaborative research is a new phenomenon in the humanities that is often dealt with by assigning percentages to the final outcomes as if writing a paper collaboratively was simply a division of labour as in, “I wrote that part worth 30%, and she wrote the remaining 70%.” For more see “The Evaluation of Digital Work” wiki maintained by the Modern Languages Association, <http://wiki.mla.org/index.php/Evaluation_Wiki>.

[10] McCarty, Humanities Computing, page 81.

[11] There is something comedic about the strange pairs of computer science student programmers and senior scholars that do many humanities computing projects. The mismatch of backgrounds, age, and interests is what humanities computing attempts to bridge.

[12] For an introduction to Extreme Programming see Beck, Extreme Programming Explained or Auer and Miller, Extreme Programming Applied.

[13] The Recipe “Explore Themes Across a Text” can be followed at <http://hermeneuti.ca/node/123>.

[14] For the Day of Digital Humanities project see <http://tapor.ualberta.ca/taporwiki/index.php/Day_in_the_Life_of_the_Digi....

[15] It is by no means the only site where applications of computing to the humanities have taken place. Computational lingustics, quantitative history, cyberculture studies, media studies and recently game studies are other fields with communities of research using computing methods.

[16] For information about PRORA see Glickman and Staalman, Manual for the Printing of Literary Texts and Concordances by Computer. For TACT (Text Analysis Computing Tools) see <http://projects.chass.utoronto.ca/tact/> and for TAPoR see <http://portal.tapor.ca>.

[17] McCarty, “Beyond the word: modelling literary context”, page 1.

[18] Willard McCarty introduced the idea of a Methdological Commons with a chart on page 119 of Humanities Computing. The chart depicts the Commons and its relationships to disciplines. His position in Humanities Computing is too nuanced for a quick summary, but in general he has argued for an interdisciplinary field and against the independence of disciplinarity. We, on the other hand, have argued for our own research agenda and programmes in articles like “Multimedia, Is it a Discipline?”

[19] Development as a form of research is common in the design field as Alan Galey and Stan Ruecker argued in “Design as a Hermeneutic Process: Thinking Through Making from Book History to Critical Design.”

[20] For TACTweb see <http://tactweb.mcmaster.ca/tactweb/doc/tact.htm>. For HyperPo see <http://tapor.mcmaster.ca/~hyperpo> and for TAPoRware see <http://taporware.ualberta.ca>.

[21] For a more in-depth consideration of dialogue see, Rockwell, Defining Dialogue From Socrates to the Internet.

[22] Xenophon, Memorabilia, IV. vi. 1.

[23] For more on this see the American Library Association's site, Google Book Settlement: An Informational Site for the Library Community at <http://wo.ala.org/gbs/> or Google's site on the Google Book Settlement at <http://www.googlebooksettlement.com/>.

[24] One computing humanist that anticipated this issue of scale is Greg Crane in, “What Do You Do with a Million Books?”

[25] A petabyte is 1,000,000,000,000 bytes.

[26] IDC, “The Expanding Digital Universe”, page 1.

[27] Ibid. page 1.

[28] See < http://cautbulletin.ca/default.asp?SectionID=0&SectionName=&VolID=34&Vol...

[29] See Shaftesbury’s Characteristics of Men, Manners, Opinions, Times, Etc., p. 132. The original quote is, “For here (in dialogue) the author is annihilated, and the reader, being no way applied to, stands for nobody. The self-interesting parties both vanish at once."

[30] Noble, “Digital Diploma Mills.”

[31] Instructions on How to Contribute are at <http://hermeneuti.ca/node/80>.