Having sent out the prospectus (and keeping our fingers crossed) we are now turning back to the second experiment tentatively called "The Swallow Flies Swiftly Through." The idea is to try to understand humanities computing / digital analysis using text analysis. This uses Humanist as a corpus so it lets us work with a large corpus. It also forced us to think about how to do diachronic analysis. By creating a corpus from each year and naming them right we can get distribution graphs over time. We also, with support from the Digging Into Data challenge, developed a correspondence analysis tool that works well.
One way we are studying the corpus is looking at the words (across the corpus) that have a skew one way or another. The idea is to look for words trending up or down. Here is a simple skin I developed for going through the words with the distribution tool and KWICA concordance or keyword in context (KWIC) is usually represented as a list of occurrences of a word with some limited context shown (words to the left and words to the right). Here is an example that shows the occurrences of the word "dream" in A Midsummer Night's Dream in TACTweb: I.1/577.1 | Four nights will quickly dream away the time; | And I.1/578.2 Swift as a shadow, short as any dream; | Brief as the II.2/585.1 | Ay me, for pity! what a dream was here! | Lysander, III.2/591.1 this derision | Shall seem a dream and fruitless vision, | IV.1/593.1 as the fierce vexation of a dream. | But first I will IV.1/594.2 to me | That yet we sleep, we dream. Do not you think | The IV.1/594.2 rare | vision. I have had a dream, past the wit of man to IV.1/594.2 the wit of man to | say what dream it was: man is but an IV.1/594.2 he go | about to expound this dream. Methought I was--there IV.1/594.2 his heart to report, what my dream | was. I will get Peter IV.1/594.2 to write a ballad of | this dream: it shall be called IV.1/594.2 it shall be called Bottom's dream, | because it hath no V.1/599.1 | Following darkness like a dream, | Now are frolic: not a V.1/599.2 theme, | No more yielding but a dream, | Gentles, do not See also the definition at Wikipedia. Return to Glossary. for checking.
By playing with the settings (set TAPoRware stopwords and a Z-scoreA z-score is an expression of how many standard deviations higher or lower a data point is from the mean. For more information, see the Wikipedia. Return to Glossary. of 2 or more) I can narrow the list of words down and then go through them manually.
Some things I need: