Tools

strict warning: Only variables should be passed by reference in /Library/Server/Web/Data/Sites/hermeneuti.ca/modules/book/book.module on line 559.
Tools AlphabeticallyTools by DisplayTools by Scope
  • Bubblelines (word distribution visualization)
  • Bubbles (text reading visualization)
  • Cirrus (word cloud A visual presentation of keywords drawn from a text, visually differentiated based on their position and frequency of use in that text.

    Return to Glossary.
    visualization)
  • Corpus Grid (table of texts in a corpus)
  • Corpus Summary (corpus overview)
  • Corpus Term Frequencies (table of term frequencies by corpus)
  • Collocate Term Frequencies (table of term frequencies in proximity to keyword)
  • Document Term Frequencies (table of term frequencies by document)
  • Document KWICs (concordance A concordance (or keyword in context) is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word.

    Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends.

    See the Wikipedia entry on Concordance (Publishing)

    Return to Glossary.
    or table of keywords in context A concordance (or keyword in context) is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word.

    Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends.

    See the Wikipedia entry on Concordance (Publishing)

    Return to Glossary.
    )
  • Entities Browser (named entities visualization)
  • Knots (term occurrence visualization)
  • Lava (keyword in context A concordance (or keyword in context) is a gathering of passages that "concord" or agree. Usually it is a gathering of passages with a sought for word.

    Concordances are a form of reading tool that go back to the Middle Ages. They are typically lists of words with their appearances. A concordance for the bible, for example, would have entries for all the content words of the bible in alphabetical order. Each entry would include information about where the word appears and some context. Searching for words on a computer now typically returns a concordance called a Key Word in Context (KWIC) with the sought word down the center and a few words of context on either side. Google returns a type of concordance when you search for a word with an example of the word in context for each page it recommends.

    See the Wikipedia entry on Concordance (Publishing)

    Return to Glossary.
    visualization)
  • Links (term frequencies in proximity to keyword visualization)
  • Mandala (term browsing visualization)
  • Reader (large-scale document reader)
  • ScatterPlot (term distribution visualization)
  • Term Frequencies Chart (term distribution visualization)
  • Term Fountain (term frequencies visualization)
Bubblelines

Bubblelines is a visualization tool that helps to understand patterns of word repetition in one or more documents. Each document is represented as a horizontal lineA line is the string of text limited by the width of a page.

Lines are often used in tokenization, and may contain parts of one or more sentences. For example

"The quick brown fox jumps over the lazy dog."

is a complete sentence and occurs on one line. By contrast,

"Hard by a great forest dwelt a poor wood-cutter with his wife and his
two children. The boy was called Hansel and the girl Gretel. He had little
to bite and to break, and once when great dearth fell on the land, he
could no longer procure even daily bread."

spans three sentences and four lines.

Return to Glossary.
and each seach term is represented as a bubble – the bubble represents the frequency of the term in the corresponding segment of text (the text is divided into segments of equal length). The larger the bubble, the more frequent the term.

Bubbles

Bubbles reads the words in a document (or corpus) and displays the highest frequency words within proportionately large bubbles.

Cirrus

Cirrus is a visualization tool that displays a word cloud A visual presentation of keywords drawn from a text, visually differentiated based on their position and frequency of use in that text.

Return to Glossary.
relating to the frequency of words appearing in one or more documents. One can click on any word appearing in the cloud to obtain detailed information about its relativity. The larger the word, the more frequent the term.

Corpus Grid

Corpus Grid shows an overview of the corpus, including each document's title, number of word tokens (total words), number or word types (unique words), and lexical density (the ratio of tokens to types).

Corpus Summary

Corpus Summary is a tool that provides a simple, textual overview of the current corpus. Features of this tool include number of words, number of unique words, longest documents, highest vocabulary density, most frequent words, notable peaks in frequency, and distinctive words. Users can click within these features for more detailed information of the analysis.

Corpus Term Frequencies

Corpus Term Frequencies shows overall word frequencies for the entire corpus as well as information about how word frequencies are spread out over documents within the corpus. Hover over column headers and buttons for more information.


 
Document Term Frequencies

Document Term Frequencies shows word frequencies for each document in the corpus. You can see the selected word at the top of the window highlighted in yellow. Its relevance to the documents is shown in the table below. Hover over the column headers or toolbar buttons for more information.

Document KWICs

Document KWICs shows a table of keywords in their contextIn text analysis, context refers to the text surrounding a string of characters, which may be as short as a word or as long as a paragraph.

Context is particularly important when generating a concordance for a string.

Return to Glossary.
. In other words, it provides a list of certain keywords and their occurrence within a corpus or document.

Entities Browser (named entities visualization)
Knots

Knots is a visualization tool that helps to understand patterns of word relevance in one or more documents. Each term is represented as a twisted lineA line is the string of text limited by the width of a page.

Lines are often used in tokenization, and may contain parts of one or more sentences. For example

"The quick brown fox jumps over the lazy dog."

is a complete sentence and occurs on one line. By contrast,

"Hard by a great forest dwelt a poor wood-cutter with his wife and his
two children. The boy was called Hansel and the girl Gretel. He had little
to bite and to break, and once when great dearth fell on the land, he
could no longer procure even daily bread."

spans three sentences and four lines.

Return to Glossary.
– when the lines overlap it means a relevance or linkage within the terms.

Lava

Lava allows you to view multiple levels of a corpus in a three-dimensional environment. Clicking on certain documents within the corpus expands the Lava visualization in a ring to explore further. By clicking on certain parts of the visualization, you are able to explore terms within their contextIn text analysis, context refers to the text surrounding a string of characters, which may be as short as a word or as long as a paragraph.

Context is particularly important when generating a concordance for a string.

Return to Glossary.
.

Links

Links finds collocates for words and displays links between them using a force directed graph. It shows term frequencies in proximity to keyword. It is a visualization and shows a web of terms.

Collocate Term Frequencies (table of term frequencies in proximity to keyword)
Mandala

Mandala is a visualization tool that imports “textual” files to perform analysis on the frequency and linkage of words. For example, you may import a play and find the linkage and frequency between a word and its speaker.

Reader

Reader acts as a method of reading all documents within a specified corpus. It does not provide text analysis but rather a method of viewing the contents of a corpus.

ScatterPlot

ScatterPlot creates a scatter plot graph of terms, spaced by their variation from one another.

Term Frequencies Chart

Term Frequencies Chart shows how terms are distributed across document(s) in a corpus (documents are shown in the order in which they were added).

  
Term Fountain

Term Fountain visualizes word frequencies as a fountain.