Voyeur Tools: See Through Your Texts

Introducing Voyeur

Voyeur is a web-based text analysis environment. It is designed to be user-friendly, flexible and powerful. Voyeur is part of the Hermeneuti.ca, a collaborative project to develop and theorize text analysis tools and text analysis rhetoric. This section of the Hermeneuti.ca web site provides information and documentation for users and developers of Voyeur.

What you can do with Voyeur:

  • use texts in a variety of formats including plain textPlain text refers to a text without any additional formatting affecting its human readability, often found in .txt files.

    Plain text files do not require a specialized program, such as a word processor, to read them.

    For more information, see the Wikipedia. Return to Glossary.
    , HTMLHTML, or Hypertext Markup Language, is a language used in web development to make a text readable by web browsers.

    HTML is primarily formed of paired elements, such as < body >< /body > or < p >< /p >, that apply some characteristic to the text within it. One pair of elements may be nested inside another like this:

    < body >< p >< /p >< /body >

    In this case, < body >< /body > marks the beginning and end of the body of the document, while < p >< /p > marks the beginning and end of a paragraph within the body.

    Elements may also be modified by attributes and attribute values:

    < p class="hangingindent" >

    In this case, the paragraph element has the attribute 'class' and the attribute value 'hangingindent'. Attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element.

    Return to Glossary.
    , XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data.

    Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another:

    < book >< title >< /title >< /book >

    Elements may also be modified by attributes and attribute values:

    < book format="hardcover" >

    In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element.

    Return to Glossary.
    , PDF, RTF and MS Word
  • use texts from different locations, including URLs and uploaded files
  • perform lexical analysis including the study of frequency and distribution data; in particular
  • export data into other tools (as XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data.

    Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another:

    < book >< title >< /title >< /book >

    Elements may also be modified by attributes and attribute values:

    < book format="hardcover" >

    In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element.

    Return to Glossary.
    , tab separated values, etc.)
  • embed live tools into remote web sites that can accompany or complement your own content

Voyeur is a work in progress – it is currently in beta. Some things don't work properly, some planned features aren't available yet. In particular, here are some weaknesses that we recognize:

  • lack of more advanced linguistic processing (lemmatization, parts of speech, semantic awareness)
  • lack of XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data.

    Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another:

    < book >< title >< /title >< /book >

    Elements may also be modified by attributes and attribute values:

    < book format="hardcover" >

    In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element.

    Return to Glossary.
    -aware analytic features (though XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data.

    Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another:

    < book >< title >< /title >< /book >

    Elements may also be modified by attributes and attribute values:

    < book format="hardcover" >

    In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element.

    Return to Glossary.
    is a valid input format)
  • the current default skin (configuration of tools) is not well-suited to reading texts
  • some of the user documentation is a bit bare
  • other funcitonality:
    • proximity searching of terms
    • multi-word (n-gram) views (though you can search for specific phrases)

To get started, try viewing one of the screencasts to the right or continue to Workshops -> Voyeur Tools for Users