THATcamp Kansas

I (Geoffrey Rockwell) am giving a workshop on Voyant at the Kansas 2012 THATcamp. This time we had a number of backup servers set up and they all worked well. Some participants were working with Arabic that worked, to a degree. Stéfan set up a system that resolves to different servers:

http://bit.ly/VoyantCirrusFrankenstein

resolves to

http://resolve.voyant-tools.org/tool/Cirrus/?corpus=frankenstein&stopLis...

which then redirects to

http://temp.voyant-tools.org/tool/Cirrus/?corpus=frankenstein&stopList=s...

That's in "workshop" mode (where the temp instance is favoured). If you remove the incontext part of the urlA URL (Uniform Resource Locator), sometimes called a web address, is used to locate and identify web content. For more information, see the Wikipedia. Return to Glossary.

http://resolve.voyant-tools.org/tool/Cirrus/?corpus=frankenstein&stopLis...

it resolves to the main server.

http://voyant-tools.org/tool/Cirrus/?corpus=frankenstein&stopList=stop.e...

Some of the issues/questions that came up:

  • How should projects like ours deal with server load? I think the resolving system that we tried, for the first time in the workshop, actually worked well.
  • The documentation should be consistent. Different links go to different help places. They should probably all go to docs.voyant-tools.org.
  • We need to provide more documentation on the Correspondance Analysis tool.
  • A number of people want to do linguistic work and would like the ability to lemmatize, search by lemmas, and use texts with POS (Part of Speech) information. This will be difficult to do. Can we add some wild card searching for word lists? Could we imagine a special skin for linguistic work with special tools.
  • A number of people have XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data. Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another: < book >< title >< /title >< /book > Elements may also be modified by attributes and attribute values: < book format="hardcover" > In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element. Return to Glossary. encoded (TEI) texts. We need to document better what we can do with XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data. Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another: < book >< title >< /title >< /book > Elements may also be modified by attributes and attribute values: < book format="hardcover" > In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element. Return to Glossary. and figure out a way to give users control over their XMLXML, or Extensible Markup Language, is a language used in web development to make a text readable by web browsers and/or store data. Like HTML, XML is primarily formed of paired elements. Unlike HTML, the elements are defined by the user, rather than predefined. For example, both < book >< /book > and < murfle >< /murfle > are valid element pairs. These elements apply characteristics and metadata to the text within them. One pair of elements may be nested inside another: < book >< title >< /title >< /book > Elements may also be modified by attributes and attribute values: < book format="hardcover" > In this case, the book element has the attribute 'format' and the attribute value 'hardcover'. In addition to storing metadata about the text, attribute/attribute value pairs are frequently used in combination with CSS to apply formatting to the text within the element. Return to Glossary. . One idea is to be able to subset (create a corpus of smaller documents from a text) based on XPath where all passages that fit criteria get aggregated into "documents".
  • We need to test and document the stop word list editing feature. Is there a way to save and reuse a custom list? Can one work with Cyrillic words?
  • We should create a list of known and sharable corpora.