The Glasgow Stop Words listThe Glasgow Stop Words list is a popular stop word list developed by the Information Retrieval Group at the University of Glasgow.
Stop word lists are generally composed of commonly used words, such as 'an', 'the', 'is', 'at', 'to' or 'we'. These lists are used to filter out such words when processing natural language data found in a novel, article, e-mail archive or other corpus of text, allowing the tool to identify significant words unique to that corpus.
TAPoR and Voyant use a modified version of the Glasgow Stop Words list in their respective text analysis toolsets. The modifications include the addition of numeric characters, punctuation, other text symbols, individual letters, and the removal of words such as 'top', 'sincere' and 'beyond'.
This list may be applied or ignored according to the needs of the user: for example, a search for common phrases may wish to retain the stop words in the results, while a search for the top words may wish to filter them out.
The original Glasgow Stop Words list is available via the Information Retrieval Group.
The modified Glasgow Stop Words list is available here.
For further information on stop word lists, please see the entries at the Wikipedia and SearchSOA.
Return to Glossary. is a popular stop word list developed by the Information Retrieval Group at the University of Glasgow.
Stop word lists are generally composed of commonly used words, such as 'an', 'the', 'is', 'at', 'to' or 'we'. These lists are used to filter out such words when processing natural language data found in a novel, article, e-mail archive or other corpus of text, allowing the tool to identify significant words unique to that corpus.
TAPoR and Voyant use a modified version of the Glasgow Stop Words listThe Glasgow Stop Words list is a popular stop word list developed by the Information Retrieval Group at the University of Glasgow.
Stop word lists are generally composed of commonly used words, such as 'an', 'the', 'is', 'at', 'to' or 'we'. These lists are used to filter out such words when processing natural language data found in a novel, article, e-mail archive or other corpus of text, allowing the tool to identify significant words unique to that corpus.
TAPoR and Voyant use a modified version of the Glasgow Stop Words list in their respective text analysis toolsets. The modifications include the addition of numeric characters, punctuation, other text symbols, individual letters, and the removal of words such as 'top', 'sincere' and 'beyond'.
This list may be applied or ignored according to the needs of the user: for example, a search for common phrases may wish to retain the stop words in the results, while a search for the top words may wish to filter them out.
The original Glasgow Stop Words list is available via the Information Retrieval Group.
The modified Glasgow Stop Words list is available here.
For further information on stop word lists, please see the entries at the Wikipedia and SearchSOA.
Return to Glossary. in their respective text analysis toolsets. The modifications include the addition of numeric characters, punctuation, other text symbols, individual letters, and the removal of words such as 'top', 'sincere' and 'beyond'.
This list may be applied or ignored according to the needs of the user: for example, a search for common phrases may wish to retain the stop words in the results, while a search for the top words may wish to filter them out.
The modified Glasgow Stop Words listThe Glasgow Stop Words list is a popular stop word list developed by the Information Retrieval Group at the University of Glasgow.
Stop word lists are generally composed of commonly used words, such as 'an', 'the', 'is', 'at', 'to' or 'we'. These lists are used to filter out such words when processing natural language data found in a novel, article, e-mail archive or other corpus of text, allowing the tool to identify significant words unique to that corpus.
TAPoR and Voyant use a modified version of the Glasgow Stop Words list in their respective text analysis toolsets. The modifications include the addition of numeric characters, punctuation, other text symbols, individual letters, and the removal of words such as 'top', 'sincere' and 'beyond'.
This list may be applied or ignored according to the needs of the user: for example, a search for common phrases may wish to retain the stop words in the results, while a search for the top words may wish to filter them out.
The original Glasgow Stop Words list is available via the Information Retrieval Group.
The modified Glasgow Stop Words list is available here.
For further information on stop word lists, please see the entries at the Wikipedia and SearchSOA.
Return to Glossary. is available here.
For further information on stop word lists, please see the entries at the Wikipedia and SearchSOA.