TokenX is a web-based environment for visualizing, analyzing and playing with texts. Options include word clouds, highlighting words, keywords in context, replacing words with bocks, highlighting punctuation and non-words, counting words in context and decontextualized, and substituting words. A number of sample files are provided, or users can point TokenX to any XML file online.

Last updated: 19 Apr 2016

Voyeur is a web-based text analysis environment where users can apply a wide variety of tools to any text they import.

Last updated: 3 Nov 2015

WordSmith allows users to develop concordances, find keywords, and develop word lists from plain text files.

Last updated: 22 May 2015

AntWordProfiler is free software for analyzing word frequency.

Last updated: 9 May 2015

"TextSTAT is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as you want from a particular website and puts them in a TextSTAT-corpus. The new news-reader, too, puts news messages in a TextSTAT-readable corpus file.
TextSTAT reads MS Word and OpenOffice files. No conversion needed, just add the files to your corpus...

Last updated: 24 Mar 2015

HyperPo is a user-friendly text exploration and analysis program that allows users to import texts or use texts available online (in English or French), and provides frequency lists of characters, words and series of words, color-coding to indicate repetition, KWIC, co-occurrence and distribution lists, and the ability to simultaneously compare data from multiple texts.

Last updated: 29 Dec 2014

Wmatrix is web-based software for corpus analysis and comparison. It provides a web interface to the USAS and CLAWS corpus annotation tools, and standard corpus linguistic methodologies such as frequency lists and concordances. It also extends the keywords method to key grammatical categories and key semantic domains.

Last updated: 29 Dec 2014

"In the WordHoard environment, texts are annotated or tagged by morphological, lexical, prosodic, and narratological criteria. They are mediated through a 'digital page' or user interface that lets scholarly but non-technical users explore the greatly increased query potential of textual data kept in such a form."

Code license: GNU GPL, Open source
Last updated: 29 Dec 2014

Bookworm enables you to graphically explore lexical trends in repositories of digitized texts.

Code license: Open source
Last updated: 29 Dec 2014

Word and Phrase utilizes the Corpus of Contemporary American English (COCA) to analyze texts for word frequencies, collocations, and concordance lines. Users copy and paste texts into a web interface.

Last updated: 29 Dec 2014
