Stylistic Analysis

What kind of data should the tool work with?

Textable is an open source program for text analysis. It offers a set of basic text-analytic components (e.g. import text from files, segment into words, measure segment diversity, etc.), which the user combines using a visual interface to build custom analytic workflows.

Code license: GNU GPL v3
Last updated: 20 Aug 2017

DocuScope is a text analysis environment with a suite of interactive visualization tools for corpus-based rhetorical analysis. Designed as a tool for rhetorical analysis, it's also extremely effective for developing the dictionary in a systematic fashion. The tool uses a collection of lists called "Language Action Types," or LATs, each of which contains words believed to belong to a given conceptual category. Words that designate "Positive Emotion" constitute one LAT; words that are associated with orality make up another LAT, and so on.

Last updated: 14 Jul 2016

TXM

TXM is a free and open-source cross-platform Unicode, XML & TEI based text analysis software, supporting Windows, Mac OS X and Linux. It is also available as a J2EE standard compliant portal software (GWT based) for online access with access control built in (see a demo portal: http://portal.textometrie.org/demo).

Code license: Open source, GNU GPL v3
Last updated: 29 Jun 2016

Philomine is an extension to the Philologic text retrieval engine that supports a variety of machine learning, text mining, and document clustering tasks.

Code license: Open source, GNU GPL
Last updated: 22 Feb 2016

A free iOS app for text analysis. Textal allows you to analyze documents, tweet streams, and webpages. Create clickable text clouds based on the source data that you choose. It comes pre-loaded with a large number of public domain texts. Text clouds are easily shareable via various Twitter and email.

Last updated: 18 Dec 2015

corpkit is a tool for doing corpus linguistics.

It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses.

Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Code license: MIT License
Last updated: 30 Oct 2015

corpkit is a tool for doing corpus linguistics.

It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses.

Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Code license: MIT License
Last updated: 5 Oct 2015

corpkit is a tool for doing corpus linguistics.

It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses.

Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Code license: MIT License
Last updated: 5 Oct 2015

The ‘Stylo’ package provides easy-to-use implementations of various established analyses in the field of computational stylistics, including non-traditional authorship attribution, genre recognition, style development (“stylochronometry”), etc. The package includes a number of explanatory methods provided by the function stylo() (multidimensional scaling, principal component analysis, cluster analysis, bootstrap consensus trees).

Last updated: 16 Jun 2015

Whatizit can ingest up to 500,000 terms pasted into the input box and execute any of the pre-defined text analysis pipelines.

Last updated: 23 May 2015

WordSmith allows users to develop concordances, find keywords, and develop word lists from plain text files.

Last updated: 22 May 2015

AntWordProfiler is free software for analyzing word frequency.

Last updated: 9 May 2015

Linguistic Inquiry and Word Count is a text analysis software program that calculates the degree to which people use different categories of words across a wide array of texts.

Last updated: 2 May 2015

CollateX is a Java software for collating textual sources, for example, to produce a critical apparatus. As of January 2012 the project was at an early stage of development and lacked thorough documentation.

Code license: GNU GPL v3
Last updated: 25 Mar 2015

JGAAP is software designed for textual analysis, text categorization, and authorship attribution

Last updated: 25 Mar 2015

"TextSTAT is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as you want from a particular website and puts them in a TextSTAT-corpus. The new news-reader, too, puts news messages in a TextSTAT-readable corpus file.
TextSTAT reads MS Word and OpenOffice files. No conversion needed, just add the files to your corpus...

Last updated: 24 Mar 2015

CorpusSearch 2 allows users to construct and search syntactically annotated corpora, including finding and counting lexical and syntactic patterns, correcting systemic errors, and coding linguistic features.

The software is released under Mozilla Public License 1.1 (MPL 1.1) .

Code license: Open source
Last updated: 11 Feb 2015

A software tool for performing concordance – the analysis of a set of words within its immediate context - on a body of text. The tool performs full concordance, reading and analysing each and every word in a text. It was initially written for the analysis of English texts, but has since been extended to cater for other Western languages. Limited support is also provided for text in East Asian scripts, such as Chinese and Korean.

Features:

Code license: Closed source
Last updated: 11 Feb 2015

AntConc is free concordance software. It is multi-platform and easy to deploy and use.

AntConc is part of a suite of related tools for text processing and analysis, including applications for parallel corpus analysis, word profiling, PDF to text conversion, text structure analysis, detecting and converting character encodings, Japanese and Chinese segmenter and tokenizer, wordclass tagger, and spelling variant anaysis. The developer is currently drafting a more explicit licence for the use of the software.

Last updated: 11 Feb 2015

MONK is a digital environment designed to help humanities scholars discover and analyze patterns in the texts they study.

Last updated: 29 Dec 2014
CSV
Subscribe to Stylistic Analysis