Structural Analysis

What kind of data should the tool work with?

This is a Windows program for generating and searching a KWIC concordance of a document ("KWIC" = "Keywords in Context"). A KWIC concordance is a list of the different words occurring in the document, with each instance of each word shown in context (that is, within a phrase). Word frequency is shown. Context size is user-definable, anything from 3 to 19 words long. The software acts on text files and on MS Word docx files, skipping over "stop" words. The concordance can be displayed alphabetically or by frequency, and can be written to a file.

Code license: Closed source
Last updated: 3 Feb 2017

TXM

TXM is a free and open-source cross-platform Unicode, XML & TEI based text analysis software, supporting Windows, Mac OS X and Linux. It is also available as a J2EE standard compliant portal software (GWT based) for online access with access control built in (see a demo portal: http://portal.textometrie.org/demo).

Code license: Open source, GNU GPL v3
Last updated: 29 Jun 2016

IBM AeroText is an information extraction system for developing knowledge-based content analysis applications.

Last updated: 15 Jun 2016

Philomine is an extension to the Philologic text retrieval engine that supports a variety of machine learning, text mining, and document clustering tasks.

Code license: Open source, GNU GPL
Last updated: 22 Feb 2016

yWorks is a powerful set of tools for creating diagrams using any number of frameworks. There are tools for working with HTML, FLEX, AJAX, Silverlight, Java and .NET.

yEd is also available from the yWorks site. This free graph editor can be used to create diagrams manually, or to import data for analysis.

Code license: Closed source
Last updated: 1 Dec 2015

Superfastmatch is designed to find exact duplicates of text strings between documents.

Code license: Open source, GNU GPL
Last updated: 1 Dec 2015

Sigma is a JavaScript library that allows for the deployment of a graph file. It makes it easy to publish networks on Web pages, and allows developers to integrate network exploration in rich Web applications.
It is highly interactive and allows a researcher to extend their work from a dedicated graph analysis package such as Gephi and share it via the web to allow for communication of research outputs, while permitting viewers to explore and discover their own findings from the raw graph network.

Code license: MIT License
Last updated: 14 Nov 2015

CulturalAnalytics is an R package containing functions for statistical analysis and plotting of image properties, including statistics such as the standard deviation and mean in the RGB and HSV color spaces, image entropy and histograms in greyscale (intensity) and color, and for plotting color clouds and image scatter charts.

Code license: Open source, GNU GPL
Last updated: 12 Nov 2015

corpkit is a tool for doing corpus linguistics.

It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses.

Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Code license: MIT License
Last updated: 30 Oct 2015

corpkit is a tool for doing corpus linguistics.

It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses.

Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Code license: MIT License
Last updated: 5 Oct 2015

corpkit is a tool for doing corpus linguistics.

It does a lot of the usual things, like parsing, concordancing and keywording, but also extends their potential significantly: you can concordance by searching for combinations of lexical and grammatical features, and can do keywording of lemmas, of subcorpora compared to corpora, or of words in certain positions within clauses.

Corpus interrogations can be quickly edited and visualised in complex ways, or saved and loaded within projects, or exported to formats that can be handled by other tools.

Code license: MIT License
Last updated: 5 Oct 2015

SentimentBuilder is an online tool that performs text analytics on emails, reviews, feedback, chat data or any unstructured texts via Natural Language Processing. It's the only tool where you can upload a file for processing and then visually view the results in a Sankey Flow Report to quickly identify trends, issues and strengths and then customize each view, save and share! Export any result for your own offline analysis! Try the Always Free version today and upload your own data or try one of our sample files.

Code license: Closed source
Last updated: 4 Sep 2015

SylvaDB is a graph database management system. It allows users with no knowledge in graph theory to model, collect, query, and analyze data in a network structure. SylvaDB provides tools for easy creation of schemas and modelling, automatic forms creation to input the data, collaborative features, a visual query editor, global and local search, reports charts generation, networks metrics, and visualizations tools.

Code license: GNU Affero GPL v.3
Last updated: 9 Jun 2015

DiscoverText allows users to import data from a variety of sources (including Facebook & Twitter feeds, plain text, Word, Excel, public YouTube comments, blogs/wikis, PDF, etc.), code them, and generate tag clouds and reports.

Last updated: 24 May 2015

Whatizit can ingest up to 500,000 terms pasted into the input box and execute any of the pre-defined text analysis pipelines.

Last updated: 23 May 2015

Diction analyzes texts for language indicating certainty, activity, optimism, realism, and commonality.

Last updated: 19 May 2015

A website that explains statistical concepts and provides a web-based environment for performing those calculations. Tools include a graph maker, distribution generators, t-tests and procedures, and correlation and regression tests. All tools have been written in Javascript and run within the browser.

Code license: Closed source
Last updated: 14 May 2015

Lynks provides an easy to use, in-browser tool that helps you to create your own networks. Lynks is an initiative by Centre for Innovation, part of Leiden University (Campus The Hague). The software has been developed in 2014 in co-creation, with expertise from Dr. Eelke Heemskerk from University of Amsterdam. The software development has been supported by the financial contributions from the European Union Fund for Regional Development (EFRO) and the Municipality of The Hague.

Code license: Closed source
Last updated: 12 May 2015

Linguistic Inquiry and Word Count is a text analysis software program that calculates the degree to which people use different categories of words across a wide array of texts.

Last updated: 2 May 2015

VennMaker provides an interactive platform for compiling, generating, visualising and analysing relationship data.

Code license: Open source
Last updated: 22 Apr 2015

CollateX is a Java software for collating textual sources, for example, to produce a critical apparatus. As of January 2012 the project was at an early stage of development and lacked thorough documentation.

Code license: GNU GPL v3
Last updated: 25 Mar 2015

Praat is software for the phonetic analysis of speech, including support for articulatory and speech synthesis.

Code license: GNU GPL v2
Last updated: 19 Feb 2015

CATMA (Computer Aided Textual Markup & Analysis) is a free, open source markup and analysis tool from the University of Hamburg's Department of Languages, Literature and Media. It incorporates three interactive modules: (1) The tagger enables flexible and individual textual markup and markup editing. (2) The analyzer incorporates a query language and predefined functions. It also includes a query builder that allows users to construct queries from combinations of pre-defined questions while allowing for manual modification for more specific questions.

Code license: GNU GPL v3
Last updated: 29 Dec 2014

MONK is a digital environment designed to help humanities scholars discover and analyze patterns in the texts they study.

Last updated: 29 Dec 2014

IBM InfoSphere is intended for enterprise-scale data warehouses, delivering access to structured and unstructured information and operational and transactional data.

Last updated: 29 Dec 2014

Korbo is a powerful aggregation platform for gathering Linked Data objects relevant to your area of research into single workspaces or “baskets”.

Korbo is targeted primarily at developers who want to build applications on top of its API and make full use of the linked cultural data from sources such as Europeana, FreeBase and DBPedia.

Korbo is currently in the early stages of development, but you can already try out a demo version of the platform.

Code license: Open source, GNU GPL
Last updated: 29 Dec 2014

Ptolemaic is a computer application for music visualization and analysis written in the Java programming language. The software is designed to aid in the analysis of all types of Western music using established analytical techniques, including tonal functional analysis (Harrison 1994), pitch-class set analysis (Forte 1973), hierarchical linear analysis (Schenker 1935, Jones 2002), tonal pitch-space analysis on the Tonnetz (Riemann 1915), pitch-class set analysis (Forte 1973), and transformation analysis (Lewin 1987).

Code license: Open source, GNU GPL
Last updated: 29 Dec 2014
CSV
Subscribe to Structural Analysis