DHCS

What kind of data should the tool work with?

KORA is an digital repository that allows institutions to ingest, manage, and deliver digital objects and metadata.

Code license: Open source
Last updated: 5 Aug 2015

The Text Creation Partnership has double-keyed roughly 55,000 titles from ProQuest’s EEBO image product into fully-searchable, TEI-compliant SGML/XML texts. These texts contain rich metadata fields that indicate when the texts include features like alchemical language, bibliographic citations, and epistolary forms.

Last updated: 17 May 2015

CorpusSearch 2 allows users to construct and search syntactically annotated corpora, including finding and counting lexical and syntactic patterns, correcting systemic errors, and coding linguistic features.

The software is released under Mozilla Public License 1.1 (MPL 1.1) .

Code license: Open source
Last updated: 11 Feb 2015

Abbot is a tool for undertaking large-scale conversion of XML document collections in order to make them interoperable with one another. In particular, Abbot can make one or more collections conform to a designated schema (including a schema used to define one of the collections).

By default, Abbot converts documents into TEI Analytics -- a TEI subset designed for text analysis applications.

Last updated: 29 Dec 2014
CSV
Subscribe to DHCS