Philomine is an extension to the Philologic text retrieval engine that supports a variety of machine learning, text mining, and document clustering tasks.
PhiloLine is an add-on for the Philologic text retrieval engine that provides a sequence alignment algorithm for humanities text analysis designed to identify "similar passages" in large collections of texts.
Philologic is a full-text search, retrieval and analysis tool with support for TEI-Lite XML/SGML, Unicode encoding, plaintext, Dublin Core/HTML, and DocBook.
PAIR is a sequence alignment algorithm for humanities text analysis designed to identify "similar passages" in large collections of texts. In addition to a Philologic add-on, PAIR is available as Text::Pair, a generalized Perl module that supports one-against-many comparisons. A corpus is indexed and incoming texts are compared against the entire corpus for text reuse.
The ARTFL Encyclopédie Project has digitized the Encyclopédie ou Dictionnaire raisonné des sciences, des arts et des métiers, par une Société de Gens de lettres (published under the direction of Diderot and d'Alembert between 1751 and 1772, containing 74,000 articles written by more than 130 contributors) and made it available online for scholars to use with the Philologic text retrieval engine and the Philomine text mining tools.