stop words

What kind of data should the tool work with?

Phrase Frequency Counter Advanced scans MS Word DOCX files, text and text-like files — including HTML and XML files encoded via ANSI or UTF-8 — and counts the number of occurrences of the different phrases. It is also a multiple-file phrase-search program. It is possible to specify exactly what counts as a word (e.g., words with or without hyphens or numerals). The phrases found can be listed alphabetically or by frequency, with rank and frequency displayed for each. It is possible to search within the set of found phrases.

Code license: Closed source
Last updated: 5 Jun 2018

cue.language is a Java library that has tokenizing (words/sentences/ngram), string counting, language guessing, and stop word detection capabilities.

Code license: Apache License, Open source
Last updated: 29 Dec 2014
CSV
Subscribe to stop words