web spider

What kind of data should the tool work with?

"TextSTAT is a simple programme for the analysis of texts. It reads plain text files (in different encodings) and HTML files (directly from the internet) and it produces word frequency lists and concordances from these files. This version includes a web-spider which reads as many pages as you want from a particular website and puts them in a TextSTAT-corpus. The new news-reader, too, puts news messages in a TextSTAT-readable corpus file.
TextSTAT reads MS Word and OpenOffice files. No conversion needed, just add the files to your corpus...

Last updated: 24 Mar 2015

ScrapBook is a Firefox extension, which helps you to save Web pages and easily manage collections. Major features are:
* Save Web page
* Save snippet of Web page
* Save Web site
* Organize the collection in the same way as Bookmarks
* Full text search and quick filtering search of the collection
* Editing of the collected Web page
* Text/HTML edit feature resembling Opera's Notes

Last updated: 29 Dec 2014
CSV
Subscribe to web spider