What kind of data should the tool work with?


F4 eases the transcription process of audio or video recordings and you can safe about 30% of your time. You can adjust the playback speed to your personal transcription speed. Further there is a foot pedal usable to control the playback. You can set automatically time marks, speaker change or text modules.

Code license: Closed source
Last updated: 2 Jan 2019

A tool to convert Normal/Scanned PDF and Image to Word, Excel, PPT, Keynote, Pages, Text, etc. on Mac.

  • Convert PDF to Word (.doc), Excel (.xlsx), and More Common Office Format Files
  • Convert PDF to Pages and Keynote
  • Convert PDF to Graphics Files
  • Convert Scanned PDF with Accurate OCR
  • Convert Multilingual PDF Files
  • Support Password-Restricted PDF Files
Code license: Closed source
Last updated: 29 May 2016

An (optical character recognition) engine for creating editable and searchable electronic files from scanned paper documents, PDFs and digital photographs.

  • Recognition of Digital Camera and Mobile Phone Camera Images
  • Comprehensive Language Support
  • Complete Integration with Popular Office Applications
  • PDF conversion, archiving and security
Code license: Closed source
Last updated: 17 May 2016

Part-of-Speech (POS) tagging software for English - the classification of words into one or more categories based upon its definition, relationship with other words, or other context, also known as wordclass tagging. CLAWS (Constituent Likelihood Automatic Word-tagging System) uses several methods to identify parts of speech., most notably a system called Hidden Markov models (HMMs) which involve counting examples of co-occurrence of words and wordclasses in training data and making a table of the probabilities of certain sequences of words.


Code license: Closed source
Last updated: 3 May 2016

DM is an environment for the study and annotation of images and texts. It is a suite of tools, enabling scholars to gather and organize the evidence necessary to support arguments based in digitized resources. DM enables users to mark fragments of interest in manuscripts, print materials, photographs, etc. and provide commentary on these resources and the relationships among them.

Last updated: 1 May 2016

Transana is open source software for the transcription and qualitative analysis of audio and video.

Last updated: 23 Feb 2016

Combined with the Leptonica Image Processing Library Tesseract can read a wide variety of image formats and convert them to text in over 40 languages.

This code is a raw OCR engine. It has no output formatting and no UI. It can detect fixed pitch vs proportional text. Nevertheless in 1995 this engine was in the top 3 in terms of character accuracy, and it compiles and runs on both Linux and Windows. Training code is included in the open source release.

The core developer on the project is Ray Smith (theraysmith).

Code license: Open source, Apache License
Last updated: 27 Jan 2016

Transana is a computer program that allows researchers to transcribe and analyze large collections of video, audio, and image data.

Last updated: 10 Aug 2015

VoxcribeCC has the most accurate speaker-independent and topic-independent desktop speech recognition technology. It is used for media (audio\video) transcription and video-captioning.

Please watch VoxcribeCC Usage Video to learn using VoxcribeCC just in 2 minutes.

Code license: Closed source
Last updated: 16 Jun 2015

Scripto is an engine for crowdsourcing the transcription of content that can be integrated with a custom transcription GUI and existing CMS.

Last updated: 21 May 2015

FromThePage is free software that allows volunteers to transcribe handwritten documents on-line. It's easy to index and annotate subjects within a text using a simple, wiki-like mark-up. Users can discuss difficult writing or obscure words within a page to refine their transcription. The resulting text is hosted on the web, making documents easy to read and search.

Code license: Open source, GNU Affero GPL
Last updated: 2 May 2015

CollateX is a Java software for collating textual sources, for example, to produce a critical apparatus. As of January 2012 the project was at an early stage of development and lacked thorough documentation.

Code license: GNU GPL v3
Last updated: 25 Mar 2015

Extensive set of tools to allow collaborative transcription of manuscript pages in TEI-compliant XML.

Features of T-PEN through version 1.2 [from project blog]

Zoom Tool in Transcription User Interface: Holding CTRL+SHIFT will result in a magnified image of the current line being transcribed.

Last updated: 17 Mar 2015

EXMARaLDA (Extensible Markup Language for Discourse Annotation) is a system of concepts, data formats and tools for the computer assisted transcription and annotation of spoken language, and for the construction and analysis of spoken language corpora.

Last updated: 29 Dec 2014

VoiceWalker has fine-grained controls for playing back audio and video to facilitate transcription.

Last updated: 29 Dec 2014

Xalan is an XSLT processor for transforming XML documents into HTML, text, or other XML document types. It implements XSL Transformations (XSLT) Version 1.0 and XML Path Language (XPath) Version 1.0.


  • Conversion between structured markup formats
  • Stylesheet validation
Code license: Apache License, Open source
Last updated: 29 Dec 2014

A software application for the playback of audio recordings. SoundScriber offers specific functionality for researchers that wish to transcribe a recording. It was originally developed for use in the Michigan Corpus of Academic Spoken English (MICASE) project and released for use by academics performing similar work.


  • Audio playback via installed audio codecs (e.g. Wav, MP3)
  • Variable speed playback
Code license: GNU GPL, Open source
Last updated: 29 Jan 2015

XSugar is a proof of concept tool for mapping textual content between a flat file schema and XML format. It performs statistical analysis to establish if transformations between the two formats are bi-directional, enabling content that has been converted into an XML format to be re-exported to the original flat file structure, or vice-versa. To validate the conversion, a schema must exist for source and destination formats, e.g. a bespoke XFlat encoded XML document that contains a definition of the structure of a class of flat files, an XML schema.


Code license: GNU GPL, Open source
Last updated: 29 Dec 2014

Digilib is a web based client/server technology for images. The image content is processed on-the-fly by a Java Servlet on the server side so that only the visible portion of the image is sent to the web browser on the client side. It supports a wide range of image formats and viewing options on the server side while only requiring an internet browser with javascript and a low bandwidth internet connection on the client side.

Code license: Open source, GNU GPL
Last updated: 29 Dec 2014

Dragon Dictation is a voice recognition application that allows you to speak and instantly see your text content from email messages to blog posts on your iPad, iPhone, or iPod Touch.

Code license: Closed source
Last updated: 21 Feb 2017

Express Scribe is a professional audio player software for PC or Mac that assists in the transcription of audio recordings.

Code license: Closed source
Last updated: 29 Dec 2014

FOLKER is an editor for the transcription of audio recorded multiparty dialogue.

Last updated: 29 Dec 2014

InqScribe is a software for transcription and subtitling. You may view and transcribe audio or video side-by-side. You may insert blocks of text, time codes, as well as convert your transcript into a subtitled movie.

Last updated: 29 Dec 2014

MacSpeech Scribe is a transcription software that allows you to create your own personal transcriptions, and supports a wide variety of audio file formats.

Last updated: 29 Dec 2014

Proofread Page is an extension for MediaWiki which allows you to edit transcriptions side by side with the page images. It is used on WikiSource for manuscript and early print transcription projects. Proofread Page supports workflow, but no markup.

Last updated: 29 Dec 2014

Transcript is a desktop-based manuscript transcription tool that supports word-processor style formatting.

Last updated: 29 Dec 2014

Transcription Assistant is a tool that assists in transcription, and incorporates metadata about each image and transcription that may be used to search through an electronic library of transcriptions.

Last updated: 29 Dec 2014

ELAN is a tool for creating annotations on multiple layers on audio and video resources. The textual content of annotations is in Unicode and the transcription is stored in XML.

Last updated: 29 Dec 2014

Transcriva is transcription software for the Mac. Whether it's your meeting minutes, interviews, lectures, home movies, dictation, speeches, or your favorite TV show, Transcriva can help you transcribe them all.

Last updated: 29 Dec 2014

TEI Boilerplate is a lightweight solution for publishing styled TEI (Text Encoding Initiative) P5 content directly in modern browsers. With TEI Boilerplate, TEI XML files can be served directly to the web without server-side processing or translation to HTML.

Last updated: 29 Dec 2014

TypeWright is a tool for correcting the text-version of a document made up of page images. These text-versions are crucially necessary: they are what enables full-text searching, datamining, preserving, and curating texts of historical importance. Right now, the text running behind the page images of these texts has been mechanically typed, leaving behind errors that need to be corrected by human eyes and hands.

Last updated: 29 Dec 2014

PyBossa is a free, open-source, platform for creating and running crowd-sourcing applications that utilise online assistance in performing tasks that require human cognition, knowledge or intelligence such as image classification, transcription, geocoding and more.

Code license: GNU Affero GPL, Open source
Last updated: 29 Dec 2014

The cross-platform Advene application allows users to easily create comments and analyses of video documents, through the definition of time-aligned annotations and their mobilisation into
automatically-generated or user-written comment views (HTML documents). Annotations can also be used to modify the rendition of the audiovisual document, thus providing virtual montage, captioning, navigation... capabilities. Users can exchange their comments/analyses in the form of Advene packages, independently from the video itself.

Code license: Open source, GNU GPL v2
Last updated: 1 Dec 2016

With ediarum researchers can comfortably transcribe, encode and edit manuscripts in TEI-XML, as well as publish their results in an online or print edition. The solution, developed by TELOTA, is based on three software components: exist-db, Oxygen XML Author, and ConTeXt. These are combined, supplemented with additional functions, and tailored to fit a project's needs.

Code license: Open source, GNU GPL, GPL, GNU LGPL
Last updated: 29 Dec 2014
Subscribe to Transcription