Transcription

What kind of data should the tool work with?

Una herramienta para convertir PDF escaneados y de imagen a Word, Excel, PPT, Keynote, Pages, texto, etc en Mac.
Características:
Convertir PDF a Word (.doc), Excel (.xlsx), y Office
Convertir PDF a Pages y Keynote
Convertir PDF a archivos gráficos
Convertir PDF escaneados con OCR
Convertir archivos PDF multilingüe
Soporta PDF restringidos con contraseña
Extrae porciones de texto, imagen o tablas de reconstrucción del documento

Code license: Closed source
Last updated: 29 May 2016

Motor de (reconocimiento óptico de caracteres) para la creación de archivos electrónicos editables y de búsqueda de documentos en papel escaneados, archivos PDF y fotografías digitales.
Características:
Reconocimiento de imágenes de cámara digital y cámara de teléfono móvil
Reconocimiento de idioma
Integración completa con aplicaciones de Office
Conversión de PDF, archivo y seguridad

Code license: Closed source
Last updated: 17 May 2016

CLAW es el software del idioma inglés para llevar a cabo el etiquetado gramatical (POS, por sus siglas en inglés). La clasificación de palabras en una o más categorías basada en su definición, su relación con otras palabras y otros contextos, también conocida como “etiquetado de clase de palabras”.

Code license: Closed source
Last updated: 3 May 2016

DM is an environment for the study and annotation of images and texts. It is a suite of tools, enabling scholars to gather and organize the evidence necessary to support arguments based in digitized resources. DM enables users to mark fragments of interest in manuscripts, print materials, photographs, etc. and provide commentary on these resources and the relationships among them.

Last updated: 1 May 2016

Transana is open source software for the transcription and qualitative analysis of audio and video.

Last updated: 23 Feb 2016

Combinado con Leptonica, la Biblioteca para el Procesamiento de Imágenes, Tesseract puede leer una gran variedad de formatos de imagen y convertirlos a texto en más de 40 idiomas.

Este código es un simple motor de OCR. No tiene formato de salida ni interfaz de usuario. Puede detectar tono fijo y texto proporcional. Sin embargo, en 1995 este motor estaba entre los 3 mejores en términos de precisión de caracteres, y opera tanto en Linux como en Windows. El código de programación está incluido en la versión de código abierto.

Code license: Open source, Apache License
Last updated: 27 Jan 2016

Transana es un programa de computación que permite a los investigadores transcribir y analizar una gran cantidad de videos, audios y datos de imágenes.

Last updated: 10 Aug 2015

VoxcribeCC ofrece la tecnología de reconocimiento de voz y de temas de escritorio más precisa. Se utiliza para la transcripción de medios audiovisuales y subtítulos de video.
Por favor vea el siguiente enlace para más información:
href='http://voxcribe.com/Video%20Speech%20Recognition%20Captioning%20Subtitli...' target='_blank'>VoxcribeCC Usage Video to learn using VoxcribeCC just in 2 minutes.

Code license: Closed source
Last updated: 16 Jun 2015

Scripto es un motor para la externalización abierta de transcripciones de contenido que puede ser integrado con un sistema GUI para la personalización de transcripciones y con sistemas existentes de gestión de contenidos.

Last updated: 21 May 2015

FromThePage es un software gratuito que permite transcribir documentos manuscritos en línea. Facilita la indexación y marcación de contenidos dentro del texto utilizando marcadores similares a los wiki. Los usuarios pueden discutir la escritura difícil o palabras oscuras dentro de una página para refinar su transcripción. Los textos resultantes se alojan en la web, por lo que resultan fáciles de buscar y leer.

Code license: Open source, GNU Affero GPL
Last updated: 2 May 2015

CollateX es un programa de Java para recolectar fuentes textuales, por ejemplo, para producir un texto crítico. A enero de 2012, el proyecto estaba en un estado de desarrollo incipiente y la documentación estaba incompleta.

Code license: GNU GPL v3
Last updated: 25 Mar 2015

Extensive set of tools to allow collaborative transcription of manuscript pages in TEI-compliant XML.

Features of T-PEN through version 1.2 [from project blog]

Zoom Tool in Transcription User Interface: Holding CTRL+SHIFT will result in a magnified image of the current line being transcribed.

Last updated: 17 Mar 2015

EXMARaLDA (Extensible Markup Language for Discourse Annotation) is a system of concepts, data formats and tools for the computer assisted transcription and annotation of spoken language, and for the construction and analysis of spoken language corpora.

Last updated: 29 Dec 2014

VoiceWalker has fine-grained controls for playing back audio and video to facilitate transcription.

Last updated: 29 Dec 2014

Xalan is an XSLT processor for transforming XML documents into HTML, text, or other XML document types. It implements XSL Transformations (XSLT) Version 1.0 and XML Path Language (XPath) Version 1.0.

Features:

  • Conversion between structured markup formats
  • Stylesheet validation
Code license: Apache License, Open source
Last updated: 29 Dec 2014

A software application for the playback of audio recordings. SoundScriber offers specific functionality for researchers that wish to transcribe a recording. It was originally developed for use in the Michigan Corpus of Academic Spoken English (MICASE) project and released for use by academics performing similar work.

Features:

  • Audio playback via installed audio codecs (e.g. Wav, MP3)
  • Variable speed playback
Code license: GNU GPL, Open source
Last updated: 29 Jan 2015

XSugar is a proof of concept tool for mapping textual content between a flat file schema and XML format. It performs statistical analysis to establish if transformations between the two formats are bi-directional, enabling content that has been converted into an XML format to be re-exported to the original flat file structure, or vice-versa. To validate the conversion, a schema must exist for source and destination formats, e.g. a bespoke XFlat encoded XML document that contains a definition of the structure of a class of flat files, an XML schema.

Features:

Code license: GNU GPL, Open source
Last updated: 29 Dec 2014

Digilib is a web based client/server technology for images. The image content is processed on-the-fly by a Java Servlet on the server side so that only the visible portion of the image is sent to the web browser on the client side. It supports a wide range of image formats and viewing options on the server side while only requiring an internet browser with javascript and a low bandwidth internet connection on the client side.

Code license: Open source, GNU GPL
Last updated: 29 Dec 2014

Dragon Dictation is a voice recognition application that allows you to speak and instantly see your text content from email messages to blog posts on your iPad, iPhone, or iPod Touch.

Code license: Closed source
Last updated: 21 Feb 2017

Express Scribe is a professional audio player software for PC or Mac that assists in the transcription of audio recordings.

Code license: Closed source
Last updated: 29 Dec 2014

FOLKER is an editor for the transcription of audio recorded multiparty dialogue.

Last updated: 29 Dec 2014

InqScribe is a software for transcription and subtitling. You may view and transcribe audio or video side-by-side. You may insert blocks of text, time codes, as well as convert your transcript into a subtitled movie.

Last updated: 29 Dec 2014

MacSpeech Scribe is a transcription software that allows you to create your own personal transcriptions, and supports a wide variety of audio file formats.

Last updated: 29 Dec 2014

Proofread Page is an extension for MediaWiki which allows you to edit transcriptions side by side with the page images. It is used on WikiSource for manuscript and early print transcription projects. Proofread Page supports workflow, but no markup.

Last updated: 29 Dec 2014

Transcript is a desktop-based manuscript transcription tool that supports word-processor style formatting.

Last updated: 29 Dec 2014

Transcription Assistant is a tool that assists in transcription, and incorporates metadata about each image and transcription that may be used to search through an electronic library of transcriptions.

Last updated: 29 Dec 2014

ELAN is a tool for creating annotations on multiple layers on audio and video resources. The textual content of annotations is in Unicode and the transcription is stored in XML.

Last updated: 29 Dec 2014

Transcriva is transcription software for the Mac. Whether it's your meeting minutes, interviews, lectures, home movies, dictation, speeches, or your favorite TV show, Transcriva can help you transcribe them all.

Last updated: 29 Dec 2014

TEI Boilerplate is a lightweight solution for publishing styled TEI (Text Encoding Initiative) P5 content directly in modern browsers. With TEI Boilerplate, TEI XML files can be served directly to the web without server-side processing or translation to HTML.

Last updated: 29 Dec 2014

TypeWright is a tool for correcting the text-version of a document made up of page images. These text-versions are crucially necessary: they are what enables full-text searching, datamining, preserving, and curating texts of historical importance. Right now, the text running behind the page images of these texts has been mechanically typed, leaving behind errors that need to be corrected by human eyes and hands.

Last updated: 29 Dec 2014

PyBossa is a free, open-source, platform for creating and running crowd-sourcing applications that utilise online assistance in performing tasks that require human cognition, knowledge or intelligence such as image classification, transcription, geocoding and more.

Code license: GNU Affero GPL, Open source
Last updated: 29 Dec 2014

F4

F4 eases the transcription process of audio or video recordings and you can safe about 30% of your time. You can adjust the playback speed to your personal transcription speed. Further there is a foot pedal usable to control the playback. You can set automatically time marks, speaker change or text modules.

Last updated: 29 Dec 2014

The cross-platform Advene application allows users to easily create comments and analyses of video documents, through the definition of time-aligned annotations and their mobilisation into
automatically-generated or user-written comment views (HTML documents). Annotations can also be used to modify the rendition of the audiovisual document, thus providing virtual montage, captioning, navigation... capabilities. Users can exchange their comments/analyses in the form of Advene packages, independently from the video itself.

Code license: Open source, GNU GPL v2
Last updated: 1 Dec 2016

With ediarum researchers can comfortably transcribe, encode and edit manuscripts in TEI-XML, as well as publish their results in an online or print edition. The solution, developed by TELOTA, is based on three software components: exist-db, Oxygen XML Author, and ConTeXt. These are combined, supplemented with additional functions, and tailored to fit a project's needs.

Code license: Open source, GNU GPL, GPL, GNU LGPL
Last updated: 29 Dec 2014
CSV
Subscribe to Transcription