“CorpusLem” una herramienta para la conversión de corpus textuales en datos

  1. Gotzon Aurrekoetxea 1
  1. 1 Universidad del País Vasco/Euskal Herriko Unibertsitatea
    info

    Universidad del País Vasco/Euskal Herriko Unibertsitatea

    Lejona, España

    ROR https://ror.org/000xsnr85

Book:
Las tecnologías de la información y las comunicaciones: presente y futuro en el análisis de corpus: Actas del III Congreso Internacional de Lingüistica de Corpus
  1. María Luisa Carrió Pastor (ed. lit.)
  2. Miguel Ángel Candel Mora (ed. lit.)

Publisher: Universidad Politécnica de Valencia = Universitat Politècnica de València

ISBN: 978-84-694-6225-6

Year of publication: 2011

Pages: 611-618

Congress: Congreso Internacional de Lingüistica de Corpus (3. 2011. Valencia)

Type: Conference paper

Abstract

“CorpusLem” is a Web tool to convert textual information into data, which is organised in a data-base. The interface has been designed in different languages (English, French, Spanish, Basque and Catalan). This tool converts text documents (.doc, .odt and .txt) into MySQL format and, in addition, it provides an alphabetic index of all the words included in the documents. Apart from that, the “CorpusLem” suggests a lemma for each variant and displays the context of each word. The user can make the corrections in the index, either into the tool or in its computer, after downloading the required information, and afterwards he can upload the corrected index. The tool is designed to house different projects and more than one user for each project. It could be used with documents written in standard or non-standard varieties, even in standard spelling or in original spelling of the texts.