In this paper we describe a prototype of a Venetan to English translation system developed under the STILVEN project financed by the Regional Authorities of Veneto Region in Italy. The general approach is a statistical one with some preprocessing operations both at training and translation time (ortographic normalization and POS tagging to make use of factored models) which are needed especially to overcome two main problems: the scarcity of Venetan resources (our Venetan-English corpus is made up of only 13,000 sentences, amounting to 128,000 Venetan tokens excluding punctuation) and the diasystemic nature of Venetan, which really represents an ensemble of varieties rather than a single dialect. We will present in detail the problems related to Venetan, our ideas to solve them, their implementation and the results obtained so far.

Venetan to English machine translation: issues and possible solutions

JABER, SUHEL;DELMONTE, Rodolfo
2011-01-01

Abstract

In this paper we describe a prototype of a Venetan to English translation system developed under the STILVEN project financed by the Regional Authorities of Veneto Region in Italy. The general approach is a statistical one with some preprocessing operations both at training and translation time (ortographic normalization and POS tagging to make use of factored models) which are needed especially to overcome two main problems: the scarcity of Venetan resources (our Venetan-English corpus is made up of only 13,000 sentences, amounting to 128,000 Venetan tokens excluding punctuation) and the diasystemic nature of Venetan, which really represents an ensemble of varieties rather than a single dialect. We will present in detail the problems related to Venetan, our ideas to solve them, their implementation and the results obtained so far.
2011
Human-Machine Interaction in Translation: Proceedings of the 8th International NLPCS Workshop
File in questo prodotto:
File Dimensione Formato  
nlpcs-2011-jaber.pdf

accesso aperto

Tipologia: Documento in Post-print
Licenza: Accesso gratuito (solo visione)
Dimensione 872.64 kB
Formato Adobe PDF
872.64 kB Adobe PDF Visualizza/Apri

I documenti in ARCA sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10278/28156
Citazioni
  • ???jsp.display-item.citation.pmc??? ND
  • Scopus ND
  • ???jsp.display-item.citation.isi??? ND
social impact