Period
2018 – 2020
2018 – 2020
collective, international
Innovation and Networks Executive Agency, Connecting Europe Facility
Dan Tufis, Verginica Barbu Mititelu, Maria Carp, Elena Irimia, Radu Ion, Vasile Pais, Eric Curea
The overall objective of the project MARCELL is to build a sustainable infrastructure for retrieval and semantic processing of documents from the body of the national legislation (laws, decrees, regulations, etc.) in Bulgaria, Poland, Romania, Slovakia, Slovenia, Hungary and Croatia to support the training of machine translation systems. The RACAI/ICIA team created a corpus containing more than 140,000 documents, heavily annotated (POS, lemma, dependency parsing, NE, IATE and Eurovoc mark-up) with appropriate metadata. The results will be provided to train the Automated Translation Platform of the Connecting Europe Facility (CEF.AT). The quality of the machine translation depends on the training of translation systems with a large number of (translated) documents in a given thematic area. The importance of machine translation is increasing with the deepening of economic, political and cultural links between the European countries.
For a description of the Romanian sub-corpus and how it was processed see:
@inproceedings{04.11.2019_Tufis_01,
title={Automatic Identification and Classification
of Legal Terms in Romanian Law Texts},
author={Coman, Andrei and Mitrofan, Maria and Tufiș, Dan},
booktitle={International Conference on Linguistic
Resources and Tools for Natural Language Processing},
place={Iași},
year={2019}
}
@inproceedings{04.11.2019_Tufis_02,
title={Integration of Romanian NLP tools into the
RELATE platform},
author={Paiș, Vasile and Tufiș, Dan and Ion, Radu},
booktitle={International Conference on Linguistic
Resources and Tools for Natural Language Processing},
place={Iași},
year={2019}
}