The HIT-SCIR System for End-to-End Parsing of Universal Dependencies

Autores: Wanxiang Che, Jiang Guo, Yuxuan Wang, Bo Zheng, Huaipeng Zhao, Yang Liu, Dechuan Teng, Ting Liu
Localización: Proceedings of the CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies : August 3-4, 2017 Vancouver, Canada / coord. por Jan Hajic, 2017, ISBN 978-1-945626-70-8, págs. 52-62
Idioma: inglés
Enlaces
- Texto Completo Libro (pdf)
Resumen
- This paper describes our system (HIT-SCIR) for the CoNLL 2017 shared task: Multilingual Parsing from Raw Text to Universal Dependencies. Our system includes three pipelined components:
  
  to-kenization, Part-of-Speech (POS) tagging and dependency parsing.
  
  We use character-based bidirectional long shortterm memory (LSTM) networks for both tokenization and POS tagging. Afterwards, we employ a list-based transition-based algorithm for general non-projective parsing and present an improved Stack- LSTM-based architecture for representing each transition state and making predictions.
  
  Furthermore, to parse low/zero-resource languages and cross-domain data, we use a model transfer approach to make effective use of existing resources. We demonstrate substantial gains against the UDPipe baseline, with an average improvement of 3.76% in LAS of all languages. And finally, we rank the 4th place on the official test sets

Acceso de usuarios registrados

¿Es nuevo? Regístrese

Coordinado por: