Dependency parsing as head selection
Conventional graph-based dependency parsers guarantee a tree structure both during
training and inference. Instead, we formalize dependency parsing as the problem of
independently selecting the head of each word in a sentence. Our model which we
call\textsc {DeNSe}(as shorthand for {\bf De} pendency {\bf N} eural {\bf Se} lection)
produces a distribution over possible heads for each word using features obtained from a
bidirectional recurrent neural network. Without enforcing structural constraints during …
training and inference. Instead, we formalize dependency parsing as the problem of
independently selecting the head of each word in a sentence. Our model which we
call\textsc {DeNSe}(as shorthand for {\bf De} pendency {\bf N} eural {\bf Se} lection)
produces a distribution over possible heads for each word using features obtained from a
bidirectional recurrent neural network. Without enforcing structural constraints during …
Conventional graph-based dependency parsers guarantee a tree structure both during training and inference. Instead, we formalize dependency parsing as the problem of independently selecting the head of each word in a sentence. Our model which we call \textsc{DeNSe} (as shorthand for {\bf De}pendency {\bf N}eural {\bf Se}lection) produces a distribution over possible heads for each word using features obtained from a bidirectional recurrent neural network. Without enforcing structural constraints during training, \textsc{DeNSe} generates (at inference time) trees for the overwhelming majority of sentences, while non-tree outputs can be adjusted with a maximum spanning tree algorithm. We evaluate \textsc{DeNSe} on four languages (English, Chinese, Czech, and German) with varying degrees of non-projectivity. Despite the simplicity of the approach, our parsers are on par with the state of the art.
arxiv.org