TY - GEN
T1 - Enhancing unlexicalized parsing performance using a wide coverage lexicon, fuzzy tag-set mapping, and EM-HMM-based lexical probabilities
AU - Goldberg, Yoav
AU - Tsarfaty, Reut
AU - Adler, Meni
AU - Elhadad, Michael
PY - 2009
Y1 - 2009
N2 - We present a framework for interfacing a PCFG parser with lexical information from an external resource following a different tagging scheme than the treebank. This is achieved by defining a stochastic mapping layer between the two resources. Lexical probabilities for rare events are estimated in a semi-supervised manner from a lexicon and large unannotated corpora. We show that this solution greatly enhances the performance of an unlexicalized Hebrew PCFG parser, resulting in state-of-the-art Hebrew parsing results both when a segmentation oracle is assumed, and in a real-word parsing scenario of parsing unsegmented tokens.
AB - We present a framework for interfacing a PCFG parser with lexical information from an external resource following a different tagging scheme than the treebank. This is achieved by defining a stochastic mapping layer between the two resources. Lexical probabilities for rare events are estimated in a semi-supervised manner from a lexicon and large unannotated corpora. We show that this solution greatly enhances the performance of an unlexicalized Hebrew PCFG parser, resulting in state-of-the-art Hebrew parsing results both when a segmentation oracle is assumed, and in a real-word parsing scenario of parsing unsegmented tokens.
UR - http://www.scopus.com/inward/record.url?scp=84874625722&partnerID=8YFLogxK
U2 - 10.3115/1609067.1609103
DO - 10.3115/1609067.1609103
M3 - ???researchoutput.researchoutputtypes.contributiontobookanthology.conference???
AN - SCOPUS:84874625722
SN - 9781932432169
T3 - EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
SP - 327
EP - 335
BT - EACL 2009 - 12th Conference of the European Chapter of the Association for Computational Linguistics, Proceedings
PB - Association for Computational Linguistics (ACL)
T2 - 12th Conference of the European Chapter of the Association for Computational Linguistics, EACL 2009
Y2 - 30 March 2009 through 3 April 2009
ER -