VinAI at ChEMU 2020: An accurate system for named entity recognition in chemical reactions from patents

Abstract

This paper describes our VinAI system for the ChEMU task 1 of named entity recognition (NER) in chemical reactions. Our system employs a BiLSTM-CNN-CRF architecture with additional contextualized word embeddings. It achieves very high performance, officially ranking second with regards to both exact- and relaxed-match F1 scores at 94.33% and 96.84%, respectively. In a post-evaluation phase, fixing a mapping bug which converts the column-based format into the brat standoff format helps our system to obtain higher results. In particular, we obtain an exact-match F1 score at 95.21% and especially a relaxed-match F1 score at 97.26%, thus achieving the highest relaxed-match F1 compared to all other participating systems. We believe our system can serve as a strong baseline for future research and downstream applications of chemical NER over chemical reactions from patents.

Publication
In The Working Notes of Conference and Labs of the Evaluation Forum 2020
Mai Hoang Dao
Mai Hoang Dao
NLP Research Intern

My research interests include spoken language understanding in low-resource languages and multilingual NLP.

Related