Host: The Japanese Society for Artificial Intelligence
Name : The 37th Annual Conference of the Japanese Society for Artificial Intelligence
Number : 37
Location : [in Japanese]
Date : June 06, 2023 - June 09, 2023
Recognizing financial terminologies from text is essential for key information retrieval and content understanding. In general, financial terminologies do not appear in single-token form but are composed of several tokens. Also, in terminologies, a proper name might have diverse expressions, like abbreviations and morphological inflection, which sacrifice the recognition performance on recall. In this paper, along with transformer-based language models, i.e. XLM-Roberta, we propose a mechanism to train the neural classifier to distinguish terminologies from plain text, by learning from the sequential tags of targeting tokens. Initially, the targeting tokens are from a list of terminologies. To involve the diverse expressions, we inventively generate different morphologies of terminologies and utilize them to extend the targeting tokens. The experiments' results prove that this mechanism shows a convincing improvement in identifying financial terms from plain text.