11/25/2022, West Palm Beach – Nelson Correa, Founder & CEO, and Antonio Correa, VP Business Development at Andinum, presented the article “Neural Text Classification for Digital Transformation in the Financial Regulatory Domain” at the IEEE ANDESCON 2022 conference, November 15-19, 2022, in Barranquilla, Colombia.

Text classification is a core use case in artificial intelligence and natural language processing (NLP). The article presents traditional, deep learning and transformer models for classification of consumer complaints in the U.S. Consumer Financial Protection Bureau (CFPB) Consumer Complaints Database, a corpus of over 2,600,000 consumer financial complaints.

The models with a large language model (LLM) transformer architecture reach a top accuracy of 88% for the classification of “financial product” on the CFPB dataset. We are currently evaluating this result against human levels of performance on the task. 

Automatic document classification is a core Andinum technology that enables the efficient, transparent and reliable handling of billions of documents, for applications such as document routing, regulatory reporting and business intelligence,” said Nelson Correa. 

With this result, using the transformer encoder for input text analysis, we are excited to continue with development of novel generative applications with large language models, using the decoder portion of the transformer (e.g., open source equivalents of GPT-3 by OpenAI),” he added.

Conference site, slides, paper and a repository are available on GitHub.


Update 12/05/2022: Article and GitHub repository