During my talk I shared the journey of building a system that maps unstructured company descriptions to official industry codes which is challenging because of 1800 possible classification categories and multiclass classification.

In presentation I described how over the years, we’ve evolved our solutions:
– From random forest classifier
– To zero-shot + classification dedicated Large Language Model (LLM)
– And finally, LLMs + Retrieval Augmented Generation (RAG)