The document discusses classifying companies into industries based on embedding their business descriptions. It explores different embedding models such as BERT, Sentence Transformer, and XLM to represent descriptions as vectors. It then uses UMAP for dimension reduction and hierarchical clustering to group companies into 12 level 1 and 38 level 2 industries, identifying representative keywords for each cluster. The results could help address limitations of existing industry classification frameworks.