Embed presentation
Download to read offline




This document discusses methods for handling categorical data in machine learning models. It describes categorical variables as data that falls into fixed categories like phone brands. It recommends either dropping unuseful categorical variables, label encoding which assigns integer values to categories, or one-hot encoding which separates categories into binary columns. Label encoding works well for decision trees while one-hot encoding preserves differences between categories.



