Label and Features: Instance / Sample: A single data point or observation.
Label: The target or output variable you want to predict.
Features: The input variables or attributes that describe each instance and are used by the model to make predictions. Aims to numerically represent the unstructured text documents to make them mathematically computable.
Types: Frequency representation & Semantic Representation
Taking words/phrases as symbols gets its occurrence in the document as a vector