Text mining refers to extracting knowledge from unstructured text data. It is needed because most biological knowledge exists in unstructured research papers, making it difficult for scientists to manually analyze large amounts of papers. Text mining can help address this by automatically analyzing papers and identifying relevant information. However, text mining also faces challenges like dealing with unstructured text, word ambiguities, and noisy data. The basic steps of text mining involve preprocessing text through tasks like tokenization, feature selection to identify important terms, and parsing to separate words and punctuation.