This project aims to build a binary classifier model to label unlabeled DNA sequences as either positive (p) or negative (n) based on labeled training sequences. The team will take two approaches: 1) A k-mer approach that generates all DNA sequence fragments of length K and counts frequencies to use as attributes for classification models. 2) A PWM approach that uses motif finding tools to generate position weight matrices and score sequences to use as attributes. The approaches will be evaluated individually and combined to obtain the best performing model. Key challenges include deriving meaningful attributes from the sequence data alone. Parameters like k-mer length, number of motifs, and motif lengths will be tuned to optimize model performance.