This document discusses using the WEKA machine learning toolkit to build a model for classifying movie reviews as positive or negative sentiment. It describes preprocessing a dataset of 2000 movie reviews from IMDb by converting the text to vector representations and experimenting with different preprocessing options. Various classifiers are tested on the vectorized datasets, with Naive Bayes achieving the best results, correctly classifying around 80-85% of reviews. Attribute selection is also explored to identify the most important words for sentiment classification. Finally, clustering is briefly applied to group similar reviews.