This thesis proposes new machine translation evaluation metrics called LEPOR that aim to address issues with existing methods. LEPOR incorporates additional evaluation factors like length penalty and n-gram position difference penalty to provide more accurate assessments. It also explores using part-of-speech tags as linguistic features. Experimental results on multiple language pairs show LEPOR achieves strong performance, correlating highly with human judgments of translation quality. The metrics were also submitted to the ACL-WMT shared translation evaluation task where they proved robust across different languages.