This paper presents a grammar-based pre-processing method for the prediction by partial matching (ppm) compression algorithm, resulting in improved compression for various natural languages. By generating a grammar based on common character sequences and substituting them during a pre-processing phase, the method achieves notable compression rates—up to 35% for Chinese text and 29% for Welsh. The approach demonstrates better performance compared to existing compression methods and highlights the effectiveness of using n-graphs specific to the text being compressed.