The paper introduces improved count suffix trees (CST) for optimizing natural language query predicates by addressing the substring estimation problem, highlighting new filtering and pruning techniques that significantly reduce CST size and enhance selectivity estimates. Key innovations include an optimistic syllabification method, a more aggressive affix and prefix stripping procedure, and effective non-word filtering. Evaluation results demonstrate that these techniques can decrease the CST size by up to 80% while improving accuracy of selectivity estimates by up to 70%.