Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

ast-rspamd

528 views

Published on

  • Be the first to comment

  • Be the first to like this

ast-rspamd

  1. 1. Expressions speed up … when you are too lazy to calculate everything
  2. 2. What is optimised • General logic expression: • Complex logic: • Arithmetic logic: • All above: A & B & C & D (A & !B) | C & D A + B + C + D > 2 F & (A + B + C + D > 2)
  3. 3. SA approach • Evaluate all rules (terribly slow) • Use only some of them • Use regexps for everything
  4. 4. Old rspamd • Reverse Polish Notation A & B & C -> A B C & & • Evaluated rules, applying basic optimisations: A & B & C & D 0 & 1 & 1 & 0 A & B & C & D If A equal to 0 there is no need to evaluate other components
  5. 5. Can we do better? • We want to organise evaluations to execute faster ops before expensive ops • We want to have a generic evaluation of the arguments to decide when to stop and return
  6. 6. Solution: AST • Abstract Syntax Tree - a tree of expressions • Optimize branches in the tree by execution time and frequency • Apply greedy algorithm to minimise calculations • Be as lazy as possible (laziness is good!)
  7. 7. AST building & | C ! B A
  8. 8. AST eval (naive) & | C ! B A A = 0, B = 1, C = 0 0 1 1 0 1 0
  9. 9. AST branches cut & | C ! B A A = 0, B = 1, C = 0 0 1 1 0 1 0 Eval order • 1 branch skipped
  10. 10. Can we do better? • In the previous slide we cut merely a single branch • Not good, still have to evaluate too many unnecessary stuff
  11. 11. AST branches reorder & |C ! B A A = 0, B = 1, C = 0 0 1 10 1 0 Eval order • 4 branches skipped
  12. 12. AST branches reorder • Prioritise branches with fewer operations in the underneath levels • Skip unnecessary evaluations • Reduce the total running time of the expression
  13. 13. N-ary operations > + 2 ! B A Eval order C D E
  14. 14. N-ary optimizations > + 2 ! B A Eval order C D E What do we compare? Here is our limit
  15. 15. N-ary optimizations > + 2 ! B A Eval order C D E What do we compare? Here is our limit Stop here
  16. 16. Results • Rspamd with RPN: 200ms on a normal message, 1.6 seconds on stupid large text message (10 Mb of text) • Rspamd with AST: 40ms on a normal message, 400ms on stupid large text message • SA: ??? (timeout?)
  17. 17. Further steps • Greedy algorithm to optimize execution time: • calculate frequency and average time of a component • minimize expression by applying greedy formula: min(freq / avg_time) for each component
  18. 18. Learn dynamically • We need to re-evaluate order of AST in the real time • Solution: periodically evaluate atoms weights and resort tree using the same greedy algorithm • Average time and cost is already evaluated
  19. 19. Laziness is the source of the progress

×