ULLINK - Analysis and proposal for FIX HFT - December 2010

3,900 views

Published on

This document is a collection of thoughts and analysis to optimize FIX for High Frequency Trading activity. Lowering latency and transit-time.

Published in: Technology, Health & Medicine
0 Comments
3 Likes
Statistics
Notes
  • Be the first to comment

No Downloads
Views
Total views
3,900
On SlideShare
0
From Embeds
0
Number of Embeds
137
Actions
Shares
0
Downloads
1
Comments
0
Likes
3
Embeds 0
No embeds

No notes for slide

ULLINK - Analysis and proposal for FIX HFT - December 2010

  1. 1. High Frequency FIX<br />Analysis and Proposals for HFT FIX<br />December 2010<br />
  2. 2. ABTRACT<br />This document is a collection of thoughts and analysis to optimize FIX for High Frequency Trading activity. Lowering latency and transit-time.<br />2<br />
  3. 3. NUMBERS TO KEEP IN MIND<br /><ul><li>L1 Cache access 0.5 ns
  4. 4. L2 Cache access 7 ns
  5. 5. Main memoryacess 100 ns
  6. 6. Compress 1K with cheap algo 3 000 ns
  7. 7. Send 2K bytes on 1Gbps network 20 000 ns
  8. 8. Read 1M sequentiallyfrommemory 250 000 ns</li></ul>Figures comingfromGoogle’s Jeff Dean talk atStanford in 2010.<br />3<br />
  9. 9. WHAT WE NEED TO OPTIMIZE<br />4<br />Trading System A<br />Trading System B<br />Event<br />Network<br />Event<br />encode<br />decode<br />Reduce transit-time<br />
  10. 10. WHAT WE NEED TO OPTIMIZE<br />Encoding<br />Event*  Stream of bytes<br />Network transit<br />Smaller the stream of bytesis, faster the transit it.<br />Decoding<br />Stream of bytes Event*<br />*Event isat the application level. Not an object of the FIX engine. <br />5<br />
  11. 11. 6<br />TARGET<br />The perfect balancebetween:<br /><ul><li>Encoding
  12. 12. Decoding
  13. 13. Message size</li></ul>IOW:Useless to have ultra-tiny message if encoding or decoding are heavily slow.<br />
  14. 14. 7<br />Chapter 1<br />Integrationwithin<br /> the protocol<br />
  15. 15. 8<br />INTEGRATION WITHIN FIX 5.0<br />TODAY<br />FIX Applicative messages<br />FIX Transport<br />FIXT<br />1.1<br />JMS<br />…<br />
  16. 16. INTEGRATION WITHIN FIX 5.0<br />PROPOSAL<br />Existing CODEC:<br />Tag=Values<SOH><br />FIX Applicative messages<br />We are here<br />FIX Codec<br />FIXC<br />1.0<br />…<br />FIXHFT<br />1.0<br />FIX Transport<br />FIXT<br />1.1<br />JMS<br />…<br />9<br />
  17. 17. 10<br />INTEGRATION WITHIN FIX 5.0<br />Wecancreate and evolve the CODEC implementation (v1.0, v1.1, etc…).<br />We open the door to differentimplementation of CODEC => More innovation captured over time.<br />
  18. 18. 11<br />Chapter 2<br />The Header and Trailer<br />
  19. 19. 12<br />BENCHMARK MESSAGE<br />All testing has been donewith the following FIX message – tinylimitorder in FIX 4.2:<br />8=FIX.4.2|9=177|35=D|49=FIXSENDER|56=ULTEST|34=4209|52=20101029-10:11:51.890|1=1374390|55=DE0005937007|48=DE0005937007|22=4|54=2|114=N|38=1|59=0|40=2|21=1|44=200|15=EUR|100=DE|11=1288346559187|10=025| <br />200 bytes<br />
  20. 20. 13<br />HEADER SIZE<br />Tag 8 (BeginString)<br />Not very time consuming but represent 10 bytes (8=FIX.4.2<soh>)<br />(5% of my benchmark message!)<br />SenderCompIDs & TargetCompIDs<br />Repeated on every messages. Can bereplaced or removedcompletly. (?)<br /><ul><li> 23 bytes in my benchmark : « 49=FIXSENDER|56=ULTEST| »
  21. 21. 10 bytesminimum possible : « 49=F|56=U| »)</li></ul>(Again 5-10% of total size)<br />
  22. 22. 14<br />HEADER SIZE<br />Tag 52 (SendingTime)<br />Sending time isusefull to avoiddelayed and dangerousorder to reach the market.<br />But, 25 bytesistoomuch (more than 10% of message size)<br />« 52=20101029-10:11:51.890| »<br />
  23. 23. 15<br />Encoding Tag 9 (BodyLength) and Tag 10 (Checksum)<br />Tag 9 needs Body to beencoded, Tag 10 needs Tag 9. All thesesteps are expensive and requiredifferentmemoryiterations and transfert.<br />Tag 9 + 10 = ~13 bytes (again more than 5% of the benchmark msg)<br />Event<br />1) Encode Body<br />Socket<br />4) Send to socket<br />Buffer<br />2) Encode Length<br />3) Encode Checksum<br />
  24. 24. 16<br />Encoding Tag 9 (BodyLength) and Tag 10 (Checksum)<br />Weshouldbe able to stream the encoding as much as possible.<br />Encoding Checksum iswaytoo slow becauseit scan full message. <br />Atthis stage memory scan make a bigdifference. Cf. Slide 2<br />Socket<br />Event<br />Stream encoding<br />
  25. 25. 17<br />Decoding Tag 9 (BodyLength) and Tag 10 (Checksum)<br />On the decoding end, Tag 9 and Tag 10 are onlyused for check integritybecause message are cutbetween Tag 8 and Tag 10.<br />Again buffer isscanned for Checksum validation. Costtoo must.<br />I guesswecangetrid of thisintegritychecks and rely on the transport directly.<br />
  26. 26. 18<br />SUMMARY<br />If you combine all this:<br />Header+Trailer = 38% of the message size<br />Theorically,<br />Header+Trailer = 38% of the encoding time<br />Because of checksum cost, I guessmore than 40% <br />Body<br />116 bytes (62%)<br />Header + Trailer<br />77 bytes (38%)<br />
  27. 27. 19<br />PROPOSAL : All mandatory tag in structuredbinary:<br />Binaryisfaster to encode and decode<br />
  28. 28. 20<br />PROPOSAL<br />Example:<br />8=FIX.4.2|9=177|35=D|49=FIXSENDER|56=ULTEST|34=4209|52=20101029-10:11:51.890|1=1374390|55=DE0005937007|48=DE0005937007|22=4|54=2|114=N|38=1|59=0|40=2|21=1|44=200|15=EUR|100=DE|11=1288346559187|10=025|<br />Becomes:<br /><NEWHEADER>1=1374390|55=DE0005937007|48=DE0005937007|22=4|54=2|114=N|38=1|59=0|40=2|21=1|44=200|15=EUR|100=DE|11=1288346559187|<br />200 bytes -> about 130 bytes<br />About 35% smaller.<br />Faster to encode and decodebecausebinarybased.<br />
  29. 29. 21<br />Chapter 3<br />BinaryEncoding<br />
  30. 30. 22<br />BINARY ENCODING<br />There is a lot of numbers in the FIX Protocol. <br />(obviouslygiven the financialrelationship)<br />For numbers, Binaryencodingis a must-have becauseitis:<br /><ul><li>Faster to encode
  31. 31. Faster to decode
  32. 32. Smaller on the wire</li></li></ul><li>23<br />BINARY ENCODING – Fasterwhy?<br />« 100 » canbedirectlystored in 1 byte.<br />No Code.<br />In literal, « 100 » isstored in 3 bytes:<br />Code doneunderneath for encoding :<br />for (;;)<br /> { <br /> q = (i * 52429) >>> (16+3);<br /> r = i - ((q << 3) + (q << 1));<br />buf [--charPos] = digits [r];<br /> i = q;<br />if (i == 0) break;<br /> }<br />if (sign != 0) {<br />buf [--charPos] = sign;<br /> }<br />
  33. 33. 24<br />PROPOSAL : Encode fields in a compact binaryform<br />Field Header = 1 byte = 8 bits<br />
  34. 34. 25<br />PROPOSAL : Encode fields in a compact binaryform<br />Field Header / Tag Value Encoded Format (7 bits)<br />
  35. 35. 26<br />PROPOSAL : Encode fields in a compact binaryform<br />Field Header / Tag Value Encoded Format (7 bits)<br />
  36. 36. 27<br />PROPOSAL : Encode fields in a compact binaryform<br />Field Header / Tag Value Encoded Format (7 bits)<br />What about Boolean and Char?<br />They are heavilyused in FIX Protocol – weneedsomethingefficent!<br />Withthis, char encodingdon’tneed tag value – it’s all there.<br />
  37. 37. 28<br />PROPOSAL : Encode fields in a compact binaryform<br />Field Header / Tag Value Encoded Format (7 bits)<br />There isavailable values for more ideas/values:<br />You to fill in.<br />
  38. 38. 29<br />PROPOSAL : Encode fields in a compact binaryform<br />Benchmark message body:<br />1=1374390| -> 10 bytes (String) or 6 bytes (Integer 4 bytes)<br />55=DE0005937007| -> 15 bytes (String)<br />48=DE0005937007| -> 15 bytes (String)<br />22=4| -> 2 bytes (Char)<br />54=2| -> 2 bytes (Char)<br />114=N| -> 2 bytes (Char/Bool)<br />38=1| -> 2 bytes (Char)<br />59=0| -> 2 bytes (Char)<br />40=2| -> 2 bytes (Char)<br />21=1| -> 2 bytes (Char)<br />44=200| -> 3 bytes (Integer 1 byte)<br />15=EUR| -> 6 bytes (String)<br />100=DE| -> 5 bytes (String)<br />11=1288346559187|-> 16 bytes (String)<br />Body size from 116 bytes -> 80 or 84 bytes (-30%)<br />(NOTE: Tag 55 and 48 = 30 bytes of repetition)<br />
  39. 39. 30<br />PROPOSAL – BIG PICTURE<br />BENCHMARK FIX MESSAGE : <br />Total = 200 bytes<br />PROPOSAL:<br />Total = 98 bytes (51% shaved)<br />Body<br />116 bytes (62%)<br />Header + Trailer<br />77 bytes (38%)<br />NEW BODY<br />~84 bytes (86%)<br />NEW HEADER<br />~14 bytes (14%)<br />
  40. 40. 31<br />Chapter 4<br />How it compares to Compression?<br />
  41. 41. 32<br />COMPRESSION with LZF<br />LZF = Fast and lightweight LZ algorithm<br />LZ = Lempel-Ziv = Compression of repetitions<br />Withoutpredefineddictionary:<br />Message size = 189bytes<br />Withpredefineddictionary:<br />Not doneyet<br />
  42. 42. 33<br />COMPRESSION withHuffman<br />Huffman = Most used chars use less bits<br />WithoptimizedHuffmantree for the benchmarked message:<br />Message size = 103 bytes(-49%)<br />But huffmantreeneeds to betransmitted! That’sanother 26 to 52 bytes! A new huffmantreemay not beneeded on each message.<br />Huffman tree: {.=1111111, -=11011000, X=1101101, U=1101110, T=1111101, S=1111010, R=1111000, N=1111011, |=011, L=11011001, I=1111001, F=1111100, E=11010, D=00110, ==010, :=1111110, 9=0010, 8=00111, 7=11100, 6=1101111, 5=1100, 4=1010, 3=11101, 2=1011, 1=000, 0=100}<br />
  43. 43. 34<br />COMPRESSION with ZLIB<br />ZLIB = LZ + Huffman<br />Single message:<br />Message size = 132 bytes (-34%)<br />(Toomuchoverheadgiven the size of the message)<br />
  44. 44. 35<br />COMPRESSION<br />Compression is an additionalcost to encoding and decoding.<br />If FIX tag=value formis slow thanit’s not going to improveitat all. <br />Message sizes are not impressive – Theydon’t match the encoded format proposed in this document.<br />Trading System A<br />Trading System B<br />Compress<br />Uncompress<br />Event<br />Network<br />Event<br />encode<br />decode<br />
  45. 45. 36<br />COMPRESSION<br />Wemayneed to investiguate compression of individual strings using a predifinedhuffmantree. A lot of FIX strings are usingnumbers and uper case letters.<br />Good for IDs, like CLOIRDID, ORDERID, ORIGCLORDID, ACCOUNT, SYMBOL, ISIN Codes, RIC Codes, etc…<br />But, thiswill impact encoding and decoding times…<br />
  46. 46. 37<br />Chapter 5<br />New Message Types<br />
  47. 47. 38<br />NEW MESSAGE TYPE<br />On top of all encoding discussion, nothingprevent us to create new message types in the FIX Protocol definition.<br />Message types thatcouldbesmaller, simplier, more powerfull for HFT.<br />e.g. messages for HFT:<br /><ul><li>Move up/down (price)
  48. 48. Increase/Decrease (qty)
  49. 49. Cancel all my open orders</li></ul>Needs to bedesigned by HFT firmsbased on theirneeds<br />
  50. 50. 39<br />Chapter 6<br />Templates<br />
  51. 51. 40<br />TEMPLATES<br />Templatescouldbeused on top of the encodingform.<br />I’m not a big fan of template files to beexchangedbetweencounterparties. IMHO, Brokers having tons of clients don’twant to have anotherthing to manage for client connectivity.<br />I prefer the idea of Prepared Messages (seenextChapter).<br />
  52. 52. 41<br />Chapter 7<br />Prepared messages<br />
  53. 53. 42<br />PREPARED MESSAGES<br />SeePrepared Messages as PreparedStatements in the SQL world.<br />Pre-filled and parametrised messages that the destination isready to receive and process.<br />This alsomeansthatsome optimisation / pre-processingcanbedone up-front on the receiving end.<br />
  54. 54. 43<br />PREPARED MESSAGES<br />Prepared Message:<br /><MESSAGE HEADER> -> Unchanged<br />1=1374390| -> 10 bytes (String) or 6 bytes (Integer 4 bytes)<br />55=?| -> 2 byte(Parameter 1)<br />48=?| -> 2 byte(Parameter 2)<br />22=4| -> 2 bytes (Char)<br />54=?| -> 2 byte(Parameter 3)<br />114=N| -> 2 bytes (Char/Bool)<br />38=1| -> 2 byte(Parameter 4)<br />59=0| -> 2 bytes (Char)<br />40=2| -> 2 bytes (Char)<br />21=1| -> 2 bytes (Char)<br />44=200| -> 2 bytes(Parameter 5)<br />15=EUR| -> 6 bytes (String)<br />100=DE| -> 5 bytes (String)<br />11=?| -> 2 bytes(Parameter 6)<br />65535=123 -> Prepared message ID<br />Tag 65535 (couldbeanything) is the prepared message ID used for execute.<br />
  55. 55. 44<br />PREPARED MESSAGES<br />Once prepared, Message to executeis:<br />PreparedMsg ID 123 -> 2 bytes<br />Parameter 1 : Symbol DE0005937007 -> 14 bytes (String)<br />Parameter 2 : SecurityID DE0005937007 -> 14 bytes (String)<br />Parameter 3 : Side 2 -> 1 bytes (Char)<br />Parameter 4 : Quantity 1 -> 2 bytes (Integer)<br />Parameter 5 : Price 200 -> 2 bytes (Integer)<br />Parameter 6 : ClOrdId 1288346559187 -> 15 bytes (String)<br />Executed message = 64 bytes<br />Execute Message<br />~50 bytes<br />NEW HEADER<br />~14 bytes<br />
  56. 56. 45<br />PREPARED MESSAGES<br />A blackboxdoing Pairs TradingcanPrepare<br />Stock and Side:<br />PreparedMsg ID 123 -> 2 bytes<br />Parameter 1 : Quantity 1 -> 2 bytes (Integer)<br />Parameter2 : Price 200 -> 2 bytes (Integer)<br />Parameter3 : ClOrdId 1288346559187 -> 15 bytes (String)<br />Executed message = 35 bytes (!)<br />Execute Message<br />~21 bytes<br />NEW HEADER<br />~14 bytes<br />
  57. 57. 46<br />PREPARED MESSAGES<br />Prepared Messages givevery good resultswithoutremovinganything of the FIX protocol.<br />Prepared Messages fits HTF verywellsince instruments and strategies are repetitive and ordertopologyisknownupfront.<br />Note:<br />Message size issomuchreducethatwemayneed to work on a more compact form for tag 11 (ClOrdId) and tag 41(OrigClOrdId).<br />Indeed, in my last example, 15/35 bytes are related to tag 11.<br />
  58. 58. 47<br />PREPARED MESSAGES<br />Another good effect of prepared messages isthat the receiving system canprepareobjects and data to optimizeprocessing of execute message.<br />This canoptimize the processing of the message within the application itself – evenfurtherthan the decoding end (Exactlylikepreparedstatements in SQL are fasterwithin the Database).<br />Trading System A<br />Trading System B<br />Prepared Message<br />Event<br />Event<br />encode<br />decode<br />Execute Message<br />
  59. 59. 48<br />Conclusion<br />
  60. 60. 49<br />In thisproposalwe have currently:<br /><ul><li> Message size up to 85% smaller (good for network processing)
  61. 61. Encoding & Decoding are faster (a lot of binaryform / no checksum)
  62. 62. It’s simple – nobodyneedsPhD
  63. 63. FIX Protocol has not been denatured.</li></ul>Prepared Messages apart, no changes required to any application alreadyusing a FIX Engine. A single « HFT=yes » option couldturnthis on in FIX Engines. This coulddramaticallyreduceinvestment of the industry and boostadaption.<br />
  64. 64. TO BE CONTINUED…<br />Contact : Georges Gomes<br />georges.gomes@ullink.com<br />Merry Christmas and Happy New Year 2011<br />

×