Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our User Agreement and Privacy Policy.

Slideshare uses cookies to improve functionality and performance, and to provide you with relevant advertising. If you continue browsing the site, you agree to the use of cookies on this website. See our Privacy Policy and User Agreement for details.

Like this presentation? Why not share!

- Stevens HFC2009 by jzw200 590 views
- Stella & Maggie The Present by guest4860c7 185 views
- Christmas By Madhu!!!!! by Erwin Elementry 384 views
- Is Brazil the Land of Publishing Fu... by Carlo Carrenho 1006 views
- Stevens Hfc2010 by jzw200 1125 views
- What is next for publishing in Brazil by Carlo Carrenho 1725 views

330 views

260 views

260 views

Published on

This is the presentation at Stevens third annual HFT meeting

No Downloads

Total views

330

On SlideShare

0

From Embeds

0

Number of Embeds

21

Shares

0

Downloads

3

Comments

0

Likes

1

No embeds

No notes for slide

- 1. Designing Customized Hash Function For High- Frequency Trading Systems Jim Wang, PhD, CFA Program in Financial Engineering Stevens Institute of Technology Hoboken, NJ, 07030, USA
- 2. Introduction TC Trading Engine P GUI UDPx TC 5 P Stock Exchange● TCP messages between U and Exchanges● UDP messages from all market participants
- 3. Source of Latency● Propagation latency: speed of light 5us/km, Mahwah – Weehawken 40km.● Transmission latency: high speed communication link throughput rate 1-10 Gbps. 1us/1kb to serialize and transport.● Processing latency. dedicated CPU for critical threads, kernel bypass, hardware acceleration.
- 4. Processing Latency● Parallel Problem: With 10k symbols, 6 major exchanges, relatively independent, tasks be streamlined.● Through Software optimization: flexibility, take advantage of general purpose CPU improvements over time.● Through Hardware Acceleration: specialized hardware, improve consistency by reducing jitter.
- 5. Software Optimization● Separation of high speed vs high complexity. Latency sensitive task in critical path, computation intensive tasks offload to separate thread/process.● memory caching: critical decision thread pined to a dedicated CPU.● inline vs function calls: C, C++, Java.
- 6. Market Data Processing● Data using ticker symbol as key● String of Characters translation into integer memory location● MS Windows and Linux systems standard of hash table (associative array, or memory map)● Generic implementation without knowledge of input
- 7. Data Specificity● Tickers are not made of equal tick event probability● BAC, ADV of 180M shares● BAC.PRE, ADV of 18K shares● Difference of 10K times or more● Search miss (or Collision) in ticker BAC is 10K more costly than that in BAC. PRE
- 8. Data Specificity
- 9. Implementation: Data Structure#define NUM_SYMBOL 10000 // total symbol universestruct tSym { // define a data structure to for tick events char m_pszTicker[12]; // Stock Symbol long long key; // key to symbol short nextIndex; // next search location if there is collision};tSym gSym[NUM_SYMBOL]; // allocate memory for symbol universe ticks#define HASH_TABLE_SIZE 28091 // optimal size by empirical calibrationshort keyToIndex[HASH_TABLE_SIZE]; // allocate memory for hash tableinline short symbolToIndex(long long key) { // search function short i = keyToIndex[key % HASH_TABLE_SIZE]; // find the key while ((i>-1) && (gSym[i].key != key)) i = gSym[i].nextIndex; // next if collision return i; // either find the matching key, or symbol unknown (return -1)}
- 10. Implementation: InitializationAssuming you have loaded g_nSym number of known symbol in descendingorder of expected tick activity gSym[j].m_pszTicker, j=0 most active stockvoid buildHashTable() // This function will initialize the hash table{ int i, j, k; short key; memset(keyToIndex, -1, sizeof(short)*HASH_TABLE_SIZE); // init cell to -1 for (j=0; j<g_nSym; j++) { // first path gSym[j].nextIndex = -3; // initialize to resolution unknown gSym[j].key = *(long long *)gSym[j].m_pszTicker; // assign key key = gSym[j].key % HASH_TABLE_SIZE; // collision possible if (keyToIndex[key] == -1) { keyToIndex[key] = j; gSym[j].nextIndex = -1; } } // terminating, do not resolve collision for (j=0; j<g_nSym; j++) if (gSym[j].nextIndex == -3) { // second path i = keyToIndex[gSym[j].key % HASH_TABLE_SIZE]; k = -2; // find an empty slot while (gSym[i].nextIndex > -1) { i = gSym[i].nextIndex; k--; } gSym[i].nextIndex = j; gSym[j].nextIndex = k; // k number of collisions }}
- 11. Implementation: ExpansionIntraday, symbol not in known universe may appear (IPO, or symbol change).int addSymbol(char *sym){ int j = g_nSym++; strcpy(gSym[j].m_pszTicker, sym); gSym[j].key = *(long long *)gSym[i].m_pszTicker; short key = gSym[j].key % HASH_TABLE_SIZE; int i = keyToIndex[key]; if (i == -1) { keyToIndex[key] = j; gSym[j].nextIndex = -1; } else { int k = -2; // find an empty slot while (gSym[i].nextIndex > -1) { i = gSym[i].nextIndex; k--; } gSym[i].nextIndex = j; gSym[j].nextIndex = k; // k number of steps, so we know } return j;}
- 12. Implementation: Example● HUN active stock, convert to long int, mod 28091, hash table location 3462, return symbol location 433, match key, done in 1 unit of time.● RYN medium activity, convert to long int, mod 28091, hash table location 3462 (collision), return symbol location 433, not match, next location 1811, match key, done in 2 unit of time● AHL.PR low activity, convert to long int, mod 28091, hash table location 3462 (collision), return symbol location 433, not match, next location 1811, not match, next location 5363, match key, done in 3 unit of time.
- 13. Optimal HASH_TABLE_SIZE Max Collision Costs
- 14. Optimal HASH_TABLE_SIZECost<500,TotalCollision<800,MaxCollision<=4,Minimize Size
- 15. Worst HASH_TABLE_SIZE ● 24 active Symbols start with "ST" ● When convert into long long, mode by 24576, result the same key 5203 (24 collisions) ● Size divisible by 256 are worst ● Byte Order Encoding ○ Big-endian ○ Little-endian ● Padding Convension ○ Null padding (ARCA) ○ Space padding (NASDAQ) ● Need to calibrate own system for best performance

No public clipboards found for this slide

Be the first to comment