AquaQ Analytics Kx Event - Data Direct Networks Presentation


Published on

Published in: Software, Technology
  • Be the first to comment

  • Be the first to like this

No Downloads
Total views
On SlideShare
From Embeds
Number of Embeds
Embeds 0
No embeds

No notes for slide
  • Don’t create data copies in local flash/NVRAM
    Higher Cost: Capacity, Admin, Power, Space, Software Licensing
    No Share, Higher Data Risk, long time to consolidate and checkpoint

    Consolidate to a single system that delivers
    Linear scaling performance
    Single Point of Admin
    Higher Density

  • AquaQ Analytics Kx Event - Data Direct Networks Presentation

    1. 1. Getting the most out of multi-year and multi-source trading history Glenn Wright, EMEA Systems Architect DDN June 2014
    2. 2. © 2013 DataDirect Networks, Inc. Agenda Uh? Who is DDN? The Evolution of Data in Data Handling Market Systems The Big Analytics Crunch What’s hot, what not…. It’s Parallel Performance, stupid!
    3. 3. © 2013 DataDirect Networks, Inc. DDN | The “Big” In Big Data 800% Paypal accelerates stream processing and fraud analytics by 8x with DDN, saves $100Ms. 1TB/s The world’s fastest file system, to power the US’s fastest supercomputer, is powered by DDN. Tier 1 Tier1 CDN accelerates the world’s video traffic using DDN technology to exceed customer SLAs. 3
    4. 4. © 2013 DataDirect Networks, Inc. DDN | The Technology Behind The World’s Leading Data-Driven Organizations HPC & Big Data Analysis Cloud & Web Infrastructure Professional Media Security
    5. 5. © 2013 DataDirect Networks, Inc. Big Data & Cloud Infrastructure DDN’s Award-Winning Product Portfolio Analytics Reference Architectures EXAScaler™ 10Ks of Clients 1TB/s+, HSM Linux HPC Clients NFS & CIFS [2014] Petascale Lustre® Storag e Enterprise Scale-Out File Storage GRIDScaler™ ~10K Clients 1TB/s+, HSM Linux/Windows HPC Clients NFS & CIFS SFA™12KX 48GB/s, 1.7M IOPS 1,680 Drives in 2 Racks Optional Embedded Computing SFA7700 12.5GB/s, 450K IOPS 60 Drives in 4U 228 Drives in 12U Storage Fusion Architecture™ Core Storage Platforms SATA SSD Flexible Drive Configuration SAS SFX™ Automated Flash Caching WOS® 3.0 32 Trillion Unique Objects Geo-Replicated Cloud Storage 256 Million Objects/Second Self-Healing Cloud Parallel Boolean Search Cloud Foundation Big Data Platform Management DirectMon ™ Cloud Tiering Infinite Memory Engine™ [Tech Preview] Distributed File System Buffer Cache WOS7000 60 Drives in 4U Self-Contained Servers Adaptive Transparent Flash Cache SFX API Gives Users Control [pre-staging, alignment, by-pass]
    6. 6. © 2013 DataDirect Networks, Inc. 0.0 20,000,000.0 40,000,000.0 60,000,000.0 80,000,000.0 100,000,000.0 120,000,000.0 140,000,000.0 160,000,000.0 1990 1991 1992 1993 1994 1995 1996 1997 1998 1999 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 TOTAL Americas Asia - Pacific Evolution of Market Systems SOURCE: World Federation of Exchanges 2011 Annual Report and Statistics DASD F DASD Scale-out NAS Parallel File System
    7. 7. © 2013 DataDirect Networks, Inc. UNDERLYING ISSUE: Gaping Performance Bottlenecks • Moore’s Law has out-stripped improvements to disk drive technology by two orders of magnitude during the last decade • Analytics moved to HPC clusters • Today’s servers are hopelessly unbalanced between the CPUs need for data and the HDDs ability to keep up HDD vs. CPU Relative Performance Improvement 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014 20,000 x 1gb 16gb
    8. 8. © 2013 DataDirect Networks, Inc. Welcome to the Big Analytics Crunch • 500TB to > 2PB of historical data for one TZ • Distributed cache : online model reads data at 100s of GB/s IO (Tick DB application such as kdb+) • 3D “cube” of in memory distributed data, online, realtime • 100’s of services/servers working together in memory: low latency analytics w/ simplicity of persistent File system semantics • Burst buffer low latency operation mainstream in FSI ► Real time Back testing ► Real time intra-day risk positioning
    9. 9. © 2013 DataDirect Networks, Inc. Why DDN & Why Parallel ? In Production Many systems deployed W/W @ Global Investment Banks and Hedge Funds Performance and Consolidation Back test in a few seconds is much closer to the trade event Mix online history and real time trade analytics Consolidate in-memory databases against on copy of data At Scale Flash – is NOT scale @ capacity Single namespace, history and real-time
    10. 10. © 2013 DataDirect Networks, Inc. Limitless Scale up and Scale out with kdb+… Compute Fabric KDB+ (1) KDB+ (2) KDB+ (3) KDB+ (16) MDS Primary MDS Replica OSS1 MDT DDN SFA7700 DDN SFA7700 OSS2 OSS3 OSS4
    11. 11. © 2013 DataDirect Networks, Inc. What we changed: export SLAVECOUNT=160 # number of kdb+ client tasks Export CLIENTCOUNT=10 # number of processes per kdb server Q script Query: l beforeeach.q R1S:rrdextras flip`k`v!(" S*";",")0:`:rrd.csv / year-hibid outp t:”YRHIBID"; fn:{[f;s;d] flip`date`sym`a!flip raze(f each s)peach d}; NRS:.tasks.rxsg[H;`$t;1;(fn[hb];apickAs[R1S;`Symbol];reverse ALLDATES2011)]; l aftereach.q symbols: $glenn head rrd.csv 1,Symbol,LKQQ 1,Symbol,LHDE 1,Symbol,LNJO 1,Symbol,LLTR 1,Symbol,LRFC 1,Symbol,LQGA 1,Symbol,LTNQ 1,Symbol,LSAG 1,Symbol,LQIA 1,Symbol,LKSJ … x850 symbols vs 84
    12. 12. © 2013 DataDirect Networks, Inc. glenn$ more hostport.txt What we changed (2): # replace $QEXEC initdb.k -g 1 -p $((baseport+i)) </dev/null &>log$((baseport+i)).log& for i in `seq 20000 20009` do for j in `seq 0 15` do echo ssh server-$j "cd $HOME;QHOME=/home/glenn/q $HOME/l64/q initdb.k -p $i -g 1 </dev/null &> $i-$j.log &" ssh gp-2-$j "cd $HOME;QHOME=/home/mpiuser/q $HOME/l64/q initdb.k -p $i -g 1 </dev/null &> $i-$j.log &" while ! nc -z "gp-2-$j" $i; do sleep 0.1; done done done # get ready ?? echo `date -u` $SLAVECOUNT slave tasks started # then start the servers aimed at the slaves baseport=5000 for ((i=0; i<$CLIENTCOUNT; i++)); do $QEXEC initdb.k -g 1 -s -$SLAVECOUNT -p $((baseport+i)) </dev/null &>log$((baseport+i)).log& while ! nc -z localhost $((baseport+i)); do sleep 0.1; done Done # check that everything can startup : $QEXEC startdb.q -s -$SLAVECOUNT -q
    13. 13. © 2013 DataDirect Networks, Inc. What we changed (3): Startdb.q … … / check all servers are there /{hopen(x;500)}each("I"$getenv`BASEPORT)+til"I"$getenv`SLAVECOUNT; {hopen(x;2500)}each hsym`$read0`:slavehostport.txt; l initdb.k {hopen(x;500)}each 5000+til"I"$getenv`CLIENTCOUNT; Cat slavehostport.txt: …. 160 times
    14. 14. © 2013 DataDirect Networks, Inc. Slave (1) slave (2) Slave (3) Slave n Lustre/DDN Service /mnt/onefilesystem Q clients: Slave x10 Slave x10 Slave x10 Slave x10 Up to 1TB/sec… “n” way server striping or by date/sym
    15. 15. © 2013 DataDirect Networks, Inc. Results of Scaling the service …. 0 50 100 150 200 250 Single Thread Lustre Latency reduction (number of seconds for query) *Lower is better The Parallel FS solution shows a near linear scalability model for one instance running over many nodes, as measured from kdb+. Latency is the time to wait from the kdb+ query of 245GB of data. To put this in context, these nodes were only equipped with 64GB of memory.
    16. 16. © 2013 DataDirect Networks, Inc. Some of the many Benefits of kdb+ on Parallel FS 1. Significant decrease in operational latency per kdb+ query, especially when running queries that search through significant amounts of historical market information. Achieved by balancing content around multiple file system servers 2. Parallelization of kdb+ query “threads” in a single shared namespace, allowing a user to treat any data workload independently from other data workloads. “query from hell” on production system is now OK?” 3. Simultaneous read/write operations on a single namespace for the entre database and for any number of kdb+ clients, (e.g. end of day data consolidations into a hdb instance) 4. Sharing of data amongst different independent hdb/rdb instances. Many instances of kdb can view the same data, meaning that strategies for data sharing and private data segments may be consolidated onto the same space. Avoids the need for kdb+ admins to physically copy data around the network or disks 5. Kdb+ context can be “striped” around all FS servers, or can be allocated in a round robin fashion against each server. Striping allows the opportunity for some files to attain maximal I/O rates for a single kdb+ “object”.
    17. 17. © 2013 DataDirect Networks, Inc. Next Steps?
    18. 18. Thanks! /Big Data Glenn Wright, EMEA Systems Architect DDNJune 2014