Stock  Market  Order  Flow  
Reconstruction    
Using  HBase  on  AWS	
Aaron Carreras, HBaseCon
– San Francisco, May 2015
About  Presenter	
•  Director of Enterprise Data Platforms at FINRA
•  Data Ingestion, Processing and Management
WHAT DO WE DO?
• Collect and Create
•  33B events/day
•  18 national exchanges
•  Equities, Options and Fixed Income
•  Reconstruct the market from trillions of events spanning years
• Detect & Investigate
•  Identify market manipulations, insider trading, fraud and compliance violations
• Enforce & Discipline
•  Ensure rule compliance 
•  Fine and bar broker dealers
•  Refer matters to the SEC and other authorities
TRF	
FIRM	
Exchange	
Dark  Pool
What  stock  trade  looks  like  to  the  investor
Example  of  what  is  actually  happening
Ingest/Access  PaJerns
Configurations/Approaches  in  
Common  
  
Logical  Architecture	
CDH  4.5;  HBase  0.94.6;  EC2  hs1.8xlarge  –  16  (vCPU),  117  (GiB),  24  drives  x  2,000  (GB)  
Row  Key  Design  &  Pre-­‐‑spliJing	
•  Salt Our Row Keys
o  Our “natural” keys are
monotonically
increasing
o  Row Key = salt (PK) +
PK
•  Pre-split
•  Better control of
distribution of data
across regions
Compactions  &  SpliJing  Configurations	
Parameter	
 Default	
 Override	
hbase.hregion.majorcompaction	
 7  days	
 0  (disable)	
hbase.hstore.compactionThreshold	
 3	
 10	
hbase.hstore.compaction.max	
 10	
 15	
hbase.hregion.max.filesize	
 10  GB	
 200  GB	
RegionSplitPolicy	
 IncreasingToUpperBoundRegionSplit
Policy	
ConstantSizeRegionSplitPol
icy	
hbase.hstore.useExploringCompati
on	
false	
 true
OS  Configuration  Considerations	
§  Some of these may not be relevant to you depending on your
OS/Version but are worth confirming
Parameter	
 Se1ing	
redhat_transparent_hugepage/
defrag	
never	
nofile/nproc  ulimit	
 32768	
tcp_low_latency	
 1  (enabled)	
vm.swappiness	
 0  (disabled)	
selinux	
 Disabled	
IPv6	
 no  (disabled)	
iptables	
 off/stop
Other  Hadoop  Configuration  
Considerations	
Where	
 Parameter	
 Se1ing	
core-­‐‑site.xml	
 ipc.client.tcpnodelay	
 true	
core-­‐‑site.xml	
 ipc.server.tcpnodelay	
 true	
hdfs-­‐‑site.xml	
 dfs.client.read.shortcircuit	
 true	
hdfs-­‐‑site.xml	
 fs.s3a.buffer.dir	
 [machine  specific]	
hbase-­‐‑site.xml	
 hbase.snapshot.master.timeoutMillis	
 1800000	
hbase-­‐‑site.xml	
 hbase.snapshot.master.timeout.millis	
 1800000	
hbase-­‐‑site.xml	
 hbase.master.cleaner.interval	
 600000  (ms)
Use  Case  ‘A’:  PaJerns
Use  Case  ‘A’:  Background	
•  Create graphs for historical market event data (trillion
records)
•  Basically a batch process
o  Each batch had ~ 4 billion events
o  Related events may span batches (e.g., root could arrive later, children
may be corrected, etc.)
•  Back process prior 18 months (540 batches)
•  Complete the project given the and
Use  Case  ‘A’:  Utilize  Bulk  Loads	
•  Back processing and ongoing update process is 100% Bulk HFile load
•  Our column families and processing aligned with this approach by splitting the linkage
and content into separate column families
•  Eliminate Puts completely and the WAL writes, memstore flushes, and additional
compactions that often accompany them
HFile  Bulk  Load
Use  Case  ‘A’:  Optimize  Gets	
•  Used sorted / partitioned batched Gets
o  Minimize required RPC calls
o  Leverage sorting to better leverage block cache
•  Allocate more on-heap memory for reads
Parameter	
 Default	
 Override	
hfile.block.cache.size	
 .4	
 .65	
hbase.regionserver.global.memstore.upperLi
mit	
.4	
 .15
Use  Case  ‘B’:  PaJerns
Use  Case  ‘B’:  Background	
•  Not a once a day batch process, it must process the
data as it arrives
o  200+ business rules covering data validation, create/break linkages, and
identify compliance issues within SLA
o  Progressively build the tree
•  The different processes required different access
paths sometimes requiring multiple copies of some
portions of the data
Use  Case  ‘B’:  Put  Strategy	
•  HFiles for the
incremental
processing
didn’t fit as
well here
•  Partitioned
Batch Puts
•  memstore vs
block cache
(50/50)
Use  Case  ‘B’:  Scan	
•  Scan
o  Distinct Daily along with a single Historical table to more naturally
support the processing
o  Scan Daily tables only
o  Switched from Get to Scan for rows with millions of columns
Backup  and  DR  
  
HBase  Backup  to  S3	
•  HBase ExportSnapshots to S3 didn’t really support
our use case
•  Significant updates to the ExportSnapshot for S3
o  Support for S3A (HADOOP-10400)
o  Remove the expensive rename operation on S3 (HBASE-11119)
S3
Disaster  Recovery	
o  AWS provides multiple
Availability Zones (AZ) in different
geographic regions
o  HBASE snapshots backed up to
S3 and to a separate cluster in a
different AZ
o  S3 buckets are backed up from
one region to another for cross-
region redundancy
Running  Hadoop  on  AWS  
  
Lessons  Learned  
  
Running  Hadoop  on  AWS	
•  S3
o  For now at least, s3a is probably the file system implementation you want to
use (if you are not using EMR)
o  Rename is not a logical operation and therefore expensive
o  Eventual consistency should be accounted for
o  Consider turning S3 versioning on
•  Instance Types / Topology
o  # of virtual instances on a single physical host impacts fault tolerance
o  Tradeoff between network performance and availability/capacity
•  Region - Availability Zone - Placement Group
o  Be aware that Availability Zone identifiers are intentionally inconsistent
across accounts
Questions?

HBaseCon 2015: Graph Processing of Stock Market Order Flow in HBase on AWS

  • 1.
    Stock  Market  Order Flow   Reconstruction     Using  HBase  on  AWS Aaron Carreras, HBaseCon – San Francisco, May 2015
  • 2.
    About  Presenter •  Directorof Enterprise Data Platforms at FINRA •  Data Ingestion, Processing and Management
  • 4.
    WHAT DO WEDO? • Collect and Create •  33B events/day •  18 national exchanges •  Equities, Options and Fixed Income •  Reconstruct the market from trillions of events spanning years • Detect & Investigate •  Identify market manipulations, insider trading, fraud and compliance violations • Enforce & Discipline •  Ensure rule compliance •  Fine and bar broker dealers •  Refer matters to the SEC and other authorities TRF FIRM Exchange Dark  Pool
  • 5.
    What  stock  trade looks  like  to  the  investor
  • 6.
    Example  of  what is  actually  happening
  • 7.
  • 8.
  • 9.
    Logical  Architecture CDH  4.5; HBase  0.94.6;  EC2  hs1.8xlarge  –  16  (vCPU),  117  (GiB),  24  drives  x  2,000  (GB)  
  • 10.
    Row  Key  Design &  Pre-­‐‑spliJing •  Salt Our Row Keys o  Our “natural” keys are monotonically increasing o  Row Key = salt (PK) + PK •  Pre-split •  Better control of distribution of data across regions
  • 11.
    Compactions  &  SpliJing Configurations Parameter Default Override hbase.hregion.majorcompaction 7  days 0  (disable) hbase.hstore.compactionThreshold 3 10 hbase.hstore.compaction.max 10 15 hbase.hregion.max.filesize 10  GB 200  GB RegionSplitPolicy IncreasingToUpperBoundRegionSplit Policy ConstantSizeRegionSplitPol icy hbase.hstore.useExploringCompati on false true
  • 12.
    OS  Configuration  Considerations § Some of these may not be relevant to you depending on your OS/Version but are worth confirming Parameter Se1ing redhat_transparent_hugepage/ defrag never nofile/nproc  ulimit 32768 tcp_low_latency 1  (enabled) vm.swappiness 0  (disabled) selinux Disabled IPv6 no  (disabled) iptables off/stop
  • 13.
    Other  Hadoop  Configuration  Considerations Where Parameter Se1ing core-­‐‑site.xml ipc.client.tcpnodelay true core-­‐‑site.xml ipc.server.tcpnodelay true hdfs-­‐‑site.xml dfs.client.read.shortcircuit true hdfs-­‐‑site.xml fs.s3a.buffer.dir [machine  specific] hbase-­‐‑site.xml hbase.snapshot.master.timeoutMillis 1800000 hbase-­‐‑site.xml hbase.snapshot.master.timeout.millis 1800000 hbase-­‐‑site.xml hbase.master.cleaner.interval 600000  (ms)
  • 14.
  • 15.
    Use  Case  ‘A’: Background •  Create graphs for historical market event data (trillion records) •  Basically a batch process o  Each batch had ~ 4 billion events o  Related events may span batches (e.g., root could arrive later, children may be corrected, etc.) •  Back process prior 18 months (540 batches) •  Complete the project given the and
  • 16.
    Use  Case  ‘A’: Utilize  Bulk  Loads •  Back processing and ongoing update process is 100% Bulk HFile load •  Our column families and processing aligned with this approach by splitting the linkage and content into separate column families •  Eliminate Puts completely and the WAL writes, memstore flushes, and additional compactions that often accompany them HFile  Bulk  Load
  • 17.
    Use  Case  ‘A’: Optimize  Gets •  Used sorted / partitioned batched Gets o  Minimize required RPC calls o  Leverage sorting to better leverage block cache •  Allocate more on-heap memory for reads Parameter Default Override hfile.block.cache.size .4 .65 hbase.regionserver.global.memstore.upperLi mit .4 .15
  • 18.
  • 19.
    Use  Case  ‘B’: Background •  Not a once a day batch process, it must process the data as it arrives o  200+ business rules covering data validation, create/break linkages, and identify compliance issues within SLA o  Progressively build the tree •  The different processes required different access paths sometimes requiring multiple copies of some portions of the data
  • 20.
    Use  Case  ‘B’: Put  Strategy •  HFiles for the incremental processing didn’t fit as well here •  Partitioned Batch Puts •  memstore vs block cache (50/50)
  • 21.
    Use  Case  ‘B’: Scan •  Scan o  Distinct Daily along with a single Historical table to more naturally support the processing o  Scan Daily tables only o  Switched from Get to Scan for rows with millions of columns
  • 22.
  • 23.
    HBase  Backup  to S3 •  HBase ExportSnapshots to S3 didn’t really support our use case •  Significant updates to the ExportSnapshot for S3 o  Support for S3A (HADOOP-10400) o  Remove the expensive rename operation on S3 (HBASE-11119) S3
  • 24.
    Disaster  Recovery o  AWSprovides multiple Availability Zones (AZ) in different geographic regions o  HBASE snapshots backed up to S3 and to a separate cluster in a different AZ o  S3 buckets are backed up from one region to another for cross- region redundancy
  • 25.
    Running  Hadoop  on AWS     Lessons  Learned    
  • 26.
    Running  Hadoop  on AWS •  S3 o  For now at least, s3a is probably the file system implementation you want to use (if you are not using EMR) o  Rename is not a logical operation and therefore expensive o  Eventual consistency should be accounted for o  Consider turning S3 versioning on •  Instance Types / Topology o  # of virtual instances on a single physical host impacts fault tolerance o  Tradeoff between network performance and availability/capacity •  Region - Availability Zone - Placement Group o  Be aware that Availability Zone identifiers are intentionally inconsistent across accounts
  • 27.