Jason Timmes led the migration of the primary data warehouse for Nasdaq's Transaction Services U.S. business unit (which operates Nasdaq's U.S. equity and options exchanges) from a traditional on-premises MPP database to Amazon Redshift. The project significantly reduced operational expenses. Jason, who is an Associate Vice President of Software Development at Nasdaq, describes how his team migrated a warehouse that loads approximately 7 billion rows a day into the cloud, satisfied several security and regulatory audits, optimized read and write performance, ensures high availability, and orchestrates other back-office activities that depend on the warehouse daily loads completing. Along with sharing several technical lessons learned, Jason will discuss Nasdaq's roadmap to integrating Redshift with more AWS services, as well as with more Nasdaq products, to offer even greater benefit to clients (internal and external) in the months ahead.

(FIN401) Seismic Shift: Nasdaq's Migration to Amazon Redshift | AWS re:Invent 2014

  1. 1. 2 We make the world’s capital markets move faster more efficient more transparent Public company in S&P 500 Develop and run markets globally in all asset classes We provide technology, trading, intelligence and listing services Intense Operational Focus on Efficiency and Competitiveness We provide the infrastructure, tools and strategic insight to help our customers navigate the complexity of global capital markets and realize their capital ambitions. Get to know us We have uniquely transformed our business from predominately a U.S. equities exchange to a global provider of corporate, trading, technology and information solutions.
  2. 2. 3 LEADING INDEX PROVIDER WITH 41,000+ INDEXES ACROSS ASSET CLASSES AND GEOGRAPHIES Over 10,000 Corporate Clients in 60 countries Our technology powers over 70 MARKETPLACES, regulators, CSDs and clearing- houses in over 50 COUNTRIES 100+ DATA PRODUCT OFFERINGS supporting 2.5+ million investment professionals and users IN 98 COUNTRIES 26Markets 3 Clearing Houses 5Central Securities Depositories Lists more than 3,500 companies in 35 countries, representing more than $8.8 trillionin total market value
  3. 3. Our warehouse can be used to analyze market share, client activity, surveillance, power our billing, and more…
  4. 4. •A quality of an action such that repetitions of the action have no further effect on outcome –In other words, f(x) = f(f(x)) = f(f(f(x))), etc. •Ingest process is designed as a workflow engine with each step in each workflow being idempotent. •Failures are easily recovered by repeating the failed step after resolving the root cause of any failure.
  5. 5. •Use a manifest file inside a transaction with a table lock, and keep a record of completed ingests •If the S3 COPY (insert) fails, rollback the transaction •If the insert succeeds, write a record of the completed ingest, and commit the transaction •Idempotence: start transaction, lock destination table, check for prior successful ingest, and only start insert if data hasn’t already been loaded today
  6. 6. •Pay close attention to the mandatory flag! •Redshift UNLOAD always sets this to false!!!
  7. 7. •TableIngestStatus –We originally put this table in Redshift itself –Turns out Redshift is not efficient on really small data sets –Significantly impacted performance, and increased concurrency contention •Solution: Moved TableIngestStatusto a separate transactional RDBMS (MySQL) –We were already using a MySQL instance to persist workflow states
  8. 8. •Multiple layers of security –Direct Connect (private lines) –VPC –HTTPS/SSL/TLS (Encryption in flight) –AES-256 (Encryption at rest in S3) –Redshift encryption (Encryption at rest in Redshift) –HSM integration (Redshift master key managed on premise) –CloudTrail/STL_CONNECTION_LOG to monitor for unauthorized DB connections
  9. 9. •Direct Connect –No company data travels over internet circuits •VPC –Isolate our Redshift servers from other tenets/internet connectivity –Security Groups restrict inbound/outbound connectivity
  10. 10. •All AWS API calls are made over HTTPS •All Redshift JDBC connections must use SSL/TLS –Parameter Group: require_ssl= true –Use Redshift cluster SSL certificate to verify cluster identity •See support.htmlfor details
  11. 11. •All Redshift load files staged in S3 are AES-256 encrypted (client side, not S3 SSE) –Key is provided to Redshift in the S3 COPY command: •Enable cluster encryption on Redshift –Only specified during cluster creation, cannot be changed –Applies to backups/snapshots as well –Performance penalty, but not optional for Nasdaq copy nbbofrom 's3://my_ingest/2014-09-17/nbbo.manifest' credentials 'aws_access_key_id=<access-key-id>; aws_secret_access_key=<secret-access-key>;master_symmetric_key=<master_key>' manifest encrypted gzip;
  12. 12. •Redshift will store the cluster key in a singlecustomer premise HSM (or CloudHSM) –SafeNetLuna SA HSM, firmware version should match CloudHSM –Requires certificate exchange between cluster and HSM –Requires cluster have an EIP •On our side, required static 1-to-1 NAT of HSM private IP •VPC Security Groups still apply; can still isolate cluster from others –Encrypted database key decrypted in HSM, passed over encrypted channel to cluster on startup, stored in memory to decrypt data encryption (block) keys –If running an HSM HA group, must synchronize keys after creation
  13. 13. •HSM integration was critical to Nasdaqadoption •Monitor cluster access, react to any unauthorized connections –STL_CONNECTION_LOG •Query system table on a timed basis, alert to any unexpected access –CloudTrailto SplunkRedshift connection & user logs •Captures all API calls, not activity inside Redshift –STL_DDLTEXT •Audits all schema changes in the cluster •In response to an alert, Redshift/HSM connectivity is severed, and cluster is immediately shut down
  14. 14. •With validation, data integrity, and security requirements met, the challenge remains to optimize ingest •Why? –Concurrency is a huge performance factor; can’t afford to be loading yesterday’s data when clients are running queries
  15. 15. - 20 40 60 80 100 120 140 1 2 4 6 8 10 12 14 16 18 Throughput (MB/sec) Concurrent Threads S3 (over HTTPS) Multithreaded Throughput
  16. 16. On premise AWS Regional (Multi-AZ) Scope AWS (US-East, primary AZ/VPC) S3 SNS Redshift Database Cluster HSM Key Appliance Cluster MySQL Redshift Load files/ Manifests Redshift Snapshots/ Backups Data Loaded Topic RMS Input Sources (multiple systems) Data Ingest Process
