Large-scale enterprise migration can be a complex undertaking, especially for organizations that re-architect solutions to leverage the benefits of the Cloud. FINRA, which regulates US equities and options markets, recently completed a 2.5-year migration and re-architecture of its Big Data platform. Their platform consumes billions of market events every day. FINRA has developed scalable platforms and services on AWS that enable migrating enterprise applications and business functions to the Cloud quickly. Their data management platform takes advantage of AWS storage and compute products. In this session, IT influencers and decision makers will learn lessons from FINRA’s migration, including how to create an enterprise-class Cloud architecture and which technology skills are required for transitioning to the Cloud. We also share examples of the business value FINRA has realized.
2. What to Expect from the Session
• FINRA’s Enterprise Class cloud architecture
• Business Value FINRA has realized from cloud migration
• Technology skillsets required
• Tools (data management) and processes required
• Other (unexpected) benefits from cloud migration
• View from AWS: partnership and platform evolution
2
4. Data Is Central to Our Mission
Reconstruct the market from trillions of
events
• Data from broker-dealers and exchanges
• Equities, Options, Fixed Income
• Build a graph of market order events
Analyze the data looking for financial
fraud
• Insider trading, layering, cross-product
manipulation, front running & many more
• Looking for a needle in a haystack
4
5. Volume Challenges
Market volumes are volatile and
steadily increasing
Exchanges are dynamically evolving
Regulatory landscape is changing
Market manipulators innovate
5
7. Pain Points
Does not scale well as volumes and
workloads increase
Duplication of effort in data management
(data lifecycle, retention, versioning, etc.)
Data sync issues – manual effort to keep
data in sync
Challenges to run analytics across
fragmented data
Costly system maintenance and upgrades
7
8. Summary of Cloud Drivers: The Problems
• Fast-growing data volumes YoY
• High cost of pre-building for peak
• Escalating costs of in-house technology infrastructure
• Appliance platforms were facing obsolescence and end-of
life as a result of new Big Data technologies
Keep spending on infrastructure or redirect
dollars to core business (financial regulation)?
8
12. FINRA’s AWS Architecture
On-premises data center
NAS
FTPIncoming Files
Validation Data Management
Linkage
Data Analytics
Normalization Amazon
EC2
Amazon
S3
Amazon
Glacier
Amazon
Redshift
Amazon
EMR
VPC
Amazon
EMR
Amazon
RDS
Machine
Learning
AWS
KMS
12
13. FINRA Usage Statistics on AWS
30k+ EC2 nodes per day
93%+ of EC2 usage is EMR
based (mostly SPOT)
20Pb+ Storage (Amazon
S3, Amazon Glacier)
60% PROD, 25% QC/UAT,
15% DEV
Node lifecycle:
o 50%: Under 2h
o 35%: 2h to 5h
o 15%: over 5h
0
10,000
20,000
30,000
40,000
31,044
35,444
32,919
36,916
29,330
25,935
20,523
Redshift Web, App & RDS
Hadoop/Spark
Node Distribution for June 19-25 (~32k/day)
13
15. FINRA’s Use of VPC is Highly Secure and Auditable
• Network security even more tightly controlled than
traditional data centers (i.e., “micro-segmentation”)
• Encrypt non-public data both in-motion and at-rest
• AWS IAM function with fine-grained entitlements and
SoD integrated with FINRA’s existing IAM processes
• Comprehensive audit trail – AWS CloudTrail & Amazon
CloudWatch
• Custom AWS compliance reporting system to ensure
“identity perimeter”
15
16. AWS Compliance & Certifications
AWS Foundation Services
Compute Storage Database Networking
AWS Global
Infrastructure Regions
Availability Zones
Edge Locations
GxP
ISO 13485
AS9100
ISO/TS 16949
16
Source: Amazon Web Services
17. Benefits
Improved performance (from min to seconds)
Ability to expand and contract (up-to 40K EC2 instances get
provisioned daily)
No more tech refreshes, patching, etc.
Lower cost of DR & Reg SCI testing
Superior data protection compared to in-house solution
Redirect focus and dollars to core business
17
18. Other (Unexpected) Benefits
Easier Data Access – no silos
• All data in one place
• Faster data discovery
• New forms of data exploration
Innovation & Engaged Staff
• Transformation from infrastructure ops to DevOps
• New technologies, new skills, challenging yet very clear goals
• Easier to try new things and innovate
18
20. FINRA’s Future Plans
• Migrate the remaining applications to the Cloud by 2018
• Hundreds of relational databases
• Hundreds of applications
• High degree of inter-application connectivity (messaging,
workflow, data replication)
• Shut down data center operation by end of 2018
20
21. Key Takeaways
• Develop a compelling business case - sell to your
stakeholder; sell to your team
• Make sure to get security right
• Focus on your data strategy
• Pay attention to variable infrastructure cost
• Partner with Cloud/Big Data vendors for staffing needs
• Innovate and transform as part of Cloud journeys
21
22. Summary
• FINRA’s original promise (cost & performance) of Cloud
realized
• Other unplanned benefits
• superior data protection
• democratization of data
• catalyst for innovation
• Migrating the remainder of portfolio by end of 2018
22
24. Enterprise Account Engagement Model
• AWS Account Team Role:
o Assist FINRA in architecting AWS services for Cloud
o Support Proof of Concepts (POCs) to accelerate migration
o Help FINRA understand and influence product roadmaps
• AWS Teams Engaged:
o Account Management
o Solutions Architecture
o Support and Technical Account Management (TAMs)
o Technical Delivery Management (TDM)
o Professional Services
o AWS Service Teams / Engineers
24
25. AWS Services That FINRA Has Requested
• Broad impact across multiple services
• Identity and access management
o Long-lived federation tokens
• Cross-region data replication (CRR) for S3:
o Copy important data to another region for catastrophic DR
o FINRA requested Data Encryption, other enhancements
• Database Migration Service (DMS):
o Input on DMS roadmap / features
o Early adopter for Oracle-Postgres migration (session DAT302)
25
26. Biggest Impact: EMR Enhancements
• Enhanced Hive / EMRFS support
• Presto performance improvements within EMR
• HBase on S3 (STG308 session):
o Separate storage & compute – data in S3 vs. persistent HS1 cluster
o Improved resiliency (RTO for cluster restart, S3 backup/replication)
o Improved cost performance (run less expensive nodes, no longer
storage constrained)
o Scale cluster up and down with demand
26
29. Related Sessions
FINRA Sessions:
• BDM203 – Building a Secure Data Science Platform
• DAT302 – Best Practices for Migrating to RDS / Aurora
• CMP316 – Aligning Billions of Time Ordered Events with
Spark
• SVR202 – What’s new with AWS Lambda
• STG308 – FINRA’s Scalable Big Data Architecture on S3
29