SlideShare a Scribd company logo
Supporting Financial Services
With a More Flexible Approach
to Big Data
October 21, 2014
WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Our Presenters
Jus$n	
  Sears	
  is	
  a	
  Product	
  Marke$ng	
  Manager	
  at	
  Hortonworks,	
  where	
  he	
  writes	
  stories	
  
about	
  how	
  enterprise	
  customers	
  use	
  Apache	
  Hadoop	
  to	
  solve	
  big	
  data	
  business	
  
challenges.	
  He	
  also	
  manages	
  product	
  launch	
  marke$ng	
  and	
  campaign	
  content	
  for	
  
Hortonworks.	
  For	
  seventeen	
  years,	
  Jus$n	
  has	
  led	
  teams	
  in	
  Silicon	
  Valley	
  to	
  create	
  
and	
  posi$on	
  enterprise	
  soCware,	
  risk-­‐controlled	
  consumer	
  banking	
  products,	
  
desktop	
  and	
  mobile	
  web	
  proper$es,	
  and	
  services	
  for	
  La$no	
  customers	
  in	
  the	
  US	
  and	
  
La$n	
  America.	
  He	
  lives	
  with	
  his	
  family	
  in	
  his	
  na$ve	
  San	
  Francisco	
  Bay	
  Area.	
  
BreH	
  Rudenstein	
  has	
  an	
  extensive	
  background	
  in	
  Applica$on	
  Lifecycle	
  Management,	
  
High	
  Performance	
  Compu$ng	
  and	
  Open	
  Source	
  SoCware	
  Analysis.	
  He	
  has	
  held	
  senior	
  
sales	
  engineering	
  and	
  management	
  posi$ons	
  at	
  Ra$onal	
  SoCware,	
  PureAtria,	
  
IBM,	
  Appistry	
  and	
  Palamida.	
  Throughout	
  his	
  career,	
  he	
  has	
  enabled	
  organiza$ons	
  to	
  
accelerate	
  technology	
  adop$on	
  by	
  understanding	
  their	
  needs	
  and	
  providing	
  just-­‐in-­‐
$me	
  business	
  solu$ons.	
  As	
  WANdisco	
  Director	
  of	
  Product	
  Management	
  for	
  Big	
  
Data,	
  BreH	
  works	
  with	
  partners,	
  prospects	
  and	
  customers	
  to	
  help	
  
them	
  understand	
  and	
  evolve	
  the	
  requirements	
  for	
  enterprise-­‐ready	
  Hadoop.	
  
Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hortonworks
We Do Hadoop
Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Our Mission:
Power your Modern Data Architecture
with HDP and Enterprise Apache Hadoop
Who we are
June 2011: Original 24 architects, developers, operators of Hadoop from Yahoo!
June 2014: An enterprise software company with 420+ Employees
Key Partners
Our model
Innovate and deliver Apache Hadoop as a complete enterprise data platform
completely in the open, backed by a world class support organization
Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Fastest growing Fortune 1000 customer base
Customer Momentum
•  300+ customers in seven quarters, growing at 75+/quarter
•  Two thirds of customers come from F1000
•  100% renewal rate
Largest Cluster in North America
32,000 Nodes
Largest Cluster in Europe
1,000 Nodes
Some notable migrations include many of the early adopters of Hadoop:
© Hortonworks Inc. 2011 – 2014. All Rights Reserved
Experience at Scale
80,000 nodes under contract
Largest Known Cluster in APAC
400 Nodes
30+ customers migrated from other distributions
Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hortonworks: A Leader In Hadoop
The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014
“Hortonworks loves and lives
open source innovation”
Vision & Execution for Enterprise Hadoop.
Hortonworks leads with a strong strategy and roadmap for open source innovation
with Hadoop and a strong delivery of that innovation in Hortonworks Data Platform.
World Class Support and Services.
Hortonworks' Customer Support received a maximum score
and was significantly higher than both Cloudera and MapR.
Key Strategic Partnerships.
Hortonworks’ unique strategic partnerships with Microsoft, SAP, Teradata and others
are a key strength as part of its overall strategy of ecosystem partnership to
accelerate Hadoop adoption in the enterprise.
Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP IS Apache Hadoop
There is ONE Enterprise Hadoop: everything else is a vendor derivation
HDP
•  Reliable
•  Consistent
•  Current
Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Enabling a Modern Data Architecture
with HDP and Apache Hadoop
Hortonworks. We do Hadoop.
Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
APPLICATIONSDATASYSTEM
Business
Analytics
Custom
Applications
Packaged
Applications
Traditional systems under pressure
•  Silos of Data
•  Costly to Scale
•  Constrained Schemas
Clickstream
Geolocation
Sentiment, Web Data
Sensor, Machine Data
Unstructured docs, emails
Server logs
SOURCES
Existing Sources
(CRM, ERP,…)
RDBMS EDW MPP
New Data Types
…and difficult to
manage new data
Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Traditional Hadoop, challenges & limitations
1 ° ° ° ° °
° ° ° ° ° N
HDFS
(Hadoop Distributed File System)
MapReduce
Largely Batch Processing
SOURCES
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  &Social	
   Geoloca9on	
   Sensor	
  &	
  
Machine	
  
Server	
  Logs	
   Unstructured	
  
Architectural Limitations
•  Single-purpose clusters, specific data sets
•  Primarily a batch system using MapReduce
Enterprise Challenges
•  Limited enterprise capabilities: 

Operations, Security & Governance
•  Created additional Silos

Interoperability Challenges
•  Difficult to natively integrate existing applications

Commercial add-ons opportunistically emerged 

in the early days to address these shortcomings
APPLICATIONSDATASYSTEM
Business
Analytics
Custom
Applications
Packaged
Applications
RDBMS EDW MPP
Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
20092006
1	
   °	
   °	
   °	
   °	
   °	
  
°	
   °	
   °	
   °	
   °	
   N	
  
HDFS	
  	
  
(Hadoop	
  Distributed	
  File	
  System)	
  
MapReduce	
  
Largely	
  Batch	
  Processing	
  
Hadoop	
  w/	
  MapReduce
YARN: Data Operating System
1
 °
 °
 °
 °
 °
 °
 °
 °
 °
°
 °
 °
 °
 °
 °
 °
 °
 °
°
°
N
HDFS 

(Hadoop Distributed File System)
Hadoop2 & YARN based Architecture
Siloed clusters
Largely batch system
Difficult to integrate
MR-­‐279:	
  YARN
Hadoop 2 & YARN
Interactive Real-TimeBatch
Architected & 

led development
of YARN to enable
the Modern Data
Architecture
October 23, 2013
Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP2 and YARN enable the Modern Data Architecture
Hortonworks architected and 

led development of YARN
Common data set, multiple applications
•  Optionally land all data in a single cluster
•  Batch, interactive & real-time use cases
•  Support multi-tenant access, processing
& segmentation of data
YARN: Architectural center of Hadoop
•  Consistent security, governance & operations
•  Ecosystem applications certified 

by Hortonworks to run natively in Hadoop
SOURCES
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  	
  
&Social	
  
Geoloca9on	
   Sensor	
  	
  
&	
  Machine	
  
Server	
  	
  
Logs	
  
Unstructured	
  
APPLICATIONSDATASYSTEM
Business
Analytics
Custom
Applications
Packaged
Applications
RDBMS EDW MPP YARN: Data Operating System
1 ° ° ° ° ° ° ° ° °
° ° ° ° ° ° ° ° ° N
HDFS
(Hadoop Distributed File System)
Interactive Real-TimeBatch
Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
A Blueprint for Enterprise Hadoop
Load data
and manage
according
to policy
Deploy and
effectively
manage the
platform
Store and process all of your Corporate Data Assets
Access your data simultaneously in multiple ways
(batch, interactive, real-time) Provide layered
approach to
security through
Authentication,
Authorization,
Accounting, and
Data Protection
DATA MANAGEMENT
SECURITYDATA ACCESS
GOVERNANCE
& INTEGRATION
OPERATIONS
Enable both existing and new applications to
provide value to the organization
PRESENTATION & APPLICATION
Empower existing operations and
security tools to manage Hadoop
ENTERPRISE MGMT & SECURITY
Provide deployment choice across physical, virtual, cloud
DEPLOYMENT OPTIONS
YARN Data Operating System
Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hortonworks Data Platform 2.2
HDP Delivers Enterprise Hadoop
YARN: Data Operating System
(Cluster Resource Management)
1 ° ° ° ° ° ° °
° ° ° ° ° ° ° °
Script
Pig
SQL
Hive
Tez
Tez
Java
Scala
Cascading
Tez
° °
° °
° ° ° ° °
° ° ° ° °
HDFS
(Hadoop Distributed File System)
Stream
Storm
Search
Solr
NoSQL
HBase
Accumulo
Slider
 Slider
SECURITYGOVERNANCE OPERATIONSBATCH, INTERACTIVE & REAL-TIME DATA ACCESS
In-Memory
Spark
Provision,
Manage &
Monitor
Ambari
Zookeeper
Scheduling
Oozie
Data Workflow,
Lifecycle &
Governance
Falcon
Sqoop
Flume
Kafka
NFS
WebHDFS
Authentication
Authorization
Accounting
Data Protection
Storage: HDFS
Resources: YARN
Access: Hive, …
Pipeline: Falcon
Cluster: Knox
Deployment ChoiceLinux Windows On-
Premises
Cloud
YARN is the architectural
center of HDP
•  Common data set across all
applications
•  Batch, interactive & real-time
workloads
•  Multi-tenant access & processing
Provides comprehensive
enterprise capabilities
•  Governance
•  Security
•  Operations
Enables broad
ecosystem adoption
•  ISVs can plug directly into Hadoop
The widest range of deployment options
•  Linux & Windows
•  On-premises & cloud
Others
ISV
Engines
Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
The Modern Data Architecture w/ HDP
Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Clickstream
Capture and analyze
website visitors’ data
trails and optimize
your website
Sensors
Discover patterns in
data streaming
automatically from
remote sensors and
machines
Server Logs
Research logs to
diagnose process
failures and prevent
security breaches
New Types of DataHadoop Value:
Sentiment
Understand how
your customers feel
about your brand
and products –
right now
Geographic
Analyze location-
based data to
manage operations
where they occur
Unstructured
Understand patterns
in files across millions
of web pages, emails,
and documents
Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
New analytic applications for new types of data
$
•  Supplier Consolidation
•  Supply Chain and Logistics
•  Assembly Line Quality Assurance
•  Proactive Maintenance
•  Crowdsourced Quality Assurance
•  New Account Risk Screens
•  Fraud Prevention
•  Trading Risk
•  Maximize Deposit Spread
•  Insurance Underwriting
•  Accelerate Loan Processing
•  Call Detail Records (CDRs)
•  Infrastructure Investment
•  Next Product to Buy (NPTB)
•  Real-time Bandwidth
Allocation
•  New Product Development
•  360° View of the Customer
•  Analyze Brand Sentiment
•  Localized, Personalized
Promotions
•  Website Optimization
•  Optimal Store Layout
Financial
Services
Retail Telecom Manufacturing
Healthcare
Utilities,
Oil & Gas
Public
Sector
•  Genomic data for medical trials
•  Monitor patient vitals
•  Reduce re-admittance rates
•  Store medical research data
•  Recruit cohorts for
pharmaceutical trials
•  Smart meter stream analysis
•  Slow oil well decline curves
•  Optimize lease bidding
•  Compliance reporting
•  Proactive equipment repair
•  Seismic image processing
•  Analyze public sentiment
•  Protect critical networks
•  Prevent fraud and waste
•  Crowdsource reporting for
repairs to infrastructure
•  Fulfill open records requests
Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
..to shift from reactive to proactive interactions
HDP and Hadoop allow
organizations to shift
interactions from…
Reactive
Post Transaction
Proactive
Pre Decision
…to Real-time PersonalizationFrom static branding
…to repair before breakFrom break then fix
…to Designer MedicineFrom mass treatment
…to Automated AlgorithmsFrom Educated Investing
…to 1x1 TargetingFrom mass branding
A shift in Advertising
A shift in Financial Services
A shift in Healthcare
A shift in Retail
A shift in Telco
Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Data Lake: An architectural shift
SCALE
SCOPE
Unlocking the Data Lake
	
  
RDBMS
MPP
EDW
Data Lake
Enabled by YARN
•  Single data repository,
shared infrastructure
•  Multiple biz apps
accessing all the data
•  Enable a shift from
reactive to proactive
interactions
•  Gain new insight across
the entire enterprise
New Analytic Apps
or IT Optimization
HDP 2.1
Governance
&Integration
Security
Operations
Data Access
Data Management
YARN
Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
OPERATIONAL	
  TOOLS	
  
DEV	
  &	
  DATA	
  TOOLS	
  
INFRASTRUCTURE	
  
HDP is deeply integrated in the data centerSOURCES
EXISTING	
  
Systems	
  
Clickstream	
   Web	
  &Social	
   Geoloca9on	
   Sensor	
  &	
  
Machine	
  
Server	
  Logs	
   Unstructured	
  
DATASYSTEM
RDBMS	
   EDW	
   MPP	
  
HANA
APPLICATIONS	
  
BusinessObjects BI
Deep Partnerships
Hortonworks engages
in deep engineered relationships
with the leaders in the data center,
such as Microsoft, Teradata, Redhat,
HP, SAS & SAP
Broad Partnerships
Over 600 partners work with us to
certify their applications to work with
Hadoop so they can extend big data
to their users
HDP 2.1
Governance
&Integration
Security
Operations
Data Access
Data Management
YARN
Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
HDP Use Cases in Financial Services
Hortonworks. We do Hadoop.
Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Monetize Anonymous & Aggregate Banking Data
Problem
Valuable banking data needed to be anonymous & unified
•  Bank possesses data that indicates larger macro-economic trends, which can be
monetized in secondary markets
•  Regulations and company policies protect customer privacy
•  Data sets are isolated in legacy silos controlled by LOBs
•  IT challenged by joining data while guaranteeing anonymity
Solution
Cross-bank data lake for aggregate data with secure access
•  Multiple data sets abstracted from source platforms
•  Single point of security & privacy for de-identification, masking, encryption,
authentication and access control
•  Mortgage bankers, consumer bankers, credit card group and treasury bankers have
access to the same cross-sell data
•  Interoperability with partners SAS, R, RedHat & Splunk
•  Economies of scale for compression & archiving data
•  Significant reduction in storage costs from prior platforms
Creating Opportunity
Data: Structured,
Clickstream, Social &
Unstructured
Banking
One of the largest US banks
Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Insurance Data Lake to Manage Risk
Problem
Challenges merging new & old data hamper analysis
•  Traditional and newer types of data were both growing quickly but were difficult to
combine in the EDW
•  “Schema on load” requirements of EDW platform limited ingest of some data with
significant predictive power
•  Company missed data-driven ways to serve customers
•  Process of separating legitimate from fraudulent claims created “needle-in-a-
haystack” problem
Solution
Common platform for all types of data improves up-sell and reduces fraud
•  “Schema on read” Hadoop architecture means that more data sources can be
easily ingested to enrich predictive analytics
•  Agents use big data insights to determine the best action for valued customers and
recommend those in real-time
•  Claims analysts and underwriters process streaming data to quickly flag fraud risks
and fast-track legitimate claims
Creating Opportunity
Data: Structured,
Clickstream, Server Log
Health Insurance
Large US medical insurer
>$30B in revenue
>20M members
~35K employees
Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Maintaining SLAs for Equity Trading Information
Problem
Meeting 12 millisecond SLAs for “ticker plant”
•  Daily ingest: 50GB server log data from 10,000 feeds
•  Four times daily, this data is pushed into DB2
•  Applications query this data 35K times per second
•  70% of queries are for data <1 year old, 30% for >1 year old
•  Current architecture can only hold 10 years of trading data
•  Growing volume puts performance at risk of missing SLAs
Solution
Meeting SLAs with confidence
•  HBase provides super-fast queries within SLA targets
•  ETL offloading to Hadoop allows longer data retention, without jeopardizing fast
response times
Improving Efficiency
Data: Server Log & ETL
Investment
Services
Highly trafficked website
providing business and
financial information
~15K employees
Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved
Hadoop is a Platform Decision
Open Leadership
Drive innovation in the open via
the Apache community-driven
open source process
Enterprise Rigor
Engineer, test and certify
Apache Hadoop with the
enterprise in mind
Ecosystem Endorsement
Focus on deep integration with
existing data center technologies
and skills
Fastest Growing Customer and Partner Base
Largest and most experienced Hadoop adopters have standardized on Hortonworks
The data center leaders have standardized on Hortonworks
27	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
WANdisco Background
•  WANdisco: Wide Area Network Distributed Computing
–  Enterprise-ready, high availability software solutions that enable globally distributed
organizations to meet today’s data challenges of secure storage, scalability and availability
•  Leader in tools for software engineers – Subversion
–  Apache Software Foundation sponsor
•  Highly successful IPO, London Stock Exchange, June 2012 (LSE:WAND)
•  US patented active-active replication technology granted, November 2012
•  Global locations
–  San Ramon (CA)
–  Chengdu (China)
–  Tokyo (Japan)
–  Boston (MA)
–  Sheffield (UK)
–  Belfast (UK)
28	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Customers
29	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Non-Stop Hadoop
Non-Intrusive Plugin
to Hortonworks HDP
Provides Continuous Availability
In the LAN / Across the WAN
Active/Active
30	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
3 Problems For Sharing Data Across Clusters
LAN / WAN
31	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
•  Require Continuous Availability
–  SLA’s, regulatory compliance
•  Require HDFS to be Deployed Globally
–  Share data between data centers
–  Data is consistent, not eventual
•  Ease Administrative Burden
–  Reduce operational complexity
–  Simplify disaster recovery
–  Lower RTO/RPO
•  Allow Maximum Utilization of
Resources
–  Within the data center
–  Across data centers
Enterprise-Ready Hadoop
Characteristics of Mission-critical Financial Applications
32	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Single Standby
•  Inefficient utilization of resource
–  Journal Nodes
–  ZooKeeper Nodes
–  Standby Node
•  Performance Bottleneck
•  Still tied to the beeper
•  Limited to LAN scope
Breaking Away from Active/Passive
What’s in a NameNode
33	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Single Standby
•  Inefficient utilization of resource
–  Journal Nodes
–  ZooKeeper Nodes
–  Standby Node
•  Performance Bottleneck
•  Still tied to the beeper
•  Limited to LAN scope
Active / Active
•  All resources utilized
–  Only NameNode configuration
–  Scale as the cluster grows
–  All NameNodes active
•  Load balancing
•  Set resiliency (# of active NN)
•  Global Consistency
Breaking Away from Active/Passive
What’s in a NameNode
34	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Standby Data Center
•  Idle Resource
–  Single Data Center Ingest
–  Disaster Recovery Only
•  One way synchronization
–  DistCp
•  Error Prone
–  Clusters can diverge over time
•  Difficult to scale > 2 Data Centers
–  Complexity of sharing data
increases
Breaking Away from Active/Passive
What’s in a Data Center
35	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Standby Data Center
•  Idle Resource
–  Single Data Center Ingest
–  Disaster Recovery Only
•  One way synchronization
–  DistCp
•  Error Prone
–  Clusters can diverge over time
•  Difficult to scale > 2 Data Centers
–  Complexity of sharing data
increases
Active / Active
•  DR Resource Available
–  Ingest at all Data Centers
–  Run Jobs in both Data Centers
•  Replication is Multi-Directional
–  active/active
•  Absolute Consistency
–  Single HDFS spans locations
•  ‘N’ Data Center support
–  Global HDFS allows appropriate
data to be shared
Breaking Away from Active/Passive
What’s in a Data Center
Use Cases
37	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
•  Data is as current as possible (no
periodic synchs)
•  Doesn’t require monitoring and
consistency checking
•  Virtually zero downtime to recover
from regional data center failure
•  Regulatory compliance
Use Case: Disaster Recovery
38	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
•  Ingest and analyze anywhere
•  Analyze everywhere
–  Fraud detection
–  Equity trading information
–  New business
–  Etc…
•  Backup data center(s) can be used
for work
–  No idle resources
Use Case: Multi-Data Center
Ingest and multi-tenant workloads
39	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
•  Mixed Hardware Profiles
–  Memory, disk, CPU
–  Isolate memory-hungry
processing (Storm/Spark) from
regular jobs
•  Share data, not processing
–  Isolate lower priority (dev/
test) work
Use Case: Heterogeneous Hardware
In-memory analytics
40	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
The difficulty realizing the data lake…
41	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
…is that data spans the entire world
42	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Data	
  
Ocean	
  
Feeder	
  
Site	
  
Accoun$ng	
  
Mart	
  
Banking	
  
Mart	
  
•  Data Marts
–  Restrict access to relevant
data
–  Create quick clusters
•  Feeder Sites (Data
Tributaries)
–  Ingest only
Data Reservoir
Use Cases
43	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
•  Basel III
–  Consistency of data
•  Data Privacy Directive
–  Data sovereignty
•  Data doesn’t leave country of
origin
Compliance	
  
Regula$on	
  
Guidelines	
  
Regulatory Compliance
Technical Comparison
Hadoop Powered by WANdisco
45	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Periodic Synchronization
DistCp
Parallel Data Ingest
Load Balancer, Streaming
Multi-Data Center Hadoop Today
What's wrong with the status quo
46	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Periodic Synchronization
DistCp
Multi-Data Center Hadoop Today
Hacks currently in use
•  Runs as MapReduce
•  DR data center is read-only
•  Over time, Hadoop clusters
become inconsistent
•  Manual and labor-intensive
process to reconcile differences
•  Inefficient use of the network
47	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Parallel Data Ingest
Load Balancer, Flume
Multi-Data Center Hadoop Today
Hacks currently in use
•  Hiccups in either of the Hadoop
clusters causes the two file
systems to diverge
•  Potential to run out of buffer when
WAN is down
•  Requires constant attention and
sys-admin hours to keep running
•  Data created on the cluster is not
replicated
•  Use of streaming technologies
(like flume) for data redirection are
only for streaming
48	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Architecture of a Non-Stop Hadoop
49	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Q&A
Question and Answer
Submit your questions using the “ASK A QUESTION” button
50	
   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA
Thank you

More Related Content

What's hot

Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
Hortonworks
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Hortonworks
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Hortonworks
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Hortonworks
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
Hortonworks
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
Hortonworks
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
Hortonworks
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Hortonworks
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Hortonworks
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
Hortonworks
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
Hortonworks
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
Hortonworks
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Hortonworks
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Hortonworks
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
Hortonworks
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Hortonworks
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
Hortonworks
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
Hortonworks
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
Hortonworks
 

What's hot (20)

Discover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop SearchDiscover HDP 2.1: Apache Solr for Hadoop Search
Discover HDP 2.1: Apache Solr for Hadoop Search
 
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in HadoopDiscover HDP 2.1: Apache Falcon for Data Governance in Hadoop
Discover HDP 2.1: Apache Falcon for Data Governance in Hadoop
 
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
Hortonworks Protegrity Webinar: Leverage Security in Hadoop Without Sacrifici...
 
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
Predictive Analytics and Machine Learning…with SAS and Apache HadoopPredictive Analytics and Machine Learning…with SAS and Apache Hadoop
Predictive Analytics and Machine Learning …with SAS and Apache Hadoop
 
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache HadoopRescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
Rescue your Big Data from Downtime with HP Operations Bridge and Apache Hadoop
 
Discover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.finalDiscover.hdp2.2.storm and kafka.final
Discover.hdp2.2.storm and kafka.final
 
Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014Splunk-hortonworks-risk-management-oct-2014
Splunk-hortonworks-risk-management-oct-2014
 
Hp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar SlidesHp Converged Systems and Hortonworks - Webinar Slides
Hp Converged Systems and Hortonworks - Webinar Slides
 
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache HiveDiscover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
Discover HDP 2.1: Interactive SQL Query in Hadoop with Apache Hive
 
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
Optimizing your Modern Data Architecture - with Attunity, RCG Global Services...
 
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise HadoopHDP Advanced Security: Comprehensive Security for Enterprise Hadoop
HDP Advanced Security: Comprehensive Security for Enterprise Hadoop
 
Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]Discover.hdp2.2.h base.final[2]
Discover.hdp2.2.h base.final[2]
 
Introduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready ProgramIntroduction to the Hortonworks YARN Ready Program
Introduction to the Hortonworks YARN Ready Program
 
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
Webinar - Accelerating Hadoop Success with Rapid Data Integration for the Mod...
 
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
Discover hdp 2.2: Data storage innovations in Hadoop Distributed Filesystem (...
 
YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez YARN Ready: Integrating to YARN with Tez
YARN Ready: Integrating to YARN with Tez
 
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
Discover Red Hat and Apache Hadoop for the Modern Data Architecture - Part 3
 
Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014Hortonworks Yarn Code Walk Through January 2014
Hortonworks Yarn Code Walk Through January 2014
 
Bigger Data For Your Budget
Bigger Data For Your BudgetBigger Data For Your Budget
Bigger Data For Your Budget
 
Apache Hadoop on the Open Cloud
Apache Hadoop on the Open CloudApache Hadoop on the Open Cloud
Apache Hadoop on the Open Cloud
 

Viewers also liked

Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
Hortonworks
 
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataHadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big Data
WANdisco Plc
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
Chris Almond
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmSolving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
DataWorks Summit
 
Selective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed HadoopSelective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed Hadoop
DataWorks Summit
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
Hortonworks
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Hortonworks
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
Hortonworks
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
Hortonworks
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
OReillyStrata
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
Hortonworks
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
Hortonworks
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Hortonworks
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
Hortonworks
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
Hortonworks
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Hortonworks
 
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
Amazon Web Services
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Hortonworks
 

Viewers also liked (20)

Create a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache HadoopCreate a Smarter Data Lake with HP Haven and Apache Hadoop
Create a Smarter Data Lake with HP Haven and Apache Hadoop
 
Hadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big DataHadoop and WANdisco: The Future of Big Data
Hadoop and WANdisco: The Future of Big Data
 
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014 WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
WANdisco Non-Stop Hadoop: PHXDataConference Presentation Oct 2014
 
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos AlgorithmSolving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
Solving Hadoop Replication Challenges with an Active-Active Paxos Algorithm
 
Selective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed HadoopSelective Data Replication with Geographically Distributed Hadoop
Selective Data Replication with Geographically Distributed Hadoop
 
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
Hortonworks and Red Hat Webinar_Sept.3rd_Part 1
 
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
3 CTOs Discuss the Shift to Next-Gen Analytic Ecosystems
 
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
Getting to What Matters: Accelerating Your Path Through the Big Data Lifecycl...
 
Hortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinarHortonworks and Voltage Security webinar
Hortonworks and Voltage Security webinar
 
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption2015 02 12 talend hortonworks webinar challenges to hadoop adoption
2015 02 12 talend hortonworks webinar challenges to hadoop adoption
 
Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar Hortonworks, Novetta and Noble Energy Webinar
Hortonworks, Novetta and Noble Energy Webinar
 
Large scale ETL with Hadoop
Large scale ETL with HadoopLarge scale ETL with Hadoop
Large scale ETL with Hadoop
 
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and HortonworksHow to Become an Analytics Ready Insurer - with Informatica and Hortonworks
How to Become an Analytics Ready Insurer - with Informatica and Hortonworks
 
Hadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data ProcessingHadoop 2.0: YARN to Further Optimize Data Processing
Hadoop 2.0: YARN to Further Optimize Data Processing
 
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and TalendAdoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
Adoption de Hadoop : des Possibilités Illimitées - Hortonworks and Talend
 
Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks Non-Stop Hadoop for Hortonworks
Non-Stop Hadoop for Hortonworks
 
Cloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinarCloudian 451-hortonworks - webinar
Cloudian 451-hortonworks - webinar
 
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior GraphsPredicting Customer Experience through Hadoop and Customer Behavior Graphs
Predicting Customer Experience through Hadoop and Customer Behavior Graphs
 
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
AWS re:Invent 2016: Disaster Recovery and Business Continuity for Systemicall...
 
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
Starting Small and Scaling Big with Hadoop (Talend and Hortonworks webinar)) ...
 

Similar to Supporting Financial Services with a More Flexible Approach to Big Data

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
WANdisco Plc
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
Hortonworks
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Hortonworks
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
Hortonworks
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Innovative Management Services
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
Ameet Paranjape
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
Slim Baltagi
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
Hortonworks
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
Data Con LA
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
POSSCON
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
Rommel Garcia
 
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
Rommel Garcia
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
Shivaji Dutta
 
YARN - Strata 2014
YARN - Strata 2014YARN - Strata 2014
YARN - Strata 2014
Hortonworks
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGskumpf
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
Mac Moore
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
Hortonworks
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
Pactera_US
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksHortonworks
 

Similar to Supporting Financial Services with a More Flexible Approach to Big Data (20)

Supporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big DataSupporting Financial Services with a More Flexible Approach to Big Data
Supporting Financial Services with a More Flexible Approach to Big Data
 
Discover hdp 2.2 hdfs - final
Discover hdp 2.2   hdfs - finalDiscover hdp 2.2   hdfs - final
Discover hdp 2.2 hdfs - final
 
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
Discover HDP 2.2: Comprehensive Hadoop Security with Apache Ranger and Apache...
 
Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]Discover.hdp2.2.ambari.final[1]
Discover.hdp2.2.ambari.final[1]
 
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
Open-BDA Hadoop Summit 2014 - Mr. Slim Baltagi (Building a Modern Data Archit...
 
Cloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a championCloud Austin Meetup - Hadoop like a champion
Cloud Austin Meetup - Hadoop like a champion
 
Building a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise HadoopBuilding a Modern Data Architecture with Enterprise Hadoop
Building a Modern Data Architecture with Enterprise Hadoop
 
Enterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the UnionEnterprise Apache Hadoop: State of the Union
Enterprise Apache Hadoop: State of the Union
 
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder HortonworksThe Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
The Future of Hadoop by Arun Murthy, PMC Apache Hadoop & Cofounder Hortonworks
 
Introduction to Hadoop
Introduction to HadoopIntroduction to Hadoop
Introduction to Hadoop
 
Realtime Analytics in Hadoop
Realtime Analytics in HadoopRealtime Analytics in Hadoop
Realtime Analytics in Hadoop
 
Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0Realtime analytics + hadoop 2.0
Realtime analytics + hadoop 2.0
 
Introduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystemIntroduction to the Hadoop EcoSystem
Introduction to the Hadoop EcoSystem
 
YARN - Strata 2014
YARN - Strata 2014YARN - Strata 2014
YARN - Strata 2014
 
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUGReal-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
Real-Time Processing in Hadoop for IoT Use Cases - Phoenix HUG
 
Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015Storm Demo Talk - Colorado Springs May 2015
Storm Demo Talk - Colorado Springs May 2015
 
Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2Hortonworks and Red Hat Webinar - Part 2
Hortonworks and Red Hat Webinar - Part 2
 
Hortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - WebinarHortonworks and Platfora in Financial Services - Webinar
Hortonworks and Platfora in Financial Services - Webinar
 
Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks Transform Your Business with Big Data and Hortonworks
Transform Your Business with Big Data and Hortonworks
 
Transform You Business with Big Data and Hortonworks
Transform You Business with Big Data and HortonworksTransform You Business with Big Data and Hortonworks
Transform You Business with Big Data and Hortonworks
 

More from Hortonworks

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
Hortonworks
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Hortonworks
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
Hortonworks
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Hortonworks
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
Hortonworks
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Hortonworks
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Hortonworks
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
Hortonworks
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
Hortonworks
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
Hortonworks
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
Hortonworks
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Hortonworks
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Hortonworks
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
Hortonworks
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Hortonworks
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
Hortonworks
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
Hortonworks
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Hortonworks
 

More from Hortonworks (20)

Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next LevelHortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
Hortonworks DataFlow (HDF) 3.3 - Taking Stream Processing to the Next Level
 
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT StrategyIoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
IoT Predictions for 2019 and Beyond: Data at the Heart of Your IoT Strategy
 
Getting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with CloudbreakGetting the Most Out of Your Data in the Cloud with Cloudbreak
Getting the Most Out of Your Data in the Cloud with Cloudbreak
 
Johns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log EventsJohns Hopkins - Using Hadoop to Secure Access Log Events
Johns Hopkins - Using Hadoop to Secure Access Log Events
 
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad GuysCatch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
Catch a Hacker in Real-Time: Live Visuals of Bots and Bad Guys
 
HDF 3.2 - What's New
HDF 3.2 - What's NewHDF 3.2 - What's New
HDF 3.2 - What's New
 
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging ManagerCuring Kafka Blindness with Hortonworks Streams Messaging Manager
Curing Kafka Blindness with Hortonworks Streams Messaging Manager
 
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical EnvironmentsInterpretation Tool for Genomic Sequencing Data in Clinical Environments
Interpretation Tool for Genomic Sequencing Data in Clinical Environments
 
IBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data LandscapeIBM+Hortonworks = Transformation of the Big Data Landscape
IBM+Hortonworks = Transformation of the Big Data Landscape
 
Premier Inside-Out: Apache Druid
Premier Inside-Out: Apache DruidPremier Inside-Out: Apache Druid
Premier Inside-Out: Apache Druid
 
Accelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at ScaleAccelerating Data Science and Real Time Analytics at Scale
Accelerating Data Science and Real Time Analytics at Scale
 
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATATIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
TIME SERIES: APPLYING ADVANCED ANALYTICS TO INDUSTRIAL PROCESS DATA
 
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
Blockchain with Machine Learning Powered by Big Data: Trimble Transportation ...
 
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: ClearsenseDelivering Real-Time Streaming Data for Healthcare Customers: Clearsense
Delivering Real-Time Streaming Data for Healthcare Customers: Clearsense
 
Making Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with EaseMaking Enterprise Big Data Small with Ease
Making Enterprise Big Data Small with Ease
 
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World PresentationWebinewbie to Webinerd in 30 Days - Webinar World Presentation
Webinewbie to Webinerd in 30 Days - Webinar World Presentation
 
Driving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data ManagementDriving Digital Transformation Through Global Data Management
Driving Digital Transformation Through Global Data Management
 
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming FeaturesHDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
HDF 3.1 pt. 2: A Technical Deep-Dive on New Streaming Features
 
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
Hortonworks DataFlow (HDF) 3.1 - Redefining Data-In-Motion with Modern Data A...
 
Unlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDCUnlock Value from Big Data with Apache NiFi and Streaming CDC
Unlock Value from Big Data with Apache NiFi and Streaming CDC
 

Recently uploaded

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
James Anderson
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
Fwdays
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Product School
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
Kari Kakkonen
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
OnBoard
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Thierry Lestable
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
Abida Shariff
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
Safe Software
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Product School
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
DianaGray10
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
DianaGray10
 

Recently uploaded (20)

GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...
 
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdfFIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
FIDO Alliance Osaka Seminar: The WebAuthn API and Discoverable Credentials.pdf
 
"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi"Impact of front-end architecture on development cost", Viktor Turskyi
"Impact of front-end architecture on development cost", Viktor Turskyi
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...Designing Great Products: The Power of Design and Leadership by Chief Designe...
Designing Great Products: The Power of Design and Leadership by Chief Designe...
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
DevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA ConnectDevOps and Testing slides at DASA Connect
DevOps and Testing slides at DASA Connect
 
Leading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdfLeading Change strategies and insights for effective change management pdf 1.pdf
Leading Change strategies and insights for effective change management pdf 1.pdf
 
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
Empowering NextGen Mobility via Large Action Model Infrastructure (LAMI): pav...
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptxIOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
IOS-PENTESTING-BEGINNERS-PRACTICAL-GUIDE-.pptx
 
Essentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with ParametersEssentials of Automations: Optimizing FME Workflows with Parameters
Essentials of Automations: Optimizing FME Workflows with Parameters
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...Mission to Decommission: Importance of Decommissioning Products to Increase E...
Mission to Decommission: Importance of Decommissioning Products to Increase E...
 
Connector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a buttonConnector Corner: Automate dynamic content and events by pushing a button
Connector Corner: Automate dynamic content and events by pushing a button
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4UiPath Test Automation using UiPath Test Suite series, part 4
UiPath Test Automation using UiPath Test Suite series, part 4
 

Supporting Financial Services with a More Flexible Approach to Big Data

  • 1. Supporting Financial Services With a More Flexible Approach to Big Data October 21, 2014
  • 2. WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Our Presenters Jus$n  Sears  is  a  Product  Marke$ng  Manager  at  Hortonworks,  where  he  writes  stories   about  how  enterprise  customers  use  Apache  Hadoop  to  solve  big  data  business   challenges.  He  also  manages  product  launch  marke$ng  and  campaign  content  for   Hortonworks.  For  seventeen  years,  Jus$n  has  led  teams  in  Silicon  Valley  to  create   and  posi$on  enterprise  soCware,  risk-­‐controlled  consumer  banking  products,   desktop  and  mobile  web  proper$es,  and  services  for  La$no  customers  in  the  US  and   La$n  America.  He  lives  with  his  family  in  his  na$ve  San  Francisco  Bay  Area.   BreH  Rudenstein  has  an  extensive  background  in  Applica$on  Lifecycle  Management,   High  Performance  Compu$ng  and  Open  Source  SoCware  Analysis.  He  has  held  senior   sales  engineering  and  management  posi$ons  at  Ra$onal  SoCware,  PureAtria,   IBM,  Appistry  and  Palamida.  Throughout  his  career,  he  has  enabled  organiza$ons  to   accelerate  technology  adop$on  by  understanding  their  needs  and  providing  just-­‐in-­‐ $me  business  solu$ons.  As  WANdisco  Director  of  Product  Management  for  Big   Data,  BreH  works  with  partners,  prospects  and  customers  to  help   them  understand  and  evolve  the  requirements  for  enterprise-­‐ready  Hadoop.  
  • 3. Page 3 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks We Do Hadoop
  • 4. Page 4 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Our Mission: Power your Modern Data Architecture with HDP and Enterprise Apache Hadoop Who we are June 2011: Original 24 architects, developers, operators of Hadoop from Yahoo! June 2014: An enterprise software company with 420+ Employees Key Partners Our model Innovate and deliver Apache Hadoop as a complete enterprise data platform completely in the open, backed by a world class support organization
  • 5. Page 5 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Fastest growing Fortune 1000 customer base Customer Momentum •  300+ customers in seven quarters, growing at 75+/quarter •  Two thirds of customers come from F1000 •  100% renewal rate Largest Cluster in North America 32,000 Nodes Largest Cluster in Europe 1,000 Nodes Some notable migrations include many of the early adopters of Hadoop: © Hortonworks Inc. 2011 – 2014. All Rights Reserved Experience at Scale 80,000 nodes under contract Largest Known Cluster in APAC 400 Nodes 30+ customers migrated from other distributions
  • 6. Page 6 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks: A Leader In Hadoop The Forrester Wave™: Big Data Hadoop Solutions, Q1 2014 “Hortonworks loves and lives open source innovation” Vision & Execution for Enterprise Hadoop. Hortonworks leads with a strong strategy and roadmap for open source innovation with Hadoop and a strong delivery of that innovation in Hortonworks Data Platform. World Class Support and Services. Hortonworks' Customer Support received a maximum score and was significantly higher than both Cloudera and MapR. Key Strategic Partnerships. Hortonworks’ unique strategic partnerships with Microsoft, SAP, Teradata and others are a key strength as part of its overall strategy of ecosystem partnership to accelerate Hadoop adoption in the enterprise.
  • 7. Page 7 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP IS Apache Hadoop There is ONE Enterprise Hadoop: everything else is a vendor derivation HDP •  Reliable •  Consistent •  Current
  • 8. Page 8 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Enabling a Modern Data Architecture with HDP and Apache Hadoop Hortonworks. We do Hadoop.
  • 9. Page 9 © Hortonworks Inc. 2011 – 2014. All Rights Reserved APPLICATIONSDATASYSTEM Business Analytics Custom Applications Packaged Applications Traditional systems under pressure •  Silos of Data •  Costly to Scale •  Constrained Schemas Clickstream Geolocation Sentiment, Web Data Sensor, Machine Data Unstructured docs, emails Server logs SOURCES Existing Sources (CRM, ERP,…) RDBMS EDW MPP New Data Types …and difficult to manage new data
  • 10. Page 10 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Traditional Hadoop, challenges & limitations 1 ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) MapReduce Largely Batch Processing SOURCES EXISTING   Systems   Clickstream   Web  &Social   Geoloca9on   Sensor  &   Machine   Server  Logs   Unstructured   Architectural Limitations •  Single-purpose clusters, specific data sets •  Primarily a batch system using MapReduce Enterprise Challenges •  Limited enterprise capabilities: 
 Operations, Security & Governance •  Created additional Silos Interoperability Challenges •  Difficult to natively integrate existing applications Commercial add-ons opportunistically emerged 
 in the early days to address these shortcomings APPLICATIONSDATASYSTEM Business Analytics Custom Applications Packaged Applications RDBMS EDW MPP
  • 11. Page 11 © Hortonworks Inc. 2011 – 2014. All Rights Reserved 20092006 1   °   °   °   °   °   °   °   °   °   °   N   HDFS     (Hadoop  Distributed  File  System)   MapReduce   Largely  Batch  Processing   Hadoop  w/  MapReduce YARN: Data Operating System 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS 
 (Hadoop Distributed File System) Hadoop2 & YARN based Architecture Siloed clusters Largely batch system Difficult to integrate MR-­‐279:  YARN Hadoop 2 & YARN Interactive Real-TimeBatch Architected & 
 led development of YARN to enable the Modern Data Architecture October 23, 2013
  • 12. Page 12 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP2 and YARN enable the Modern Data Architecture Hortonworks architected and 
 led development of YARN Common data set, multiple applications •  Optionally land all data in a single cluster •  Batch, interactive & real-time use cases •  Support multi-tenant access, processing & segmentation of data YARN: Architectural center of Hadoop •  Consistent security, governance & operations •  Ecosystem applications certified 
 by Hortonworks to run natively in Hadoop SOURCES EXISTING   Systems   Clickstream   Web     &Social   Geoloca9on   Sensor     &  Machine   Server     Logs   Unstructured   APPLICATIONSDATASYSTEM Business Analytics Custom Applications Packaged Applications RDBMS EDW MPP YARN: Data Operating System 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° N HDFS (Hadoop Distributed File System) Interactive Real-TimeBatch
  • 13. Page 13 © Hortonworks Inc. 2011 – 2014. All Rights Reserved A Blueprint for Enterprise Hadoop Load data and manage according to policy Deploy and effectively manage the platform Store and process all of your Corporate Data Assets Access your data simultaneously in multiple ways (batch, interactive, real-time) Provide layered approach to security through Authentication, Authorization, Accounting, and Data Protection DATA MANAGEMENT SECURITYDATA ACCESS GOVERNANCE & INTEGRATION OPERATIONS Enable both existing and new applications to provide value to the organization PRESENTATION & APPLICATION Empower existing operations and security tools to manage Hadoop ENTERPRISE MGMT & SECURITY Provide deployment choice across physical, virtual, cloud DEPLOYMENT OPTIONS YARN Data Operating System
  • 14. Page 14 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hortonworks Data Platform 2.2 HDP Delivers Enterprise Hadoop YARN: Data Operating System (Cluster Resource Management) 1 ° ° ° ° ° ° ° ° ° ° ° ° ° ° ° Script Pig SQL Hive Tez Tez Java Scala Cascading Tez ° ° ° ° ° ° ° ° ° ° ° ° ° ° HDFS (Hadoop Distributed File System) Stream Storm Search Solr NoSQL HBase Accumulo Slider Slider SECURITYGOVERNANCE OPERATIONSBATCH, INTERACTIVE & REAL-TIME DATA ACCESS In-Memory Spark Provision, Manage & Monitor Ambari Zookeeper Scheduling Oozie Data Workflow, Lifecycle & Governance Falcon Sqoop Flume Kafka NFS WebHDFS Authentication Authorization Accounting Data Protection Storage: HDFS Resources: YARN Access: Hive, … Pipeline: Falcon Cluster: Knox Deployment ChoiceLinux Windows On- Premises Cloud YARN is the architectural center of HDP •  Common data set across all applications •  Batch, interactive & real-time workloads •  Multi-tenant access & processing Provides comprehensive enterprise capabilities •  Governance •  Security •  Operations Enables broad ecosystem adoption •  ISVs can plug directly into Hadoop The widest range of deployment options •  Linux & Windows •  On-premises & cloud Others ISV Engines
  • 15. Page 15 © Hortonworks Inc. 2011 – 2014. All Rights Reserved The Modern Data Architecture w/ HDP
  • 16. Page 16 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Clickstream Capture and analyze website visitors’ data trails and optimize your website Sensors Discover patterns in data streaming automatically from remote sensors and machines Server Logs Research logs to diagnose process failures and prevent security breaches New Types of DataHadoop Value: Sentiment Understand how your customers feel about your brand and products – right now Geographic Analyze location- based data to manage operations where they occur Unstructured Understand patterns in files across millions of web pages, emails, and documents
  • 17. Page 17 © Hortonworks Inc. 2011 – 2014. All Rights Reserved New analytic applications for new types of data $ •  Supplier Consolidation •  Supply Chain and Logistics •  Assembly Line Quality Assurance •  Proactive Maintenance •  Crowdsourced Quality Assurance •  New Account Risk Screens •  Fraud Prevention •  Trading Risk •  Maximize Deposit Spread •  Insurance Underwriting •  Accelerate Loan Processing •  Call Detail Records (CDRs) •  Infrastructure Investment •  Next Product to Buy (NPTB) •  Real-time Bandwidth Allocation •  New Product Development •  360° View of the Customer •  Analyze Brand Sentiment •  Localized, Personalized Promotions •  Website Optimization •  Optimal Store Layout Financial Services Retail Telecom Manufacturing Healthcare Utilities, Oil & Gas Public Sector •  Genomic data for medical trials •  Monitor patient vitals •  Reduce re-admittance rates •  Store medical research data •  Recruit cohorts for pharmaceutical trials •  Smart meter stream analysis •  Slow oil well decline curves •  Optimize lease bidding •  Compliance reporting •  Proactive equipment repair •  Seismic image processing •  Analyze public sentiment •  Protect critical networks •  Prevent fraud and waste •  Crowdsource reporting for repairs to infrastructure •  Fulfill open records requests
  • 18. Page 18 © Hortonworks Inc. 2011 – 2014. All Rights Reserved ..to shift from reactive to proactive interactions HDP and Hadoop allow organizations to shift interactions from… Reactive Post Transaction Proactive Pre Decision …to Real-time PersonalizationFrom static branding …to repair before breakFrom break then fix …to Designer MedicineFrom mass treatment …to Automated AlgorithmsFrom Educated Investing …to 1x1 TargetingFrom mass branding A shift in Advertising A shift in Financial Services A shift in Healthcare A shift in Retail A shift in Telco
  • 19. Page 19 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Data Lake: An architectural shift SCALE SCOPE Unlocking the Data Lake   RDBMS MPP EDW Data Lake Enabled by YARN •  Single data repository, shared infrastructure •  Multiple biz apps accessing all the data •  Enable a shift from reactive to proactive interactions •  Gain new insight across the entire enterprise New Analytic Apps or IT Optimization HDP 2.1 Governance &Integration Security Operations Data Access Data Management YARN
  • 20. Page 20 © Hortonworks Inc. 2011 – 2014. All Rights Reserved OPERATIONAL  TOOLS   DEV  &  DATA  TOOLS   INFRASTRUCTURE   HDP is deeply integrated in the data centerSOURCES EXISTING   Systems   Clickstream   Web  &Social   Geoloca9on   Sensor  &   Machine   Server  Logs   Unstructured   DATASYSTEM RDBMS   EDW   MPP   HANA APPLICATIONS   BusinessObjects BI Deep Partnerships Hortonworks engages in deep engineered relationships with the leaders in the data center, such as Microsoft, Teradata, Redhat, HP, SAS & SAP Broad Partnerships Over 600 partners work with us to certify their applications to work with Hadoop so they can extend big data to their users HDP 2.1 Governance &Integration Security Operations Data Access Data Management YARN
  • 21. Page 21 © Hortonworks Inc. 2011 – 2014. All Rights Reserved HDP Use Cases in Financial Services Hortonworks. We do Hadoop.
  • 22. Page 22 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Monetize Anonymous & Aggregate Banking Data Problem Valuable banking data needed to be anonymous & unified •  Bank possesses data that indicates larger macro-economic trends, which can be monetized in secondary markets •  Regulations and company policies protect customer privacy •  Data sets are isolated in legacy silos controlled by LOBs •  IT challenged by joining data while guaranteeing anonymity Solution Cross-bank data lake for aggregate data with secure access •  Multiple data sets abstracted from source platforms •  Single point of security & privacy for de-identification, masking, encryption, authentication and access control •  Mortgage bankers, consumer bankers, credit card group and treasury bankers have access to the same cross-sell data •  Interoperability with partners SAS, R, RedHat & Splunk •  Economies of scale for compression & archiving data •  Significant reduction in storage costs from prior platforms Creating Opportunity Data: Structured, Clickstream, Social & Unstructured Banking One of the largest US banks
  • 23. Page 23 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Insurance Data Lake to Manage Risk Problem Challenges merging new & old data hamper analysis •  Traditional and newer types of data were both growing quickly but were difficult to combine in the EDW •  “Schema on load” requirements of EDW platform limited ingest of some data with significant predictive power •  Company missed data-driven ways to serve customers •  Process of separating legitimate from fraudulent claims created “needle-in-a- haystack” problem Solution Common platform for all types of data improves up-sell and reduces fraud •  “Schema on read” Hadoop architecture means that more data sources can be easily ingested to enrich predictive analytics •  Agents use big data insights to determine the best action for valued customers and recommend those in real-time •  Claims analysts and underwriters process streaming data to quickly flag fraud risks and fast-track legitimate claims Creating Opportunity Data: Structured, Clickstream, Server Log Health Insurance Large US medical insurer >$30B in revenue >20M members ~35K employees
  • 24. Page 24 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Maintaining SLAs for Equity Trading Information Problem Meeting 12 millisecond SLAs for “ticker plant” •  Daily ingest: 50GB server log data from 10,000 feeds •  Four times daily, this data is pushed into DB2 •  Applications query this data 35K times per second •  70% of queries are for data <1 year old, 30% for >1 year old •  Current architecture can only hold 10 years of trading data •  Growing volume puts performance at risk of missing SLAs Solution Meeting SLAs with confidence •  HBase provides super-fast queries within SLA targets •  ETL offloading to Hadoop allows longer data retention, without jeopardizing fast response times Improving Efficiency Data: Server Log & ETL Investment Services Highly trafficked website providing business and financial information ~15K employees
  • 25. Page 25 © Hortonworks Inc. 2011 – 2014. All Rights Reserved Hadoop is a Platform Decision Open Leadership Drive innovation in the open via the Apache community-driven open source process Enterprise Rigor Engineer, test and certify Apache Hadoop with the enterprise in mind Ecosystem Endorsement Focus on deep integration with existing data center technologies and skills Fastest Growing Customer and Partner Base Largest and most experienced Hadoop adopters have standardized on Hortonworks The data center leaders have standardized on Hortonworks
  • 26.
  • 27. 27   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA WANdisco Background •  WANdisco: Wide Area Network Distributed Computing –  Enterprise-ready, high availability software solutions that enable globally distributed organizations to meet today’s data challenges of secure storage, scalability and availability •  Leader in tools for software engineers – Subversion –  Apache Software Foundation sponsor •  Highly successful IPO, London Stock Exchange, June 2012 (LSE:WAND) •  US patented active-active replication technology granted, November 2012 •  Global locations –  San Ramon (CA) –  Chengdu (China) –  Tokyo (Japan) –  Boston (MA) –  Sheffield (UK) –  Belfast (UK)
  • 28. 28   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Customers
  • 29. 29   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Non-Stop Hadoop Non-Intrusive Plugin to Hortonworks HDP Provides Continuous Availability In the LAN / Across the WAN Active/Active
  • 30. 30   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA 3 Problems For Sharing Data Across Clusters LAN / WAN
  • 31. 31   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA •  Require Continuous Availability –  SLA’s, regulatory compliance •  Require HDFS to be Deployed Globally –  Share data between data centers –  Data is consistent, not eventual •  Ease Administrative Burden –  Reduce operational complexity –  Simplify disaster recovery –  Lower RTO/RPO •  Allow Maximum Utilization of Resources –  Within the data center –  Across data centers Enterprise-Ready Hadoop Characteristics of Mission-critical Financial Applications
  • 32. 32   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Single Standby •  Inefficient utilization of resource –  Journal Nodes –  ZooKeeper Nodes –  Standby Node •  Performance Bottleneck •  Still tied to the beeper •  Limited to LAN scope Breaking Away from Active/Passive What’s in a NameNode
  • 33. 33   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Single Standby •  Inefficient utilization of resource –  Journal Nodes –  ZooKeeper Nodes –  Standby Node •  Performance Bottleneck •  Still tied to the beeper •  Limited to LAN scope Active / Active •  All resources utilized –  Only NameNode configuration –  Scale as the cluster grows –  All NameNodes active •  Load balancing •  Set resiliency (# of active NN) •  Global Consistency Breaking Away from Active/Passive What’s in a NameNode
  • 34. 34   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Standby Data Center •  Idle Resource –  Single Data Center Ingest –  Disaster Recovery Only •  One way synchronization –  DistCp •  Error Prone –  Clusters can diverge over time •  Difficult to scale > 2 Data Centers –  Complexity of sharing data increases Breaking Away from Active/Passive What’s in a Data Center
  • 35. 35   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Standby Data Center •  Idle Resource –  Single Data Center Ingest –  Disaster Recovery Only •  One way synchronization –  DistCp •  Error Prone –  Clusters can diverge over time •  Difficult to scale > 2 Data Centers –  Complexity of sharing data increases Active / Active •  DR Resource Available –  Ingest at all Data Centers –  Run Jobs in both Data Centers •  Replication is Multi-Directional –  active/active •  Absolute Consistency –  Single HDFS spans locations •  ‘N’ Data Center support –  Global HDFS allows appropriate data to be shared Breaking Away from Active/Passive What’s in a Data Center
  • 37. 37   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA •  Data is as current as possible (no periodic synchs) •  Doesn’t require monitoring and consistency checking •  Virtually zero downtime to recover from regional data center failure •  Regulatory compliance Use Case: Disaster Recovery
  • 38. 38   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA •  Ingest and analyze anywhere •  Analyze everywhere –  Fraud detection –  Equity trading information –  New business –  Etc… •  Backup data center(s) can be used for work –  No idle resources Use Case: Multi-Data Center Ingest and multi-tenant workloads
  • 39. 39   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA •  Mixed Hardware Profiles –  Memory, disk, CPU –  Isolate memory-hungry processing (Storm/Spark) from regular jobs •  Share data, not processing –  Isolate lower priority (dev/ test) work Use Case: Heterogeneous Hardware In-memory analytics
  • 40. 40   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA The difficulty realizing the data lake…
  • 41. 41   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA …is that data spans the entire world
  • 42. 42   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Data   Ocean   Feeder   Site   Accoun$ng   Mart   Banking   Mart   •  Data Marts –  Restrict access to relevant data –  Create quick clusters •  Feeder Sites (Data Tributaries) –  Ingest only Data Reservoir Use Cases
  • 43. 43   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA •  Basel III –  Consistency of data •  Data Privacy Directive –  Data sovereignty •  Data doesn’t leave country of origin Compliance   Regula$on   Guidelines   Regulatory Compliance
  • 45. 45   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Periodic Synchronization DistCp Parallel Data Ingest Load Balancer, Streaming Multi-Data Center Hadoop Today What's wrong with the status quo
  • 46. 46   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Periodic Synchronization DistCp Multi-Data Center Hadoop Today Hacks currently in use •  Runs as MapReduce •  DR data center is read-only •  Over time, Hadoop clusters become inconsistent •  Manual and labor-intensive process to reconcile differences •  Inefficient use of the network
  • 47. 47   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Parallel Data Ingest Load Balancer, Flume Multi-Data Center Hadoop Today Hacks currently in use •  Hiccups in either of the Hadoop clusters causes the two file systems to diverge •  Potential to run out of buffer when WAN is down •  Requires constant attention and sys-admin hours to keep running •  Data created on the cluster is not replicated •  Use of streaming technologies (like flume) for data redirection are only for streaming
  • 48. 48   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Architecture of a Non-Stop Hadoop
  • 49. 49   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Q&A Question and Answer Submit your questions using the “ASK A QUESTION” button
  • 50. 50   WWW.WANDISCO.COMREALIZING THE POSSIBILITIES OF BIG DATA Thank you