SlideShare a Scribd company logo
1 of 29
© 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 0
Leveraging Big Data to Create Value
June 5th, 2014
Agenda
2
12-12:30pm Registration and Lunch
12:30-12:40pm Welcome and Introductions -- Art Hansen
12:40-1:45pm Keynote Presentation -- Chris Ward, Brian Vaughan, James Bigger
1:45-2:20pm Hadoop in the Real World by MapR -- David Feldman
2:20-2:30pm Break
2:30-2:45pm Cisco Unified Computing System Rack Mount Servers for Big Data – Wade Ison
2:45-3:30pm Big Data Brainstorm Breakouts
3:30-4:30pm Refreshments, Q&A Session, and Conclusion
4:30pm Raffle Drawing for iPad
Big Data as a Competitive Strategy
Harvard’s Michael Porter:
1. Cost Leadership Strategy (Wal-Mart)
2. Differentiation Strategy (Southwest)
3. Innovation Strategy (Apple)
4. Operational Effectiveness Strategy (UPS)
5. Technology-based Competitive Strategy
What do we have that makes us different?
• Custom Apps
• Process (Workflow)
• Big Data
• People
• Culture
4
Big Data’s Financial Benefits
Gartner predicts that “Big Data will deliver transformational benefits to enterprises
within 2 to 5 years, and by 2015 will enable enterprises adopting this technology to
outperform competitors by 20% in every available financial metric
Goals for Today:
• High ROI less than a year
• Must be applied to things that
are important to the business
• Use of multiple patterns
encouraged
• New ways of correlating data
that was formally not
correlated
• Remember Big Data patterns
usually require scale
• Understand Big Data Major
Building Blocks
• Learn the major patterns
• Understand how to introduce Big
Data into the enterprise in practical
ways
• Identify a solid use case for Big
Data
Tips for Winning:
WWT Big Data Leadership Team
20 years of management
consulting and
entrepreneurial experience.
Expertise in financial services,
insurance and telecom. Prior
consulting experience with
Opera Solutions and A. T.
Kearney.
Ph.D. in Physics from Oxford
University.
James Bigger
Principal
Consultant
15 years in management
consulting, analytics and
software experience.
Expertise in healthcare and
insurance. Prior experience
with Opera Solutions, Mitchell
Madison Group and
Broadlane.
Ph.D. in Physics from Stanford
University.
Brian Vaughan
Principal
Consultant
20 years in management
consulting and executive
leadership. Expertise in retail,
marketing, hospitality &
financial services. Prior
consulting experience with
Opera Solutions and The
Boston Consulting Group.
BA from Princeton University,
MBA from the University of
Virginia Darden School of
Business.
Chris Ward
Principal
Consultant
Over 20 years of experience
in a range of IT and security
disciplines. Responsible for
deploying large, secure,
Hadoop-based platforms for
the U. S. Government. 10 year
of international experience
implementing networking and
virtual data center
environments
Undergraduate degree from
AIU.
Matt DuBell
Principal Systems
Engineer
Over 7 Years of experience in
management and analytics
consulting. Led engagements
in telecom at Opera Solutions.
Previous experience
performing predictive analytics
for NASA and USAF at The
Aerospace Corporation.
Ph.D. in Mechanical
Engineering from Pennsylvania
State University.
Yoni Malchi
Engagement
Manager
18 years of analytics and
software development
experience. Expertise in
financial services, healthcare,
insurance, retail and marketing
science. Prior analytics
development experience at
Opera Solutions, FICO and J.D.
Power and Associates.
Ph.D. in Physics from Stanford
University.
.
Jason Lu
Chief Scientist
Over 7 Years of management
consulting and entrepreneurial
experience. Expertize in
financial services, travel, and
retail sectors across US and
Europe. Led Big Data strategy
and analytical engagements at
Opera Solutions.
MSci in Astrophysics from the
University of Cambridge.
Jamie Milne
Engagement
Manager
Over 8 years of experience in
analytics consulting and
delivery management. Ran
engagements in wealth
management, corporate
security, marketing, education
and transportation at Opera
Solutions and IBM Global
Business Services.
BS in Mathematics from
Georgetown University.
Chris Infanti
Engagement
Manager
Over 20 years of experience
in enterprise datacenter,
building innovative solutions
in Big Data, storage, HPC,
virtualization, data migration
and enterprise applications.
Formerly lead architect for
NetApp's Big Data solutions,
and led the development
of the FlexPod select
solutions.
B.S. in Electrical Engineering.
Prem Jain
Principal
Architect
Volume, Variety and Velocity of Data are Exploding
The production of data is expanding at an astonishing rate. Drivers include the switch from analog to
digital technologies and the creation of structured and unstructured data by individuals and companies
via social media and the Web
8
• Every 60 Seconds:
- 98,000+ tweets
- 695,000 status updates
- 11 million instant messages
- 698,445 Google searches
- 168 million+ emails sent
- 1,820TB of data created
- 217 new mobile web users
• The need to process more data
faster to respond to dynamic
business trends has brought new
requirements for database
architectures
• We believe the industry stands at
the cusp of the most significant
revolution in database and,
therefore, application architectures
in the past 20 years.
VelocityVarietyVolume
0
10
20
30
40
2010 2015 2020
ZB
Enterprise Managed Data
Enterprise Created Data
0
10
20
30
40
50
60
70
80
2009 2010 2011 2012 2013 2014
Unstructured data storage
Structured data storage
EB
Source: IDC, Gartner, EMC, Worldwide File-Based Storage 2010-2014 Forecast
Vendor Landscape Is Crowded and Growing
Data Sources
& Capture
IT
Infrastructure
Data Management
& Integration
Analytics Platforms
and Solutions
Analytics Services and
Support
Data Vendors Infrastructure Vendors
Open Data Platforms
Proprietary Data Platforms
Extended infrastructure +
data platforms
Systems
Integrators
Specialized End-to-End Solutions
Analytics Service Providers
Vertical Analytics Solutions
Distributed File System
and Processing Language
Characteristics
• Parallel
storage/processing
• Flexible programming
model
• Horizontal scaling
• Batch processing
Non-relational Key-Value
Database
Characteristics
• Fast read/write
• Real time query
• Horizontal scaling
• Simple programming
model
• Dynamic schema
Column-Oriented
Analytics Database
Characteristics
• Relational
• Efficient compression
• Optimized for fast
read of many/all
records
In-Memory Database
and Processing
Characteristics
• Relational
• Random Access
• Extremely Fast
Enablement / Uses
• Complex Event
Processing
• Real Time Analytics
• Potential to use a
common database for
transactions and
analytics
Enablement / Uses
• Pre-processing of data
for analytics
• ETL for transforming
unstructured data to
structured
• Data summarization
Enablement / Uses
• Real-time ingest
• Rapid retrieval
• Input to MapReduce
Enablement / Uses
• On-Line Analytics
Processing (OLAP)
• Data storage and
retrieval for advanced
analytics
Foundational Emerging
Key Big Data Technologies
10
Hadoop NoSQL Columnar In-Memory
The Big Data Software Stack
The big data ecosystem includes open source and proprietary distributions that span the stack from
ingest through analytics
11
JobFlow
USER/MACHINE WORKFLOW
Enterprise Structured Enterprise Unstructured 3rd Party Web/ Unstructured
Flexible interfaces:
TRANSFORM
ANALYTICS
DATABASE
ANALYTICS
ACCESS/
QUERIES
INGEST
FILE SYSTEM/
DATABASE
MANAGEMENT
Columnar
In Memory
Parallel RDBMS
EMC/PIVOTAL HD /
GREENPLUM
HP/VERTICA/CLOUDERA
ORACLE BIG DATA
EXADATA/EXALYTICS
IBM INFOSPHERE
BIGINSIGHTS
SAP HANA
TERRACOTTA BIGMEMORY
ZOOKEEPER
CLOUDERA
HORTONWORKS
MAPR
PIVOTALHD
HADOOP
CASSANDRA
HBASE
MONGODB
TEREDATA
NETEZZA
GREENPLUM
VERTICA
OLAP
Natural Language
Custom Analytics
Custom API’s
SQL
OPEN SOURCE
COMMERCIAL
OPEN SOURCE
Fast,
Scalable
Provisioning
Maintenance
Flexible,
Compressed,
Fast Read
Optimized
for high vol
reads
Interfaces to
accept data
Real Time
& Batch
HDFS
NoSQL
- Document
- Key-Value
- Wide Column
SQL
PIG
HIVE
R
PYTHON
SAS
SPSS
Batch
Streaming
SQOOP
FLUME
SPLUNK
TALEND
LAYER PROPERTIES OPTIONS EXAMPLES OF PRODUCTS INTEGRATED OFFERINGS
MapReduce HADOOP
Parallel,
Distributed
ODS
Data
Warehouse
Call
Center
Server
Logs
Financial Demographic
OOZIE
DATA
ACQUIRE
ORGANIZE
ANALYZE
DECIDE
SOLUTIONS
MICROSTRATEGY
BUSINESS OBJECTS
COGNOS
ORACLE OBIEE PLUS
Technology: Expanding the Traditional Stack
Big Data requires a technology stack that leverages existing infrastructure and introduces new
technology for distributed parallel processing
12
Queries (SQL)
Relational Databases
Monolithic Hardware
(few CPUs and network
computers)
“Shared Disk/Memory”
Architecture
(centralized processing)
Direct Record Access or Queries
Monolithic Hardware
(few CPUs and network
computers)
“Shared Disk/Memory”
Architecture
(centralized processing)
NoSQL
Database
Parallel
Relational
Database
Distributed
File
System
High-Performance
Traditional
Relational
Database
MapReduce Programs
Distributed Hardware
(multicore CPUs, multiple computers
connected via high-performance network)
“Shared Nothing” Architecture
(distributed parallel processing)
INTERFACE
DATABASE/
DISTRIBUTED
PROCESSING
FRAMEWORK
HARDWARE
TRADITIONAL RELATIONAL
DATABASE STACK
STACK FOR THE NEW DATA
FOUNDATION
Source: IDC, CSC, Gartner
Business
Need
Class of
Analytics
Analytics: Translating Business Needs to Math
Regardless of industry, many use cases translate into a limited class of “math problems” that big-data
platforms (unlike transactional platforms) are optimized to solve at scale
13
Method
Analytics
Ready Stack
Hardware & Software
• Parallel
• Distributed
• Shared Nothing
• Columnar
• NoSQL
• In-Memory
• ARMA
• Decision Trees
• Genetic Algorithms
• Graph Theory
• Kalman Filter
• KNN
• Linear Regression
• Logistic Regression
• Matrix Factorization
• Monte Carlo
• Neural Networks
• Sorting
• Survival Time Analysis
• Visualization
• Regression
• Classification
• Clustering
• Forecasting
• Optimization
• Simulation
• Sparse Data Inference
• Anomaly Detection
• Natural Language
Processing
• Intelligent Data
Design
• Recommendation
• Risk Scoring
• Pricing
• Capacity Planning
• Cost Reduction
• Matching
• Retrieval
Defining The Business Opportunity Is The Starting Point
The power of “Big Data” lies in bringing together data in a timely fashion from sources within and external
to the enterprise - structured and unstructured - to create a complete view of critical business issues,
therefore enabling advanced analytics to unlock key insights that drive significant business value
14
Outcome
Analytics
Data
Technology
Clearly defined use cases with the potential to deliver
significant value by distilling vast data into new, previously
unknowable intelligence
Advanced machine learning techniques to analyze
data and mine for insights to drive critical business
decisions
Structured or unstructured, internal or
external, requiring new methods of
storage/integration
Emerging/new technology stacks
using scalable, distributed
architectures
Telematics is Transforming Auto Insurance
Big Data Use Case
Combine driving behavioral with actuarial
data to create individualized risk models that
more accurately predict claims losses that
enables risk adjusted pricing to gain market
share and increase margins
Business Imperative
To gain profitable market share, insurance
companies need to offer the lowest “risk
adjusted” pricing possible to consumers
Methods
• KNN
• Linear Regression
• SVD
Class of Analytics
• Regression
• Clustering
• Anomaly Detection
• Sensors to capture
routes, miles driven,
time of day, braking
patterns, driving speed
• Geospatial maps
tied to database
layers
Science & Data
HDFS
MapReduce
NoSQL
Data W/H
In database
Analytics
Data Marts
Technology
Data
15
C a s e S t u d y
I n s u r a n c e
Predictive Maintenance
16
FTP over
MESH
Data Logger
Data Logger
• One per truck
• (Logs, Sensors, OEM
Alarms, VIMS Service
Port)
Equipment
Maintenance
Dispatch &
Operator
Fuel, Oil
Analysis, etc.
Hours
1
Urgent Component
Problem
2 Critical Sensor Problem
Stratifying Alarms
3
Important/Not Urgent
Component/Sensor Problem
4
Not Important Component
or Sensor Problem
5 Noise - Ignore
Data Logger
Data Driven Preventative Maintenance
Data/Analytics driven timing for preventative maintenance
(e.g., oil changes) on individual Trucks1 Urgent Component
Problems
e.g., Engine, Transmission,
Differentials, Torque
Converters, Final Drives
Major Component Failure Model(s)
Project Scope
• 252 Trucks – 200
sensors per truck
• 7 Mine sites
• 10,000
readings/second
Data Integration
• Integrating 15+ siloed data sources
in multiple file formats
• 10 Terabytes of data
• 3 year historical data ecosystem
Business Impact: Higher equipment up-time; reduced critical component failure; better
preventative maintenance and increased labor productivity
C a s e S t u d y
M i n i n g
Data Warehouse Augmentation: Value Proposition
Augmenting the Data Warehouse with a less expensive Hadoop system will allow companies to free up valuable
space on their DW systems to run faster queries and analysis, whilst storing large volumes of their data universe
WWT Hadoop Appliance
Traditional Data
Warehouse
Full Data Universe
CRM Social
Media
Billing
Web logs
Payments
Scheduling
Cold Data Warm Data
Hot
Data
2. About 50% of data that is brought
into a typical Data Warehouse
system is rarely accessed: Cold Data
3. About 80% of the queries and
reporting performed on Hot Data
does not need to be at DW speeds
1. A significant amount of data is
thrown out during the ETL process
that may be valuable in the future
Traditional Data Warehouse
Full Data Universe
CRM Social
Media
Billing
Web logs
Payments
Scheduling
Cold Data
Warm
Data
2. Store Cold Data in Hadoop, taking
advantage of lower cost per TB
− Teradata: $17K
− Hadoop: $2K
3. Continue to take
advantage of DW agility
and speed in real-time
analysis and querying
1. Utilize additional Hadoop-based
storage to store full data universe
− Files can be stored in natural
format
Warm
Data
Hot
Data
Potential jumping-off point for Big Data Business Impact project
CURRENTPROPOSED
Integrating Many Data Sources To Provide Lift
Purchase
History
Service
History
Web
Data
Campaign
Metadata
Destination
Word clouds
Partner
Hotels
Profiled 100+m
transactions for
millions of customers
Linked data for
millions of customer
interactions and
service records
Analyzed billions of
page-views for
behavioral indicators
Extracted meaning
from tens of
thousands of email
campaigns
Mapped destinations
to key “feature tags”
which explain
selection
Geotagged tens of
thousands of partner
hotels by
understanding free
text description
C a s e S t u d y
G l o b a l A i r l i n e
18
Time
Nov
2010
Dec
2010
Jan
2011
Feb
2011
Mar
2011
Apr
2011
May
2011
Jun
2011
Jul
2011
Aug
2011
Sept
2011
Hotel ExperienceFlight Car Rental Holiday
Customer Travel Profile
ID= xxxx
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Uptake%
% Offered
Lift
0%
20%
40%
60%
80%
100%
0% 20% 40% 60% 80% 100%
Uptake%
% Offered
Time
Nov
2010
Dec
2010
Jan
2011
Feb
2011
Mar
2011
Apr
2011
May
2011
Jun
2011
Jul
2011
Aug
2011
Sept
2011
Hotel ExperienceFlight Car Rental Holiday
Customer Travel Profile:
ID= xxxx
Typically social media tools focus on monitoring past/present activity. Predictive analytics allows users to identify
important threads and intervene early, shifting the focus to future activity
• Details on particular themes or attributes
• Forecasts trend and a mechanism to intervene
in attribute that are going viral
• Word cloud shows ongoing buzz and sentiment
• Tabular view shows emerging themes and
sentiment, virality score and recommended
time-window for action
Social Media Analytics
C a s e S t u d y
C o n s u m e r
E l e c t r o n i c s
19
Curriculum
Management
Engine
Curriculum
Management Engine
We designed a recommendation engine that generates a dynamic set of recommendations on a daily basis
(over 1MM/day, from sales force handhelds, website, call centers) that learns and adapts to increase its
ability to change behaviors over time through a Curriculum Management Engine
Plan for Smith Household:
Total Wallet = $600
Aspiration: Achieve 60% share
of wallet up from 40%
How:
• Habituate Pizza and Ice
Cream and Increase
Frequency
• Move Into Dinner Entrees &
Sides
• Move Into Higher Margin
Breakfast Entrees
• Increase Frequency of
Purchases
VISIT #1:
1. Haven’t Bought In A While:
2. Others On My Route Like:
3. Would You Like Another?:
4. Just for You -- $1.00 Off
Household
Response
VISIT #2
1. Would You Like Another?
2. Others On My Route Like:
3. No pizza; not yet consumed
4. Just For You
Nature of
Recommendations
• Individuated Offers –
Especially for You
• Cross-Sell/ Up-sell –
Based on latent needs
• Reminders – Haven’t
bought in a while
• Trials – Never tried but
similar people like it
• Promotions – Being a
loyal customer
Recommendations for Grocery Retailer’s
Customers Delivered $100 million p.a. in EBIT
C a s e S t u d y
F o o d G r o c e r
Using Internal and External Data with Advanced
Analytics for Site Selection
• Comprehensive performance data
– Fronts store / pharmacy sales
– Customer and patient demographics
– Local area demographic
• Web Scraping and Text Analytics
– Neighborhood business profile
– Competitor performance
– Healthcare alternatives (ER, Urgent Care, PCPs)
• Non-linear, multivariate predictive models
– Linear/Logistic Regression
– Decision Trees (CART)
– Random Forest
– Gradient Boosting Machine
– Neural Networks
• Incorporation of all data, including variables
usually viewed as “qualitative”
0.0
0.2
0.4
0.6
0.8
1.0
1.2
1.4
1.6
1.8
R = 0.75
M o d e l P e r f o r m a n c e
Predicted Patient Volume
Actual Patient Volume
+17%
Model
Recommendation
0.83
Original
Expansion
Plan
0.71
Potential Volume
I m p a c t
C a s e S t u d y
R e t a i l P h a r m a c y
Designing Appropriate Reference Architectures
A reference architecture is a specific set of software and hardware components that together comprise
an Analytics-Ready Infrastructure
22
USER/MACHINE WORKFLOW
Visualization Forecasts Pricing Reports Alerts Scores Offers
NETWORK
LAYER DESCRIPTION EXAMPLES OF PRODUCTS
DATA
FILE SYSTEM/
DATABASES
Enterprise Structured Enterprise Unstructured 3rd Party Web/ Unstructured
ODS
Data
Warehouse
Call
Center
Server
Logs
Financial Demographic
CUSTOM ANALYTICS
ANALYTICS TOOLS
ANALYTICS DATABASES
• Flexible, Compressed, Fast Read
• Columnar, In Memory, Parallel
RDBMS
• High-level programming languages
with packaged analytical modules
• Can be either general purpose or
industry/function specific
• Services
• Advanced models
• Parallel, Distributed
• HDFS or NoSQL
• Interfaces to accept fast and
varied data
“Analytics-Ready
Infrastructure”
COMPUTE
STORAGE
INGEST
• 10Ge, low latency
• Commodity, rack mount
• Purpose built servers
• Internal JBOD, Direct Attached,
Network
SAS R PYTHON SPSS
VERTICA GREENPLUM TERADATA NETEZZA
EXADATA SAP HANA
CLOUDERA MAPR HORTONWORKS PIVOTALHD
MARKLOGIC DATATACTICSORACLE NOSQL
FLUME SQOOP TALEND VELOCIDATA
UCS-C240 UCS-C460 HP 380P HP SL4540
UCS 6200 NEXUS 2200 HP 5800 DELL FORCE10
JBOD SATA JBOD SSD E-SERIES ISILON
Deploying new technologies
and combining with existing
architecture
• How do we create an effective
integrated Big Data stack?
• What new technologies do we
need and how do they fit
together?
Organizing for success
• Where does Big Data fit?
• What belongs in the BUs vs.
centralized?
• Who is responsible for data
integrity?
• Where do we find the critical
resources needed to deliver
Big Data solutions?
Navigating a crowded and
evolving vendor landscape
• How do we separate
marketing hype from reality?
• Who should we use? Who can
we trust
Defining the business value
proposition
• What problem/opportunity
are we pursuing?
• What is the value that can be
created?
Four Major Big Data Challenges Facing Most Companies
In our meetings with customers, four issues are consistently brought up as a major challenges related to
creating a big data capability that can effectively support the business units
23
Key
Big Data
Challenges
Dual Approach to Delivering Big Data Solutions
WWT offers customers both strategic and tactical approaches to derive value from the application of Big Data
analytics and technology
24
• Strategic Roadmap
− Big Data Strategy
− Use Case Design
• Use Case PoC
− Analytics Development
− Workflow Integration
• Data Warehouse Augmentation
− ETL Offload
− Data Lake Creation
• SAP HANA Implementation
• Big Data Stack Build / Optimization
• Production Support & Sustainment
BIG DATA BUSINESS
IMPACT
Extract value from data to drive
multiple Use Cases
BIG DATA TECHNOLOGY
OPTIMIZATION
Accomplish data tasks, faster, cheaper,
better
EXAMPLE SCALE OUT HARDWARE
• Multiple Nexus 6000/ 7000
Series switches
• 5 – 50 Big Data racks
• Cisco SAP HANA scale-out
(e.g. 8-16 UCS-B200)
• Software scale-out
EXAMPLE STARTER KIT:
Cisco SAP HANA Medium Appliance (2 UCS-C460)
• Big Data Solution Stack:
o 2 UCS 6296PP
o Each Big Data rack:
 2 Nexus 2232PP
 8-16 HP DL380 or SL4540, UCS-C240, etc.
o Initially: 1 – 2 racks
o Software: MapR, E.
Service and Solution Offerings
25
• Develop a roadmap for
implementing Big Data
- Use case exploration
- Data Governance,
Infrastructure and
Analytics ownership
• Define high impact use
cases
• Design and test
appropriate reference
architectures
Plan Design Pilot Scale
WWT
Offerings
Indicative
Infra-
structure
• Create detailed
description of selected
pilot use cases
- Analytics
- Workflow
integration
• Test various reference
architectures
• “Stand-up” reference
architecture
• Design the pilot
- Success criteria
- Timeline
- Scope
• Identify and prepare
data
• Build analytical models
• Design workflow
• Implement, manage and
monitor
Analytics-Ready Infrastructure Solution Development
• Implement design
changes from pilot
learnings
• Invest in software
development as
necessary to improve UI
• Prepare ETL process for
scale
• Build out infrastructure
as required to support
rollout
4. Production Support
• Operationalizing POC
• Infrastructure Sustainment
• Training
• Ongoing support
3. Proof of Concept
• POC design
• Analytical models
• Customer data loaded,
processed and analyzed
1.Strategic Roadmap
• Use case definition
• Organizational alignment
• Big Data Architecture high
level design
2. Big Data Stack Build
• Detailed design Big Data
architecture and BOM
• Procure, configure and
deploy Big Data stack
Advanced Technology Center (ATC)
COLLABORATIONENTERPRISE NETWORKS SECURITY DATA CENTER
A highly collaborative, ecosystem to design,
build, educate, demo & deploy advanced
technology solutions for our customers &
partners
Hands-on Access to over $50M in Equipment
• Point Product Demos
• Tech. Training Sessions
• EBCs / ATC Tours
• Tech Days Demos
• Customer Proof of Concepts
• Reference Arch. Dev.
• Product Training / PS
• Version Upgrade Testing
• Version Upgrade Testing
• Strategic Ref. Arch. Demo
(RAD)
• Product Comparison –Func.
• Product Comparison – Perf.
• Customer Access to Lab
• Customer Environment
• Workshop Demos
• Early Field Trials / Beta Code
• Certification
• Next Generation
Networking
• Nexus (7K, 5K, 3K & 2K)
• Virtual Networking
(Nexus 1000v)
• OTV, LISP, Fabric Path
• Layer 2 Extension
• DR/BC Networking
• BYOD (Bring Your Own
Device) & Secure
Mobility
• Jukebox
• ISE & RSA
• ASA 1000v
• VSG (Virtual Security
Gateway)
• Cyber Security Solutions
• Unified
Communications
• Tandberg Video
• VXI (View &
XenDesktop)
• WebEx, Call Center &
Collaboration Solutions
• Phones, Backpacks &
Soft, Phone Clients
• Telepresence &
Business Video
• Vblock, FlexPod &
CloudSystem Matrix
• EMC & NetApp Storage
• vSphere / XenServer
• vCloud Director
• VDI (View /
XenDesktop)
• Cisco CIAC & BMC CLM
• EMC’s UIM & Cloupia
• FAST MDC (Mobile Data
Center) Solutions
26
ATC Big Data Functions: Overview
Three functions of the ATC have been identified, which will support Sales (and other) processes
27
Function Description Usage
Proof of
Concept
• Test customer solutions prior to full onsite
implementation, e.g.
− Run Use Case analytical models and
architectures on Big Data machines
− Create Big Data hardware/software stack,
potentially with client data
• Mid-term project basis, to
provide an environment
for customer, based on a
running engagement
Technology
Comparison
• Compare Big Data solutions to provide insight
into strengths and weaknesses of each
• Run “bake-offs” to gauge how well a full
solution can be solved using certain
components
• To test generic POCs, may
be customer-driven
• Inform Big Data Team on
best solutions
Field Demo • Showcase Big Data capabilities by hosting
demos of WWT PoCs and analysis
− Run Use Case analytical models and
architectures on Big Data machines
• Tool for sales calls and
EBCs
Big Data Environment Set-up: ATC Reference Architectures
28
Four analytics-ready infrastructure stacks have been developed in the ATC to showcase Big Data technologies
DATA
Enterprise Structured Enterprise Unstructured 3rd Party Web/ Unstructured
ODS
Data
Warehouse
Call
Center
Server
Logs
Financial Demographic
STORAGE
REFERENCE
ARCHITECTURE 1
NETWORK
FILE SYSTEM/
DATABASES
ANALYTICS TOOLS
ANALYTICS
DATABASES
COMPUTE
INGEST
REFERENCE
ARCHITECTURE 2
HP Internal Local
Storage
UCS – NetApp Direct
Attached Storage
UCS 6296UP NEXUS 2232PP
UCS-C220M3
REFERENCE
ARCHITECTURE 3
UCS – Isilon Network
Storage
UCS 6296 NEXUS 2200
HAWQ HBASE
PIVOTALHD
UCS-C240
MICROSTRATEGYMICROSTRATEGY
REFERENCE
ARCHITECTURE 4
SAP HANA
HITACHI
UCS B BLADES
JBOD SATA
HORTON
IMPALA
NEXUS 2200
HP DL 380
HBASE
R PYTHON R PYTHONR PYTHON
HITACHINETAPP E5460 ISILON
VELOCIDATA VELOCIDATA VELOCIDATA
MAPR
CLOUDERA CLOUDERA
GEMFIRE
IMPALA HBASE
JAVA JAVA JAVA
In ProcessCurrent In Process
SPLUNK SPLUNK SPLUNK
HORTON MAPR HORTON MAPR
CLOUDERA
SAP HANA
VELOCIDATA SPLUNK
First Step: Big Data Workshop
29

More Related Content

What's hot

Big Data Overview 2013-2014
Big Data Overview 2013-2014Big Data Overview 2013-2014
Big Data Overview 2013-2014KMS Technology
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop IntroductionJayant Mukherjee
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-HadoopNagarjuna D.N
 
Big Data
Big DataBig Data
Big DataNGDATA
 
Big Data Final Presentation
Big Data Final PresentationBig Data Final Presentation
Big Data Final Presentation17aroumougamh
 
Big Data & the Cloud
Big Data & the CloudBig Data & the Cloud
Big Data & the CloudDATAVERSITY
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An OverviewArvind Kalyan
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An OverviewC. Scyphers
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big DataHaluan Irsad
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research reportJULIO GONZALEZ SANZ
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabatinabati
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data AnalyticsTUSHAR GARG
 
introduction to big data frameworks
introduction to big data frameworksintroduction to big data frameworks
introduction to big data frameworksAmal Targhi
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsKaniska Mandal
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...i_scienceEU
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduceRenuSuren
 

What's hot (20)

Big Data Overview 2013-2014
Big Data Overview 2013-2014Big Data Overview 2013-2014
Big Data Overview 2013-2014
 
Big Data & Hadoop Introduction
Big Data & Hadoop IntroductionBig Data & Hadoop Introduction
Big Data & Hadoop Introduction
 
Introduction to Cloud computing and Big Data-Hadoop
Introduction to Cloud computing and  Big Data-HadoopIntroduction to Cloud computing and  Big Data-Hadoop
Introduction to Cloud computing and Big Data-Hadoop
 
Big Data
Big DataBig Data
Big Data
 
Big Data Final Presentation
Big Data Final PresentationBig Data Final Presentation
Big Data Final Presentation
 
Big Data & the Cloud
Big Data & the CloudBig Data & the Cloud
Big Data & the Cloud
 
Big Data - An Overview
Big Data -  An OverviewBig Data -  An Overview
Big Data - An Overview
 
Big Data: An Overview
Big Data: An OverviewBig Data: An Overview
Big Data: An Overview
 
Introduction to Big Data
Introduction to Big DataIntroduction to Big Data
Introduction to Big Data
 
Motivation for big data
Motivation for big dataMotivation for big data
Motivation for big data
 
Big data analytics, research report
Big data analytics, research reportBig data analytics, research report
Big data analytics, research report
 
Big data abstract
Big data abstractBig data abstract
Big data abstract
 
Big data analytics, survey r.nabati
Big data analytics, survey r.nabatiBig data analytics, survey r.nabati
Big data analytics, survey r.nabati
 
Big data Analytics
Big data AnalyticsBig data Analytics
Big data Analytics
 
introduction to big data frameworks
introduction to big data frameworksintroduction to big data frameworks
introduction to big data frameworks
 
Core concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data AnalyticsCore concepts and Key technologies - Big Data Analytics
Core concepts and Key technologies - Big Data Analytics
 
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
Kave Salamatian, Universite de Savoie and Eiko Yoneki, University of Cambridg...
 
Big data analysis using map/reduce
Big data analysis using map/reduceBig data analysis using map/reduce
Big data analysis using map/reduce
 
Big_data_ppt
Big_data_ppt Big_data_ppt
Big_data_ppt
 
Big data
Big dataBig data
Big data
 

Viewers also liked

T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...
T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...
T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...Fujitsu India
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache sparksarith divakar
 
Putting Learners First in Enterprise Training
Putting Learners First in Enterprise TrainingPutting Learners First in Enterprise Training
Putting Learners First in Enterprise TrainingDavid Blake
 
Tracxn Research — IoT Infrastructure Landscape, September 2016
Tracxn Research —  IoT Infrastructure Landscape, September 2016Tracxn Research —  IoT Infrastructure Landscape, September 2016
Tracxn Research — IoT Infrastructure Landscape, September 2016Tracxn
 
Tracxn Research - Blockchain Landscape, November 2016
Tracxn Research - Blockchain Landscape, November 2016Tracxn Research - Blockchain Landscape, November 2016
Tracxn Research - Blockchain Landscape, November 2016Tracxn
 
Tracxn Research — Industrial IoT Landscape, October 2016
Tracxn Research —  Industrial IoT Landscape, October 2016Tracxn Research —  Industrial IoT Landscape, October 2016
Tracxn Research — Industrial IoT Landscape, October 2016Tracxn
 

Viewers also liked (6)

T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...
T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...
T22.Fujitsu World Tour India 2016-Business Intelligence and Data Analytics in...
 
Big data processing with apache spark
Big data processing with apache sparkBig data processing with apache spark
Big data processing with apache spark
 
Putting Learners First in Enterprise Training
Putting Learners First in Enterprise TrainingPutting Learners First in Enterprise Training
Putting Learners First in Enterprise Training
 
Tracxn Research — IoT Infrastructure Landscape, September 2016
Tracxn Research —  IoT Infrastructure Landscape, September 2016Tracxn Research —  IoT Infrastructure Landscape, September 2016
Tracxn Research — IoT Infrastructure Landscape, September 2016
 
Tracxn Research - Blockchain Landscape, November 2016
Tracxn Research - Blockchain Landscape, November 2016Tracxn Research - Blockchain Landscape, November 2016
Tracxn Research - Blockchain Landscape, November 2016
 
Tracxn Research — Industrial IoT Landscape, October 2016
Tracxn Research —  Industrial IoT Landscape, October 2016Tracxn Research —  Industrial IoT Landscape, October 2016
Tracxn Research — Industrial IoT Landscape, October 2016
 

Similar to Cisco event 6 05 2014v3 wwt only

SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"MDS ap
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneySai Paravastu
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febJonathan Woodward
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...Hortonworks
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution ShowcaseInside Analysis
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big dataRaul Chong
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaEdureka!
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014MapR Technologies
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynoteSeattleSIM
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life RevolutionCapgemini
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIBig Data Week
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Denodo
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Precisely
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...Anand Haridass
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientPerficient, Inc.
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data BSP Media Group
 
Capitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi InnovationCapitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi InnovationHitachi Vantara
 
Transforming Business in a Digital Era with Big Data and Microsoft
Transforming Business in a Digital Era with Big Data and MicrosoftTransforming Business in a Digital Era with Big Data and Microsoft
Transforming Business in a Digital Era with Big Data and MicrosoftPerficient, Inc.
 
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev KumarApache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev KumarYahoo Developer Network
 

Similar to Cisco event 6 05 2014v3 wwt only (20)

SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
SAP Forum Ankara 2017 - "Verinin Merkezine Seyahat"
 
BAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, SydneyBAR360 open data platform presentation at DAMA, Sydney
BAR360 open data platform presentation at DAMA, Sydney
 
Data Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th febData Culture Series - Keynote - 24th feb
Data Culture Series - Keynote - 24th feb
 
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
C-BAG Big Data Meetup Chennai Oct.29-2014 Hortonworks and Concurrent on Casca...
 
Big Data in Action – Real-World Solution Showcase
 Big Data in Action – Real-World Solution Showcase Big Data in Action – Real-World Solution Showcase
Big Data in Action – Real-World Solution Showcase
 
02 a holistic approach to big data
02 a holistic approach to big data02 a holistic approach to big data
02 a holistic approach to big data
 
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | EdurekaHadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
Hadoop Tutorial | What is Hadoop | Hadoop Project on Reddit | Edureka
 
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
Fast and Furious: From POC to an Enterprise Big Data Stack in 2014
 
Matt McIlwain opening keynote
Matt McIlwain opening keynoteMatt McIlwain opening keynote
Matt McIlwain opening keynote
 
Big Data - A Real Life Revolution
Big Data - A Real Life RevolutionBig Data - A Real Life Revolution
Big Data - A Real Life Revolution
 
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAIMAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
MAKING SENSE OF IOT DATA W/ BIG DATA + DATA SCIENCE - CHARLES CAI
 
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
Rethink Your 2021 Data Management Strategy with Data Virtualization (ASEAN)
 
Hadoop Perspectives for 2017
Hadoop Perspectives for 2017Hadoop Perspectives for 2017
Hadoop Perspectives for 2017
 
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
2016 Sept 1st - IBM Consultants & System Integrators Interchange - Big Data -...
 
Integrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and PerficientIntegrate Big Data into Your Organization with Informatica and Perficient
Integrate Big Data into Your Organization with Informatica and Perficient
 
Capturing big value in big data
Capturing big value in big data Capturing big value in big data
Capturing big value in big data
 
Capitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi InnovationCapitalize on Big Data Through Hitachi Innovation
Capitalize on Big Data Through Hitachi Innovation
 
Ramesh kutumbaka resume
Ramesh kutumbaka resumeRamesh kutumbaka resume
Ramesh kutumbaka resume
 
Transforming Business in a Digital Era with Big Data and Microsoft
Transforming Business in a Digital Era with Big Data and MicrosoftTransforming Business in a Digital Era with Big Data and Microsoft
Transforming Business in a Digital Era with Big Data and Microsoft
 
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev KumarApache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
Apache Hadoop India Summit 2011 talk "Informatica and Big Data" by Snajeev Kumar
 

Recently uploaded

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?XfilesPro
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetEnjoy Anytime
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machinePadma Pradeep
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksSoftradix Technologies
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationRidwan Fadjar
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsMemoori
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024BookNet Canada
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxnull - The Open Security Community
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationSafe Software
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...HostedbyConfluent
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptxLBM Solutions
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphNeo4j
 

Recently uploaded (20)

08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?How to Remove Document Management Hurdles with X-Docs?
How to Remove Document Management Hurdles with X-Docs?
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your BudgetHyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
Hyderabad Call Girls Khairatabad ✨ 7001305949 ✨ Cheap Price Your Budget
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Install Stable Diffusion in windows machine
Install Stable Diffusion in windows machineInstall Stable Diffusion in windows machine
Install Stable Diffusion in windows machine
 
Benefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other FrameworksBenefits Of Flutter Compared To Other Frameworks
Benefits Of Flutter Compared To Other Frameworks
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
My Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 PresentationMy Hashitalk Indonesia April 2024 Presentation
My Hashitalk Indonesia April 2024 Presentation
 
AI as an Interface for Commercial Buildings
AI as an Interface for Commercial BuildingsAI as an Interface for Commercial Buildings
AI as an Interface for Commercial Buildings
 
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
#StandardsGoals for 2024: What’s new for BISAC - Tech Forum 2024
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptxMaking_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
Making_way_through_DLL_hollowing_inspite_of_CFG_by_Debjeet Banerjee.pptx
 
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry InnovationBeyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
Beyond Boundaries: Leveraging No-Code Solutions for Industry Innovation
 
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
Transforming Data Streams with Kafka Connect: An Introduction to Single Messa...
 
Key Features Of Token Development (1).pptx
Key  Features Of Token  Development (1).pptxKey  Features Of Token  Development (1).pptx
Key Features Of Token Development (1).pptx
 
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge GraphSIEMENS: RAPUNZEL – A Tale About Knowledge Graph
SIEMENS: RAPUNZEL – A Tale About Knowledge Graph
 

Cisco event 6 05 2014v3 wwt only

  • 1. © 2006 Cisco Systems, Inc. All rights reserved. Cisco ConfidentialPresentation_ID 0 Leveraging Big Data to Create Value June 5th, 2014
  • 2. Agenda 2 12-12:30pm Registration and Lunch 12:30-12:40pm Welcome and Introductions -- Art Hansen 12:40-1:45pm Keynote Presentation -- Chris Ward, Brian Vaughan, James Bigger 1:45-2:20pm Hadoop in the Real World by MapR -- David Feldman 2:20-2:30pm Break 2:30-2:45pm Cisco Unified Computing System Rack Mount Servers for Big Data – Wade Ison 2:45-3:30pm Big Data Brainstorm Breakouts 3:30-4:30pm Refreshments, Q&A Session, and Conclusion 4:30pm Raffle Drawing for iPad
  • 3. Big Data as a Competitive Strategy Harvard’s Michael Porter: 1. Cost Leadership Strategy (Wal-Mart) 2. Differentiation Strategy (Southwest) 3. Innovation Strategy (Apple) 4. Operational Effectiveness Strategy (UPS) 5. Technology-based Competitive Strategy
  • 4. What do we have that makes us different? • Custom Apps • Process (Workflow) • Big Data • People • Culture 4
  • 5. Big Data’s Financial Benefits Gartner predicts that “Big Data will deliver transformational benefits to enterprises within 2 to 5 years, and by 2015 will enable enterprises adopting this technology to outperform competitors by 20% in every available financial metric
  • 6. Goals for Today: • High ROI less than a year • Must be applied to things that are important to the business • Use of multiple patterns encouraged • New ways of correlating data that was formally not correlated • Remember Big Data patterns usually require scale • Understand Big Data Major Building Blocks • Learn the major patterns • Understand how to introduce Big Data into the enterprise in practical ways • Identify a solid use case for Big Data Tips for Winning:
  • 7. WWT Big Data Leadership Team 20 years of management consulting and entrepreneurial experience. Expertise in financial services, insurance and telecom. Prior consulting experience with Opera Solutions and A. T. Kearney. Ph.D. in Physics from Oxford University. James Bigger Principal Consultant 15 years in management consulting, analytics and software experience. Expertise in healthcare and insurance. Prior experience with Opera Solutions, Mitchell Madison Group and Broadlane. Ph.D. in Physics from Stanford University. Brian Vaughan Principal Consultant 20 years in management consulting and executive leadership. Expertise in retail, marketing, hospitality & financial services. Prior consulting experience with Opera Solutions and The Boston Consulting Group. BA from Princeton University, MBA from the University of Virginia Darden School of Business. Chris Ward Principal Consultant Over 20 years of experience in a range of IT and security disciplines. Responsible for deploying large, secure, Hadoop-based platforms for the U. S. Government. 10 year of international experience implementing networking and virtual data center environments Undergraduate degree from AIU. Matt DuBell Principal Systems Engineer Over 7 Years of experience in management and analytics consulting. Led engagements in telecom at Opera Solutions. Previous experience performing predictive analytics for NASA and USAF at The Aerospace Corporation. Ph.D. in Mechanical Engineering from Pennsylvania State University. Yoni Malchi Engagement Manager 18 years of analytics and software development experience. Expertise in financial services, healthcare, insurance, retail and marketing science. Prior analytics development experience at Opera Solutions, FICO and J.D. Power and Associates. Ph.D. in Physics from Stanford University. . Jason Lu Chief Scientist Over 7 Years of management consulting and entrepreneurial experience. Expertize in financial services, travel, and retail sectors across US and Europe. Led Big Data strategy and analytical engagements at Opera Solutions. MSci in Astrophysics from the University of Cambridge. Jamie Milne Engagement Manager Over 8 years of experience in analytics consulting and delivery management. Ran engagements in wealth management, corporate security, marketing, education and transportation at Opera Solutions and IBM Global Business Services. BS in Mathematics from Georgetown University. Chris Infanti Engagement Manager Over 20 years of experience in enterprise datacenter, building innovative solutions in Big Data, storage, HPC, virtualization, data migration and enterprise applications. Formerly lead architect for NetApp's Big Data solutions, and led the development of the FlexPod select solutions. B.S. in Electrical Engineering. Prem Jain Principal Architect
  • 8. Volume, Variety and Velocity of Data are Exploding The production of data is expanding at an astonishing rate. Drivers include the switch from analog to digital technologies and the creation of structured and unstructured data by individuals and companies via social media and the Web 8 • Every 60 Seconds: - 98,000+ tweets - 695,000 status updates - 11 million instant messages - 698,445 Google searches - 168 million+ emails sent - 1,820TB of data created - 217 new mobile web users • The need to process more data faster to respond to dynamic business trends has brought new requirements for database architectures • We believe the industry stands at the cusp of the most significant revolution in database and, therefore, application architectures in the past 20 years. VelocityVarietyVolume 0 10 20 30 40 2010 2015 2020 ZB Enterprise Managed Data Enterprise Created Data 0 10 20 30 40 50 60 70 80 2009 2010 2011 2012 2013 2014 Unstructured data storage Structured data storage EB Source: IDC, Gartner, EMC, Worldwide File-Based Storage 2010-2014 Forecast
  • 9. Vendor Landscape Is Crowded and Growing Data Sources & Capture IT Infrastructure Data Management & Integration Analytics Platforms and Solutions Analytics Services and Support Data Vendors Infrastructure Vendors Open Data Platforms Proprietary Data Platforms Extended infrastructure + data platforms Systems Integrators Specialized End-to-End Solutions Analytics Service Providers Vertical Analytics Solutions
  • 10. Distributed File System and Processing Language Characteristics • Parallel storage/processing • Flexible programming model • Horizontal scaling • Batch processing Non-relational Key-Value Database Characteristics • Fast read/write • Real time query • Horizontal scaling • Simple programming model • Dynamic schema Column-Oriented Analytics Database Characteristics • Relational • Efficient compression • Optimized for fast read of many/all records In-Memory Database and Processing Characteristics • Relational • Random Access • Extremely Fast Enablement / Uses • Complex Event Processing • Real Time Analytics • Potential to use a common database for transactions and analytics Enablement / Uses • Pre-processing of data for analytics • ETL for transforming unstructured data to structured • Data summarization Enablement / Uses • Real-time ingest • Rapid retrieval • Input to MapReduce Enablement / Uses • On-Line Analytics Processing (OLAP) • Data storage and retrieval for advanced analytics Foundational Emerging Key Big Data Technologies 10 Hadoop NoSQL Columnar In-Memory
  • 11. The Big Data Software Stack The big data ecosystem includes open source and proprietary distributions that span the stack from ingest through analytics 11 JobFlow USER/MACHINE WORKFLOW Enterprise Structured Enterprise Unstructured 3rd Party Web/ Unstructured Flexible interfaces: TRANSFORM ANALYTICS DATABASE ANALYTICS ACCESS/ QUERIES INGEST FILE SYSTEM/ DATABASE MANAGEMENT Columnar In Memory Parallel RDBMS EMC/PIVOTAL HD / GREENPLUM HP/VERTICA/CLOUDERA ORACLE BIG DATA EXADATA/EXALYTICS IBM INFOSPHERE BIGINSIGHTS SAP HANA TERRACOTTA BIGMEMORY ZOOKEEPER CLOUDERA HORTONWORKS MAPR PIVOTALHD HADOOP CASSANDRA HBASE MONGODB TEREDATA NETEZZA GREENPLUM VERTICA OLAP Natural Language Custom Analytics Custom API’s SQL OPEN SOURCE COMMERCIAL OPEN SOURCE Fast, Scalable Provisioning Maintenance Flexible, Compressed, Fast Read Optimized for high vol reads Interfaces to accept data Real Time & Batch HDFS NoSQL - Document - Key-Value - Wide Column SQL PIG HIVE R PYTHON SAS SPSS Batch Streaming SQOOP FLUME SPLUNK TALEND LAYER PROPERTIES OPTIONS EXAMPLES OF PRODUCTS INTEGRATED OFFERINGS MapReduce HADOOP Parallel, Distributed ODS Data Warehouse Call Center Server Logs Financial Demographic OOZIE DATA ACQUIRE ORGANIZE ANALYZE DECIDE SOLUTIONS MICROSTRATEGY BUSINESS OBJECTS COGNOS ORACLE OBIEE PLUS
  • 12. Technology: Expanding the Traditional Stack Big Data requires a technology stack that leverages existing infrastructure and introduces new technology for distributed parallel processing 12 Queries (SQL) Relational Databases Monolithic Hardware (few CPUs and network computers) “Shared Disk/Memory” Architecture (centralized processing) Direct Record Access or Queries Monolithic Hardware (few CPUs and network computers) “Shared Disk/Memory” Architecture (centralized processing) NoSQL Database Parallel Relational Database Distributed File System High-Performance Traditional Relational Database MapReduce Programs Distributed Hardware (multicore CPUs, multiple computers connected via high-performance network) “Shared Nothing” Architecture (distributed parallel processing) INTERFACE DATABASE/ DISTRIBUTED PROCESSING FRAMEWORK HARDWARE TRADITIONAL RELATIONAL DATABASE STACK STACK FOR THE NEW DATA FOUNDATION Source: IDC, CSC, Gartner
  • 13. Business Need Class of Analytics Analytics: Translating Business Needs to Math Regardless of industry, many use cases translate into a limited class of “math problems” that big-data platforms (unlike transactional platforms) are optimized to solve at scale 13 Method Analytics Ready Stack Hardware & Software • Parallel • Distributed • Shared Nothing • Columnar • NoSQL • In-Memory • ARMA • Decision Trees • Genetic Algorithms • Graph Theory • Kalman Filter • KNN • Linear Regression • Logistic Regression • Matrix Factorization • Monte Carlo • Neural Networks • Sorting • Survival Time Analysis • Visualization • Regression • Classification • Clustering • Forecasting • Optimization • Simulation • Sparse Data Inference • Anomaly Detection • Natural Language Processing • Intelligent Data Design • Recommendation • Risk Scoring • Pricing • Capacity Planning • Cost Reduction • Matching • Retrieval
  • 14. Defining The Business Opportunity Is The Starting Point The power of “Big Data” lies in bringing together data in a timely fashion from sources within and external to the enterprise - structured and unstructured - to create a complete view of critical business issues, therefore enabling advanced analytics to unlock key insights that drive significant business value 14 Outcome Analytics Data Technology Clearly defined use cases with the potential to deliver significant value by distilling vast data into new, previously unknowable intelligence Advanced machine learning techniques to analyze data and mine for insights to drive critical business decisions Structured or unstructured, internal or external, requiring new methods of storage/integration Emerging/new technology stacks using scalable, distributed architectures
  • 15. Telematics is Transforming Auto Insurance Big Data Use Case Combine driving behavioral with actuarial data to create individualized risk models that more accurately predict claims losses that enables risk adjusted pricing to gain market share and increase margins Business Imperative To gain profitable market share, insurance companies need to offer the lowest “risk adjusted” pricing possible to consumers Methods • KNN • Linear Regression • SVD Class of Analytics • Regression • Clustering • Anomaly Detection • Sensors to capture routes, miles driven, time of day, braking patterns, driving speed • Geospatial maps tied to database layers Science & Data HDFS MapReduce NoSQL Data W/H In database Analytics Data Marts Technology Data 15 C a s e S t u d y I n s u r a n c e
  • 16. Predictive Maintenance 16 FTP over MESH Data Logger Data Logger • One per truck • (Logs, Sensors, OEM Alarms, VIMS Service Port) Equipment Maintenance Dispatch & Operator Fuel, Oil Analysis, etc. Hours 1 Urgent Component Problem 2 Critical Sensor Problem Stratifying Alarms 3 Important/Not Urgent Component/Sensor Problem 4 Not Important Component or Sensor Problem 5 Noise - Ignore Data Logger Data Driven Preventative Maintenance Data/Analytics driven timing for preventative maintenance (e.g., oil changes) on individual Trucks1 Urgent Component Problems e.g., Engine, Transmission, Differentials, Torque Converters, Final Drives Major Component Failure Model(s) Project Scope • 252 Trucks – 200 sensors per truck • 7 Mine sites • 10,000 readings/second Data Integration • Integrating 15+ siloed data sources in multiple file formats • 10 Terabytes of data • 3 year historical data ecosystem Business Impact: Higher equipment up-time; reduced critical component failure; better preventative maintenance and increased labor productivity C a s e S t u d y M i n i n g
  • 17. Data Warehouse Augmentation: Value Proposition Augmenting the Data Warehouse with a less expensive Hadoop system will allow companies to free up valuable space on their DW systems to run faster queries and analysis, whilst storing large volumes of their data universe WWT Hadoop Appliance Traditional Data Warehouse Full Data Universe CRM Social Media Billing Web logs Payments Scheduling Cold Data Warm Data Hot Data 2. About 50% of data that is brought into a typical Data Warehouse system is rarely accessed: Cold Data 3. About 80% of the queries and reporting performed on Hot Data does not need to be at DW speeds 1. A significant amount of data is thrown out during the ETL process that may be valuable in the future Traditional Data Warehouse Full Data Universe CRM Social Media Billing Web logs Payments Scheduling Cold Data Warm Data 2. Store Cold Data in Hadoop, taking advantage of lower cost per TB − Teradata: $17K − Hadoop: $2K 3. Continue to take advantage of DW agility and speed in real-time analysis and querying 1. Utilize additional Hadoop-based storage to store full data universe − Files can be stored in natural format Warm Data Hot Data Potential jumping-off point for Big Data Business Impact project CURRENTPROPOSED
  • 18. Integrating Many Data Sources To Provide Lift Purchase History Service History Web Data Campaign Metadata Destination Word clouds Partner Hotels Profiled 100+m transactions for millions of customers Linked data for millions of customer interactions and service records Analyzed billions of page-views for behavioral indicators Extracted meaning from tens of thousands of email campaigns Mapped destinations to key “feature tags” which explain selection Geotagged tens of thousands of partner hotels by understanding free text description C a s e S t u d y G l o b a l A i r l i n e 18 Time Nov 2010 Dec 2010 Jan 2011 Feb 2011 Mar 2011 Apr 2011 May 2011 Jun 2011 Jul 2011 Aug 2011 Sept 2011 Hotel ExperienceFlight Car Rental Holiday Customer Travel Profile ID= xxxx 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Uptake% % Offered Lift 0% 20% 40% 60% 80% 100% 0% 20% 40% 60% 80% 100% Uptake% % Offered Time Nov 2010 Dec 2010 Jan 2011 Feb 2011 Mar 2011 Apr 2011 May 2011 Jun 2011 Jul 2011 Aug 2011 Sept 2011 Hotel ExperienceFlight Car Rental Holiday Customer Travel Profile: ID= xxxx
  • 19. Typically social media tools focus on monitoring past/present activity. Predictive analytics allows users to identify important threads and intervene early, shifting the focus to future activity • Details on particular themes or attributes • Forecasts trend and a mechanism to intervene in attribute that are going viral • Word cloud shows ongoing buzz and sentiment • Tabular view shows emerging themes and sentiment, virality score and recommended time-window for action Social Media Analytics C a s e S t u d y C o n s u m e r E l e c t r o n i c s 19
  • 20. Curriculum Management Engine Curriculum Management Engine We designed a recommendation engine that generates a dynamic set of recommendations on a daily basis (over 1MM/day, from sales force handhelds, website, call centers) that learns and adapts to increase its ability to change behaviors over time through a Curriculum Management Engine Plan for Smith Household: Total Wallet = $600 Aspiration: Achieve 60% share of wallet up from 40% How: • Habituate Pizza and Ice Cream and Increase Frequency • Move Into Dinner Entrees & Sides • Move Into Higher Margin Breakfast Entrees • Increase Frequency of Purchases VISIT #1: 1. Haven’t Bought In A While: 2. Others On My Route Like: 3. Would You Like Another?: 4. Just for You -- $1.00 Off Household Response VISIT #2 1. Would You Like Another? 2. Others On My Route Like: 3. No pizza; not yet consumed 4. Just For You Nature of Recommendations • Individuated Offers – Especially for You • Cross-Sell/ Up-sell – Based on latent needs • Reminders – Haven’t bought in a while • Trials – Never tried but similar people like it • Promotions – Being a loyal customer Recommendations for Grocery Retailer’s Customers Delivered $100 million p.a. in EBIT C a s e S t u d y F o o d G r o c e r
  • 21. Using Internal and External Data with Advanced Analytics for Site Selection • Comprehensive performance data – Fronts store / pharmacy sales – Customer and patient demographics – Local area demographic • Web Scraping and Text Analytics – Neighborhood business profile – Competitor performance – Healthcare alternatives (ER, Urgent Care, PCPs) • Non-linear, multivariate predictive models – Linear/Logistic Regression – Decision Trees (CART) – Random Forest – Gradient Boosting Machine – Neural Networks • Incorporation of all data, including variables usually viewed as “qualitative” 0.0 0.2 0.4 0.6 0.8 1.0 1.2 1.4 1.6 1.8 R = 0.75 M o d e l P e r f o r m a n c e Predicted Patient Volume Actual Patient Volume +17% Model Recommendation 0.83 Original Expansion Plan 0.71 Potential Volume I m p a c t C a s e S t u d y R e t a i l P h a r m a c y
  • 22. Designing Appropriate Reference Architectures A reference architecture is a specific set of software and hardware components that together comprise an Analytics-Ready Infrastructure 22 USER/MACHINE WORKFLOW Visualization Forecasts Pricing Reports Alerts Scores Offers NETWORK LAYER DESCRIPTION EXAMPLES OF PRODUCTS DATA FILE SYSTEM/ DATABASES Enterprise Structured Enterprise Unstructured 3rd Party Web/ Unstructured ODS Data Warehouse Call Center Server Logs Financial Demographic CUSTOM ANALYTICS ANALYTICS TOOLS ANALYTICS DATABASES • Flexible, Compressed, Fast Read • Columnar, In Memory, Parallel RDBMS • High-level programming languages with packaged analytical modules • Can be either general purpose or industry/function specific • Services • Advanced models • Parallel, Distributed • HDFS or NoSQL • Interfaces to accept fast and varied data “Analytics-Ready Infrastructure” COMPUTE STORAGE INGEST • 10Ge, low latency • Commodity, rack mount • Purpose built servers • Internal JBOD, Direct Attached, Network SAS R PYTHON SPSS VERTICA GREENPLUM TERADATA NETEZZA EXADATA SAP HANA CLOUDERA MAPR HORTONWORKS PIVOTALHD MARKLOGIC DATATACTICSORACLE NOSQL FLUME SQOOP TALEND VELOCIDATA UCS-C240 UCS-C460 HP 380P HP SL4540 UCS 6200 NEXUS 2200 HP 5800 DELL FORCE10 JBOD SATA JBOD SSD E-SERIES ISILON
  • 23. Deploying new technologies and combining with existing architecture • How do we create an effective integrated Big Data stack? • What new technologies do we need and how do they fit together? Organizing for success • Where does Big Data fit? • What belongs in the BUs vs. centralized? • Who is responsible for data integrity? • Where do we find the critical resources needed to deliver Big Data solutions? Navigating a crowded and evolving vendor landscape • How do we separate marketing hype from reality? • Who should we use? Who can we trust Defining the business value proposition • What problem/opportunity are we pursuing? • What is the value that can be created? Four Major Big Data Challenges Facing Most Companies In our meetings with customers, four issues are consistently brought up as a major challenges related to creating a big data capability that can effectively support the business units 23 Key Big Data Challenges
  • 24. Dual Approach to Delivering Big Data Solutions WWT offers customers both strategic and tactical approaches to derive value from the application of Big Data analytics and technology 24 • Strategic Roadmap − Big Data Strategy − Use Case Design • Use Case PoC − Analytics Development − Workflow Integration • Data Warehouse Augmentation − ETL Offload − Data Lake Creation • SAP HANA Implementation • Big Data Stack Build / Optimization • Production Support & Sustainment BIG DATA BUSINESS IMPACT Extract value from data to drive multiple Use Cases BIG DATA TECHNOLOGY OPTIMIZATION Accomplish data tasks, faster, cheaper, better
  • 25. EXAMPLE SCALE OUT HARDWARE • Multiple Nexus 6000/ 7000 Series switches • 5 – 50 Big Data racks • Cisco SAP HANA scale-out (e.g. 8-16 UCS-B200) • Software scale-out EXAMPLE STARTER KIT: Cisco SAP HANA Medium Appliance (2 UCS-C460) • Big Data Solution Stack: o 2 UCS 6296PP o Each Big Data rack:  2 Nexus 2232PP  8-16 HP DL380 or SL4540, UCS-C240, etc. o Initially: 1 – 2 racks o Software: MapR, E. Service and Solution Offerings 25 • Develop a roadmap for implementing Big Data - Use case exploration - Data Governance, Infrastructure and Analytics ownership • Define high impact use cases • Design and test appropriate reference architectures Plan Design Pilot Scale WWT Offerings Indicative Infra- structure • Create detailed description of selected pilot use cases - Analytics - Workflow integration • Test various reference architectures • “Stand-up” reference architecture • Design the pilot - Success criteria - Timeline - Scope • Identify and prepare data • Build analytical models • Design workflow • Implement, manage and monitor Analytics-Ready Infrastructure Solution Development • Implement design changes from pilot learnings • Invest in software development as necessary to improve UI • Prepare ETL process for scale • Build out infrastructure as required to support rollout 4. Production Support • Operationalizing POC • Infrastructure Sustainment • Training • Ongoing support 3. Proof of Concept • POC design • Analytical models • Customer data loaded, processed and analyzed 1.Strategic Roadmap • Use case definition • Organizational alignment • Big Data Architecture high level design 2. Big Data Stack Build • Detailed design Big Data architecture and BOM • Procure, configure and deploy Big Data stack
  • 26. Advanced Technology Center (ATC) COLLABORATIONENTERPRISE NETWORKS SECURITY DATA CENTER A highly collaborative, ecosystem to design, build, educate, demo & deploy advanced technology solutions for our customers & partners Hands-on Access to over $50M in Equipment • Point Product Demos • Tech. Training Sessions • EBCs / ATC Tours • Tech Days Demos • Customer Proof of Concepts • Reference Arch. Dev. • Product Training / PS • Version Upgrade Testing • Version Upgrade Testing • Strategic Ref. Arch. Demo (RAD) • Product Comparison –Func. • Product Comparison – Perf. • Customer Access to Lab • Customer Environment • Workshop Demos • Early Field Trials / Beta Code • Certification • Next Generation Networking • Nexus (7K, 5K, 3K & 2K) • Virtual Networking (Nexus 1000v) • OTV, LISP, Fabric Path • Layer 2 Extension • DR/BC Networking • BYOD (Bring Your Own Device) & Secure Mobility • Jukebox • ISE & RSA • ASA 1000v • VSG (Virtual Security Gateway) • Cyber Security Solutions • Unified Communications • Tandberg Video • VXI (View & XenDesktop) • WebEx, Call Center & Collaboration Solutions • Phones, Backpacks & Soft, Phone Clients • Telepresence & Business Video • Vblock, FlexPod & CloudSystem Matrix • EMC & NetApp Storage • vSphere / XenServer • vCloud Director • VDI (View / XenDesktop) • Cisco CIAC & BMC CLM • EMC’s UIM & Cloupia • FAST MDC (Mobile Data Center) Solutions 26
  • 27. ATC Big Data Functions: Overview Three functions of the ATC have been identified, which will support Sales (and other) processes 27 Function Description Usage Proof of Concept • Test customer solutions prior to full onsite implementation, e.g. − Run Use Case analytical models and architectures on Big Data machines − Create Big Data hardware/software stack, potentially with client data • Mid-term project basis, to provide an environment for customer, based on a running engagement Technology Comparison • Compare Big Data solutions to provide insight into strengths and weaknesses of each • Run “bake-offs” to gauge how well a full solution can be solved using certain components • To test generic POCs, may be customer-driven • Inform Big Data Team on best solutions Field Demo • Showcase Big Data capabilities by hosting demos of WWT PoCs and analysis − Run Use Case analytical models and architectures on Big Data machines • Tool for sales calls and EBCs
  • 28. Big Data Environment Set-up: ATC Reference Architectures 28 Four analytics-ready infrastructure stacks have been developed in the ATC to showcase Big Data technologies DATA Enterprise Structured Enterprise Unstructured 3rd Party Web/ Unstructured ODS Data Warehouse Call Center Server Logs Financial Demographic STORAGE REFERENCE ARCHITECTURE 1 NETWORK FILE SYSTEM/ DATABASES ANALYTICS TOOLS ANALYTICS DATABASES COMPUTE INGEST REFERENCE ARCHITECTURE 2 HP Internal Local Storage UCS – NetApp Direct Attached Storage UCS 6296UP NEXUS 2232PP UCS-C220M3 REFERENCE ARCHITECTURE 3 UCS – Isilon Network Storage UCS 6296 NEXUS 2200 HAWQ HBASE PIVOTALHD UCS-C240 MICROSTRATEGYMICROSTRATEGY REFERENCE ARCHITECTURE 4 SAP HANA HITACHI UCS B BLADES JBOD SATA HORTON IMPALA NEXUS 2200 HP DL 380 HBASE R PYTHON R PYTHONR PYTHON HITACHINETAPP E5460 ISILON VELOCIDATA VELOCIDATA VELOCIDATA MAPR CLOUDERA CLOUDERA GEMFIRE IMPALA HBASE JAVA JAVA JAVA In ProcessCurrent In Process SPLUNK SPLUNK SPLUNK HORTON MAPR HORTON MAPR CLOUDERA SAP HANA VELOCIDATA SPLUNK
  • 29. First Step: Big Data Workshop 29

Editor's Notes

  1. .“transformational benefits,” however, will be delivered to very few enterprises according to another Gartner prediction, from December 2011: “Through 2015, more than 85 percent of Fortune 500 organizations will fail to effectively exploit big data for competitive advantage.”
  2. A key understanding of Big Data Analytics is that it doesn’t replace BI or EDW’s that are in use today. It is imperative that organizations include their historical structured data in their analysis. It is the ability to analyze across multiple data sets that delivers new understanding. A big driver of Big Data is the transformation of the data from the structured data we have historically analyzed to the data that is emerging today. Often it is real-time data generated via sensors or click stream. Semi-structured data is data that can often be transformed into structured data if manipulated in the correct manner. Examples include log files or emails. Unstructured data is one of the fastest growing data types and often the richest in information that can be gleaned from it. Examples include video, tweets and other social media and machine generated data. Because of the volume and complexity of the data itself, the preferred approach for processing big data is in clustered computing environments and Massively Parallel Processing (MPP), which enable simultaneous, parallel ingest and data loading, and analysis.
  3. Architectural Independent Multivendor Tony Berg will cover more in-depth