More Related Content
Similar to Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise
Similar to Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise (20)
Pivotal data science_data_engineering_secret_weapons_of_the_strategic_enterprise
- 1. 1© Copyright 2014 EMC Corporation. All rights reserved.
Data Science + Data Engineering
Annika Jimenez
Secret Weapon of the Strategic Enterprise
- 2. 2© Copyright 2014 EMC Corporation. All rights reserved.
Agenda
Data Science: What is it and why do we do it?
The Importance of Data Engineering
An Example: Kaiser Code-a-thon
Transforming Your Enterprise with Pivotal
Pivotal Data Labs: Data Engineering + Data Science
Closing Advice
- 3. 3© Copyright 2014 EMC Corporation. All rights reserved.
What Matters: Apps. Data. Analytics.
Apps power business, and
those apps generate data
Analytic insights from that
data drive new app
functionality, which in-turn
drives new data
The faster you can move
around the cycle, the faster
you learn, innovate and pull
away from the competition
- 4. 4© Copyright 2014 EMC Corporation. All rights reserved.
Primary Motions for Pivotal
Agile: data-driven apps and
rapid time to value
Data Lake: store everything,
analyze anything
Enterprise PaaS: revolutionary
software development and
speed; build the right thing
- 5. 5© Copyright 2014 EMC Corporation. All rights reserved.
DATA SCIENCE
The use of statistical and machine learning
techniques on big, multi-structured data
– in a distributed computing environment –
to identify correlations and causal relationships,
classify and predict events, identify patterns and
anomalies, and infer probabilities,
interest, and sentiment.
- 6. 6© Copyright 2014 EMC Corporation. All rights reserved.
But, why do
we use Data Science?
- 8. 8© Copyright 2014 EMC Corporation. All rights reserved.
Is the Goal Any of These Things?
A. Cool Visualizations
B. Custom Querying
C. Decision Enablement
D. Insights
E. All of the above…
NO
- 9. 9© Copyright 2014 EMC Corporation. All rights reserved.
DRIVE AUTOMATED
LOW LATENCY ACTIONS
IN RESPONSE TO
EVENTS OF INTEREST
- 10. 10© Copyright 2014 EMC Corporation. All rights reserved.
YOUR
DATA
DATA
SCIENCE
+= MODELS
- 11. 11© Copyright 2014 EMC Corporation. All rights reserved.
Drive
Automated
Low Latency
Actions
Production
Data Feeds
Low
Latency
Model
Scoring
API
Availability
or Push to
Apps
Business
Logic
Applicatio
n
Response
New
Events
(aka, Data)Model
Operationalization
(“O16N”)
- 12. 12© Copyright 2014 EMC Corporation. All rights reserved.
Data Science Value Chain
Instrumen-
tation
Logs
Capture
Store
Transform
& Prepare
Access Model Dev. Deploy Apps
Process
Change
Product
Engineer
Data
Engineer
DBA
Data
Engineer
Data
Engineer Data
Scientist
Data
Engineer
Application
Developer
PMO
- 14. 14© Copyright 2014 EMC Corporation. All rights reserved.
Code-a-Thon Details – Logistics
24-Hour Data Science Code-a-Thon
5 resources per vendor
Vendors were asked to be prepared
for any use in the domain
A 15-minute presentation to senior
leaders, executives, doctors and
pharmacists
Teams were required to use Tableau
in their presentation
- 15. 15© Copyright 2014 EMC Corporation. All rights reserved.
Code-a-Thon – Pivotal Team
Hulya Emir-Farinas
Data Scientist
Noah Zimmerman
Data Scientist
Jacque Istok
Application Developer
Dillon Woods
Application Developer
Randy Williard
Big Data Engineer
Jemish Patel
Big Data Engineer
Adam Shook
Big Data Engineer
Roy Mims
Coordinator
- 18. 18© Copyright 2014 EMC Corporation. All rights reserved.
Asthma Population Management Application
- 19. 19© Copyright 2014 EMC Corporation. All rights reserved.
Asthma Management Application
- 20. 20© Copyright 2014 EMC Corporation. All rights reserved.
What Did We Learn in 2013?
Pivotal has a world-class Data Science team, the best there is
Data Science alone is good, but Data Science + Expert Data
Engineering and Architecture is great
Corollary: Data Science + Data Engineering + Apps trumps
everything
– This is the path to rapid value creation and ROI
- 21. 21© Copyright 2014 EMC Corporation. All rights reserved.
DATA
SCIENCE
DATA
ENGINEERING
PIVOTAL
LABS
Data Science + Data Engineering +
Pivotal Labs = The Magic in the Middle
- 22. 22© Copyright 2014 EMC Corporation. All rights reserved.
What Is Pivotal Data Labs?
Data Science Data Engineering
+
- 23. 23© Copyright 2014 EMC Corporation. All rights reserved.
Pivotal Data Scientists are technical
professionals with strong programming
skills, anchored in
vertical/horizontal domains or in
specialized academic research, able to
identify real-world problems
requiring predictive analytics, formulate
these mathematically, and solve them
by applying machine learning and
statistical algorithms, on Big Data,
in Pivotal and third-party
technologies.
Pivotal Data Engineers are Big Data
experts and industry veterans with a
passion to leverage these skills to
drive business value for Pivotal
customers. They posses expert
knowledge and
skills with the Pivotal data products
and excel at architecting enterprise
scale solutions to the most
demanding data and analytic
challenges.
Data Science Data Engineering
+
What Is Pivotal Data Labs?
- 24. 24© Copyright 2014 EMC Corporation. All rights reserved.
What is a
“Data
Scientist”?
ProgrammingSkills
Mathematical/Statistical Skills
- 25. 25© Copyright 2014 EMC Corporation. All rights reserved.
What is a
“Data
Engineer”?
ProgrammingSkills
Architectural Skills
- 26. 26© Copyright 2014 EMC Corporation. All rights reserved.
World’s Leading Experts
Pivotal Labs – Pivotal Data Labs
BATCH BATCH
NEAR TIME NEAR TIMEHAWQGreenplum DB
Pivotal HD
REAL TIME REAL TIMEGemFire XDGemFire
- 27. 27© Copyright 2014 EMC Corporation. All rights reserved.
Pivotal One
SOLUTIONS
Pivotal One
SERVICES
S
PivotalOne
PIVOTAL
MySQL
Elastic Runtime
Services:
Java, Spring, Ruby,
Node.JS
Value-adds:
Installation,
Management
& Monitoring
(Core OSS)
• Data Lake Solutions (Security Analytics, Corp Comm Analytics, Business)
• RTI for Telco (RTI4T)
PIVOTAL
GemFire
XD
PIVOTAL
Data Dispatch
Coming in 2014
Pivotal HD
Hadoop+Que
ry
GPDB
MPP
Analytics
GemFire
In-Memory
Grid
Spring
App
Framework
RabbitMQ
, Redis…
Pivotal One
MARKETPLACE
Pivotal Data Labs in Data Fabric
Building Towards Pivotal One
- 28. 28© Copyright 2014 EMC Corporation. All rights reserved.
Introducing Pivotal Data Labs
Our Charter:
Pivotal Data Labs is Pivotal’s differentiated and highly opinionated
data-centric service delivery organization.
Our Goals:
Expedite customer time-to-value and ROI, by driving business-aligned
innovation and solutions assurance within Pivotal’s Data Fabric
technologies.
Drive customer adoption and autonomy across the full spectrum of
Pivotal Data technologies through best-in-class data science and data
engineering services, with a deep emphasis on knowledge transfer.
- 29. 29© Copyright 2014 EMC Corporation. All rights reserved.
Highly-Opinionated & Differentiated
“Highly-Opinionated” – Highly prescriptive in our
counsel of data best practices to customers and
partners, drawing from best-in-class talent and deep
experience operating on the Pivotal Data Fabric
“Differentiated” – An expert Data services
business that is unique in its class and unlike the
Data services available elsewhere
- 30. 30© Copyright 2014 EMC Corporation. All rights reserved.
What Will PDL Deliver For Customers?
Accelerated time-to-value
and real ROI for customers
- 31. 31© Copyright 2014 EMC Corporation. All rights reserved.
How Do We Do This?
Best-in-class Data Science to drive value creation on
Pivotal stack and customer data
Best-in-class Data Engineering to drive pragmatic, well-
designed, customized architecture for end-to-end Pivotal
stack
Assured solutions success in Pivotal data service delivery
Operationally-optimized predictive models
Collaboration with Pivotal Labs to deliver data-driven
applications
- 32. 32© Copyright 2014 EMC Corporation. All rights reserved.
INSTALL VERIFY ENABLE
We will verify the installation
making sure it’s fully operational
in your environment and ready
to give you the Pivotal
advantage.
Our experts work with your staff
to plan, install and fully-
configure the Pivotal software
based on your environment and
requirements.
Lastly, we’ll train your people
and conduct knowledge transfer
to make sure you are
comfortable using and
supporting Pivotal software.
Getting Started with Pivotal Software
Engagement Management – site prep, project management, customer support overview
Hardware Validation / Installation
Software Installation / Verification
Training & Knowledge
Transfer
PIVOTAL
ONBOARDIN
G
SERVICES
- 33. 33© Copyright 2014 EMC Corporation. All rights reserved.
PIVOTAL
SOLUTION
ASSURANCE
INCEPTION ADVISE SUPPORT
Leveraging expert-services in
Pivotal Data Labs, Pivotal Labs,
and Certified Pivotal Partners
we’ll work with your architects
and developers to assure that
your system design and
development is aligned with
Pivotal best practices.
Getting off the ground with a
well-formed plan, a solid team
and realistic expectations are
fundamental to overall success.
Our experts help with design,
guidance, oversight and
lessons from other customers to
get your initiative going in the
best direction from the start.
.
We’ll act as your conduit to
Certified Professional Services,
Customer Support and Pivotal
R&D to make you successful,
quickly determine answers and
bring in specialized expertise
where needed.
Leverage Pivotal Data Experts for Success
Engagement / Success Alignment – Regular meetings for status and guidance, Resource advice
Architecture Design
Implementation Design and
Assistance
Resource Assistance,
Expedited Response
- 34. 34© Copyright 2014 EMC Corporation. All rights reserved.
DISCOVERY INSIGHTS RESULTS
Once the data is understood, we
set ourselves apart by making
optimal use of Pivotal’s Data
Fabric, our analytical tools and
our data science experts to build
models creating deep actionable
insights on key events of interest
in any use case.
Getting the most from your data
requires understanding what
you have today and discovering
what your data can do for you.
Combining our data scientists
with your data starts the path to
value creation.
Driving insights into actionable
results is enabled through data
scoring and model code
optimization, documentation
and knowledge transfer.
Value Creation With Predictive Insights
Engagement / Business / Technical Alignment – Regular meetings for status and validation
Data exploration, readiness
assessment, and prep Combining domain knowledge and
data science for predictive modeling
Code documentation, and
knowledge sharing
DATA
SCIENCE
LABS
- 35. 35© Copyright 2014 EMC Corporation. All rights reserved.
DATA
LAB INCEPTION IMPLEMENT EXCEL
Delivered by a dedicated team
drawing from Pivotal Data Labs’
experts in architecture, data
engineering, data science, and
application development,
implementations of Pivotal
technologies will be targeted to
maximize value creation against
your specific technical and
business goals.
Getting off the ground with a
well-formed holistic
architectural, data science, and
application plan is fundamental
to overall success. Our experts
will drive this process
leveraging deep experience in
these areas, to streamline the
path to success for your
initiative.
Years of experience building
successful data projects give us
the know-how to quickly and
efficiently work through the
challenging phases of any
project including design,
scaling, integration and
production readiness.
Engagement / Business / Technical Alignment – Regular meetings for status and guidance
Data platform architecting
and strategic analytiic use-
case roadmaps
Data and Application Fabric
deployments, Data Science
modeling & App development Knowledge sharing,
training and expert
assistance
Deep Partnering to Maximize Value Creation
- 36. 36© Copyright 2014 EMC Corporation. All rights reserved.
Pivotal Data Labs + Pivotal Labs =
The Magic in the Middle
RAPID VALUE!
PIVOTAL
LABS
*
PIVOTAL DATA LABS
- 37. 37© Copyright 2014 EMC Corporation. All rights reserved.
My Advice to Enterprises
1. Know your data and its potential value
2. Get “vision” and question status quo
3. Understand the technical paradigm shift underway
4. Hire or grow your Data Dream Team:
Data Scientists and Data Engineers
5. Clear the path to operationalization (aka, value)
6. Manage the disruption, don’t reject it