0
Analytics
and Big Data
Analytics
Robin Bloor Ph D
The Sequence of Topics….
1

2
3
4

5

Data Science?
The Nature of
Analytics
Machine Learning Et
Al
The Business
Perspectiv...
1
What Is Data Science?
 There

is no “data
science.” It’s a
misnomer
 All science is
empirical and involves
data analysis...
What Is A Data Scientist?
 Project

manager
 Qualified statistician
 Domain Business
expert
 Experienced data
architec...
Data Scientist v Business Analysts
 Claims

that business
analysts can be data
scientists are dubious
 Good practitioner...
Who Understands Data?
Nevertheless!

You can know more
about a business
from its data than
by any other
means
2

The
Nature
Of
Analytics
The Field of Business Intelligence
Hindsight

• Regular
reporting/operational
BI

Oversight

• Dashboards, OLAP,
BPM, etc....
The Driving Force is Insight
A Process Not An Activity
 Data Analytics is a multidisciplinary end-to-end
process
 Until recently it was a
walled-gard...
The Data Analytics Process - Detail
The CRITICAL Workload Issue
 Previously, we viewed
database workloads as
an i/o optimization
problem
 With analytics the...
3

Machine
Learning
Et Al
Analytical Latencies
1 Data access
2 Data preparation
3 Model development
4 Execution
5 Implementation
6 Model Audit & Upd...
The Open Source Dynamic
 The R Language
 Over 1 million
users
 Hadoop and its
Ecosystem
 Reduced latency
for analytics...
Machine Learning Algorithms - 1
 There are many:
 Neural network(s)
 Bayesian networks
 Decisions
trees/random
forests...
Machine Learning Algorithms - 2
 They are not newly
invented
 We did not
previously use them
much because we
never had t...
Machine Learning Algorithms - 3
 Machine learning
algorithms can check
all possibilities
 We never had the
computer powe...
The Impact?
 Machine learning
and processing
power (parallelism)
will change the
data analysis
process
 The analytics te...
4

The
Business
Perspective
Business Metamorphosis
 The role of data
analysis has not
changed
 Only the speed has
changed
 The process will
evolve
...
The Data Analysis Budget
 Data Analysis is
Business R&D
 The focus is on
business process
 The outcome of
successful R&...
The Data Analysis Budget
 Data Analysis is
Business R&D
 The focus is on
business process
 The outcome of
successful R&...
5

The
Future
Non èfinitafino a quando
la signora grassacanta
 Hardware disruption
 Software disruption
 Business process
disruption
...
In Summary…
1

2
3
4

5

Data Science?
The Nature of
Analytics
Machine Learning Et
Al
The Business
Perspective
The Future
Analytics and Big Data Analytics
Analytics and Big Data Analytics
Upcoming SlideShare
Loading in...5
×

Analytics and Big Data Analytics

2,752

Published on

View Dr. Robin Bloor's presentation from the Dec. 2013 Big Data Conference in Rome.

Published in: Technology, Education

Transcript of "Analytics and Big Data Analytics"

  1. 1. Analytics and Big Data Analytics Robin Bloor Ph D
  2. 2. The Sequence of Topics…. 1 2 3 4 5 Data Science? The Nature of Analytics Machine Learning Et Al The Business Perspective The Future
  3. 3. 1
  4. 4. What Is Data Science?  There is no “data science.” It’s a misnomer  All science is empirical and involves data analysis.  Science implements a method.  So do statisticians
  5. 5. What Is A Data Scientist?  Project manager  Qualified statistician  Domain Business expert  Experienced data architect  Software engineer (It’s a team)
  6. 6. Data Scientist v Business Analysts  Claims that business analysts can be data scientists are dubious  Good practitioners of statistics understand data (from years of training)  Software understands nothing, it simply implements algorithms
  7. 7. Who Understands Data?
  8. 8. Nevertheless! You can know more about a business from its data than by any other means
  9. 9. 2 The Nature Of Analytics
  10. 10. The Field of Business Intelligence Hindsight • Regular reporting/operational BI Oversight • Dashboards, OLAP, BPM, etc. Insight • Data mining, statistical analysis Foresight • Predictive analytics
  11. 11. The Driving Force is Insight
  12. 12. A Process Not An Activity  Data Analytics is a multidisciplinary end-to-end process  Until recently it was a walled-garden. But recently the walls were torn down by…  Data availability  Scalable technology  Open source tools
  13. 13. The Data Analytics Process - Detail
  14. 14. The CRITICAL Workload Issue  Previously, we viewed database workloads as an i/o optimization problem  With analytics the workload is a very variable mix of i/o and calculation  No databases were built for this – not even Big Data databases
  15. 15. 3 Machine Learning Et Al
  16. 16. Analytical Latencies 1 Data access 2 Data preparation 3 Model development 4 Execution 5 Implementation 6 Model Audit & Update Speed = value (probably)
  17. 17. The Open Source Dynamic  The R Language  Over 1 million users  Hadoop and its Ecosystem  Reduced latency for analytics  Machine Learning Algorithms  Raw power None of these are engineered for performance
  18. 18. Machine Learning Algorithms - 1  There are many:  Neural network(s)  Bayesian networks  Decisions trees/random forests  Support vector machines  K-means  Clustering  Regression(s)  Etc.
  19. 19. Machine Learning Algorithms - 2  They are not newly invented  We did not previously use them much because we never had the computer power  Now that we have the power (at a price) we can employ them
  20. 20. Machine Learning Algorithms - 3  Machine learning algorithms can check all possibilities  We never had the computer power  Now that we have the power (at a price) we can employ them
  21. 21. The Impact?  Machine learning and processing power (parallelism) will change the data analysis process  The analytics team needs to understand IT
  22. 22. 4 The Business Perspective
  23. 23. Business Metamorphosis  The role of data analysis has not changed  Only the speed has changed  The process will evolve  It will be disruptive for incumbent vendors
  24. 24. The Data Analysis Budget  Data Analysis is Business R&D  The focus is on business process  The outcome of successful R&D is a changed process  Think of manufacturing for a useful analogy
  25. 25. The Data Analysis Budget  Data Analysis is Business R&D  The focus is on business process  The outcome of successful R&D is a changed process  Think of manufacturing for a useful analogy
  26. 26. 5 The Future
  27. 27. Non èfinitafino a quando la signora grassacanta  Hardware disruption  Software disruption  Business process disruption  All we know is:  Analytical processing will get faster  Analytic latencies will reduce  Data will continue to grow  Analytics will be a differentiator
  28. 28. In Summary… 1 2 3 4 5 Data Science? The Nature of Analytics Machine Learning Et Al The Business Perspective The Future
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×