SlideShare a Scribd company logo
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Spark
SPSS
Modeler
Statistics
scoring
IBMMlib
R
variance
decission tree
algorithm
regression
distribution
propensity
accuracy
binomial
Stratified sample
Analytic Server
Hadoop
Map/Reduce
Gini
Weibull
PCA
gamma
Montecarlo
decision
management
neural network
type I error
cluster
K-means
SQL
learning machine learning
Using IBM Analytics to help learn taking better decisions
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Spark
SPSS
Modeler
Statistics
scoring IBMMlib
Rvariance
decission tree
algorithm
regressiondistribution
propensity
accuracy
binomialStratified sample
Analytic Server
Hadoop
Map/Reduce
Gini
Weibull
PCAgamma
What is Data Science
and what is not
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
CRISP-DM
Business
Understanding
Data
Understanding
Data
Preparation
Modeling
Evaluation
Deployment
Cross Industry Standard Process for Data Mining,
commonly known by its acronym
CRISP-DM,
is a data mining process model that describes
commonly used approaches that data mining
experts use to tackle problems.
In other words is common practice (and common
sense) put in a diagram.
But is it as simple as it seems?
Can we just walk in and start mining?
How I am sure I got the right tool ?
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Simple case: One variable. Straightforward?
Source https://clevertap.com/blog/the-fallacy-of-seeing-patterns/
From the shape of the histogram,
it seems the distribution is left-skewed,
but does it picture the entire story?
The data is represented on 5 intervals between 35
and 85.
A little over 45% of the observations are in the
interval – 65 to 75.
What if we change the number of intervals from
the current 5 to something higher that could give
a better distribution of data among the intervals?
0
20
40
60
80
100
120
140
160
35 45 55 65 75 More
FREQUENCY X
Histogram
Frequency
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Not that much
Source https://clevertap.com/blog/the-fallacy-of-seeing-patterns/
0
10
20
30
40
50
60
FREQUENCY X
Histogram
Frequency
In this histogram,
each interval is of size 3 approximately.
There seems to be a change in the shape of the
distribution now.
The original inference of
left-skewed
distribution is now replaced with a shape that has
2 peaks.
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
This is correlated, does it mean one causes the other?
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Divorce rate in Maine 5 4,7 4,6 4,4 4,3 4,1 4,2 4,2 4,2 4,1
Per capita consumption of margarine 8,2 7 6,5 5,3 5,2 4 4,6 4,5 4,2 3,7
Correlation: 0,992558
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Per capita consumption of chicken 54,2 54 56,8 57,5 59,3 60,5 60,9 59,9 58,7 56
Total US crude oil imports 3,311 3,405 3,336 3,521 3,674 3,67 3,685 3,656 3,571 3,307
Correlation: 0,899899
2000 2001 2002 2003 2004 2005 2006 2007 2008 2009
Number of people who died by becoming
tangled in their bedsheets 327 456 509 497 596 573 661 741 809 717
Total revenue generated by skiing facilities 1,551 1,635 1,801 1,827 1,956 1,989 2,178 2,257 2,476 2,438
Correlation: 0,969724
Source: http://tylervigen.com/spurious-correlations
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Correlation does not mean causation
That
correlation proves causation,
is considered a questionable cause
logical fallacy
when two events occurring together are taken to
have established a cause-and-effect relationship. This
fallacy is also known as
cum hoc ergo propter hoc,
Latin for
with this, therefore because of this,
and "false cause." A similar fallacy, that an event that
followed another was necessarily a consequence of
the first event, is the
post hoc ergo propter hoc ,
Latin for
after this, therefore because of this.
fallacy.
Source: http://tylervigen.com/spurious-correlations & Wikipedia
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Need training (who doesn’t)? No problem.
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Specifics on Data Science
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Bigdata with Hortonworks
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Try Watson Analytics for free
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Watson Analytics
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Cognitive classes
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Train yourself …
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
…for free …
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
… with a free and open environment available
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Spark
SPSS
Modeler
Statistics
scoring IBMMlib
Rvariance
decission tree
algorithm
regressiondistribution
propensity
accuracy
binomialStratified sample
Analytic Server
Hadoop
Map/Reduce
Gini
Weibull
PCAgamma
I understood IBM is with open source, but we need support and guarantee
Any commercial software?
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Data Science & Machine Learning
A product for each of the techniques / user profile
Data Science
Experience
SPSS
Decision
Optimization
Machine Learning Watson Analytics
IBM is making data science and machine learning simple and open.
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Data Science Experience
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
IBM SPSS Product portfolio
IBM SPSS
Modeler Gold
IBM SPSS
Modeler
Professional
IBM SPSS
Analytic Server
IBM SPSS
Statistics
IBM SPSS
C&DS
IBM SPSS
Decision
Management
IBM SPSS
Modeler
Premium
SparkSPSS ModelerStatisticsscoring IBMMlib
Rvariance decission treealgorithm
regressiondistributionpropensity accuracybinomial
variable
Stratified sample
Analytic Server
Hadoop Map/ReduceGini
Weibull
PCA
Doubts, concerns, questions, suggestions?
Let me know:
ramiro.rego@es.ibm.com
Thanks!

More Related Content

More from eMadrid network

A study about the impact of rewards on student's engagement with the flipped ...
A study about the impact of rewards on student's engagement with the flipped ...A study about the impact of rewards on student's engagement with the flipped ...
A study about the impact of rewards on student's engagement with the flipped ...
eMadrid network
 
Assessment and recognition in technical massive open on-line courses with and...
Assessment and recognition in technical massive open on-line courses with and...Assessment and recognition in technical massive open on-line courses with and...
Assessment and recognition in technical massive open on-line courses with and...
eMadrid network
 
Recognition of learning: Status, experiences and challenges - Carlos Delgado ...
Recognition of learning: Status, experiences and challenges - Carlos Delgado ...Recognition of learning: Status, experiences and challenges - Carlos Delgado ...
Recognition of learning: Status, experiences and challenges - Carlos Delgado ...
eMadrid network
 
Bootstrapping serious games to assess learning through analytics - Baltasar F...
Bootstrapping serious games to assess learning through analytics - Baltasar F...Bootstrapping serious games to assess learning through analytics - Baltasar F...
Bootstrapping serious games to assess learning through analytics - Baltasar F...
eMadrid network
 
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth CobosMeta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
eMadrid network
 
Best paper Award - Miguel Castro
Best paper Award - Miguel CastroBest paper Award - Miguel Castro
Best paper Award - Miguel Castro
eMadrid network
 
eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...
eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...
eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...
eMadrid network
 
Seminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdf
Seminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdfSeminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdf
Seminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdf
eMadrid network
 
eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...
eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...
eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...
eMadrid network
 
Open_principles_and_co-creation_for_digital_competences_for_students.pdf
Open_principles_and_co-creation_for_digital_competences_for_students.pdfOpen_principles_and_co-creation_for_digital_competences_for_students.pdf
Open_principles_and_co-creation_for_digital_competences_for_students.pdf
eMadrid network
 
Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...
Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...
Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...
eMadrid network
 
eMadrid_KatjaAssaf_DigiCred.pdf
eMadrid_KatjaAssaf_DigiCred.pdfeMadrid_KatjaAssaf_DigiCred.pdf
eMadrid_KatjaAssaf_DigiCred.pdf
eMadrid network
 
Presentazione E-Madrid - 12-01-2023 Ruth Kerr.pdf
Presentazione E-Madrid - 12-01-2023 Ruth Kerr.pdfPresentazione E-Madrid - 12-01-2023 Ruth Kerr.pdf
Presentazione E-Madrid - 12-01-2023 Ruth Kerr.pdf
eMadrid network
 
EDC-eMadrid_20230113 Ildikó Mázár.pdf
EDC-eMadrid_20230113 Ildikó Mázár.pdfEDC-eMadrid_20230113 Ildikó Mázár.pdf
EDC-eMadrid_20230113 Ildikó Mázár.pdf
eMadrid network
 
2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...
2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...
2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...
eMadrid network
 
2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»
2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»
2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»
eMadrid network
 
2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...
2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...
2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...
eMadrid network
 
2022_11_11 «AI and ML methods for Multimodal Learning Analytics»
2022_11_11 «AI and ML methods for Multimodal Learning Analytics»2022_11_11 «AI and ML methods for Multimodal Learning Analytics»
2022_11_11 «AI and ML methods for Multimodal Learning Analytics»
eMadrid network
 
2022_11_11 «The promise and challenges of Multimodal Learning Analytics»
2022_11_11 «The promise and challenges of Multimodal Learning Analytics»2022_11_11 «The promise and challenges of Multimodal Learning Analytics»
2022_11_11 «The promise and challenges of Multimodal Learning Analytics»
eMadrid network
 
2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...
2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...
2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...
eMadrid network
 

More from eMadrid network (20)

A study about the impact of rewards on student's engagement with the flipped ...
A study about the impact of rewards on student's engagement with the flipped ...A study about the impact of rewards on student's engagement with the flipped ...
A study about the impact of rewards on student's engagement with the flipped ...
 
Assessment and recognition in technical massive open on-line courses with and...
Assessment and recognition in technical massive open on-line courses with and...Assessment and recognition in technical massive open on-line courses with and...
Assessment and recognition in technical massive open on-line courses with and...
 
Recognition of learning: Status, experiences and challenges - Carlos Delgado ...
Recognition of learning: Status, experiences and challenges - Carlos Delgado ...Recognition of learning: Status, experiences and challenges - Carlos Delgado ...
Recognition of learning: Status, experiences and challenges - Carlos Delgado ...
 
Bootstrapping serious games to assess learning through analytics - Baltasar F...
Bootstrapping serious games to assess learning through analytics - Baltasar F...Bootstrapping serious games to assess learning through analytics - Baltasar F...
Bootstrapping serious games to assess learning through analytics - Baltasar F...
 
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth CobosMeta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
Meta-review of recognition of learning in LMS and MOOCs - Ruth Cobos
 
Best paper Award - Miguel Castro
Best paper Award - Miguel CastroBest paper Award - Miguel Castro
Best paper Award - Miguel Castro
 
eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...
eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...
eMadrid Gaming4Coding - Possibilities of game learning analytics for coding l...
 
Seminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdf
Seminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdfSeminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdf
Seminario eMadrid_Curso MOOC_Antonio de Nebrija_Apología del saber.pptx.pdf
 
eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...
eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...
eMadrid-Opportunities and Design Challenges in the Gaming4Coding Project_Pete...
 
Open_principles_and_co-creation_for_digital_competences_for_students.pdf
Open_principles_and_co-creation_for_digital_competences_for_students.pdfOpen_principles_and_co-creation_for_digital_competences_for_students.pdf
Open_principles_and_co-creation_for_digital_competences_for_students.pdf
 
Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...
Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...
Competencias_digitales_del_profesorado_universitario_para_la_educación_abiert...
 
eMadrid_KatjaAssaf_DigiCred.pdf
eMadrid_KatjaAssaf_DigiCred.pdfeMadrid_KatjaAssaf_DigiCred.pdf
eMadrid_KatjaAssaf_DigiCred.pdf
 
Presentazione E-Madrid - 12-01-2023 Ruth Kerr.pdf
Presentazione E-Madrid - 12-01-2023 Ruth Kerr.pdfPresentazione E-Madrid - 12-01-2023 Ruth Kerr.pdf
Presentazione E-Madrid - 12-01-2023 Ruth Kerr.pdf
 
EDC-eMadrid_20230113 Ildikó Mázár.pdf
EDC-eMadrid_20230113 Ildikó Mázár.pdfEDC-eMadrid_20230113 Ildikó Mázár.pdf
EDC-eMadrid_20230113 Ildikó Mázár.pdf
 
2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...
2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...
2022_12_16 «“La informática en la educación escolar en Europa”, informe Euryd...
 
2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»
2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»
2022_12_16 «Informatics – A Fundamental Discipline for the 21st Century»
 
2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...
2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...
2022_12_16 «Efecto del uso de lenguajes basados en bloques en el aprendizaje ...
 
2022_11_11 «AI and ML methods for Multimodal Learning Analytics»
2022_11_11 «AI and ML methods for Multimodal Learning Analytics»2022_11_11 «AI and ML methods for Multimodal Learning Analytics»
2022_11_11 «AI and ML methods for Multimodal Learning Analytics»
 
2022_11_11 «The promise and challenges of Multimodal Learning Analytics»
2022_11_11 «The promise and challenges of Multimodal Learning Analytics»2022_11_11 «The promise and challenges of Multimodal Learning Analytics»
2022_11_11 «The promise and challenges of Multimodal Learning Analytics»
 
2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...
2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...
2022_11_11 «Biometrics and Behavior Understanding Technologies for e-Learning...
 

Recently uploaded

Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
TechSoup
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
SACHIN R KONDAGURI
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
Celine George
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
tarandeep35
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
Pavel ( NSTU)
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
deeptiverma2406
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Thiyagu K
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
Levi Shapiro
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
DhatriParmar
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
DhatriParmar
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
Krisztián Száraz
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
MysoreMuleSoftMeetup
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Mohd Adib Abd Muin, Senior Lecturer at Universiti Utara Malaysia
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
Nguyen Thanh Tu Collection
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
David Douglas School District
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
Mohammed Sikander
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Dr. Vinod Kumar Kanvaria
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
Wasim Ak
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
EugeneSaldivar
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
Israel Genealogy Research Association
 

Recently uploaded (20)

Introduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp NetworkIntroduction to AI for Nonprofits with Tapp Network
Introduction to AI for Nonprofits with Tapp Network
 
"Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe..."Protectable subject matters, Protection in biotechnology, Protection of othe...
"Protectable subject matters, Protection in biotechnology, Protection of othe...
 
How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17How to Make a Field invisible in Odoo 17
How to Make a Field invisible in Odoo 17
 
S1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptxS1-Introduction-Biopesticides in ICM.pptx
S1-Introduction-Biopesticides in ICM.pptx
 
Synthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptxSynthetic Fiber Construction in lab .pptx
Synthetic Fiber Construction in lab .pptx
 
Best Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDABest Digital Marketing Institute In NOIDA
Best Digital Marketing Institute In NOIDA
 
Unit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdfUnit 2- Research Aptitude (UGC NET Paper I).pdf
Unit 2- Research Aptitude (UGC NET Paper I).pdf
 
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
June 3, 2024 Anti-Semitism Letter Sent to MIT President Kornbluth and MIT Cor...
 
The Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptxThe Accursed House by Émile Gaboriau.pptx
The Accursed House by Émile Gaboriau.pptx
 
The Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptxThe Diamond Necklace by Guy De Maupassant.pptx
The Diamond Necklace by Guy De Maupassant.pptx
 
Advantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO PerspectiveAdvantages and Disadvantages of CMS from an SEO Perspective
Advantages and Disadvantages of CMS from an SEO Perspective
 
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
Mule 4.6 & Java 17 Upgrade | MuleSoft Mysore Meetup #46
 
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptxChapter 4 - Islamic Financial Institutions in Malaysia.pptx
Chapter 4 - Islamic Financial Institutions in Malaysia.pptx
 
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
BÀI TẬP BỔ TRỢ TIẾNG ANH GLOBAL SUCCESS LỚP 3 - CẢ NĂM (CÓ FILE NGHE VÀ ĐÁP Á...
 
Pride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School DistrictPride Month Slides 2024 David Douglas School District
Pride Month Slides 2024 David Douglas School District
 
Multithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race conditionMultithreading_in_C++ - std::thread, race condition
Multithreading_in_C++ - std::thread, race condition
 
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
Exploiting Artificial Intelligence for Empowering Researchers and Faculty, In...
 
Normal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of LabourNormal Labour/ Stages of Labour/ Mechanism of Labour
Normal Labour/ Stages of Labour/ Mechanism of Labour
 
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...TESDA TM1 REVIEWER  FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
TESDA TM1 REVIEWER FOR NATIONAL ASSESSMENT WRITTEN AND ORAL QUESTIONS WITH A...
 
The Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collectionThe Diamonds of 2023-2024 in the IGRA collection
The Diamonds of 2023-2024 in the IGRA collection
 

VII Jornadas eMadrid "Education in exponential times". "Uso de IBM Analytics para aprender a tomar mejores decisiones". Ramiro Regó Álvarez. 05/07/2017.

  • 1. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Spark SPSS Modeler Statistics scoring IBMMlib R variance decission tree algorithm regression distribution propensity accuracy binomial Stratified sample Analytic Server Hadoop Map/Reduce Gini Weibull PCA gamma Montecarlo decision management neural network type I error cluster K-means SQL learning machine learning Using IBM Analytics to help learn taking better decisions
  • 2. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Spark SPSS Modeler Statistics scoring IBMMlib Rvariance decission tree algorithm regressiondistribution propensity accuracy binomialStratified sample Analytic Server Hadoop Map/Reduce Gini Weibull PCAgamma What is Data Science and what is not
  • 3. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA CRISP-DM Business Understanding Data Understanding Data Preparation Modeling Evaluation Deployment Cross Industry Standard Process for Data Mining, commonly known by its acronym CRISP-DM, is a data mining process model that describes commonly used approaches that data mining experts use to tackle problems. In other words is common practice (and common sense) put in a diagram. But is it as simple as it seems? Can we just walk in and start mining? How I am sure I got the right tool ?
  • 4. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Simple case: One variable. Straightforward? Source https://clevertap.com/blog/the-fallacy-of-seeing-patterns/ From the shape of the histogram, it seems the distribution is left-skewed, but does it picture the entire story? The data is represented on 5 intervals between 35 and 85. A little over 45% of the observations are in the interval – 65 to 75. What if we change the number of intervals from the current 5 to something higher that could give a better distribution of data among the intervals? 0 20 40 60 80 100 120 140 160 35 45 55 65 75 More FREQUENCY X Histogram Frequency
  • 5. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Not that much Source https://clevertap.com/blog/the-fallacy-of-seeing-patterns/ 0 10 20 30 40 50 60 FREQUENCY X Histogram Frequency In this histogram, each interval is of size 3 approximately. There seems to be a change in the shape of the distribution now. The original inference of left-skewed distribution is now replaced with a shape that has 2 peaks.
  • 6. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA This is correlated, does it mean one causes the other? 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Divorce rate in Maine 5 4,7 4,6 4,4 4,3 4,1 4,2 4,2 4,2 4,1 Per capita consumption of margarine 8,2 7 6,5 5,3 5,2 4 4,6 4,5 4,2 3,7 Correlation: 0,992558 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Per capita consumption of chicken 54,2 54 56,8 57,5 59,3 60,5 60,9 59,9 58,7 56 Total US crude oil imports 3,311 3,405 3,336 3,521 3,674 3,67 3,685 3,656 3,571 3,307 Correlation: 0,899899 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 Number of people who died by becoming tangled in their bedsheets 327 456 509 497 596 573 661 741 809 717 Total revenue generated by skiing facilities 1,551 1,635 1,801 1,827 1,956 1,989 2,178 2,257 2,476 2,438 Correlation: 0,969724 Source: http://tylervigen.com/spurious-correlations
  • 7. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Correlation does not mean causation That correlation proves causation, is considered a questionable cause logical fallacy when two events occurring together are taken to have established a cause-and-effect relationship. This fallacy is also known as cum hoc ergo propter hoc, Latin for with this, therefore because of this, and "false cause." A similar fallacy, that an event that followed another was necessarily a consequence of the first event, is the post hoc ergo propter hoc , Latin for after this, therefore because of this. fallacy. Source: http://tylervigen.com/spurious-correlations & Wikipedia
  • 8. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Need training (who doesn’t)? No problem.
  • 9. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Specifics on Data Science
  • 10. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Bigdata with Hortonworks
  • 11. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Try Watson Analytics for free
  • 12. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Watson Analytics
  • 13. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Cognitive classes
  • 14. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Train yourself …
  • 15. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA …for free …
  • 16. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA … with a free and open environment available
  • 17. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Spark SPSS Modeler Statistics scoring IBMMlib Rvariance decission tree algorithm regressiondistribution propensity accuracy binomialStratified sample Analytic Server Hadoop Map/Reduce Gini Weibull PCAgamma I understood IBM is with open source, but we need support and guarantee Any commercial software?
  • 18. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Data Science & Machine Learning A product for each of the techniques / user profile Data Science Experience SPSS Decision Optimization Machine Learning Watson Analytics IBM is making data science and machine learning simple and open.
  • 19. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Data Science Experience
  • 20. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA IBM SPSS Product portfolio IBM SPSS Modeler Gold IBM SPSS Modeler Professional IBM SPSS Analytic Server IBM SPSS Statistics IBM SPSS C&DS IBM SPSS Decision Management IBM SPSS Modeler Premium
  • 21. SparkSPSS ModelerStatisticsscoring IBMMlib Rvariance decission treealgorithm regressiondistributionpropensity accuracybinomial variable Stratified sample Analytic Server Hadoop Map/ReduceGini Weibull PCA Doubts, concerns, questions, suggestions? Let me know: ramiro.rego@es.ibm.com Thanks!