This presentation is based on ``Statistical Modeling: The two cultures'' from Leo Breiman. It compares the data modeling culture (statistics) and the algorithmic modeling culture (machine learning).
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
This is the presentation accompanying my tutorial about deep learning methods in the recommender systems domain. The tutorial consists of a brief general overview of deep learning and the introduction of the four most prominent research direction of DL in recsys as of 2017. Presented during RecSys Summer School 2017 in Bolzano, Italy.
RFM Segmentation is the easiest and most frequently used form of database segmentation. It is based on three key metrics: Recency, Frequency and Monetary Value of customer activity. RFM is often used with transactional history in e-commerce, but can also work for Social Media interactions, online gaming or discussion boards. Based on calculated segments a marketer can prepare cross-sell, up-sell, retention and reactivation capampaigns. This deck provides a simple introduction to the RFM Segmentation methodology.
Deep Learning in Recommender Systems - RecSys Summer School 2017Balázs Hidasi
This is the presentation accompanying my tutorial about deep learning methods in the recommender systems domain. The tutorial consists of a brief general overview of deep learning and the introduction of the four most prominent research direction of DL in recsys as of 2017. Presented during RecSys Summer School 2017 in Bolzano, Italy.
RFM Segmentation is the easiest and most frequently used form of database segmentation. It is based on three key metrics: Recency, Frequency and Monetary Value of customer activity. RFM is often used with transactional history in e-commerce, but can also work for Social Media interactions, online gaming or discussion boards. Based on calculated segments a marketer can prepare cross-sell, up-sell, retention and reactivation capampaigns. This deck provides a simple introduction to the RFM Segmentation methodology.
Prediction of Diamond Prices Using Multivariate RegressionMohitMhapuskar
The prices of precious diamonds are primarily determined by some sort of combination of the four C's : Carat, Color, Cut and Clarity. Our team used SAS to implement feature selection and multivariate regression to create a regression model that would allow us to predict the prices of diamonds based on those intrinsic characteristics. Our model achieved an accuracy of 94%.
Understand what value can be gained by using simulation-based predictive analytics for supply chain, distribution center, logistics and warehouse design, operations, and improvement
Knowledge Graphs have proven to be extremely valuable to rec-
ommender systems, as they enable hybrid graph-based recommen-
dation models encompassing both collaborative and content infor-
mation. Leveraging this wealth of heterogeneous information for
top-N item recommendation is a challenging task, as it requires the
ability of effectively encoding a diversity of semantic relations and
connectivity patterns. In this work, we propose entity2rec, a novel
approach to learning user-item relatedness from knowledge graphs
for top-N item recommendation. We start from a knowledge graph
modeling user-item and item-item relations and we learn property-
specific vector representations of users and items applying neural
language models on the network. These representations are used
to create property-specific user-item relatedness features, which
are in turn fed into learning to rank algorithms to learn a global
relatedness model that optimizes top-N item recommendations. We
evaluate the proposed approach in terms of ranking quality on
the MovieLens 1M dataset, outperforming a number of state-of-
the-art recommender systems, and we assess the importance of
property-specific relatedness scores on the overall ranking quality.
In this talk we will explain some of the main challenges that we faced at OLX Europe while trying to proof the value of a deep learning based recommender system, and to later productionize it with a high level of automation.
We'll talk about:
* Modern Recommender Systems
* Deep Learning
* Neural Item Embeddings
* Similarity Search
* Proving value through Experimentation
* From POC to PRD
* Lessons Learned
About the speakers:
Cristian Martinez works as Lead Data Scientist at OLX Group, mainly focused on Search and Recommenders, and has been working for more than a decade in different companies solving business problems with Machine Learning.
Ilia Ivanov is a Data Scientist in OLX Europe (online marketplace) with 4 years of experience in DS focusing on recommendations and NLP.
Slide helps in generating an understand about the intuition and mathematics / stats behind association rule mining. This presentation starts by highlighting the difference between causal and correlation. This is followed Apriori algorithm and the metrics which are used with it. Each metric is discussed in detail. Then a formulation has been generated in classification setting which can be used to generate rules i.e. rule mining.
Other Reference: https://www.slideshare.net/JustinCletus/mining-frequent-patterns-association-and-correlations
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
Data wrangling is the process of removing errors and combining complex data sets to make them more accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources available today, storing and organizing large quantities of data for analysis is becoming increasingly necessary.Data wrangling is the process of removing errors and combining complex data sets to make them more accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources available today, storing and organizing large quantities of data for analysis is becoming increasingly necessary.Data wrangling is the process of removing errors and combining complex data sets to make them more accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources available today, storing and organizing large quantities of data for analysis is becoming increasingly necessary.
Next-generation integrated business planning: The Deloitte Perspective | AnaplanAnaplan
How can smart planning help your business become more agile and connected? What business activities can benefit from an integrated, real-time planning approach?
In this informative webinar, Bart Hughes from Deloitte will share perspectives on what business planning activites to integrate and when to do so on the journey to integrated business planning (IBP). You’ll gain perspective from Deloitte planning process experts across Finance, Sales, HR, and Supply Chain on how next-generation IBP can benefit your business and the steps you can take to get there.
https://www.anaplan.com/webinars/next-generation-integrated-business/
Time series forecasting with machine learningDr Wei Liu
An introduction of developing and application time series forecast models with both traditional time series methods and machine learning techniques. Case study for a challenging very short-term electrical price forecasting project was presented.
The CTREE-algorithm groups together explanatory variables for observations with similar outcomes based on statistical tests. The data mining approach is found to be a useful tool to quantify a discrete response variable conditional on multiple individual characteristics and is generally believed to provide better covariate interactions than traditional parametric discrete choice models, i.e. logit and probit models.
Prediction of Diamond Prices Using Multivariate RegressionMohitMhapuskar
The prices of precious diamonds are primarily determined by some sort of combination of the four C's : Carat, Color, Cut and Clarity. Our team used SAS to implement feature selection and multivariate regression to create a regression model that would allow us to predict the prices of diamonds based on those intrinsic characteristics. Our model achieved an accuracy of 94%.
Understand what value can be gained by using simulation-based predictive analytics for supply chain, distribution center, logistics and warehouse design, operations, and improvement
Knowledge Graphs have proven to be extremely valuable to rec-
ommender systems, as they enable hybrid graph-based recommen-
dation models encompassing both collaborative and content infor-
mation. Leveraging this wealth of heterogeneous information for
top-N item recommendation is a challenging task, as it requires the
ability of effectively encoding a diversity of semantic relations and
connectivity patterns. In this work, we propose entity2rec, a novel
approach to learning user-item relatedness from knowledge graphs
for top-N item recommendation. We start from a knowledge graph
modeling user-item and item-item relations and we learn property-
specific vector representations of users and items applying neural
language models on the network. These representations are used
to create property-specific user-item relatedness features, which
are in turn fed into learning to rank algorithms to learn a global
relatedness model that optimizes top-N item recommendations. We
evaluate the proposed approach in terms of ranking quality on
the MovieLens 1M dataset, outperforming a number of state-of-
the-art recommender systems, and we assess the importance of
property-specific relatedness scores on the overall ranking quality.
In this talk we will explain some of the main challenges that we faced at OLX Europe while trying to proof the value of a deep learning based recommender system, and to later productionize it with a high level of automation.
We'll talk about:
* Modern Recommender Systems
* Deep Learning
* Neural Item Embeddings
* Similarity Search
* Proving value through Experimentation
* From POC to PRD
* Lessons Learned
About the speakers:
Cristian Martinez works as Lead Data Scientist at OLX Group, mainly focused on Search and Recommenders, and has been working for more than a decade in different companies solving business problems with Machine Learning.
Ilia Ivanov is a Data Scientist in OLX Europe (online marketplace) with 4 years of experience in DS focusing on recommendations and NLP.
Slide helps in generating an understand about the intuition and mathematics / stats behind association rule mining. This presentation starts by highlighting the difference between causal and correlation. This is followed Apriori algorithm and the metrics which are used with it. Each metric is discussed in detail. Then a formulation has been generated in classification setting which can be used to generate rules i.e. rule mining.
Other Reference: https://www.slideshare.net/JustinCletus/mining-frequent-patterns-association-and-correlations
k-means clustering aims to partition n observations into k clusters in which each observation belongs to the cluster with the nearest mean, serving as a prototype of the cluster. This results in a partitioning of the data space into Voronoi cells.
Data wrangling is the process of removing errors and combining complex data sets to make them more accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources available today, storing and organizing large quantities of data for analysis is becoming increasingly necessary.Data wrangling is the process of removing errors and combining complex data sets to make them more accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources available today, storing and organizing large quantities of data for analysis is becoming increasingly necessary.Data wrangling is the process of removing errors and combining complex data sets to make them more accessible and easier to analyze. Due to the rapid expansion of the amount of data and data sources available today, storing and organizing large quantities of data for analysis is becoming increasingly necessary.
Next-generation integrated business planning: The Deloitte Perspective | AnaplanAnaplan
How can smart planning help your business become more agile and connected? What business activities can benefit from an integrated, real-time planning approach?
In this informative webinar, Bart Hughes from Deloitte will share perspectives on what business planning activites to integrate and when to do so on the journey to integrated business planning (IBP). You’ll gain perspective from Deloitte planning process experts across Finance, Sales, HR, and Supply Chain on how next-generation IBP can benefit your business and the steps you can take to get there.
https://www.anaplan.com/webinars/next-generation-integrated-business/
Time series forecasting with machine learningDr Wei Liu
An introduction of developing and application time series forecast models with both traditional time series methods and machine learning techniques. Case study for a challenging very short-term electrical price forecasting project was presented.
The CTREE-algorithm groups together explanatory variables for observations with similar outcomes based on statistical tests. The data mining approach is found to be a useful tool to quantify a discrete response variable conditional on multiple individual characteristics and is generally believed to provide better covariate interactions than traditional parametric discrete choice models, i.e. logit and probit models.
Strata 2013: Tutorial-- How to Create Predictive Models in R using EnsemblesIntuit Inc.
This tutorial, based on a published book by Giovanni Seni, offers a hands-on intro to ensemble models, which combine multiple models into a single predictive system that’s often more accurate than the best of its components. Participants will use data sets and snippets of R code to experiment with the methods to gain a practical understanding of this breakthrough technology.
Giovanni Seni is currently a Senior Data Scientist with Intuit where he leads the Applied Data Sciences team. As an active data mining practitioner in Silicon Valley, he has over 15 years R&D experience in statistical pattern recognition and data mining applications. He has been a member of the technical staff at large technology companies, and a contributor at smaller organizations. He holds five US patents and has published over twenty conference and journal articles. His book with John Elder, “Ensemble Methods in Data Mining – Improving accuracy through combining predictions”, was published in February 2010 by Morgan & Claypool. Giovanni is also an adjunct faculty at the Computer Engineering Department of Santa Clara University, where he teaches an Introduction to Pattern Recognition and Data Mining class.
Data Mining: What is Data Mining?
History
How data mining works?
Data Mining Techniques.
Data Mining Process.
(The Cross-Industry Standard Process)
Data Mining: Applications.
Advantages and Disadvantages of Data Mining.
Conclusion.
The Art and Power of Data-Driven Modeling: Statistical and Machine Learning A...WithTheBest
This presentation illustrates distinct statistical and machine learning approaches to automated recognition of major brain tissues in 3D brain MRI.
Nataliya Portman, Postdoctoral Fellow Faculty of Science, UOIT, Oshawa, ON Canada
PhD in Applied Mathematics, University of Waterloo | Postdoctoral Research on Brain MRI Segmentation, Neuro | Current: Applied Machine Learning in Materials Science, University of Ontario Institute of Technology
This Slide was collected from a seminar "Machine Learning for Data Mining" which was arranged in Daffodil International University.The Chief Guest was Dr. Dewan Md. Farid. He made this wonderful Slide for described to us about Data Mining. He also shared his research experience which was just amazing.Totally unpredictable speech it was from Dr. Dewan Md. Farid Sir. He is one of the famous researcher.I hope , you will enjoy this slide. Details about Dr. Dewan Md. Farid sir is given below in this link
https://ai.vub.ac.be/members/dewan-md-farid
PLOTCON NYC: The Architecture of Jupyter: Protocols for Interactive Data Expl...Plotly
Project Jupyter, evolved from the IPython environment, provides a platform for interactive computing that is widely used today in research, education, journalism and industry. The core premise of the Jupyter architecture is to design tools around the experience of interactive computing, building an environment, protocol, file format and libraries optimized for the computational process when there is a human in the loop, in a live iteration with ideas and data assisted by the computer.
In this talk, I will discuss what are the basic ideas that underpin Jupyter, and how they provide "lego blocks" that enable the project team, and the broader community, to develop a variety of tools and approaches to problems in interactive computing, data science, visualization and more.
Machine Learning and Data Mining: 15 Data Exploration and PreparationPier Luca Lanzi
Course "Machine Learning and Data Mining" for the degree of Computer Engineering at the Politecnico di Milano. In this lecture we discuss the data exploration and preparation.
Comparison of various data compression techniques and it perfectly differentiates different techniques of data compression. Its likely to be precise and focused on techniques rather than the topic itself.
Big Data & Machine Learning - TDC2013 Sao PauloOCTO Technology
BigData and Machine Learning: Usage and Opportunities for your IT department
Talk presented at The Developer Conference in São Paulo - 12/0713
Mathieu DESPRIEE
AI Professionals use top machine learning algorithms to automate models that analyze more extensive and complex data which was not possible in older machine learning algos.
Introduction to Data and Computation: Essential capabilities for everyone in ...Kim Flintoff
An overview seminar about the themes of the Curtin Institute for Computation, and some thoughts on the future role of these capabilities in Learning and Teaching.
Fundementals of Machine Learning and Deep Learning ParrotAI
Introduction to machine learning and deep learning to beginners.Learn the applications of machine learning and deep learning and how ti can solve different problems
Camp IT: Making the World More Efficient Using AI & Machine LearningKrzysztof Kowalczyk
Slides from the introductory lecture I gave for students at Camp IT 2019. I tried to cover artificial inteligence, machine learning, most popular algorithms and their applications to business as broadly as possible - for in-depth materials on the given topics, see links and references in the presentation.
Welcome to TechSoup New Member Orientation and Q&A (May 2024).pdfTechSoup
In this webinar you will learn how your organization can access TechSoup's wide variety of product discount and donation programs. From hardware to software, we'll give you a tour of the tools available to help your nonprofit with productivity, collaboration, financial management, donor tracking, security, and more.
How to Split Bills in the Odoo 17 POS ModuleCeline George
Bills have a main role in point of sale procedure. It will help to track sales, handling payments and giving receipts to customers. Bill splitting also has an important role in POS. For example, If some friends come together for dinner and if they want to divide the bill then it is possible by POS bill splitting. This slide will show how to split bills in odoo 17 POS.
Palestine last event orientationfvgnh .pptxRaedMohamed3
An EFL lesson about the current events in Palestine. It is intended to be for intermediate students who wish to increase their listening skills through a short lesson in power point.
The Art Pastor's Guide to Sabbath | Steve ThomasonSteve Thomason
What is the purpose of the Sabbath Law in the Torah. It is interesting to compare how the context of the law shifts from Exodus to Deuteronomy. Who gets to rest, and why?
2024.06.01 Introducing a competency framework for languag learning materials ...Sandy Millin
http://sandymillin.wordpress.com/iateflwebinar2024
Published classroom materials form the basis of syllabuses, drive teacher professional development, and have a potentially huge influence on learners, teachers and education systems. All teachers also create their own materials, whether a few sentences on a blackboard, a highly-structured fully-realised online course, or anything in between. Despite this, the knowledge and skills needed to create effective language learning materials are rarely part of teacher training, and are mostly learnt by trial and error.
Knowledge and skills frameworks, generally called competency frameworks, for ELT teachers, trainers and managers have existed for a few years now. However, until I created one for my MA dissertation, there wasn’t one drawing together what we need to know and do to be able to effectively produce language learning materials.
This webinar will introduce you to my framework, highlighting the key competencies I identified from my research. It will also show how anybody involved in language teaching (any language, not just English!), teacher training, managing schools or developing language learning materials can benefit from using the framework.
Read| The latest issue of The Challenger is here! We are thrilled to announce that our school paper has qualified for the NATIONAL SCHOOLS PRESS CONFERENCE (NSPC) 2024. Thank you for your unwavering support and trust. Dive into the stories that made us stand out!
Model Attribute Check Company Auto PropertyCeline George
In Odoo, the multi-company feature allows you to manage multiple companies within a single Odoo database instance. Each company can have its own configurations while still sharing common resources such as products, customers, and suppliers.
How to Make a Field invisible in Odoo 17Celine George
It is possible to hide or invisible some fields in odoo. Commonly using “invisible” attribute in the field definition to invisible the fields. This slide will show how to make a field invisible in odoo 17.
We all have good and bad thoughts from time to time and situation to situation. We are bombarded daily with spiraling thoughts(both negative and positive) creating all-consuming feel , making us difficult to manage with associated suffering. Good thoughts are like our Mob Signal (Positive thought) amidst noise(negative thought) in the atmosphere. Negative thoughts like noise outweigh positive thoughts. These thoughts often create unwanted confusion, trouble, stress and frustration in our mind as well as chaos in our physical world. Negative thoughts are also known as “distorted thinking”.
The French Revolution, which began in 1789, was a period of radical social and political upheaval in France. It marked the decline of absolute monarchies, the rise of secular and democratic republics, and the eventual rise of Napoleon Bonaparte. This revolutionary period is crucial in understanding the transition from feudalism to modernity in Europe.
For more information, visit-www.vavaclasses.com
Instructions for Submissions thorugh G- Classroom.pptxJheel Barad
This presentation provides a briefing on how to upload submissions and documents in Google Classroom. It was prepared as part of an orientation for new Sainik School in-service teacher trainees. As a training officer, my goal is to ensure that you are comfortable and proficient with this essential tool for managing assignments and fostering student engagement.
Operation “Blue Star” is the only event in the history of Independent India where the state went into war with its own people. Even after about 40 years it is not clear if it was culmination of states anger over people of the region, a political game of power or start of dictatorial chapter in the democratic setup.
The people of Punjab felt alienated from main stream due to denial of their just demands during a long democratic struggle since independence. As it happen all over the word, it led to militant struggle with great loss of lives of military, police and civilian personnel. Killing of Indira Gandhi and massacre of innocent Sikhs in Delhi and other India cities was also associated with this movement.
1. STATISTICAL MODELING: THE
TWO CULTURES
Based on Leo Breiman's paper
Christoph Molnar
Department of Statistics,
LMU Munich
2. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
STATISTICAL MODELING: THE
TWO CULTURES
Based on Leo Breiman's paper
.
Christoph Molnar
Department of Statistics,
LMU Munich
.
Abstract:
This presentation compares two cultures of statistical
modeling: the data modeling culture, which assumes a
stochastic process that produced the data. This culture is
associated with traditional statistics. The other culture is called
algorithmic modeling culture, which can be reduced to
optimization of a loss function with an algorithm. This culture is
associated with Machine Learning. It is argued to use
algorithmic modeling more often in statistics.
3. Data Modeling
Algorithmic Modeling
Principles
Experiences
Conclusions
OUTLINE
1. Statistics: Data Modeling Culture
2. Machine Learning: Algorithmic Modeling Culture
3. Statistical Learning Principles
4. Personal Experience
5. Summary
Content heavily based on: ``Statistical Modeling: The two cultures''
from Leo Breiman [1]
1 / 19
4. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
OUTLINE
1. Statistics: Data Modeling Culture
2. Machine Learning: Algorithmic Modeling Culture
3. Statistical Learning Principles
4. Personal Experience
Outline
5. Summary
.
Content heavily based on: ``Statistical Modeling: The two cultures''
from Leo Breiman [1]
.
This presentation is based on ``Statistical Modeling: The two
cultures'' from Leo Breiman [1].
The first segment introduces the data modeling culture and
analogously the second segment explains the algorithmic
modeling culture together with the presentation of three
algorithms. The part about statistical learning principals
presents aspects which help to compare both of cultures.
Personal experiences in both cultures are addressed. The
conclusion summarizes the message of the paper [1].
7. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
WORK OF A DATA ANALYST
Data Modeling
→ Predict
→ Reveal associations
→ Munge data, design experiments, visualize data, …
Work of a Data Analyst
.
.
The work of a data analyst is very diverse. This presentation
focuses on the modeling of data, which can be reduced to two
targets: Learn a model to predict the outcome for new
covariates and get a better understanding about the
relationship between covariates and outcome.
9. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
SIMPLIFIED WORLDVIEW
Data Modeling
y
.
nature
Simplified worldview
.
.
In a strongly simplified world an arbitrary outcome y is
produced by the nature given the covariates x. The knowledge
about the natures true mechanisms range between entirely
unknown and established (scientific) explanations of the
mechanism. One example: Outcome y is the rent for
apartments and covariates x are size, number of bathrooms
and location.
x
11. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
DATA MODELING CULTURE
Data Modeling
y
Data Modeling Culture
Logistic
Regression,
.
Cox Model,
GEE,
…
x
Find a stochastic model of the data-generating process:
y = f(x, parameters, random error)
.
.
The direct modeling of the mechanism in the ``box'' is labeled
»Data Modeling Culture« by Leo Breiman. In this culture a
stochastic model for the data- generating process is assumed.
A common formulation of these model is: y is a function of x
with corresponding weights and a random error. For example:
Given the covariates size, number of bathrooms and location,
the rent of apartments is normal distributed.
13. Data Modeling
Algorithmic Modeling
Principles
Experiences
Conclusions
PROBLEMS
→ Conclusions about model, not about nature
→ Assumptions often violated
→ Often no model evaluation
→ ⇒ can lead to irrelevant theory and questionable
statistical conclusions
→ Focus not on prediction
→ Data models fail in areas like image and speech
recognition
6 / 19
16. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
MACHINE LEARNING
y
Algorithmic Modeling
Machine Learning
.
unknown
x
algorithm
Find a function f (X) that minimizes the loss: L(Y, f (X))
.
.
In the »Algorithmic Modeling Culture«, the true mechanism is
treated as unknown. It is not the target to find the true
data-generating mechanism but to use an algorithm that
imitates the mechanism as good as possible. Modeling is
reduced to a mere problem of function optimization: Given the
covariates x, outcome y and a loss function find a function f(x)
which minimizes the loss for the prediction of the outcome. This
culture is lived in the machine learning area.
Summary: The data modeling culture tries to find the true
data-generating mechanism, the algorithmic modeling culture
tries to imitate the true mechanism as good as possible.
18. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
ALGORITHM IN MACHINE LEARNING
→ Boosting
Algorithmic Modeling
→ Support Vector Machines
→ Artificial neural networks
→ Random Forests
→ Hidden Markov
→ Bayes-Nets
Algorithm in Machine Learning
→ …1
.
1
Details and more algorithms in ``Elements of statistical learning''[2]
.
The algorithms used in machine learning are motivated
differently. Three algorithms are presented in short.
20. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
ARTIFICIAL NEURAL NETWORKS
Algorithmic Modeling
Artificial neural networks
2
.
2
http://commons.wikimedia.org/wiki/File:
Mouse_cingulate_cortex_neurons.jpg
http://commons.wikimedia.org/wiki/File:Neural_network.svg
.
Artificial neural networks are used in classification and
regression. They are inspired by the brain, which consists of a
network of brain cells (neurons). Mathematically artificial
neural networks are a concatenation of weighted functions. An
exemplary application is image processing.
22. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
SUPPORT VECTOR MACHINES
Algorithmic Modeling
Support Vector Machines
3
.
3
http://commons.wikimedia.org/wiki/File:Svm_10_perceptron.JPG
.
Support Vector Machines (SVM) were originally a classification
method (regression is also possible). SVMs try to draw a
border between two classes in the covariate space. The
distance between the border and the class points is
maximized. They use a mathematical trick to implicitly project
the covariates in a space with higher dimensions (yes, it sounds
a bit crazy) in order to achieve class separation. Text
classification is an exemplary usage.
24. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
RANDOM FORESTS
Algorithmic Modeling
Random Forests
.
.
4
4
http://openclipart.org/detail/175304/forest-by-z-175304
Random Forests™(invented by Leo Breiman) are used for
regression and classification. A Random Forest consists of
many decision trees. Two random mechanisms are used to
train different trees on the data. Results from all trees are
averaged for the prediction.
27. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
RASHOMON EFFECT
Principles
(Often) Many different models describe a situation
equally accurate.
Rashomon Effect
.
.
Rashomon is a Japanese movie in which four witnesses tell
different versions of a crime. All versions account for the facts
but they contradict each other. In terms of statistical learning
this means, that often different models (e.g. same y but
different covariates) can be equally accurate. Each model has
a different interpretation which makes it difficult to find the
true mechanism in the data modeling cultures. In the
algorithmic modeling culture the Rashomon effect is exploited
by aggregating over many models. Random forests use this
effect by aggregating over many trees. It is also common to
average the predictions of different algorithms.
28. Data Modeling
Algorithmic Modeling
Principles
Experiences
Conclusions
DIMENSIONALITY OF THE DATA
→ The higher the dimensionality of the data (#
covariates) the more difficult is the separation of
signal and noise
→ Common practice in data modeling: variable selection
(by expert selection or data driven) and reduction of
dimensionality (e.g. PCA)
→ Common practice in algorithmic modeling:
Engineering of new features (covariates) to increase
predictive accuracy; algorithms robust for many
covariates
13 / 19
30. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
PREDICTION VS. INTERPRETATION
Principles
Predictive accuracy
Interpretability
.
Tree
Logist. Regression
Prediction vs. Interpretation
…
.
.
There is a trade-off between interpretability and predictive
accuracy: the models that are good in prediction are often
complex and models that are easy to interpret are often bad
predictors. See for example trees and Random Forests: A
single decision tree is very intuitive and easy to read for
non-professionals, but they are unstable and give weak
predictions. A complex aggregation of decision trees (Random
Forest) has an excellent prediction accuracy, but it is
impossible to interpret the model structure.
Random Forest
Neuronal networks
…
31. Data Modeling
Algorithmic Modeling
Principles
Experiences
Conclusions
GOODNESS OF MODEL
→ Data modeling culture: Goodness of fit often based
on model assumptions (e.g. AIC) and calculated on
training data.
→ Algorithmic modeling culture: Evaluation of predictive
accuracy with an extra test set or cross validation.
How good is a statistical model if the predictive accuracy
is weak? Is it legit to interpret parameters and p-values?
15 / 19
34. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
STATISTICAL CONSULTING
Experiences
Stereotypical user ...
→ are e.g. veterinarians, linguists, biologists, …
→ crave p-values
→ want interpretability
→ ignore model diagnosis
Statistical Consulting
.
.
From my experience in the statistical consulting unit of our
university, most user want to test their scientific hypothesis
with models. They want models which are easy to interpret
regarding their questions. Thus it is more important to have a
model that gives parameters associated with covariates and
p-values than to have a model that predicts the data well.
35. Data Modeling
Algorithmic Modeling
Principles
Experiences
Conclusions
KAGGLE
Algorithms of winners on kaggle, a platform for
prediction challenges:
→ Job Salary Prediction: »I used deep neural networks«
→ Observing Dark Worlds: »Bayesian analysis provided
the winning recipe for solving this problem«
→ Give Me Some Credit: »In the end we only used five
supervised learning methods: a random forest of
classification trees, a random forest of regression
trees, a classification tree boosting algorithm, a
regression tree boosting algorithm, and a neural
network.«
17 / 19
36. 2014-01-20
.
.
Statistical Modeling: The Two Cultures
KAGGLE
Algorithms of winners on kaggle, a platform for
prediction challenges:
Experiences
→ Job Salary Prediction: »I used deep neural networks«
→ Observing Dark Worlds: »Bayesian analysis provided
the winning recipe for solving this problem«
→ Give Me Some Credit: »In the end we only used five
supervised learning methods: a random forest of
classification trees, a random forest of regression
trees, a classification tree boosting algorithm, a
regression tree boosting algorithm, and a neural
network.«
kaggle
.
.
Kaggle (http://www.kaggle.com) is a platform for prediction
challenges. Data are provided and the participant with the best
prediction on a test set wins the challenge. To generate
insights about the mechanisms in the data is secondary
because the prediction is all that counts to win. That's why the
algorithmic modeling culture is superior in this field.
38. Data Modeling
Algorithmic Modeling
Principles
Experiences
Conclusions
FURTHER LITERATURE
L. Breiman
Statistical modeling: The two cultures (with comments
and a rejoinder by the author)
Institute of Mathematical Statistics, 2001.
T. Hastie, R. Tibshirani and J. Friedman
The elements of statistical learning
Springer New York, 2001
19 / 19