This presentation covers data science buzz words, big data introduction, predictive analytics, and model building methods. Structured vs unstructured. Supervised learning vs unsupervised learning.
2. Presentation Objectives
• Enable you to be smarter than your prospect (data history / lingo)
• Motivate you to be unstoppable and hyper-confident
• Motivate you to begin looking for data driven opportunities
• Motivate you to become a data scientist
3. "What the hell is cloud computing?"
-Larry Ellison, CEO Oracle
5. What is big data?
Big data includes datasets or problems which exceed the
capacity of a single computer and require a distributed data
access system.
The concept of "big" is relative to the conventional systems
and technology and is subject to change in the future with
advances in memory and storage solutions.
http://www.pcmag.com/article2/0,2817,2453838,00.asp
30. Data
munging
Prediction process
Raw data
Feature selection
Training
Model
Data cleaning
LSR, SVM, RANDOM FOREST,
NAÏVE BAYESIAN, NEURAL NET
Retail > 15, Engineering > 95
GPA, Colleges, Hobbies
> 5.67
39. Data Lingo
Supervised vs unsupervised learning
Supervised: Training set provided.
Unsupervised: No training set, clustering based on
similar attributes.
40. Data Lingo
Analytic Layers
Descriptive Analytics: Telling a data story, plotting, or
visualization.
Predictive Analytics: Predict future outcomes, usually
trained on a historical training set
Prescriptive Analytics: Using the insight from your
predictive model to proactively change something
Interview/Interaction Analytics: Any analytics
surrounding the interview or interaction.
41. Data Lingo
Prediction methods
Regression: Predicting a continuous output (stock)
Classification: Predicting discrete category outputs.
i.e. Yes/Maybe/No
42. Data Lingo
Data Types
Structured: Does it play well in Excel?
Unstructured: Raw text (Twitter), audio, video,
photos, resumes, etc…