4. Shopping:
How does Amazon
forecast how many
items it needs to store
in its warehouses?
From www.formaspace.com
From cdn.wonderfulengineering.com (top), formaspace.com (bottom) and linkedin.com (right)
5. From cimss.ssec.wisc.edu, ipcc.ch, and www.spot-7.com
Climate: How does NASA automatically detect
land changes using satellite image data?
6. Medicine: How can genomics help to
personalize medical recommendations?
Data Matrix:
Rows = genes, Columns = patients
From www.originlab.com
7. Physics: How do you write software to
search for new physics particles?
Large Hadron Collider:
700 Mbytes/second
60 Terabytes/day
20 Petabytes/year
9. Social media: How does
Facebook recognize
people in images?
From Le Cun and Ranzato 2013
10. How?
• All of these applications use Data Science
• These applications are built on
combinations of ideas from:
o Database systems
o Algorithms
o Machine learning
o Probabilistic models
o Statistical forecasting
o Data visualization
o and more…
13. What makes a good Data
Scientist?
• Interested in computing
– Enjoy working with algorithms, programming, machine
learning,…
• Have a good mathematics background
– Comfortable with mathematical ideas and concepts
– Interested in applying mathematical ideas to real-world
problems
• Enthusiastic about analyzing data
– Enjoy working with data? exploring, visualizing, modeling,
understanding
14. (Sample electives
shown in parentheses)
Statistics
Stats 120 ABC: Intro to Prob and Stats
Stats 68: Exploratory Data Analysis
Stats 110-112: Statistical Methods
CS 178: Machine Learning
(Stats 140: Multivariate Statistics)
Computing
ICS 46: Data Structures
IFMTX 43: Intro to Software Engineering
CS 122A: Intro to Data Management
CS 161: Design and Analysis of Algorithms
(CS 131: Parallel and Distributed Computing)
(CS 172: Neural Networks/Deep Learning)
Applications
Stats 170AB: Data Science Capstone Project
INF 143: Information Visualization
(INF 131: Human Computer Interaction)
(CS 121: Information Retrieval)
(CS 122B: Project in Databases/Web
Applications)
(Summer intermships, e.g., junior year)
What classes might I
take in the DS Major?
15. Years 1 and 2 focus on foundational courses in computer science,
mathematics, statistics, including statistical computing
Sample course of study in the major
Fall (12 units) Winter (13 units) Spring (16 units)
ICS 31, Social Analysis of
Computerization (4 units)
Math 2A, Calculus I (4 units)
Writing 39A, Writing and
Rhetoric (4 units)
ICS 32, (4 units)
Math 2B, Calculus II (4 units)
Writing 39B, Critical Reading
and Rhetoric (4 units)
Stats 5, Seminar in DS (1 unit)
ICS 33, Intermediate
Programming (4 units)
Math 2D, Multivariable
Calculus (4 units)
Stats 7, Basic Statistics (4
units)
Writing 39C, Argument and
Research (4 units)
Year 1 Sample Program:
16. Years 1 and 2 focus on foundational courses in computer science,
mathematics, statistics, including statistical computing
Sample course of study in the major
Fall (16 units) Winter (14 units) Spring (16 units)
ICS 6B, Boolean Algebra and
Logic (4 units)
Math 3A, Intro to Linear
Algebra (4 units)
Stats 120A, Intro to Probability
and Statistics I (4 units)
GE III, (4 units)
ICS 45C, C/C++ (4 units)
ICS 51, Intro to Computer
Organization (6 units)
Stats 120B, Intro to Probability
and Statistics II (4 units)
Stats 68, Stat Computing and
Exploratory DA (4 units)
Stats 120C, Intro to Probability
and Statistics III (4 units)
ICS 46, Data Structures (4
units)
ICS 6D, Discrete Mathematics
(4 units)
Year 2 Sample Program:
17. Fall (16 units) Winter (16 units) Spring (16 units)
Stats 110, Statistical Methods for
Data Analysis I (4 units)
CS 161, Design and Analysis of
Algorithms (4 units)
In4matx 43, Introduction to
Software Engineering (4 units)
GE IV/VIII (4 units)
Stats 111, Statistical Methods
for Data Analysis II (4 units)
CS 178, Machine Learning and
Data-Mining (4 units)
ICS 139W, Critical Writing on
Information Technology (4
units)
GE III/VII (4 units)
Stats 112, Statistical Methods
for Data Analysis III (4 units)
CS 122A, Introduction to Data
Management (4 units)
In4matx 143, Information
Visualization (4 units)
GE VI (4 units)
Year 3 includes more emphasis and specialization in data science topics such
as machine learning, databases, visualization, advanced statistics
Sample course of study in the major
Year 3 Sample Program:
18. 18
Sample course of study in the major
Year 4 Sample Program:
• Stats 170A (4 units) + Stats 170B (4 units)
• Two-quarter data science capstone project courses
• “data-intensive” projects
• team-based
• Statistics electives
• CS electives
• Individual studies
• Internship
20. What can I do with a
Data Science Major?
• Careers in “Data-Oriented” Companies and Organizations
– Computing/internet companies: Google, Amazon, Facebook, IBM,….
– Engineering companies: Intel, Samsung, Boeing, ….
– Finance/insurance companies
– Medical/pharmaceutical companies
– Government/national labs: NASA, NIST, DoD, ….
– Many many more……
• Option to specialize with a Graduate Degrees (MS or PhD)
– Computer Science: specialize in a topic such as machine learning,
databases, etc
– Statistics: specialize in a statistical topic, e.g., computational statistics
– MS/PhD degrees lead to a wide variety of careers
21. Are there jobs for Data Scientists?
“The fastest-growing roles
are Data Scientists and
Advanced Analysts, which
are projected to see demand
spike by 28% by 2020.
Employers are willing to pay
premium salaries for
professionals with expertise
in these areas as well.”
(Louis Colombo, Forbes,
May 2017)
Job Postings Data
from Indeed.com,
September 2017
22. Are there (currently) jobs for Data
Scientists?
Glassdoor.com currently ranks Data Scientist as the #1 job in a America, based on
number of job openings (4,184; compared to 1,736 in August 2016), median base
salary ($110,000), and career opportunity.
Source (September 14, 2017) : https://www.glassdoor.com/List/Best-Jobs-in-America-LST_KQ0,20.htm
23. Do I need a Data Science
degree to do Data Science?
• Technically no……many people currently are “data scientists” with
backgrounds in quantitative degrees that are not data science
– Some with statistics, some with computer science, some with a
combination
– Some with other quantitative degrees
• Advantages of the DS major
– Puts you on the “fast track” to becoming a Data Scientist
– Ensures that you will know the fundamentals of both
• Computing
• Statistics
– Provides you with skills that are likely to have lasting value (as
technology changes)
24. What are other degree options?
• Computer Science with a Statistics minor?
– More classes in “systems” aspects of computer science
– Fewer classes in statistics
– No capstone data science project class
• Another degree like Math or Economics with a Statistics minor?
– Far fewer classes in computer science
– Fewer classes in statistics
– No capstone data science project class
• Statistics undergraduate degree (e.g., at another UC)?
– More classes in mathematics and statistics
– Far fewer classes in computer science
– No capstone data science project class
25. A sample of internship requirements
(from a local firm)
• High energy self starter
• Experience with machine learning techniques such as Naive
Bayes, SVM, Random Forest, Neural Networks
• Experience with cloud platforms technology stacks: AWS,
Azure, Google Cloud Platform
• Data Concepts (ETL, streaming, data structures, metadata
management)
• Programming skills in Python or R
• BS in Data Science, Statistics, OR, Computer Science or
Engineering (MS a plus)
25
26. Want to learn more?
Visit us online!
• Additional information on the Data
science major at UCI
http://www.stat.uci.edu/slider/b-s-in-data-science/
• Seminar announcements
• Job/Internship opportunities