SlideShare a Scribd company logo
1 of 54
Download to read offline
Lecture 1: Introduction
CIS 4190/5190 (Fall 2022)
August 31
CIS 4190/5190: Applied Machine Learning
• Course goals
• Identify opportunities for applying machine learning (ML) algorithms
• Train ML models
• Diagnose and debug issues in ML models
• Lectures will focus on developing mathematical understanding
• Assignments will focus on applying this understanding to
implementing ML solutions
Agenda
• Logistics
• Course description
• Course policies
• Tentative Schedule
• Grading
• Introduction
• Motivation
• Basic definitions
• Examples
Instructors
Prof. Zachary Ives
Prof. Osbert Bastani
Teaching Assistants
• Brian Chen
• Swati Gupta
• Jio Jeong
• Jinhui Luo
• Jason Ma
• Rushab Manthripragada
• Ramya Ramalingam
• Peizhi Wu
Course Website
• Course website
• https://www.seas.upenn.edu/~cis5190/fall2022/
• Syllabus
• Tentative schedule
• Links to all assignments
• Resources
Office Hours
• Each TA and instructor will have 1-2 hours of office hours each week
• Times still being decided
• We are aiming to have a mix of in person and remote office hours
Communication
• We will use Ed Discussion for questions and course discussions
• Send a message to “instructors” to contact professors and TAs
• You can contact the professors directly on Ed Discussion or by email (posted
on course website); always contact both of us
Attendance Policy
• We expect all students to attend classes regularly
• Weekly quizzes designed to make sure students are following course material
• Please do not come to class if you are not feeling well or if you test
positive for Covid-19
• We will provide course recordings and lecture notes for students unable to
make it to class
• These will be available on Canvas (though we reserve the right to change our
policy if attendance becomes an issue)
Masking Policy
• There are some students in the class who are at higher risk from
COVID exposure
• Thus, we are setting a policy that we will enforce wearing a mask in
class and during office hours
• We appreciate your consideration of your classmates
Waitlist Policy
• We are considering students roughly in the order (up to the cap):
1. Absolutely need to take this course during this semester (Fall 2022) for a
graduation requirement
2. Absolutely need to take this course during next semester (Spring 2023) for
a graduation requirement
3. Graduating in this semester (Fall 2022)
• In addition, you must satisfy prerequisites and have correctly
answered all waitlist questions (https://waitlist.cis.upenn.edu)
Waitlist Policy
• If you satisfy one of these categories, email me with which category
you are in and a clear explanation for why you are in that category
• Email: obastani@seas.upenn.edu
• Deadline: Thursday at 8pm
• Final decisions at the discretion of the instructors
Prerequisites
• Math: University-level courses in probability, linear algebra, and
multivariable calculus
• Understand prior and posterior probabilities, 𝔼 𝑋 = ∫ 𝑝 𝑥 𝑑𝑥, etc.
• Understand matrix ranks, inverses, and eigenvalues
• Understand how to compute ∇!(𝐴𝑥) for a matrix 𝐴 and vector 𝑥
• Tested in HW 1
• Programming: Previously coded up projects (preferably in Python)
that were at least 100 lines of code long
• We will provide Python help (primer + office hours) for students who know
how to program but do not know Python
Course Comparisons
• CIS 4190/5190 (this course)
• Basic mathematical ideas behind ML
• Apply existing ML algorithms to new problems as an engineer or researcher
Also see: https://priml.upenn.edu/courses
Course Comparisons
• CIS 4190/5190 (this course)
• Basic mathematical ideas behind ML
• Apply existing ML algorithms to new problems as an engineer or researcher
• CIS 5200
• Deeper, more mathematically demanding introduction to ML
• Perhaps do fundamental ML research in the future
Also see: https://priml.upenn.edu/courses
Course Comparisons
• CIS 4190/5190 (this course)
• Basic mathematical ideas behind ML
• Apply existing ML algorithms to new problems as an engineer or researcher
• CIS 5220
• Deep learning techniques and applications in more detail
• CIS 5450
• Data science workflow including data wrangling, ML modeling, and analytics
• Scaling ML to big datasets and clusters
Also see: https://priml.upenn.edu/courses
CIS 4190 vs. 5190
• 5190 will have extra, mandatory components in the HW, which are
optional for 4190
• Example
• HW may have 45 points for 4190, and 5 extra points for 5190 (total of 50)
• Student taking 4190 will get 100% if they get at least 45 points (typically by
skipping the 5190 problem, but not necessarily)
• You cannot score more than 100%
• The written and coding portion are counted separately; you cannot make up
written points using coding points and vice versa
Tentative Schedule
Week Content Homework Project
1 Introduction
2 Linear Regression
3 Linear Regression, Cont. HW 1 Due
4 Logistic Regression
5 Decision Trees
6 Neural Networks HW 2 Due
7 Computer Vision Milestone 1 Due
8 Bayesian Networks HW 3 Due
9 Unsupervised Learning
10 Natural Language Processing HW 4 Due
11 Reinforcement Learning Milestone 2 Due
12 Recommender Systems HW 5 Due
13 Ethics
14 Ethics HW 6 Due
15 Review Milestone 3 Due
16 Review
Grading Scheme
• Homeworks (6×): 30%
• Project (3 milestones, teams of 3): 20%
• Final exam (during exam week): 35%
• Quizzes (12 ×, roughly weekly): 10%
• 50% correct sufficient for full credit
• Class participation: 5%
Grading Scheme
• A+: 90+
• A: 85-90
• A-: 80-85
• B+: 75-80
• B: 70-75
• B-: 65-70
• Lower passing grades: 50-65
• May be curved up
Late Policy
• For each hour late, lose 0.5% on the points for that assignment
• Homeworks, quizzes, project milestones
• Max 48 late hours per assignment
• Example
• Submit HW 1 late by 20 hours
• Lose 20×0.5 = 10% on HW 1 (0.5% of overall grade)
• If you have a medical reason, email both professors a copy of your
medical visit report, and we will grant an extension (typically 2 days)
• We will consider other reasons on a case-by-case basis
Homework Schedule
• 6 homeworks
• Released every other Wednesday
• Due Wednesday 2 weeks later (with an exception for HW 2)
• Due at 8pm
Homework Structure
• Written problems: GradeScope submission
• LaTeX encouraged; handwritten + scanned at your own risk
• Won’t be graded if you don’t annotate your answers correctly!
• Coding problems: AutoGrader + GradeScope submission of notebook
• Colab/iPython notebook skeletons; AutoGrader as unit tests within skeleton
• Only difference between AutoGrader and unit tests is different data
• If code passes the unit tests and you didn’t “game” it, it should pass AutoGrader
• Discussion permitted for clarifications, but never share solutions or
code; acknowledge all your discussions at beginning of your report
Homework 1
• Designed to test mathematical background
• Opportunity to get used to the workflow
• Full points if you score 50% or more
• We will not answer questions about the HW 1 (except clarifying
questions)
Quiz Schedule
• 12 quizzes
• Released every Wednesday (starting next week)
• Due Thursday 8 days later
• Checks basic understanding of material covered the previous week
Agenda
• Logistics
• Course description
• Course policies
• Tentative Schedule
• Grading
• Introduction
• Motivation
• Basic definitions
• Examples
What is Machine Learning?
“Learning is any process by
which a system improves
performance from experience.”
Herbert Simon
What is Machine Learning?
“Machine learning … gives
computers the ability to learn
without being explicitly
programmed.”
Arthur Samuel
What is Machine Learning?
• Tom Mitchell: Algorithms that
• improve their performance 𝑃
• at task 𝑇
• with experience 𝐸
• A well-defined machine learning
task is given by 𝑃, 𝑇, 𝐸
Example: Game Playing
• Tom Mitchell: Algorithms that
• improve their performance 𝑃
• at task 𝑇
• with experience 𝐸
• 𝑇 = playing Checkers
• 𝑃 = win rate against opponents
• 𝐸 = playing games against itself
Example: Prediction
31
Image: https://www.flickr.com/photos/gsfc/5937599688/
Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
1975 1985 1995 2005 2015 2025
Arctic
Sea
Ice
Extent
(millions
of
sq
km)
Year
NSIDC Index of Arctic Sea Ice in September
Photo by NASA Goddard
??
Example: Prediction
• Tom Mitchell: Algorithms that
• improve their performance 𝑃
• at some task 𝑇
• with experience 𝐸
• 𝑇 = predict Arctic sea ice extent
• 𝑃 = prediction error (e.g.,
absolute difference)
• 𝐸 = historical data
Machine Learning for Prediction
Data 𝑍 Machine learning
algorithm
Model 𝑓
New input
Predicted output
Example: Prediction
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
1975 1985 1995 2005 2015 2025
Arctic
Sea
Ice
Extent
(millions
of
sq
km)
Year
NSIDC Index of Arctic Sea Ice in September
Photo by NASA Goddard
??
Image: https://www.flickr.com/photos/gsfc/5937599688/
Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/
Our focus
Machine Learning Workflow
Framing an ML problem (Mitchell’s P, T, E)
Data curation (sourcing, scraping, collection, labeling)
Data analysis / visualization
ML Design (hypothesis class, loss function, optimizer, hyperparameters,
features)
Train model
Validate / Evaluate
Deploy (and generate new data)
Monitor performance on new data
Types of Learning
• Supervised learning
• Input: Examples of inputs and outputs
• Output: Model that predicts unknown output given a new input
• Unsupervised learning
• Input: Examples of some data (no “outputs”)
• Output: Representation of structure in the data
• Reinforcement learning
• Input: Sequence of interactions with an environment
• Output: Policy that performs a desired task
Supervised Learning
• Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥
37
Image: https://www.flickr.com/photos/gsfc/5937599688/
Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
1975 1985 1995 2005 2015 2025
Arctic
Sea
Ice
Extent
(millions
of
sq
km) Year
NSIDC Index of Arctic Sea Ice in September
Photo by NASA Goddard
Supervised Learning
• Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥
• Regression: Labels 𝑦 are real-valued
38
Image: https://www.flickr.com/photos/gsfc/5937599688/
Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
1975 1985 1995 2005 2015 2025
Arctic
Sea
Ice
Extent
(millions
of
sq
km) Year
NSIDC Index of Arctic Sea Ice in September
Photo by NASA Goddard
Supervised Learning
• Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥
• Classification: Labels 𝑦 are categories
Tumor Size
Ocular Tumor (Malignant / Benign)
Predict Malignant
Predict Benign
Malignant
Benign
𝑓(𝑥)
Image: https://eyecancer.com/uncategorized/choroidal-
metastasis-test/
Supervised Learning
• Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥
• Inputs 𝑥 can be multi-dimensional
Image: https://eyecancer.com/uncategorized/choroidal-
metastasis-test/
Tumor Size
Age
• Patient age
• Clump thickness
• Tumor Color
• Cell type
• …
Unsupervised Learning
• Given 𝑥!, … , 𝑥" (no labels), output hidden structure in 𝑥’s
• E.g., clustering
Unsupervised Learning
Image Credits:
https://medium.com/graph-commons/finding-organic-clusters-in-your-complex-data-networks-5c27e1d4645d
https://arxiv.org/pdf/1703.08893.pdf
https://en.wikipedia.org/wiki/Exoplanet
Find Subgroups in
Social Networks
Identify Types of Exoplanets Visualize Data
Reinforcement Learning
• Learn how to perform a task from
interactions with the environment
• Examples:
• Playing chess (interact with the game)
• Robot grasping an object (interact
with the object/real world)
• Optimize inventory allocations
(interact with the inventory system)
environment agent
Reinforcement Learning
https://www.youtube.com/watch?v=iaF43Ze1oeI
Applications of Machine Learning
45
Everyday Applications
Radiology and Medicine
https://www.nature.com/articles/s41746-020-00376-2
Input: Brain scans
Output: Neurological disease labels
https://www.nature.com/articles/s41573-019-0024-5
Scientific Discovery
https://deepmind.com/blog/article/AlphaFol
d-Using-AI-for-scientific-discovery
https://www.jpl.nasa.gov/edu/news/2019/4/19/how-scientists-captured-
the-first-image-of-a-black-hole/
Creating Images & Text
https://thispersondoesnotexist.com/
https://transformer.huggingface.co/doc/gpt2-large
Analytical
Modeling/
Understanding
Data Quantity and Quality
When should we use machine learning?
Adding two numbers
NO
Solving differential equations
YES, SOMETIMES
Flying rockets to other planets
NO
Checking large prime numbers
NO
Recognizing animals from pictures
YES!
Make art and music
YES!
Get robots to make sandwiches
YES, PROBABLY
Predict fashion in 20 years
NO, PROBABLY
Weather forecasting
MAYBE?
Danger of Out-of-Domain Machine Learning
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
8.0
9.0
1975 1985 1995 2005 2015 2025
Arctic
Sea
Ice
Extent
(millions
of
sq
km)
Year
NSIDC Index of Arctic Sea Ice in September ??
Any time you are evaluating on data “far” from your training data, beware!
Ethical Considerations
“The Pennsylvania Board of Probation and Parole has begun using machine learning forecasts to help inform
parole release decisions. In this paper, we evaluate the impact of the forecasts on those decisions and
subsequent recidivism.”
“In 2013, the University of Texas at Austin’s computer science department began using a machine-learning
system called GRADE to help make decisions about who gets into its Ph.D. program”
“Videos about vegetarianism led to videos about veganism. Videos about jogging led to videos about running
ultramarathons. It seems as if you are never ‘hard core’ enough for YouTube’s recommendation algorithm. It
promotes, recommends and disseminates videos in a manner that appears to constantly up the stakes. Given
its billion or so users, YouTube may be one of the most powerful radicalizing instruments of the 21st century.”
Tentative Schedule
Week Content Homework Project
1 Introduction
2 Linear Regression
3 Linear Regression, Cont. HW 1 Due
4 Logistic Regression
5 Decision Trees
6 Neural Networks HW 2 Due
7 Computer Vision Milestone 1 Due
8 Bayesian Networks HW 3 Due
9 Unsupervised Learning
10 Natural Language Processing HW 4 Due
11 Reinforcement Learning Milestone 2 Due
12 Recommender Systems HW 5 Due
13 Ethics
14 Ethics HW 6 Due
15 Review Milestone 3 Due
16 Review
First Assignments
• HW 1: Released today, due 9/14
• No office hours planned for HW 1
• 50% = full credit
• Quiz 1: Released 9/7, due 9/15

More Related Content

Similar to lecture1.pdf

Workplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing ConferenceWorkplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing ConferenceCengage Learning
 
Addictive links, Keynote talk at WWW 2014 workshop
Addictive links, Keynote talk at WWW 2014 workshopAddictive links, Keynote talk at WWW 2014 workshop
Addictive links, Keynote talk at WWW 2014 workshopPeter Brusilovsky
 
Essentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in PalestineEssentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in PalestineJafar Hajeer
 
Data Science: Introduction
Data Science: IntroductionData Science: Introduction
Data Science: IntroductionJinho Choi
 
OpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissionsOpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissionsPeter Tröger
 
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8 CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8 Patty Stephens
 
TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...
TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...
TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...BlackboardEMEA
 
ITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional DevelopmentITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional DevelopmentBarry Dahl
 
Updating Teaching Techonologies - Real World Impact!
Updating Teaching Techonologies - Real World Impact!Updating Teaching Techonologies - Real World Impact!
Updating Teaching Techonologies - Real World Impact!afacct
 
Program Overview - Data Science and ML Bootcamp by Jovian (1).pdf
Program Overview - Data Science and ML Bootcamp by Jovian (1).pdfProgram Overview - Data Science and ML Bootcamp by Jovian (1).pdf
Program Overview - Data Science and ML Bootcamp by Jovian (1).pdfRishirajPatidar2
 
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12Patty Stephens
 
Wilczewski iced report to nasa at ksc on 20180609 f
Wilczewski iced report to nasa at ksc on 20180609 fWilczewski iced report to nasa at ksc on 20180609 f
Wilczewski iced report to nasa at ksc on 20180609 fRich Wilczewski
 
Social Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabusSocial Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabusJakub Ruzicka
 
Pmp capm exam preparation
Pmp capm exam preparationPmp capm exam preparation
Pmp capm exam preparationFreedom Monk
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Lucidworks
 

Similar to lecture1.pdf (20)

Workplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing ConferenceWorkplace Simulated Courses - Course Technology Computing Conference
Workplace Simulated Courses - Course Technology Computing Conference
 
Addictive links, Keynote talk at WWW 2014 workshop
Addictive links, Keynote talk at WWW 2014 workshopAddictive links, Keynote talk at WWW 2014 workshop
Addictive links, Keynote talk at WWW 2014 workshop
 
Essentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in PalestineEssentials for a Better ICT Student in Palestine
Essentials for a Better ICT Student in Palestine
 
Data Science: Introduction
Data Science: IntroductionData Science: Introduction
Data Science: Introduction
 
BbWorld 2010 notes
BbWorld 2010 notesBbWorld 2010 notes
BbWorld 2010 notes
 
OpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissionsOpenSubmit - How to grade 1200 code submissions
OpenSubmit - How to grade 1200 code submissions
 
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8 CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 7-8
 
TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...
TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...
TLC2016 - Turning Blackboard Learn into a Digital Examination Platform: lesso...
 
ITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional DevelopmentITC12 Five Effective Practices for eLearning Professional Development
ITC12 Five Effective Practices for eLearning Professional Development
 
Wk1 - L1.ppsx
Wk1 - L1.ppsxWk1 - L1.ppsx
Wk1 - L1.ppsx
 
The Art of Project Estimation
The Art of Project EstimationThe Art of Project Estimation
The Art of Project Estimation
 
Updating Teaching Techonologies - Real World Impact!
Updating Teaching Techonologies - Real World Impact!Updating Teaching Techonologies - Real World Impact!
Updating Teaching Techonologies - Real World Impact!
 
Facebook Presentation, Facebook Promotion ppt
Facebook Presentation, Facebook Promotion pptFacebook Presentation, Facebook Promotion ppt
Facebook Presentation, Facebook Promotion ppt
 
Program Overview - Data Science and ML Bootcamp by Jovian (1).pdf
Program Overview - Data Science and ML Bootcamp by Jovian (1).pdfProgram Overview - Data Science and ML Bootcamp by Jovian (1).pdf
Program Overview - Data Science and ML Bootcamp by Jovian (1).pdf
 
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12
CCSS Math and SBAC Assessments: Assessing the Common Core, Grades 9-12
 
Facebook
FacebookFacebook
Facebook
 
Wilczewski iced report to nasa at ksc on 20180609 f
Wilczewski iced report to nasa at ksc on 20180609 fWilczewski iced report to nasa at ksc on 20180609 f
Wilczewski iced report to nasa at ksc on 20180609 f
 
Social Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabusSocial Web: (Big) Data Mining | summer 2014/2015 course syllabus
Social Web: (Big) Data Mining | summer 2014/2015 course syllabus
 
Pmp capm exam preparation
Pmp capm exam preparationPmp capm exam preparation
Pmp capm exam preparation
 
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
Enriching Solr with Deep Learning for a Question Answering System - Sanket Sh...
 

More from Tamer Nadeem

PresentationDNN.pptx
PresentationDNN.pptxPresentationDNN.pptx
PresentationDNN.pptxTamer Nadeem
 
Semi-Supervised.pptx
Semi-Supervised.pptxSemi-Supervised.pptx
Semi-Supervised.pptxTamer Nadeem
 
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptxTamer Nadeem
 
cs229-probability_review_slides.pdf
cs229-probability_review_slides.pdfcs229-probability_review_slides.pdf
cs229-probability_review_slides.pdfTamer Nadeem
 

More from Tamer Nadeem (8)

PresentationDNN.pptx
PresentationDNN.pptxPresentationDNN.pptx
PresentationDNN.pptx
 
Chapter 1.ppt
Chapter 1.pptChapter 1.ppt
Chapter 1.ppt
 
Semi-Supervised.pptx
Semi-Supervised.pptxSemi-Supervised.pptx
Semi-Supervised.pptx
 
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
4_1_indoorLocalization_1_fingerprint_deadreckoning.pptx
 
qos-f05.pdf
qos-f05.pdfqos-f05.pdf
qos-f05.pdf
 
cs229-probability_review_slides.pdf
cs229-probability_review_slides.pdfcs229-probability_review_slides.pdf
cs229-probability_review_slides.pdf
 
N00014 21-s-f003
N00014 21-s-f003N00014 21-s-f003
N00014 21-s-f003
 
Ci carplay-cic
Ci carplay-cicCi carplay-cic
Ci carplay-cic
 

Recently uploaded

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Celine George
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionSafetyChain Software
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTiammrhaywood
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...Marc Dusseiller Dusjagr
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxpboyjonauth
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxiammrhaywood
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxAnaBeatriceAblay2
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Sapana Sha
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxGaneshChakor2
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonJericReyAuditor
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfsanyamsingh5019
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfadityarao40181
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Educationpboyjonauth
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Krashi Coaching
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsanshu789521
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,Virag Sontakke
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaVirag Sontakke
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdfssuser54595a
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application ) Sakshi Ghasle
 

Recently uploaded (20)

Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
Incoming and Outgoing Shipments in 1 STEP Using Odoo 17
 
Mastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory InspectionMastering the Unannounced Regulatory Inspection
Mastering the Unannounced Regulatory Inspection
 
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPTECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
ECONOMIC CONTEXT - LONG FORM TV DRAMA - PPT
 
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
“Oh GOSH! Reflecting on Hackteria's Collaborative Practices in a Global Do-It...
 
Introduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptxIntroduction to AI in Higher Education_draft.pptx
Introduction to AI in Higher Education_draft.pptx
 
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdfTataKelola dan KamSiber Kecerdasan Buatan v022.pdf
TataKelola dan KamSiber Kecerdasan Buatan v022.pdf
 
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptxSOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
SOCIAL AND HISTORICAL CONTEXT - LFTVD.pptx
 
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptxENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
ENGLISH5 QUARTER4 MODULE1 WEEK1-3 How Visual and Multimedia Elements.pptx
 
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111Call Girls in Dwarka Mor Delhi Contact Us 9654467111
Call Girls in Dwarka Mor Delhi Contact Us 9654467111
 
CARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptxCARE OF CHILD IN INCUBATOR..........pptx
CARE OF CHILD IN INCUBATOR..........pptx
 
Science lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lessonScience lesson Moon for 4th quarter lesson
Science lesson Moon for 4th quarter lesson
 
Sanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdfSanyam Choudhary Chemistry practical.pdf
Sanyam Choudhary Chemistry practical.pdf
 
Biting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdfBiting mechanism of poisonous snakes.pdf
Biting mechanism of poisonous snakes.pdf
 
Introduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher EducationIntroduction to ArtificiaI Intelligence in Higher Education
Introduction to ArtificiaI Intelligence in Higher Education
 
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
Kisan Call Centre - To harness potential of ICT in Agriculture by answer farm...
 
Presiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha electionsPresiding Officer Training module 2024 lok sabha elections
Presiding Officer Training module 2024 lok sabha elections
 
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,भारत-रोम व्यापार.pptx, Indo-Roman Trade,
भारत-रोम व्यापार.pptx, Indo-Roman Trade,
 
Painted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of IndiaPainted Grey Ware.pptx, PGW Culture of India
Painted Grey Ware.pptx, PGW Culture of India
 
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
18-04-UA_REPORT_MEDIALITERAСY_INDEX-DM_23-1-final-eng.pdf
 
Hybridoma Technology ( Production , Purification , and Application )
Hybridoma Technology  ( Production , Purification , and Application  ) Hybridoma Technology  ( Production , Purification , and Application  )
Hybridoma Technology ( Production , Purification , and Application )
 

lecture1.pdf

  • 1. Lecture 1: Introduction CIS 4190/5190 (Fall 2022) August 31
  • 2. CIS 4190/5190: Applied Machine Learning • Course goals • Identify opportunities for applying machine learning (ML) algorithms • Train ML models • Diagnose and debug issues in ML models • Lectures will focus on developing mathematical understanding • Assignments will focus on applying this understanding to implementing ML solutions
  • 3. Agenda • Logistics • Course description • Course policies • Tentative Schedule • Grading • Introduction • Motivation • Basic definitions • Examples
  • 5. Teaching Assistants • Brian Chen • Swati Gupta • Jio Jeong • Jinhui Luo • Jason Ma • Rushab Manthripragada • Ramya Ramalingam • Peizhi Wu
  • 6. Course Website • Course website • https://www.seas.upenn.edu/~cis5190/fall2022/ • Syllabus • Tentative schedule • Links to all assignments • Resources
  • 7. Office Hours • Each TA and instructor will have 1-2 hours of office hours each week • Times still being decided • We are aiming to have a mix of in person and remote office hours
  • 8. Communication • We will use Ed Discussion for questions and course discussions • Send a message to “instructors” to contact professors and TAs • You can contact the professors directly on Ed Discussion or by email (posted on course website); always contact both of us
  • 9. Attendance Policy • We expect all students to attend classes regularly • Weekly quizzes designed to make sure students are following course material • Please do not come to class if you are not feeling well or if you test positive for Covid-19 • We will provide course recordings and lecture notes for students unable to make it to class • These will be available on Canvas (though we reserve the right to change our policy if attendance becomes an issue)
  • 10. Masking Policy • There are some students in the class who are at higher risk from COVID exposure • Thus, we are setting a policy that we will enforce wearing a mask in class and during office hours • We appreciate your consideration of your classmates
  • 11. Waitlist Policy • We are considering students roughly in the order (up to the cap): 1. Absolutely need to take this course during this semester (Fall 2022) for a graduation requirement 2. Absolutely need to take this course during next semester (Spring 2023) for a graduation requirement 3. Graduating in this semester (Fall 2022) • In addition, you must satisfy prerequisites and have correctly answered all waitlist questions (https://waitlist.cis.upenn.edu)
  • 12. Waitlist Policy • If you satisfy one of these categories, email me with which category you are in and a clear explanation for why you are in that category • Email: obastani@seas.upenn.edu • Deadline: Thursday at 8pm • Final decisions at the discretion of the instructors
  • 13. Prerequisites • Math: University-level courses in probability, linear algebra, and multivariable calculus • Understand prior and posterior probabilities, 𝔼 𝑋 = ∫ 𝑝 𝑥 𝑑𝑥, etc. • Understand matrix ranks, inverses, and eigenvalues • Understand how to compute ∇!(𝐴𝑥) for a matrix 𝐴 and vector 𝑥 • Tested in HW 1 • Programming: Previously coded up projects (preferably in Python) that were at least 100 lines of code long • We will provide Python help (primer + office hours) for students who know how to program but do not know Python
  • 14. Course Comparisons • CIS 4190/5190 (this course) • Basic mathematical ideas behind ML • Apply existing ML algorithms to new problems as an engineer or researcher Also see: https://priml.upenn.edu/courses
  • 15. Course Comparisons • CIS 4190/5190 (this course) • Basic mathematical ideas behind ML • Apply existing ML algorithms to new problems as an engineer or researcher • CIS 5200 • Deeper, more mathematically demanding introduction to ML • Perhaps do fundamental ML research in the future Also see: https://priml.upenn.edu/courses
  • 16. Course Comparisons • CIS 4190/5190 (this course) • Basic mathematical ideas behind ML • Apply existing ML algorithms to new problems as an engineer or researcher • CIS 5220 • Deep learning techniques and applications in more detail • CIS 5450 • Data science workflow including data wrangling, ML modeling, and analytics • Scaling ML to big datasets and clusters Also see: https://priml.upenn.edu/courses
  • 17. CIS 4190 vs. 5190 • 5190 will have extra, mandatory components in the HW, which are optional for 4190 • Example • HW may have 45 points for 4190, and 5 extra points for 5190 (total of 50) • Student taking 4190 will get 100% if they get at least 45 points (typically by skipping the 5190 problem, but not necessarily) • You cannot score more than 100% • The written and coding portion are counted separately; you cannot make up written points using coding points and vice versa
  • 18. Tentative Schedule Week Content Homework Project 1 Introduction 2 Linear Regression 3 Linear Regression, Cont. HW 1 Due 4 Logistic Regression 5 Decision Trees 6 Neural Networks HW 2 Due 7 Computer Vision Milestone 1 Due 8 Bayesian Networks HW 3 Due 9 Unsupervised Learning 10 Natural Language Processing HW 4 Due 11 Reinforcement Learning Milestone 2 Due 12 Recommender Systems HW 5 Due 13 Ethics 14 Ethics HW 6 Due 15 Review Milestone 3 Due 16 Review
  • 19. Grading Scheme • Homeworks (6×): 30% • Project (3 milestones, teams of 3): 20% • Final exam (during exam week): 35% • Quizzes (12 ×, roughly weekly): 10% • 50% correct sufficient for full credit • Class participation: 5%
  • 20. Grading Scheme • A+: 90+ • A: 85-90 • A-: 80-85 • B+: 75-80 • B: 70-75 • B-: 65-70 • Lower passing grades: 50-65 • May be curved up
  • 21. Late Policy • For each hour late, lose 0.5% on the points for that assignment • Homeworks, quizzes, project milestones • Max 48 late hours per assignment • Example • Submit HW 1 late by 20 hours • Lose 20×0.5 = 10% on HW 1 (0.5% of overall grade) • If you have a medical reason, email both professors a copy of your medical visit report, and we will grant an extension (typically 2 days) • We will consider other reasons on a case-by-case basis
  • 22. Homework Schedule • 6 homeworks • Released every other Wednesday • Due Wednesday 2 weeks later (with an exception for HW 2) • Due at 8pm
  • 23. Homework Structure • Written problems: GradeScope submission • LaTeX encouraged; handwritten + scanned at your own risk • Won’t be graded if you don’t annotate your answers correctly! • Coding problems: AutoGrader + GradeScope submission of notebook • Colab/iPython notebook skeletons; AutoGrader as unit tests within skeleton • Only difference between AutoGrader and unit tests is different data • If code passes the unit tests and you didn’t “game” it, it should pass AutoGrader • Discussion permitted for clarifications, but never share solutions or code; acknowledge all your discussions at beginning of your report
  • 24. Homework 1 • Designed to test mathematical background • Opportunity to get used to the workflow • Full points if you score 50% or more • We will not answer questions about the HW 1 (except clarifying questions)
  • 25. Quiz Schedule • 12 quizzes • Released every Wednesday (starting next week) • Due Thursday 8 days later • Checks basic understanding of material covered the previous week
  • 26. Agenda • Logistics • Course description • Course policies • Tentative Schedule • Grading • Introduction • Motivation • Basic definitions • Examples
  • 27. What is Machine Learning? “Learning is any process by which a system improves performance from experience.” Herbert Simon
  • 28. What is Machine Learning? “Machine learning … gives computers the ability to learn without being explicitly programmed.” Arthur Samuel
  • 29. What is Machine Learning? • Tom Mitchell: Algorithms that • improve their performance 𝑃 • at task 𝑇 • with experience 𝐸 • A well-defined machine learning task is given by 𝑃, 𝑇, 𝐸
  • 30. Example: Game Playing • Tom Mitchell: Algorithms that • improve their performance 𝑃 • at task 𝑇 • with experience 𝐸 • 𝑇 = playing Checkers • 𝑃 = win rate against opponents • 𝐸 = playing games against itself
  • 31. Example: Prediction 31 Image: https://www.flickr.com/photos/gsfc/5937599688/ Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/ 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 1975 1985 1995 2005 2015 2025 Arctic Sea Ice Extent (millions of sq km) Year NSIDC Index of Arctic Sea Ice in September Photo by NASA Goddard ??
  • 32. Example: Prediction • Tom Mitchell: Algorithms that • improve their performance 𝑃 • at some task 𝑇 • with experience 𝐸 • 𝑇 = predict Arctic sea ice extent • 𝑃 = prediction error (e.g., absolute difference) • 𝐸 = historical data
  • 33. Machine Learning for Prediction Data 𝑍 Machine learning algorithm Model 𝑓 New input Predicted output
  • 34. Example: Prediction 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 1975 1985 1995 2005 2015 2025 Arctic Sea Ice Extent (millions of sq km) Year NSIDC Index of Arctic Sea Ice in September Photo by NASA Goddard ?? Image: https://www.flickr.com/photos/gsfc/5937599688/ Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/
  • 35. Our focus Machine Learning Workflow Framing an ML problem (Mitchell’s P, T, E) Data curation (sourcing, scraping, collection, labeling) Data analysis / visualization ML Design (hypothesis class, loss function, optimizer, hyperparameters, features) Train model Validate / Evaluate Deploy (and generate new data) Monitor performance on new data
  • 36. Types of Learning • Supervised learning • Input: Examples of inputs and outputs • Output: Model that predicts unknown output given a new input • Unsupervised learning • Input: Examples of some data (no “outputs”) • Output: Representation of structure in the data • Reinforcement learning • Input: Sequence of interactions with an environment • Output: Policy that performs a desired task
  • 37. Supervised Learning • Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥 37 Image: https://www.flickr.com/photos/gsfc/5937599688/ Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/ 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 1975 1985 1995 2005 2015 2025 Arctic Sea Ice Extent (millions of sq km) Year NSIDC Index of Arctic Sea Ice in September Photo by NASA Goddard
  • 38. Supervised Learning • Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥 • Regression: Labels 𝑦 are real-valued 38 Image: https://www.flickr.com/photos/gsfc/5937599688/ Data from https://nsidc.org/arcticseaicenews/sea-ice-tools/ 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 1975 1985 1995 2005 2015 2025 Arctic Sea Ice Extent (millions of sq km) Year NSIDC Index of Arctic Sea Ice in September Photo by NASA Goddard
  • 39. Supervised Learning • Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥 • Classification: Labels 𝑦 are categories Tumor Size Ocular Tumor (Malignant / Benign) Predict Malignant Predict Benign Malignant Benign 𝑓(𝑥) Image: https://eyecancer.com/uncategorized/choroidal- metastasis-test/
  • 40. Supervised Learning • Given 𝑥!, 𝑦! , … , 𝑥", 𝑦" , learn a function that predicts 𝑦 given 𝑥 • Inputs 𝑥 can be multi-dimensional Image: https://eyecancer.com/uncategorized/choroidal- metastasis-test/ Tumor Size Age • Patient age • Clump thickness • Tumor Color • Cell type • …
  • 41. Unsupervised Learning • Given 𝑥!, … , 𝑥" (no labels), output hidden structure in 𝑥’s • E.g., clustering
  • 43. Reinforcement Learning • Learn how to perform a task from interactions with the environment • Examples: • Playing chess (interact with the game) • Robot grasping an object (interact with the object/real world) • Optimize inventory allocations (interact with the inventory system) environment agent
  • 45. Applications of Machine Learning 45
  • 47. Radiology and Medicine https://www.nature.com/articles/s41746-020-00376-2 Input: Brain scans Output: Neurological disease labels https://www.nature.com/articles/s41573-019-0024-5
  • 49. Creating Images & Text https://thispersondoesnotexist.com/ https://transformer.huggingface.co/doc/gpt2-large
  • 50. Analytical Modeling/ Understanding Data Quantity and Quality When should we use machine learning? Adding two numbers NO Solving differential equations YES, SOMETIMES Flying rockets to other planets NO Checking large prime numbers NO Recognizing animals from pictures YES! Make art and music YES! Get robots to make sandwiches YES, PROBABLY Predict fashion in 20 years NO, PROBABLY Weather forecasting MAYBE?
  • 51. Danger of Out-of-Domain Machine Learning 0.0 1.0 2.0 3.0 4.0 5.0 6.0 7.0 8.0 9.0 1975 1985 1995 2005 2015 2025 Arctic Sea Ice Extent (millions of sq km) Year NSIDC Index of Arctic Sea Ice in September ?? Any time you are evaluating on data “far” from your training data, beware!
  • 52. Ethical Considerations “The Pennsylvania Board of Probation and Parole has begun using machine learning forecasts to help inform parole release decisions. In this paper, we evaluate the impact of the forecasts on those decisions and subsequent recidivism.” “In 2013, the University of Texas at Austin’s computer science department began using a machine-learning system called GRADE to help make decisions about who gets into its Ph.D. program” “Videos about vegetarianism led to videos about veganism. Videos about jogging led to videos about running ultramarathons. It seems as if you are never ‘hard core’ enough for YouTube’s recommendation algorithm. It promotes, recommends and disseminates videos in a manner that appears to constantly up the stakes. Given its billion or so users, YouTube may be one of the most powerful radicalizing instruments of the 21st century.”
  • 53. Tentative Schedule Week Content Homework Project 1 Introduction 2 Linear Regression 3 Linear Regression, Cont. HW 1 Due 4 Logistic Regression 5 Decision Trees 6 Neural Networks HW 2 Due 7 Computer Vision Milestone 1 Due 8 Bayesian Networks HW 3 Due 9 Unsupervised Learning 10 Natural Language Processing HW 4 Due 11 Reinforcement Learning Milestone 2 Due 12 Recommender Systems HW 5 Due 13 Ethics 14 Ethics HW 6 Due 15 Review Milestone 3 Due 16 Review
  • 54. First Assignments • HW 1: Released today, due 9/14 • No office hours planned for HW 1 • 50% = full credit • Quiz 1: Released 9/7, due 9/15