SlideShare a Scribd company logo
1 of 30
Download to read offline
Stepping into the AI Wave:
Words from an Industry Newbie
Sept. 29, 2017 @ AI Web Talk Series
Joanne Tseng
joanne@appdiff.com
About me
! B.S. Degree in Mathematics and Statistics(DM), NCKU, Taiwan
! 2.5 years of working experience as a Machine Learning Engineer/ Data Scientist
! SVIP (Silicon Valley Internship Programme) 2016-2017
! Data Scientist @ Appdiff Inc.
Joanne Tseng
joanne@appdiff.com
SVIP (Silicon Valley Internship Programme)
! Non-profit organization based in the UK
! gives newly graduating students one year full time internship @ Silicon Valley Startup
! partnership with Girls In Tech (GIT) - open opportunities to women around the world
Joanne Tseng
joanne@appdiff.com
This talk is about…
! My self-directed learning process
! The project I’m doing right now @ Appdiff
! Q&A
Joanne Tseng
joanne@appdiff.com
Three years ago...
- I was in my senior year at the university
- with mathematics and statistics background
- The first time I heard of the term “machine learning”
- No coding background
- Interested in data analysis
self-directed learning process My project @ Appdiff Q&A
FYI:
Machine Learning (from
Wiki): is a field of computer
science that gives computers
the ability to learn without being
explicitly programmed.
Joanne Tseng
joanne@appdiff.com
I was wondering about two questions...
- What kind of job I can get/ I might do in the future if I’m interested in data analysis?

- If I want to start my master degree, which area/ field I should go for?
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
I was wondering about two questions...
- What kind of job I can get/ I might do in the future if I’m interested in data analysis?

- If I want to start my master degree, which area/ field I should go for?
self-directed learning process My project @ Appdiff Q&A
Ask graduated seniors !!
Joanne Tseng
joanne@appdiff.com
Suggestions I got
- Learn python language
- Take some basic CS courses including Algorithms and Data Structure
- Take machine learning course
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
The starting point of my self-directed learning
- Learn python language
- Take some basic CS courses including Algorithms and Data Structure
- Take machine learning course
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Self-directed learning: online courses
- Good side and the downside: TOO FLEXIBLE
- Hard to persist
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Starting is easy, persistence is an art
- To have a study group!!!
- Plan the long term and short term schedule
- Have meeting regularly
- Can’t find study group members: recommend meetup.com
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
My study group (three years ago...)
- The long term path we followed: 

https://www.springboard.com/learning-paths/data-analysis/learn/
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Suggested order to learn
- Learn Python on codecademy
- Use Ipython notebook and one basic Kaggle dataset to practice data analysis flow
- Take Algorithms course (recommend MIT course) and use python to do practices

- Try to use terminal to set up your python environment
- Try to use github to manage your code.
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
After I started to work...
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
After I started to work...
- Engineer’s Mindset
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Engineer’s Mindset
- Learn how to solve problems independently - aka. Google everything!
- Never stop learning
- Be a patient problem solver!! - Don’t afraid of having new bugs :)
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
After I started to work...
- Engineer’s Mindset
- Having a new study group - keep doing self-directed learning 

self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Our study group right now
- Deep Learning
- Follow online courses

> Deep Learning A-Z™: Hands-On Artificial Neural Networks on Udemy 

> (Next Course) Andrew Ng Deep Learning course (https://www.deeplearning.ai/)
- Have online meetup regularly (once per two weeks)
- If you are interested: find more info on https://dosudo.com/ 

self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
About my current company - Appdiff Inc.
- Building AI system for software testing
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
What’s Software Testing
Software testing (from Wiki):
is an investigation conducted to provide stakeholders with information about the quality of the
software product or service under test. Test techniques include the process of executing a
program or application with the intent of finding software bugs (errors or other defects), and
verifying that the software product is fit for use.



e.g.
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Machine Learning Related Topics
- Build page classifiers and button classifiers
self-directed learning process My project @ Appdiff Q&A
Page Level: login page
Button Level: login button, facebook
signin button, password button etc.
Joanne Tseng
joanne@appdiff.com
Data Scientist @ startup company
- What you really do is closer to machine learning engineer
- Training classifier only accounts 30% of your job :)
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Data Scientist @ startup company
- What you really do is closer to machine learning engineer
- Training classifier only accounts 30% of your job :)
self-directed learning process My project @ Appdiff Q&A
How about the rest 70%?
Joanne Tseng
joanne@appdiff.com
Data Scientist @ startup company
- What you really do is closer to machine learning engineer
- Training classifier only accounts 30% of your job :)
- Building model training pipeline (40%)

Building the system of the cycle from getting label data → feature extraction → 

model training → model evaluation → storage → label correction → getting new labels.
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Data Scientist @ startup company
- What you really do is closer to machine learning engineer
- Training classifier only accounts 30% of your job :)
- Building model training pipeline (40%)

Building the system of the cycle from getting label data → feature extraction → 

model training → model evaluation → storage → label correction → getting new labels.
- Label collection by designing the experiment or implement simple label interface (15%)
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Skills I’m using
! Building data pipeline

- Language: Python

- Database: BigQuery, GCP API

- System Design
! Building ML classifiers

- python data libraries: pandas, numpy, matplotlib(plotting library), 

nltk(Natural Language ToolKit), keras(for training Neural Networks Models), 

scikit-learn(tools for data mining and data analysis)
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com
Reference
! Data Analysis Learning Path (https://www.springboard.com/learning-paths/data-analysis/learn/)
! Kaggle (https://www.kaggle.com/)
! Meetup (https://www.meetup.com/)
! Learn Python, codecademy
! Introduction to Algorithms, MITOpenCourseWare
! Machine Learning, Andrew Ng, Coursera
! Deep Learning A-Z™: Hands-On Artificial Neural Networks, Udemy
! Deep Learning Specializtion, deeplearning.ai
Joanne Tseng
joanne@appdiff.com
self-directed learning process My project @ Appdiff Q&A
Joanne Tseng
joanne@appdiff.com

More Related Content

Similar to Stepping into the AI Wave - Words from an Industry Newbie

Welcome to SoftUni (Software University)
Welcome to SoftUni (Software University)Welcome to SoftUni (Software University)
Welcome to SoftUni (Software University)Svetlin Nakov
 
Class 01 - Intro.pdf
Class 01 - Intro.pdfClass 01 - Intro.pdf
Class 01 - Intro.pdfJonathanArp3
 
Digital World: A Freshmore Course for Computational Thinking at SUTD
Digital World: A Freshmore Course for Computational Thinking at SUTDDigital World: A Freshmore Course for Computational Thinking at SUTD
Digital World: A Freshmore Course for Computational Thinking at SUTDOka Kurniawan
 
resume_Sangsu_Lee
resume_Sangsu_Leeresume_Sangsu_Lee
resume_Sangsu_LeeSangsu Lee
 
EdTechJoker IST 402 - Syllabus day
EdTechJoker IST 402 - Syllabus dayEdTechJoker IST 402 - Syllabus day
EdTechJoker IST 402 - Syllabus daybtopro
 
How to Succeed as a PM by fmr Native Instrument Dir of Product
How to Succeed as a PM by fmr Native Instrument Dir of ProductHow to Succeed as a PM by fmr Native Instrument Dir of Product
How to Succeed as a PM by fmr Native Instrument Dir of ProductProduct School
 
#ISTE2016 Teach any subjects by making apps
#ISTE2016  Teach any subjects by making apps#ISTE2016  Teach any subjects by making apps
#ISTE2016 Teach any subjects by making appsMartine Paquet
 
AWARENESS PROGRAMME ON SPOKEN TUTORIALS
AWARENESS PROGRAMME ON SPOKEN TUTORIALSAWARENESS PROGRAMME ON SPOKEN TUTORIALS
AWARENESS PROGRAMME ON SPOKEN TUTORIALSdrningappaarabagonda
 
vodQA Pune - Innovations in Testing - Agenda
vodQA Pune - Innovations in Testing - AgendavodQA Pune - Innovations in Testing - Agenda
vodQA Pune - Innovations in Testing - AgendavodQA
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessOpenSource Connections
 
Enabling learning ecosystems - Webinar slides
Enabling learning ecosystems - Webinar slidesEnabling learning ecosystems - Webinar slides
Enabling learning ecosystems - Webinar slidesSprout Labs
 
Reinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomReinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomGalit Shmueli
 

Similar to Stepping into the AI Wave - Words from an Industry Newbie (20)

Welcome to SoftUni (Software University)
Welcome to SoftUni (Software University)Welcome to SoftUni (Software University)
Welcome to SoftUni (Software University)
 
Class 01 - Intro.pdf
Class 01 - Intro.pdfClass 01 - Intro.pdf
Class 01 - Intro.pdf
 
my own resume
my own resumemy own resume
my own resume
 
TO DO list APP Called Do It
TO DO list APP Called Do ItTO DO list APP Called Do It
TO DO list APP Called Do It
 
Digital World: A Freshmore Course for Computational Thinking at SUTD
Digital World: A Freshmore Course for Computational Thinking at SUTDDigital World: A Freshmore Course for Computational Thinking at SUTD
Digital World: A Freshmore Course for Computational Thinking at SUTD
 
resume_Sangsu_Lee
resume_Sangsu_Leeresume_Sangsu_Lee
resume_Sangsu_Lee
 
EdTechJoker IST 402 - Syllabus day
EdTechJoker IST 402 - Syllabus dayEdTechJoker IST 402 - Syllabus day
EdTechJoker IST 402 - Syllabus day
 
Infographic by A'PaePae
Infographic by A'PaePaeInfographic by A'PaePae
Infographic by A'PaePae
 
How to Succeed as a PM by fmr Native Instrument Dir of Product
How to Succeed as a PM by fmr Native Instrument Dir of ProductHow to Succeed as a PM by fmr Native Instrument Dir of Product
How to Succeed as a PM by fmr Native Instrument Dir of Product
 
#ISTE2016 Teach any subjects by making apps
#ISTE2016  Teach any subjects by making apps#ISTE2016  Teach any subjects by making apps
#ISTE2016 Teach any subjects by making apps
 
AWARENESS PROGRAMME ON SPOKEN TUTORIALS
AWARENESS PROGRAMME ON SPOKEN TUTORIALSAWARENESS PROGRAMME ON SPOKEN TUTORIALS
AWARENESS PROGRAMME ON SPOKEN TUTORIALS
 
vodQA Pune - Innovations in Testing - Agenda
vodQA Pune - Innovations in Testing - AgendavodQA Pune - Innovations in Testing - Agenda
vodQA Pune - Innovations in Testing - Agenda
 
How To Structure Your Search Team for Success
How To Structure Your Search Team for SuccessHow To Structure Your Search Team for Success
How To Structure Your Search Team for Success
 
Lecture-1 Applied ML.pptx
Lecture-1 Applied ML.pptxLecture-1 Applied ML.pptx
Lecture-1 Applied ML.pptx
 
Lecture-1 Applied ML.pptx
Lecture-1 Applied ML.pptxLecture-1 Applied ML.pptx
Lecture-1 Applied ML.pptx
 
C++ notes 2 tutorials duniya
C++ notes 2   tutorials duniyaC++ notes 2   tutorials duniya
C++ notes 2 tutorials duniya
 
Enabling learning ecosystems - Webinar slides
Enabling learning ecosystems - Webinar slidesEnabling learning ecosystems - Webinar slides
Enabling learning ecosystems - Webinar slides
 
My Academics
My AcademicsMy Academics
My Academics
 
C++ notes tutorials duniya
C++ notes   tutorials duniyaC++ notes   tutorials duniya
C++ notes tutorials duniya
 
Reinventing the Data Analytics Classroom
Reinventing the Data Analytics ClassroomReinventing the Data Analytics Classroom
Reinventing the Data Analytics Classroom
 

Recently uploaded

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...Martijn de Jong
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdfhans926745
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 

Recently uploaded (20)

Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 

Stepping into the AI Wave - Words from an Industry Newbie

  • 1. Stepping into the AI Wave: Words from an Industry Newbie Sept. 29, 2017 @ AI Web Talk Series Joanne Tseng joanne@appdiff.com
  • 2. About me ! B.S. Degree in Mathematics and Statistics(DM), NCKU, Taiwan ! 2.5 years of working experience as a Machine Learning Engineer/ Data Scientist ! SVIP (Silicon Valley Internship Programme) 2016-2017 ! Data Scientist @ Appdiff Inc. Joanne Tseng joanne@appdiff.com
  • 3. SVIP (Silicon Valley Internship Programme) ! Non-profit organization based in the UK ! gives newly graduating students one year full time internship @ Silicon Valley Startup ! partnership with Girls In Tech (GIT) - open opportunities to women around the world Joanne Tseng joanne@appdiff.com
  • 4. This talk is about… ! My self-directed learning process ! The project I’m doing right now @ Appdiff ! Q&A Joanne Tseng joanne@appdiff.com
  • 5. Three years ago... - I was in my senior year at the university - with mathematics and statistics background - The first time I heard of the term “machine learning” - No coding background - Interested in data analysis self-directed learning process My project @ Appdiff Q&A FYI: Machine Learning (from Wiki): is a field of computer science that gives computers the ability to learn without being explicitly programmed. Joanne Tseng joanne@appdiff.com
  • 6. I was wondering about two questions... - What kind of job I can get/ I might do in the future if I’m interested in data analysis?
 - If I want to start my master degree, which area/ field I should go for? self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 7. I was wondering about two questions... - What kind of job I can get/ I might do in the future if I’m interested in data analysis?
 - If I want to start my master degree, which area/ field I should go for? self-directed learning process My project @ Appdiff Q&A Ask graduated seniors !! Joanne Tseng joanne@appdiff.com
  • 8. Suggestions I got - Learn python language - Take some basic CS courses including Algorithms and Data Structure - Take machine learning course self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 9. self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 10. The starting point of my self-directed learning - Learn python language - Take some basic CS courses including Algorithms and Data Structure - Take machine learning course self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 11. Self-directed learning: online courses - Good side and the downside: TOO FLEXIBLE - Hard to persist self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 12. Starting is easy, persistence is an art - To have a study group!!! - Plan the long term and short term schedule - Have meeting regularly - Can’t find study group members: recommend meetup.com self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 13. My study group (three years ago...) - The long term path we followed: 
 https://www.springboard.com/learning-paths/data-analysis/learn/ self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 14. Suggested order to learn - Learn Python on codecademy - Use Ipython notebook and one basic Kaggle dataset to practice data analysis flow - Take Algorithms course (recommend MIT course) and use python to do practices
 - Try to use terminal to set up your python environment - Try to use github to manage your code. self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 15. After I started to work... self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 16. After I started to work... - Engineer’s Mindset self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 17. Engineer’s Mindset - Learn how to solve problems independently - aka. Google everything! - Never stop learning - Be a patient problem solver!! - Don’t afraid of having new bugs :) self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 18. After I started to work... - Engineer’s Mindset - Having a new study group - keep doing self-directed learning 
 self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 19. Our study group right now - Deep Learning - Follow online courses
 > Deep Learning A-Z™: Hands-On Artificial Neural Networks on Udemy 
 > (Next Course) Andrew Ng Deep Learning course (https://www.deeplearning.ai/) - Have online meetup regularly (once per two weeks) - If you are interested: find more info on https://dosudo.com/ 
 self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 20. About my current company - Appdiff Inc. - Building AI system for software testing self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 21. What’s Software Testing Software testing (from Wiki): is an investigation conducted to provide stakeholders with information about the quality of the software product or service under test. Test techniques include the process of executing a program or application with the intent of finding software bugs (errors or other defects), and verifying that the software product is fit for use.
 
 e.g. self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 22. Machine Learning Related Topics - Build page classifiers and button classifiers self-directed learning process My project @ Appdiff Q&A Page Level: login page Button Level: login button, facebook signin button, password button etc. Joanne Tseng joanne@appdiff.com
  • 23. Data Scientist @ startup company - What you really do is closer to machine learning engineer - Training classifier only accounts 30% of your job :) self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 24. Data Scientist @ startup company - What you really do is closer to machine learning engineer - Training classifier only accounts 30% of your job :) self-directed learning process My project @ Appdiff Q&A How about the rest 70%? Joanne Tseng joanne@appdiff.com
  • 25. Data Scientist @ startup company - What you really do is closer to machine learning engineer - Training classifier only accounts 30% of your job :) - Building model training pipeline (40%)
 Building the system of the cycle from getting label data → feature extraction → 
 model training → model evaluation → storage → label correction → getting new labels. self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 26. Data Scientist @ startup company - What you really do is closer to machine learning engineer - Training classifier only accounts 30% of your job :) - Building model training pipeline (40%)
 Building the system of the cycle from getting label data → feature extraction → 
 model training → model evaluation → storage → label correction → getting new labels. - Label collection by designing the experiment or implement simple label interface (15%) self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 27. self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 28. Skills I’m using ! Building data pipeline
 - Language: Python
 - Database: BigQuery, GCP API
 - System Design ! Building ML classifiers
 - python data libraries: pandas, numpy, matplotlib(plotting library), 
 nltk(Natural Language ToolKit), keras(for training Neural Networks Models), 
 scikit-learn(tools for data mining and data analysis) self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com
  • 29. Reference ! Data Analysis Learning Path (https://www.springboard.com/learning-paths/data-analysis/learn/) ! Kaggle (https://www.kaggle.com/) ! Meetup (https://www.meetup.com/) ! Learn Python, codecademy ! Introduction to Algorithms, MITOpenCourseWare ! Machine Learning, Andrew Ng, Coursera ! Deep Learning A-Z™: Hands-On Artificial Neural Networks, Udemy ! Deep Learning Specializtion, deeplearning.ai Joanne Tseng joanne@appdiff.com
  • 30. self-directed learning process My project @ Appdiff Q&A Joanne Tseng joanne@appdiff.com