SlideShare a Scribd company logo
1 of 21
1
Roman Orac, 1Tap Machine Learning & Data Analysis
A Gentle introduction to Machine Learning
1Tap is a Automated Accounting Platform
For the Self Employed*
* Sole Trader, Sole Proprietor, Freelancer, Contractor, Independent, Non Incorporated
Businesses
Fully
The Self Employed can’t buy the stuff they want
Profit…Welfare…
Taxes…
No idea
That is a problem for
the new year...
Denied...
Hopefully I get better
real soon...
Credit…
6
Making Self Employment
> Employment
Our Mission
1Tap Receipts
Take a photo Data Extracted Tax Return updated Customers Love it
1 2 3 4
The foundation of our apps
Ruby on Rails
Restful JSON API
4.0 Code Climate GPA
Enough about us …
What is Machine Learning Anyway?
What is Machine Learning?
Training data
Machine Learning
algorithm
ClassifierNew samples Prediction
Pre-processing
● Machine Learning is the science of getting computers to act without
being explicitly programmed
Predict survival on the Titanic
In 1912 the Titanic sank, killing 1,502 out
of 2,224 passengers and crew.
Some groups of people were more
likely to survive than others.
Let’s look at the data
Abbreviations
● Embarked: Port of embarkation
○ C = Cherbourg
○ Q = Queenstown
○ S = Southampton
● Parch: Number of parents/children
aboard
● Pclass: Passenger's class
● SibSp: Number of siblings/spouses
aboard
● Survived: Survived (1) or died (0)
● Ticket: Ticket number
Understanding the data
● Distributions of the fare of passengers who survived or did
not survive
● Many passengers with cheaper fares died
● Is fare a good predictive variable?
Most Important Step: Data preprocessing
Original data Preprocessed data
preprocessing
● Clean the data
● Encode attributes
● Fill in missing values
● Add new attributes
Decision Tree
● Use training set and build a decision tree model
● Use the model to predict new samples
What types of problems do we solve with
ML at 1Tap?
Receipt categorization
Initial receipt categorization
based on company’s industry
deterministic categorization
many mis-categorization
The Numbers
600K categorized receipts
40K users
80K new receipts every month
Receipt categorization with ML
Categorizing receipts in a smarter and more contextual way
● Features:
○ user’s profession
○ vendor name, date, expense total and text
● Preprocessing:
○ Filter receipts
○ Recategorize most obvious receipts
● Train a classifier that categorizes receipts
● This approach improves categorization as receipt text adds more context
Receipt categorization with ML
Questions?
Come talk to
us over pizza!
Nejc, Human
Resources
Roman, Machine
Learning
Vesna, Head of
Product

More Related Content

Viewers also liked

Placement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environmentPlacement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environmentKim Clark
 
How to Triple Your Speed of Development Using Automation
How to Triple Your Speed of Development Using AutomationHow to Triple Your Speed of Development Using Automation
How to Triple Your Speed of Development Using AutomationAllCloud
 
Deloitte BPM case study by WorkflowGen
Deloitte BPM case study by WorkflowGenDeloitte BPM case study by WorkflowGen
Deloitte BPM case study by WorkflowGenAlain Bezançon
 
IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...
IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...
IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...Frank Altenburg
 
AI & Machine Learning - Webinar Deck
AI & Machine Learning - Webinar DeckAI & Machine Learning - Webinar Deck
AI & Machine Learning - Webinar DeckThe Digital Insurer
 
ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...
ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...
ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...Thuan Ng
 
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7Manju Devadas
 
Practical Strategies to Designing Beautiful Portals
Practical Strategies to Designing Beautiful PortalsPractical Strategies to Designing Beautiful Portals
Practical Strategies to Designing Beautiful PortalsKanwal Khipple
 
Practical Strategies for Transitioning to Office 365 #sptechcon
Practical Strategies for Transitioning to Office 365 #sptechconPractical Strategies for Transitioning to Office 365 #sptechcon
Practical Strategies for Transitioning to Office 365 #sptechconKanwal Khipple
 
Operations Playbook: Monitoring and Automation - RightScale Compute 2013
Operations Playbook: Monitoring and Automation - RightScale Compute 2013Operations Playbook: Monitoring and Automation - RightScale Compute 2013
Operations Playbook: Monitoring and Automation - RightScale Compute 2013RightScale
 
Case Study for Project Management System Using Sharepoint
Case Study for Project Management System Using SharepointCase Study for Project Management System Using Sharepoint
Case Study for Project Management System Using SharepointMike Taylor
 
Entrepreneurship with Data, Machine Learning and AI
Entrepreneurship with Data, Machine Learning and AIEntrepreneurship with Data, Machine Learning and AI
Entrepreneurship with Data, Machine Learning and AIJesus Ramos
 
NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...
NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...
NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...Morgan Simonsen
 
Automation Technology Series: Part 2: Intelligent automation: Driving efficie...
Automation Technology Series: Part 2: Intelligent automation: Driving efficie...Automation Technology Series: Part 2: Intelligent automation: Driving efficie...
Automation Technology Series: Part 2: Intelligent automation: Driving efficie...Accenture Insurance
 
Ansible- Durham Meetup: Using Ansible for Cisco ACI deployment
Ansible- Durham Meetup: Using Ansible for Cisco ACI deploymentAnsible- Durham Meetup: Using Ansible for Cisco ACI deployment
Ansible- Durham Meetup: Using Ansible for Cisco ACI deploymentJoel W. King
 
SEM Performance with Machine Learning
SEM Performance with Machine LearningSEM Performance with Machine Learning
SEM Performance with Machine LearningAcquisio
 
Closing with Coffee: Energizing and Engaging Target Accounts
Closing with Coffee: Energizing and Engaging Target AccountsClosing with Coffee: Energizing and Engaging Target Accounts
Closing with Coffee: Energizing and Engaging Target AccountsTerminus
 
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...Sophia Zhou
 

Viewers also liked (18)

Placement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environmentPlacement of BPM runtime components in an SOA environment
Placement of BPM runtime components in an SOA environment
 
How to Triple Your Speed of Development Using Automation
How to Triple Your Speed of Development Using AutomationHow to Triple Your Speed of Development Using Automation
How to Triple Your Speed of Development Using Automation
 
Deloitte BPM case study by WorkflowGen
Deloitte BPM case study by WorkflowGenDeloitte BPM case study by WorkflowGen
Deloitte BPM case study by WorkflowGen
 
IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...
IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...
IBM Connections 4.5 Integration - From Zero To Social Hero - 2.0 - with Domin...
 
AI & Machine Learning - Webinar Deck
AI & Machine Learning - Webinar DeckAI & Machine Learning - Webinar Deck
AI & Machine Learning - Webinar Deck
 
ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...
ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...
ExpertsLive Asia Pacific 2017 - Planning and Deploying SharePoint Server 2016...
 
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
Machine Learning Application to Manufacturing using Tableau and Google by Pluto7
 
Practical Strategies to Designing Beautiful Portals
Practical Strategies to Designing Beautiful PortalsPractical Strategies to Designing Beautiful Portals
Practical Strategies to Designing Beautiful Portals
 
Practical Strategies for Transitioning to Office 365 #sptechcon
Practical Strategies for Transitioning to Office 365 #sptechconPractical Strategies for Transitioning to Office 365 #sptechcon
Practical Strategies for Transitioning to Office 365 #sptechcon
 
Operations Playbook: Monitoring and Automation - RightScale Compute 2013
Operations Playbook: Monitoring and Automation - RightScale Compute 2013Operations Playbook: Monitoring and Automation - RightScale Compute 2013
Operations Playbook: Monitoring and Automation - RightScale Compute 2013
 
Case Study for Project Management System Using Sharepoint
Case Study for Project Management System Using SharepointCase Study for Project Management System Using Sharepoint
Case Study for Project Management System Using Sharepoint
 
Entrepreneurship with Data, Machine Learning and AI
Entrepreneurship with Data, Machine Learning and AIEntrepreneurship with Data, Machine Learning and AI
Entrepreneurship with Data, Machine Learning and AI
 
NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...
NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...
NIC 2017 Azure AD Identity Protection and Conditional Access: Using the Micro...
 
Automation Technology Series: Part 2: Intelligent automation: Driving efficie...
Automation Technology Series: Part 2: Intelligent automation: Driving efficie...Automation Technology Series: Part 2: Intelligent automation: Driving efficie...
Automation Technology Series: Part 2: Intelligent automation: Driving efficie...
 
Ansible- Durham Meetup: Using Ansible for Cisco ACI deployment
Ansible- Durham Meetup: Using Ansible for Cisco ACI deploymentAnsible- Durham Meetup: Using Ansible for Cisco ACI deployment
Ansible- Durham Meetup: Using Ansible for Cisco ACI deployment
 
SEM Performance with Machine Learning
SEM Performance with Machine LearningSEM Performance with Machine Learning
SEM Performance with Machine Learning
 
Closing with Coffee: Energizing and Engaging Target Accounts
Closing with Coffee: Energizing and Engaging Target AccountsClosing with Coffee: Energizing and Engaging Target Accounts
Closing with Coffee: Energizing and Engaging Target Accounts
 
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
Microsoft PPM tool (Project Online / Project Server) Case Study by epmsolutio...
 

Similar to Gentle introduction to Machine Learning

SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)Laura Chiticariu
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the EnterpriseSrinath Perera
 
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor..."The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...Quantopian
 
First steps in Data Mining Kindergarten
First steps in Data Mining KindergartenFirst steps in Data Mining Kindergarten
First steps in Data Mining KindergartenAlexey Zinoviev
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Andrew Clark
 
Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)FSR Communications and Media
 
Tips to get the most out of OpenERP
Tips to get the most out of OpenERPTips to get the most out of OpenERP
Tips to get the most out of OpenERPAudaxis
 
Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...
Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...
Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...Odoo
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 -  Automated ML - Panagiotis PapaemmanouilGDG DEvFest Hellas 2020 -  Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis PapaemmanouilPanagiotis Papaemmanouil
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Rittman Analytics
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Manjunath Sindagi
 
From DBA to DE: Becoming a Data Engineer
From DBA to DE:  Becoming a Data Engineer From DBA to DE:  Becoming a Data Engineer
From DBA to DE: Becoming a Data Engineer Jim Czuprynski
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j
 
Final project presentation group2
Final project presentation   group2Final project presentation   group2
Final project presentation group2VikalpUpadhyay1
 
Overview of analytics and big data in practice
Overview of analytics and big data in practiceOverview of analytics and big data in practice
Overview of analytics and big data in practiceVivek Murugesan
 
Data analytics in fraud detection and customer feedback
Data analytics in fraud detection and customer feedbackData analytics in fraud detection and customer feedback
Data analytics in fraud detection and customer feedbackAnkit Jain
 
Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...
Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...
Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...Databricks
 
How to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving CarsHow to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving CarsVMware Tanzu
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data ScientistRohit Dubey
 

Similar to Gentle introduction to Machine Learning (20)

SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
SystemT: Declarative Information Extraction (invited talk at MIT CSAIL)
 
Data science Applications in the Enterprise
Data science Applications in the EnterpriseData science Applications in the Enterprise
Data science Applications in the Enterprise
 
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor..."The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
"The Hunt For Alpha Among Alternative Data Sources" by Dr. Michael Halls-Moor...
 
First steps in Data Mining Kindergarten
First steps in Data Mining KindergartenFirst steps in Data Mining Kindergarten
First steps in Data Mining Kindergarten
 
Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know Machine Learning: What Assurance Professionals Need to Know
Machine Learning: What Assurance Professionals Need to Know
 
Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)Artificial Intelligence and Antitrust (Hal Varian)
Artificial Intelligence and Antitrust (Hal Varian)
 
Tips to get the most out of OpenERP
Tips to get the most out of OpenERPTips to get the most out of OpenERP
Tips to get the most out of OpenERP
 
Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...
Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...
Tips to get the most out of OpenERP. Jean Luc Delsaute & Coralie Girardet, Au...
 
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 -  Automated ML - Panagiotis PapaemmanouilGDG DEvFest Hellas 2020 -  Automated ML - Panagiotis Papaemmanouil
GDG DEvFest Hellas 2020 - Automated ML - Panagiotis Papaemmanouil
 
Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17Analytics is Taking over the World (Again) - UKOUG Tech'17
Analytics is Taking over the World (Again) - UKOUG Tech'17
 
Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)Introduction to machine learning and applications (1)
Introduction to machine learning and applications (1)
 
3 types of monitoring for 2020
3 types of monitoring for 20203 types of monitoring for 2020
3 types of monitoring for 2020
 
From DBA to DE: Becoming a Data Engineer
From DBA to DE:  Becoming a Data Engineer From DBA to DE:  Becoming a Data Engineer
From DBA to DE: Becoming a Data Engineer
 
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
Neo4j GraphTalk Copenhagen - Next Generation Solutions using Neo4j
 
Final project presentation group2
Final project presentation   group2Final project presentation   group2
Final project presentation group2
 
Overview of analytics and big data in practice
Overview of analytics and big data in practiceOverview of analytics and big data in practice
Overview of analytics and big data in practice
 
Data analytics in fraud detection and customer feedback
Data analytics in fraud detection and customer feedbackData analytics in fraud detection and customer feedback
Data analytics in fraud detection and customer feedback
 
Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...
Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...
Merchant Churn Prediction Using SparkML at PayPal with Chetan Nadgire and Ani...
 
How to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving CarsHow to Make Cars Smarter: A Step Towards Self-Driving Cars
How to Make Cars Smarter: A Step Towards Self-Driving Cars
 
Guide for a Data Scientist
Guide for a Data ScientistGuide for a Data Scientist
Guide for a Data Scientist
 

Recently uploaded

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Enterprise Knowledge
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slidevu2urc
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 

Recently uploaded (20)

The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
Neo4j - How KGs are shaping the future of Generative AI at AWS Summit London ...
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...Driving Behavioral Change for Information Management through Data-Driven Gree...
Driving Behavioral Change for Information Management through Data-Driven Gree...
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Histor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slideHistor y of HAM Radio presentation slide
Histor y of HAM Radio presentation slide
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 

Gentle introduction to Machine Learning

  • 1. 1 Roman Orac, 1Tap Machine Learning & Data Analysis A Gentle introduction to Machine Learning
  • 2. 1Tap is a Automated Accounting Platform For the Self Employed* * Sole Trader, Sole Proprietor, Freelancer, Contractor, Independent, Non Incorporated Businesses Fully
  • 3. The Self Employed can’t buy the stuff they want Profit…Welfare… Taxes… No idea That is a problem for the new year... Denied... Hopefully I get better real soon... Credit… 6
  • 4. Making Self Employment > Employment Our Mission
  • 5. 1Tap Receipts Take a photo Data Extracted Tax Return updated Customers Love it 1 2 3 4
  • 6. The foundation of our apps Ruby on Rails Restful JSON API 4.0 Code Climate GPA
  • 7. Enough about us … What is Machine Learning Anyway?
  • 8. What is Machine Learning? Training data Machine Learning algorithm ClassifierNew samples Prediction Pre-processing ● Machine Learning is the science of getting computers to act without being explicitly programmed
  • 9. Predict survival on the Titanic In 1912 the Titanic sank, killing 1,502 out of 2,224 passengers and crew. Some groups of people were more likely to survive than others.
  • 10. Let’s look at the data Abbreviations ● Embarked: Port of embarkation ○ C = Cherbourg ○ Q = Queenstown ○ S = Southampton ● Parch: Number of parents/children aboard ● Pclass: Passenger's class ● SibSp: Number of siblings/spouses aboard ● Survived: Survived (1) or died (0) ● Ticket: Ticket number
  • 11. Understanding the data ● Distributions of the fare of passengers who survived or did not survive ● Many passengers with cheaper fares died ● Is fare a good predictive variable?
  • 12. Most Important Step: Data preprocessing Original data Preprocessed data preprocessing ● Clean the data ● Encode attributes ● Fill in missing values ● Add new attributes
  • 13. Decision Tree ● Use training set and build a decision tree model ● Use the model to predict new samples
  • 14. What types of problems do we solve with ML at 1Tap?
  • 15. Receipt categorization Initial receipt categorization based on company’s industry deterministic categorization many mis-categorization The Numbers 600K categorized receipts 40K users 80K new receipts every month
  • 16. Receipt categorization with ML Categorizing receipts in a smarter and more contextual way
  • 17. ● Features: ○ user’s profession ○ vendor name, date, expense total and text ● Preprocessing: ○ Filter receipts ○ Recategorize most obvious receipts ● Train a classifier that categorizes receipts ● This approach improves categorization as receipt text adds more context Receipt categorization with ML
  • 19.
  • 20.
  • 21. Come talk to us over pizza! Nejc, Human Resources Roman, Machine Learning Vesna, Head of Product