Successfully reported this slideshow.
Upcoming SlideShare
×

# Gentle introduction to Machine Learning

176 views

Published on

We start with a presentation of 1Tap then we do a gentle introduction to Machine Learning.

Published in: Technology
• Full Name
Comment goes here.

Are you sure you want to Yes No
• Be the first to comment

• Be the first to like this

### Gentle introduction to Machine Learning

1. 1. 1 Roman Orac, 1Tap Machine Learning & Data Analysis A Gentle introduction to Machine Learning
2. 2. 1Tap is a Automated Accounting Platform For the Self Employed* * Sole Trader, Sole Proprietor, Freelancer, Contractor, Independent, Non Incorporated Businesses Fully
3. 3. The Self Employed can’t buy the stuff they want Profit…Welfare… Taxes… No idea That is a problem for the new year... Denied... Hopefully I get better real soon... Credit… 6
4. 4. Making Self Employment > Employment Our Mission
5. 5. 1Tap Receipts Take a photo Data Extracted Tax Return updated Customers Love it 1 2 3 4
6. 6. The foundation of our apps Ruby on Rails Restful JSON API 4.0 Code Climate GPA
7. 7. Enough about us … What is Machine Learning Anyway?
8. 8. What is Machine Learning? Training data Machine Learning algorithm ClassifierNew samples Prediction Pre-processing ● Machine Learning is the science of getting computers to act without being explicitly programmed
9. 9. Predict survival on the Titanic In 1912 the Titanic sank, killing 1,502 out of 2,224 passengers and crew. Some groups of people were more likely to survive than others.
10. 10. Let’s look at the data Abbreviations ● Embarked: Port of embarkation ○ C = Cherbourg ○ Q = Queenstown ○ S = Southampton ● Parch: Number of parents/children aboard ● Pclass: Passenger's class ● SibSp: Number of siblings/spouses aboard ● Survived: Survived (1) or died (0) ● Ticket: Ticket number
11. 11. Understanding the data ● Distributions of the fare of passengers who survived or did not survive ● Many passengers with cheaper fares died ● Is fare a good predictive variable?
12. 12. Most Important Step: Data preprocessing Original data Preprocessed data preprocessing ● Clean the data ● Encode attributes ● Fill in missing values ● Add new attributes
13. 13. Decision Tree ● Use training set and build a decision tree model ● Use the model to predict new samples
14. 14. What types of problems do we solve with ML at 1Tap?
15. 15. Receipt categorization Initial receipt categorization based on company’s industry deterministic categorization many mis-categorization The Numbers 600K categorized receipts 40K users 80K new receipts every month
16. 16. Receipt categorization with ML Categorizing receipts in a smarter and more contextual way
17. 17. ● Features: ○ user’s profession ○ vendor name, date, expense total and text ● Preprocessing: ○ Filter receipts ○ Recategorize most obvious receipts ● Train a classifier that categorizes receipts ● This approach improves categorization as receipt text adds more context Receipt categorization with ML
18. 18. Questions?
19. 19. Come talk to us over pizza! Nejc, Human Resources Roman, Machine Learning Vesna, Head of Product