The State of ML for iOS
On the Advent of WWDC 2018
Meghan Kane, @meghafon
NSLondon
May 2018
!
Hey, I'm Meghan!
@meghafon
iOS Engineer @ Novoda Berlin
wwdc 2018
!
big picture
"
when is it practical to use ML for iOS?
#
what's available to us?
$
end-to-end examples
!
barriers to entry?
1. A large dataset
2. Access to high end compute power
3. PhD in machine learning
4. All the time in the world
...nope!
Is it practical for my app?
image classification
audio classification
speech recognition
text classification
gesture recognition
optical character recognition (OCR)
translation
voice synthesis
embrace idea generation
& experimentation
is it just hype?
machine learning is a powerful tool
but, it is still just another tool
how can we think about
ML as
!
developers?
Can this be solved without ML?
if so, choose that
ML vs not ML
basic unit of solving problem = function ("model")
ML: enabling a machine to learn function on its own
classify sign language alphabet images
not ML: explicitly defining function
determining if a number is even/odd
If you decide to use ML
still go with the simplest solution
Why do ML (predictions) on mobile?
→ low latency user experience
→ user privacy
What's available from Apple?
image classification of 1000 common categories
→ trees, animals, food, vehicles, people
→ SqueezeNet (5 MB), MobileNet (17 MB), Inception
V3 (95 MB), ResNet50 (103 MB), VGG16 (554 MB)
scene classification of 205 categories
→ airport terminal, bedroom, forest, coast
→ Places205-GoogLeNet (25 MB)
If not, train custom ML model
step 1: use framework for training
TensorFlow, keras, Turi Create , Caffe, etc
⚠
warning, there are a lot of them
step 2: convert to .mlmodel format (OSS)
→  coremltools github.com/apple/coremltools
→ tf-coreml github.com/tf-coreml
It has been
quite a year
beyond the cat/dog classifier (TM)
End-to-end Process as a developer?
0. Define problem
1. Collect data
2. Train ML model
3. Convert to coreml .mlmodel
4. Import into Xcode project
5. Predict using Core ML (+Vision) framework
Mobile specific concerns
size of model
time it takes to run predictions
supported layers
examples!
0.Define problem
American Sign Language (ASL) alphabet classifier
1.Collect data
!
2. Train model
Quick Review: Deep Learning
neural network model with many layers
deep = many layers
-> deep neural network
Mobile Machine Learning 101: Glossary Jameson
Toole on Heartbeat blog
sometime way back in B.C.
people used to train deep neural network from
scratch
still some (more recent) time in B.C.
people stand on the shoulders of giants' work
utilizing transfer learning
enter.. transfer learning
!
use knowledge learned from source task (MobileNet)
--> to train target task (ASL classifier)
don't reinvent the wheel
Why Transfer Learning works
neural networks are universal approximators
in theory, they can approximate any function
how much data do i need?
depends on problem
just 100s images per category
where can i get it?
kaggle
google for them...
record video + extract frames (using e.g. FFmpeg)
what if i don't have enough?
data augmentation!
Deeplearning.ai: C4W2L10 Data Augmentation (~10
min video)
Let's start
training...
→ can we use Swift for Tensor
Flow?
→ for now, stick with regular
Tensor Flow
confession: this is in
python
Performance
so how did our training go?
~20 min to run on my MacBook
95.3% accuracy on the test data
3. Convert
to .mlmodel
tf-coreml
It just works
!
4. Import into Xcode project
drag + drop
It actually just
works
5. Predict using Core ML (+Vision)
framework
vision + core ml
audio classification
0.Define problem
1. Gather data
2. Train ML model
0.Define problem
Audio classifier of urban sounds
air conditioner, car horn, children playing, drilling,
siren, etc
1.Gather data
UrbanSound 8K open dataset
Urban Sound Datasets, NYU CUSP
should we use raw audio (.wav)?
no, it's too computationally expensive
convert wav to spectrogram
represent audio as image (3 dimensions)
1st dimension: time (x-axis)
2nd dimension: frequency (y-axis)
3rd dimension: sound intensity (color)
2. Train Model
Performance
so how did our training go?
~1 hour to run on my MacBook
77.1% accuracy on the test data
what to
focus on
Where to find inspiration
look at open datasets
read research papers!
follow heartbeat blog, openAI
Reproduce results
research papers often include this
make sure to do the same if you publish
check licensing + attribute proper credit
Looking forward to the future
ML interpretability
swift for TensorFlow
Review
!
big picture
"
when is it practical to use ML for iOS?
#
what's available to us?
$
end-to-end examples
Attributions & Mentions (1/4)
Apple Machine Learning
WWDC 2017 Videos
TensorFlow for Poets Google codelabs tutorial
Apple coremltools GitHub repo
tf-coreml GitHub repo: TensorFlow->core ml
converter
Attributions & Mentions (2/4)
Heartbeat by fritz.ai blog: Machine Learning at the
edge
ASL Datasets
Kaggle Sign Language MNIST
Urban Sound Datasets, NYU CUSP
deeplearning.ai course: Data Augmentation
Attributions & Mentions (3/4)
Swift for TensorFlow GitHub repo
Dockerized Swift for TF GitHub repo, Alexis Gallager
themorningpaper by Adrian Colyer
OpenAI Research
"The Building Blocks of Interpretability" Google: C.
Olah et al
Attributions & Mentions (4/4)
"Strategically Ignorant" Devon Zuegel
"Transfer Learning of Temporal Information for Driver
Action Classification" J. Lemley et al
"Transfer Learning for Sound Classification"
TataLab
Further Learning (1/3)
fast.ai Deep Learning course
My Udacity Core ML course
machinethink,
!
ML for iOS blog by Matthijs
Hollemans
TensorFlow Dev Summit 2018 Videos
TensorFlow playground
Further Learning (2/3)
Building Mobile Apps w/ Tensor Flow Pete Warden
Neural Networks & Deep Learning Michael
Nielsen
Stanford's Computer Vision course (CS231n)
Further Learning (3/3)
"Distilling the Knowledge in a Neural Network"
Geoffrey Hinton et al.
"Transfer Learning - Machine Learning's Next
Frontier"
!
Sebastian Ruder
"Transfer learning for music classification and
regression tasks"
!
Keunwoo Choi et al.
Thank you
Keep in touch!
twitter: @meghafon

The State of ML for iOS: On the Advent of WWDC 2018 🕯

  • 1.
    The State ofML for iOS On the Advent of WWDC 2018 Meghan Kane, @meghafon NSLondon May 2018
  • 2.
    ! Hey, I'm Meghan! @meghafon iOSEngineer @ Novoda Berlin
  • 3.
  • 7.
    ! big picture " when isit practical to use ML for iOS? # what's available to us? $ end-to-end examples !
  • 9.
    barriers to entry? 1.A large dataset 2. Access to high end compute power 3. PhD in machine learning 4. All the time in the world ...nope!
  • 10.
    Is it practicalfor my app? image classification audio classification speech recognition text classification gesture recognition optical character recognition (OCR) translation voice synthesis
  • 11.
  • 14.
  • 15.
    machine learning isa powerful tool but, it is still just another tool
  • 16.
    how can wethink about ML as ! developers?
  • 17.
    Can this besolved without ML? if so, choose that
  • 18.
    ML vs notML basic unit of solving problem = function ("model") ML: enabling a machine to learn function on its own classify sign language alphabet images not ML: explicitly defining function determining if a number is even/odd
  • 19.
    If you decideto use ML still go with the simplest solution
  • 20.
    Why do ML(predictions) on mobile? → low latency user experience → user privacy
  • 21.
    What's available fromApple? image classification of 1000 common categories → trees, animals, food, vehicles, people → SqueezeNet (5 MB), MobileNet (17 MB), Inception V3 (95 MB), ResNet50 (103 MB), VGG16 (554 MB) scene classification of 205 categories → airport terminal, bedroom, forest, coast → Places205-GoogLeNet (25 MB)
  • 22.
    If not, traincustom ML model step 1: use framework for training TensorFlow, keras, Turi Create , Caffe, etc ⚠ warning, there are a lot of them step 2: convert to .mlmodel format (OSS) →  coremltools github.com/apple/coremltools → tf-coreml github.com/tf-coreml
  • 24.
  • 27.
    beyond the cat/dogclassifier (TM)
  • 30.
    End-to-end Process asa developer? 0. Define problem 1. Collect data 2. Train ML model 3. Convert to coreml .mlmodel 4. Import into Xcode project 5. Predict using Core ML (+Vision) framework
  • 31.
    Mobile specific concerns sizeof model time it takes to run predictions supported layers
  • 32.
  • 33.
    0.Define problem American SignLanguage (ASL) alphabet classifier
  • 34.
  • 35.
  • 36.
    Quick Review: DeepLearning neural network model with many layers deep = many layers -> deep neural network Mobile Machine Learning 101: Glossary Jameson Toole on Heartbeat blog
  • 38.
    sometime way backin B.C. people used to train deep neural network from scratch
  • 39.
    still some (morerecent) time in B.C. people stand on the shoulders of giants' work utilizing transfer learning
  • 40.
    enter.. transfer learning ! useknowledge learned from source task (MobileNet) --> to train target task (ASL classifier)
  • 41.
  • 42.
    Why Transfer Learningworks neural networks are universal approximators in theory, they can approximate any function
  • 43.
    how much datado i need? depends on problem just 100s images per category
  • 44.
    where can iget it? kaggle google for them... record video + extract frames (using e.g. FFmpeg)
  • 46.
    what if idon't have enough? data augmentation! Deeplearning.ai: C4W2L10 Data Augmentation (~10 min video)
  • 48.
    Let's start training... → canwe use Swift for Tensor Flow? → for now, stick with regular Tensor Flow
  • 50.
  • 56.
    Performance so how didour training go? ~20 min to run on my MacBook 95.3% accuracy on the test data
  • 59.
  • 61.
  • 62.
    4. Import intoXcode project drag + drop
  • 63.
  • 64.
    5. Predict usingCore ML (+Vision) framework vision + core ml
  • 66.
    audio classification 0.Define problem 1.Gather data 2. Train ML model
  • 67.
    0.Define problem Audio classifierof urban sounds air conditioner, car horn, children playing, drilling, siren, etc
  • 68.
    1.Gather data UrbanSound 8Kopen dataset Urban Sound Datasets, NYU CUSP
  • 69.
    should we useraw audio (.wav)? no, it's too computationally expensive
  • 70.
    convert wav tospectrogram represent audio as image (3 dimensions) 1st dimension: time (x-axis) 2nd dimension: frequency (y-axis) 3rd dimension: sound intensity (color)
  • 76.
  • 77.
    Performance so how didour training go? ~1 hour to run on my MacBook 77.1% accuracy on the test data
  • 81.
  • 82.
    Where to findinspiration look at open datasets read research papers! follow heartbeat blog, openAI
  • 83.
    Reproduce results research papersoften include this make sure to do the same if you publish check licensing + attribute proper credit
  • 84.
    Looking forward tothe future ML interpretability swift for TensorFlow
  • 85.
    Review ! big picture " when isit practical to use ML for iOS? # what's available to us? $ end-to-end examples
  • 86.
    Attributions & Mentions(1/4) Apple Machine Learning WWDC 2017 Videos TensorFlow for Poets Google codelabs tutorial Apple coremltools GitHub repo tf-coreml GitHub repo: TensorFlow->core ml converter
  • 87.
    Attributions & Mentions(2/4) Heartbeat by fritz.ai blog: Machine Learning at the edge ASL Datasets Kaggle Sign Language MNIST Urban Sound Datasets, NYU CUSP deeplearning.ai course: Data Augmentation
  • 88.
    Attributions & Mentions(3/4) Swift for TensorFlow GitHub repo Dockerized Swift for TF GitHub repo, Alexis Gallager themorningpaper by Adrian Colyer OpenAI Research "The Building Blocks of Interpretability" Google: C. Olah et al
  • 89.
    Attributions & Mentions(4/4) "Strategically Ignorant" Devon Zuegel "Transfer Learning of Temporal Information for Driver Action Classification" J. Lemley et al "Transfer Learning for Sound Classification" TataLab
  • 90.
    Further Learning (1/3) fast.aiDeep Learning course My Udacity Core ML course machinethink, ! ML for iOS blog by Matthijs Hollemans TensorFlow Dev Summit 2018 Videos TensorFlow playground
  • 91.
    Further Learning (2/3) BuildingMobile Apps w/ Tensor Flow Pete Warden Neural Networks & Deep Learning Michael Nielsen Stanford's Computer Vision course (CS231n)
  • 92.
    Further Learning (3/3) "Distillingthe Knowledge in a Neural Network" Geoffrey Hinton et al. "Transfer Learning - Machine Learning's Next Frontier" ! Sebastian Ruder "Transfer learning for music classification and regression tasks" ! Keunwoo Choi et al.
  • 93.
    Thank you Keep intouch! twitter: @meghafon