Navigating the
!
ML Landscape
Meghan Kane, @meghafon
developer @novoda, Berlin
Swift & Fika, 9/2018
!
before we pack our bags..
why use machine learning .. at all?
ML tasks
→ detection: rectangles, face, barcode, text
→ classification: image, text, activity, column data
→ style transfer
→ image similarity
→ object detection
→ natural language processin
→ regression
Advantages of on device inference
1. user privacy
2. low latency user experience
Let's start our road trip
Classify
board game
pieces
ML approach
Label: "Border"
1. Detect game box (rectangle
detection)
2. Classify Agricola piece within
rectangle (image classification)
Capabilities
... built right into the Vision
framework, no model training
needed
→ detection: rectangles, face,
barcode, text
→ object tracking
→ image alignment
How?
1. create handler
2. create task specific request
3. send request to handler
4. handle results
// 1. Create handler
let handler = VNImageRequestHandler(cgImage: image,
orientation: orientation,
options: [:])
// 2. Create request
let request = VNDetectRectanglesRequest(completionHandler: self.handleDetectedRectangles)
// 3. Send request to handler
do {
try handler.perform([request])
} catch let error as NSError {
// handle error
return
}
// 4. Handle results
func handleDetectedRectangles(request: VNRequest?, error: Error?) {
if let results = request?.results as? [VNRectangleObservation] {
// Do something with results [*bounding box coordinates*]
}
}
ML approach
Label: "Border"
1. Detect game box (rectangle
detection)
2. Classify Agricola piece within
rectangle (image classification)
Why restrict input image to the box?
→ easier to train an accurate model
→ faster to collect image data
Capabilities
In Xcode playground, train custom
model for:
→ image classification
→ text classification
→ classification & regression of
column data
Collect Data
→ Collect images representative of real world use
cases
→ Vary angle & lighting
→ >10 images per label, but ideally more
→ Equal # images for each label
→ Recommended: >299x299 pixels
Collecting Data Quickly
// Extract .jpg frames from .mov @ 5 frames/sec
ffmpeg -i stone.mov -r 5 data/stone/stone_%04d.jpg
Prepare Data
→ Split data: 80% train / 20% test
what's happening behind the scenes?
Transfer
Learning
Evaluate
Save model
Capabilities
→ perform predictions using
model
→ quantized weights (32 bit -> 16,
8, 4... bit)
→ perform batch predictions
→ create custom model layer
Vision + Core ML
1. create Vision Core ML model
2. create handler
3. create task specific request
4. send request to handler
5. handle results
// 1. Create Vision Core ML model
let model = AgricolaPieceClassifier()
guard let visionCoreMLModel = try? VNCoreMLModel(for: model.model) else { return }
// 2. Create handler
let handler = VNImageRequestHandler(cgImage: cgImage, orientation: cgImageOrientation)
// 3. Create request
let request = VNCoreMLRequest(model: visionCoreMLModel,
completionHandler: self.handleClassificationResults)
// 4. Send request to handler
do {
try handler.perform([request])
} catch let error as NSError {
// handle error
return
}
// 5. Handle results
func handleClassificationResults(request: VNRequest?, error: Error?) {
if let results = request?.results as? [VNClassificationObservation] {
// Do something with results
}
}
Problem solved
✅
Label: "Border"
1. Detect game box (rectangle
detection)
2. Classify Agricola piece within
rectangle (image classification)
Next
challenge...
Capabilities
...robust Python training
framework
→ style transfer
→ activity classification
→ image similarity
→ recommmendations
→ object detection
Setting up python
virtualenv venv
source venv/bin/activate
pip install requests==2.18.4 turicreate==5.0b2 jupyter
Train style transfer
jupyter notebook
Navigating the Apple ML Landscape
Navigating the Apple ML Landscape

Navigating the Apple ML Landscape

  • 1.
    Navigating the ! ML Landscape MeghanKane, @meghafon developer @novoda, Berlin Swift & Fika, 9/2018
  • 4.
    ! before we packour bags..
  • 5.
    why use machinelearning .. at all?
  • 6.
    ML tasks → detection:rectangles, face, barcode, text → classification: image, text, activity, column data → style transfer → image similarity → object detection → natural language processin → regression
  • 8.
    Advantages of ondevice inference 1. user privacy 2. low latency user experience
  • 9.
    Let's start ourroad trip
  • 11.
  • 12.
    ML approach Label: "Border" 1.Detect game box (rectangle detection) 2. Classify Agricola piece within rectangle (image classification)
  • 13.
    Capabilities ... built rightinto the Vision framework, no model training needed → detection: rectangles, face, barcode, text → object tracking → image alignment
  • 14.
    How? 1. create handler 2.create task specific request 3. send request to handler 4. handle results
  • 15.
    // 1. Createhandler let handler = VNImageRequestHandler(cgImage: image, orientation: orientation, options: [:])
  • 16.
    // 2. Createrequest let request = VNDetectRectanglesRequest(completionHandler: self.handleDetectedRectangles)
  • 17.
    // 3. Sendrequest to handler do { try handler.perform([request]) } catch let error as NSError { // handle error return }
  • 18.
    // 4. Handleresults func handleDetectedRectangles(request: VNRequest?, error: Error?) { if let results = request?.results as? [VNRectangleObservation] { // Do something with results [*bounding box coordinates*] } }
  • 19.
    ML approach Label: "Border" 1.Detect game box (rectangle detection) 2. Classify Agricola piece within rectangle (image classification)
  • 20.
    Why restrict inputimage to the box? → easier to train an accurate model → faster to collect image data
  • 21.
    Capabilities In Xcode playground,train custom model for: → image classification → text classification → classification & regression of column data
  • 22.
    Collect Data → Collectimages representative of real world use cases → Vary angle & lighting → >10 images per label, but ideally more → Equal # images for each label → Recommended: >299x299 pixels
  • 23.
    Collecting Data Quickly //Extract .jpg frames from .mov @ 5 frames/sec ffmpeg -i stone.mov -r 5 data/stone/stone_%04d.jpg
  • 24.
    Prepare Data → Splitdata: 80% train / 20% test
  • 28.
  • 29.
  • 31.
  • 33.
  • 34.
    Capabilities → perform predictionsusing model → quantized weights (32 bit -> 16, 8, 4... bit) → perform batch predictions → create custom model layer
  • 35.
    Vision + CoreML 1. create Vision Core ML model 2. create handler 3. create task specific request 4. send request to handler 5. handle results
  • 36.
    // 1. CreateVision Core ML model let model = AgricolaPieceClassifier() guard let visionCoreMLModel = try? VNCoreMLModel(for: model.model) else { return }
  • 37.
    // 2. Createhandler let handler = VNImageRequestHandler(cgImage: cgImage, orientation: cgImageOrientation)
  • 38.
    // 3. Createrequest let request = VNCoreMLRequest(model: visionCoreMLModel, completionHandler: self.handleClassificationResults)
  • 39.
    // 4. Sendrequest to handler do { try handler.perform([request]) } catch let error as NSError { // handle error return }
  • 40.
    // 5. Handleresults func handleClassificationResults(request: VNRequest?, error: Error?) { if let results = request?.results as? [VNClassificationObservation] { // Do something with results } }
  • 41.
    Problem solved ✅ Label: "Border" 1.Detect game box (rectangle detection) 2. Classify Agricola piece within rectangle (image classification)
  • 42.
  • 44.
    Capabilities ...robust Python training framework →style transfer → activity classification → image similarity → recommmendations → object detection
  • 45.
    Setting up python virtualenvvenv source venv/bin/activate pip install requests==2.18.4 turicreate==5.0b2 jupyter
  • 46.