Automation & Machine Learning
Dr. Alvaro Feito Boirac
22 Aug 2016, Bermuda
Where should we use automation?
- Repetitive Tasks
- Grunt work time/cost > development time/cost
- Task can be described as clear* steps
* the meaning of “clear” is subtle and will require clarification
Effortless for Humans
≠
Easy for Machines
3 Criteria for automation:
- Is it repetitive enough?
- Does it require enough man-hours?
- Can we describe it as an algorithm?
4 Tools (for different processes)
Show by
Clicking
Show with
Scripting
Show with
Code
Machine
Learning
Click & Tell
1. Click on
2. Type “Excel”
3. Click on
4. Paste text from previous task ...
Use a software which reads & executes instructions:
PROS
- $ Cheap
- Easy to learn
- Quick-Start
CONS
- Not resilient (image change)
- Somewhat Limited: click, open, close,
save, write, copy, paste, move, etc.
- Does not scale with complexity
Click & Tell
Examples:
- Sikuli (or SikuliX)
- Automa
- PyWinAuto
-
- Etc ...
Click & Tell
Ref: Automate the Boring Stuff with Python
Open all documents in folder
X, compare the 3rd item, and
save the result to an excel
sheet. Every end-of-the
month, open a website, take
a screenshot of the stock,
paste to excel and save that
excel sheet in the VP’s
drive and delete all the
documents.
Click & Tell
Ref: Automa
Script & tell
Combine scripts from:
Windows, VBA, your API
PROS
- Uses your current software
- Not difficult learning curve
- Affordable
- Can automate progressively
CONS
- Not resilient (program change)
- Often Incompatible: Program A can’t talk
to program B which uses a different format.
- Not one single project: Too many
moving parts
Script & tell
Examples:
- VBA (Excel)
- Windows Automation API
- AutoIT
- Etc ...
Open all documents in folder
X, compare the 3rd item, and
save the result to an excel
sheet. Calculate the rolling
average of the price & the
contribution of each
department. At the end of
the month, open a website,
take a screenshot of the
stock, and create a .doc in
the VP’s drive after
deleting all the documents.
Script & tell
Code & Tell
Use a programming language to manipulate your documents
and interact with your software.
Code & Tell
COMPILER
Click
Macros
Code
Code & Tell
PROS
- More versatile & powerful
- All in one platform
- Can automate progressively
- Many building blocks already exist
- Data analysis is easy to add
CONS
- Steeper learning curve
- Investment in staff ? time ?
- Requires some maintenance
Code & Tell
Examples:
- Python (PyWinAuto + Pandas + Numpy + … )
- . NET (White + RogueWave, …)
- Java, perl, BASH, …
Pull the raw data, make the
usual statistics, create a
PDF report from it with
graphs. Make backups of all
the documents and copy the
first line of each in an
email that will go to the
SVP of XYZ.
Machine Learning
Use software and programming tools inspired by the brain.
For more fuzzy tasks:
- Recognize objects in an image
- Transcribe handwriting / solve Captchas
- Transcribe speech
- Make decisions based on data
- Identify trends or patterns
- Does this scan contain a seal and a signature?
Machine Learning (3 main schools*)
Biology &
Physics
inspired
networks
Statistical
learning
(Bayesian)
Learning
by Analogy
(SVM)
Machine Learning
PROS
- Great for intuitive tasks (image,
patterns, trends, voice)
- Many ready-to-use tools
- Can run parallel (or on top) of
other tasks.
- Mostly free & Open source
CONS
- Longer/Steeper learning curve
- Harder to hire experts
- May need large training data sets
Machine Learning
Examples:
- OpenCV (image)
- Pyocr, tesseract, FreeOCR (OCR)
- Theano, Sci-Kit, TensorFlow
Find object in image, check
signature, check stamp,
count bullet points on a
scan, find deep correlations
in data, track point in
video, transcribe
handwriting or voice, etc

Automation and machine learning in the enterprise

  • 1.
    Automation & MachineLearning Dr. Alvaro Feito Boirac 22 Aug 2016, Bermuda
  • 2.
    Where should weuse automation? - Repetitive Tasks - Grunt work time/cost > development time/cost - Task can be described as clear* steps * the meaning of “clear” is subtle and will require clarification
  • 3.
  • 4.
    3 Criteria forautomation: - Is it repetitive enough? - Does it require enough man-hours? - Can we describe it as an algorithm?
  • 5.
    4 Tools (fordifferent processes) Show by Clicking Show with Scripting Show with Code Machine Learning
  • 6.
    Click & Tell 1.Click on 2. Type “Excel” 3. Click on 4. Paste text from previous task ... Use a software which reads & executes instructions:
  • 7.
    PROS - $ Cheap -Easy to learn - Quick-Start CONS - Not resilient (image change) - Somewhat Limited: click, open, close, save, write, copy, paste, move, etc. - Does not scale with complexity Click & Tell
  • 8.
    Examples: - Sikuli (orSikuliX) - Automa - PyWinAuto - - Etc ... Click & Tell Ref: Automate the Boring Stuff with Python Open all documents in folder X, compare the 3rd item, and save the result to an excel sheet. Every end-of-the month, open a website, take a screenshot of the stock, paste to excel and save that excel sheet in the VP’s drive and delete all the documents.
  • 9.
  • 10.
    Script & tell Combinescripts from: Windows, VBA, your API
  • 11.
    PROS - Uses yourcurrent software - Not difficult learning curve - Affordable - Can automate progressively CONS - Not resilient (program change) - Often Incompatible: Program A can’t talk to program B which uses a different format. - Not one single project: Too many moving parts Script & tell
  • 12.
    Examples: - VBA (Excel) -Windows Automation API - AutoIT - Etc ... Open all documents in folder X, compare the 3rd item, and save the result to an excel sheet. Calculate the rolling average of the price & the contribution of each department. At the end of the month, open a website, take a screenshot of the stock, and create a .doc in the VP’s drive after deleting all the documents. Script & tell
  • 13.
    Code & Tell Usea programming language to manipulate your documents and interact with your software.
  • 14.
  • 15.
    Code & Tell PROS -More versatile & powerful - All in one platform - Can automate progressively - Many building blocks already exist - Data analysis is easy to add CONS - Steeper learning curve - Investment in staff ? time ? - Requires some maintenance
  • 16.
    Code & Tell Examples: -Python (PyWinAuto + Pandas + Numpy + … ) - . NET (White + RogueWave, …) - Java, perl, BASH, … Pull the raw data, make the usual statistics, create a PDF report from it with graphs. Make backups of all the documents and copy the first line of each in an email that will go to the SVP of XYZ.
  • 17.
    Machine Learning Use softwareand programming tools inspired by the brain. For more fuzzy tasks: - Recognize objects in an image - Transcribe handwriting / solve Captchas - Transcribe speech - Make decisions based on data - Identify trends or patterns - Does this scan contain a seal and a signature?
  • 18.
    Machine Learning (3main schools*) Biology & Physics inspired networks Statistical learning (Bayesian) Learning by Analogy (SVM)
  • 19.
    Machine Learning PROS - Greatfor intuitive tasks (image, patterns, trends, voice) - Many ready-to-use tools - Can run parallel (or on top) of other tasks. - Mostly free & Open source CONS - Longer/Steeper learning curve - Harder to hire experts - May need large training data sets
  • 20.
    Machine Learning Examples: - OpenCV(image) - Pyocr, tesseract, FreeOCR (OCR) - Theano, Sci-Kit, TensorFlow Find object in image, check signature, check stamp, count bullet points on a scan, find deep correlations in data, track point in video, transcribe handwriting or voice, etc