Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019

Patrizio Pelliccione
Associate Professor (Docent), Chalmers|GU
Associate Professor, University of L’Aquila, Italy
www.patriziopelliccione.com
Software Engineering for ML/AI

https://www.utron-parking.com/videos/
https://www.utron-parking.com

https://www.ecostiera.it/atrani-app-per-conoscere-aree-libere-parcheggio/

• Uncertainty
• Unpredictability
• Unknown unknown
• Variability too high to be managed
Challenges of the real world

Some success stories of AI/ML
https://www.shellypalmer.com/2016/03/alphago-vs-not-fair-fight
• March 15, 2016
• AlphaGo is the first computer program to defeat
a professional human Go player
• Lee Sedol - winner of 18 world titles
• widely considered to be the greatest player of the
past decade
• Over 200 million people watched online
• 4-1 victor of The Google DeepMind Challenge
match in Seoul, South Korea
• AlphaGo played a number of highly innovative
moves which contradicted centuries of received
Go knowledge

• August, 2017 - OpenAI vs Dendi
• OpenAI first ever to defeat world’s best players
in competitive eSports
• Dendi is a Ukranian Dota 2 player widely
recognized as one of the best of the world
• OpenAI’s is a genetic machine learning algorithm
• It evolves and learns as it plays and discards inferior
versions of its code
• It learns to play the game only by playing against
itself
• It can learn from scratch in about two weeks of real
time
https://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-Growing-of/karras2018iclr-paper.pdf
“This guy is really scary”
“Please stop bullying me!”
Dendi

• January 2018 – Libratus versus humans
• Developed by researchers at Carnegie Mellon’s
computer science department
• Defeated four top human specialist professionals in
heads-up No-limit Texas Hold’em (HUNL)
• a game where bluffing is a core, necessary
component
• HUNL is an “imperfect-information game” -
not all information about all elements in play
is available to all players at all times
• Approach
• precomputing an overall strategy,
• adapting the strategy to actual gameplay, and
• learning from its opponent
https://science.sciencemag.org/content/359/6374/418.full

ML and AI are not a Silver bullet
“There is no single development, in either technology or management
technique, which by itself promises even one order-of-magnitude improvement
within a decade in productivity, in reliability, in simplicity.”
Frederick P. Brooks (Turing award) No Silver Bullet Essence and Accidents of Software Engineering.
Computer 20, 4 (April 1987), 10-19. DOI: https://doi.org/10.1109/MC.1987.1663532

ML and AI are not a Silver bullet
• ML/AI <<can fail in unintuitive and embarrassing ways, or worse, they can
“silently fail”, e.g., by silently adopting biases in their training data…>>
Andrej Karpathy, Director of AI – Tesla
• The (Un)Known Unknown: AI Can't Analyze What It Does Not Know
• AI is not going to magically solve problems without any significant investments
from our end
• Achieving success using AI/ML requires engineering, discipline, and
operationalization

Software 2.0 is not the end of software developers and SE
• Software 2.0
• Andrej Karpathy, Director of AI – Tesla, Identified a fundamental paradigm shift in how we build
software
• We are building, or rather training, systems that we do not actually know how they work
• Software 2.0 is about finding programs through optimization i.e. directed search using training
data as the guide
• Learning programs from data means we need more data
“Accumulating a nice, varied, large, clean dataset for all the different tasks you want to do,
and worrying about all the edge cases and massaging it is where most of the action is”
Andrej Karpathy, 2018

Data, Data, Data
Learning from mistakes, adjust, and act

Smart decisions under uncertainty
• Annie Duke was a professional poker player, she won over $4 million in tournaments, earned a World Series
of Poker bracelet, and is the only woman to have won the WSOP Tournament of Champions and NCS
National Heads-Up Poker Championship.
• Annie has devoted her life to the study of decision-making under pressure.

With 26 seconds remaining in Super Bowl XLIX -
2015, and trailing by four points at the Patriots'
one-yard line, [Pete Carroll] called for a pass
instead of a hand off to his star running back
[Marshawn Lynch]. The pass was intercepted
and the Seattle Seahawks lost.
• Annie Duke was a professional poker player, she won over $4 million in tournaments, earned a World Series
of Poker bracelet, and is the only woman to have won the WSOP Tournament of Champions and NCS
National Heads-Up Poker Championship.
• Annie has devoted her life to the study of decision-making under pressure.

The headlines the next day were brutal:
• USA Today: "What on Earth Was Seattle Thinking with Worst Play Call in NFL
History?"
• Washington Post: "'Worst Play-Call in Super Bowl History' Will Forever Alter
Perception of Seahawks, Patriots"
• FoxSports.com: "Dumbest Call in Super Bowl History Could Be Beginning of the
End for Seattle Seahawks"
• Seattle Times: "Seahawks Lost Because of the Worst Call in Super Bowl History"
• The New Yorker: "A Coach's Terrible Super Bowl Mistake”
But was the call really that bad?
Or did Carroll actually make a great move that was ruined by bad luck?

• An interception was an extremely unlucky outcome
• During the season, out of sixty-six passes in that situation 0
had been intercepted
• In the previous fifteen seasons, the interception rate in similar
situations was about 2%
• Pete Carroll got unlucky
• He made a good-quality decision that got a bad result

• An interception was an extremely unlucky outcome
• During the season, out of sixty-six passes in that situation 0
had been intercepted
• In the previous fifteen seasons, the interception rate in similar
situations was about 2%
• Pete Carroll got unlucky
• He made a good-quality decision that got a bad result
<<Pete Carroll was a victim of our tendency to adequate the quality of a decision with
the quality of its outcome. Poker players have a word for this: “resulting.”>>

• We are very bad in separating luck from skills
• Results can be beyond of our control
• The connection between results and the quality of
the decisions preceding them is not so strong

Learning from mistakes, adjust, and act: good
idea, but be careful on the quality of feedback!
Be aware of the dangers of “resulting”!

ML/AI for Software Engineering
• Exploiting ML/AI for facilitating and
enhancing SE activities
• ICSE 2019, technical track papers
• Roughly 22% of the accepted papers
exploit AI/ML for SE activities

ML/AI for Software Engineering
•ICSE-SEIP 2019 best paper award
Software Engineering for Machine Learning: A Case Study
Saleema Amershi - Microsoft, Andrew Begel - Microsoft Research, Christian Bird
- Microsoft Research, Rob DeLine – Microsoft Research, Harald Gall - University of
Zurich, Ece Kamar - Microsoft, Nachiappan Nagappan - Microsoft
Research, Besmira Nushi – Microsoft Research, Thomas Zimmermann - Microsoft
Research

Goal
• Create a “map” of AI at Microsoft
• Identify best practices across teams
and products using AI
• Discover research opportunities
Method
• 14 interviews
• 551 responses
ML/AI for Software Engineering: AI at Microsoft

Map of AI at Microsoft
• Microsoft puts AI in everything
• Many different AI algorithms in use
• 159 tools in use for ML
• AI and Data scientists is a recent addition to
the team
• Data scientists bring their own workflow

Common Challenges
End-to-end tool fragmentation
Data collection and cleaning is arduous
Traditionally developers focus on code, not data
Low experience
• Education and training
• Integrating AI into larger systems
High experience
• Tools
• Scalability
• Educating others
• Model evolution, evaluation, and deployment

Best practices for ML
• ML tools need to be better integrated into the ML workflow
and the workflow needs to be automated
• Center development around data
• Use simple, explainable, and composable ML models
• Carefully design test sets and human-in-the-loop evaluation
• Do not decouple model building from the rest of software

•ICSE 2019, technical track papers
• 3 papers trying to improve AI/ML algorithms
• deep learning software testing
• adversarial sample detection software testing deep learning
software testing adversarial sample detection

• Software engineering can help ML and AI to become the key technology for
(autonomous) systems of the near future
• Software engineering best practices and achievements reached in the last
decades might help
• Democratizing the use of ML/AI
• Trustworthy ML/AI
• Ethics and privacy

Democratize ML/AI
• Tools can be used in fields such as healthcare, education, manufacturing, retail, etc.
• Sharing AI’s power with the masses, allowing anyone and everyone to build the AI systems
they need
“…in the hands of every developer, every
organization, every public sector organization around
the world”
Putting the tools [ML/AI]
Satya Nadella (CEO of Microsoft) at the DLD (Digital-Life-Design)
conference in Munich, 2017
https://news.microsoft.com/europe/2017/01/17/democratizing-ai-satya-nadella-shares-vision-at-dld/

Democratize ML/AI
• Create building blocks that contain and embed ML/AI functionalities
https://azure.microsoft.com/en-us/overview/ai-platform/
Some Microsoft solutions
BigQuery ML
enable users to create and execute machine learning
models in BigQuery by using standard SQL queries
Some Google solutions
Cloud AutoML
include AutoML Vision for image classification,
AutoML Natural Language for text classification,
and AutoML Translation
An end-to-end open source machine learning platform
designed to simplify large scale deep learning
Knowledge mining
Uncover latent insights from all your content
Machine Learning
Quickly and easily build, train, deploy, and manage
your models
AI apps and agents
Deliver breakthrough experiences in your apps
https://medium.com/vickdata/how-google-are-democratising-ai-7d47a07a7307

Democratize ML/AI
https://www.h2o.ai/democratizing-ai/
https://peltarion.com
• A movement to democratize AI for
everyone
• Open source ML platform
• Increasing transparency, accountability,
and trustworthiness in AI
• Easy to use cloud-based operational AI
platform to build and deploy deep
learning models
• Integrated visual development
environment for deep learning
some other initiatives

Democratize ML/AI - challenges
• Package AI solutions in customizable and tunable
solutions
• Make clear assumptions, properties, and limitations of the
solutions
• Support the composition of simple solutions into more
complex solutions
• Wizard to guide end-users in selecting the best solution
for the specific problem
• Guide the selection and creation of training data
• Guide the verification and validation of the defined
solution
• Easy to use
Component-based
software engineering
Design-by-contract
V&V approaches
SEknowledge

Why trustworthiness is important?
Tesla, 2016 Uber, 2018 Tesla, 2018

Tesla, 2016 Uber, 2018 Tesla, 2018
Why trustworthiness is important?

Traditional Software vs ML
Pre-engineering all
behaviours at design time
Traditionalsoftware
Learn the behaviours
at run-time
Machinelearning
https://xkcd.com/1838/

Reinforcement learning
The good:
• No prior knowledge of the
environment is needed
• Can continuously learn and adapt
statet+1
statet
actiont
get reward

How to convey the goals to a Reinforcement Learning (RL) agent?
RL - Reward Function
• No knowledge about the
environment
• It learns the optimal policy to
play the game
• The reward function is simply
the score of the game
(or a function of the score)
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for engineering
reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE

RL - Reward Hacking
Policies that maximise the reward functions
are not guaranteed to satisfy the
specifications.
• Reward hacking:
• Informal goal: complete the race
• Conveyed goal: hit as many targets to get
more points
• Wrong assumption:
the score of the game conveys the implicit goal
of finishing the race
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for
engineering reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE

How do we trust that a reward
function conveys the right goals?
• Motivation
• Trivial reward functions not always work
• Real world applications typically involve complex
tasks
• The goals of the system are encoded in the reward
function
• Reward functions are typically handcrafted
• Problem
• Policies that maximize the reward functions are not
guaranteed to satisfy the specifications (Reward
hacking)

How do we trust that a reward
function conveys the right goals?
• Motivation
• Trivial reward functions not always work
• Real world applications typically involve complex
tasks
• The goals of the system are encoded in the reward
function
• Reward functions are typically handcrafted
• Problem
• Policies that maximize the reward functions are not
guaranteed to satisfy the specifications (Reward
hacking)
Solution: Engineering the reward function!

MoVEMo - A structured approach for
engineering reward functions
1. Model complex reward functions as a
network of state machines
2. Formally verify the correctness of the
reward model
3. Automatically enforce the reward model
to the RL agent at runtime
4. Monitor the behaviour of the agent as it
transverses the state machines to collect
the rewards
Results: improvement in the
achievement of the goals in less
time/episodes
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for engineering
reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE

MoVEMo: results
• We have compared our UPPAAL reward model (URM)
with a BRF (benchmark reward function) proposed in [4]
• This BRF has already been shown to be an improvement of the original
reward function proposed by Google in [5]
• Building more complex and robust reward functions, the
agent can learn faster to achieve its goal.
[4] “Using Keras and Deep Deterministic Policy Gradient to play TORCS,”
https://github.com/yanpanlau/DDPG- Keras-Torcs
[5] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra,
“Continuous control with deep reinforcement learning”
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for
engineering reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE
• Average results of 100 iterations.
• In order to complete one iteration, the agent
must learn how to drive on the track and
complete 20 laps.
• An episode of the algorithm ends when the
vehicle is perpendicular to the track axis.
• The Uppaal model contains 6 automata with a
total of 21 states.
Results

How do we ensure that the RL agent does
not violate important properties?
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
• Proposed approach:
• Runtime verification (Monitoring) to create a
safety envelop to control the ML agent
• Safety case
• A structured written argument, supported by
evidence, justifying system is acceptably safe
for intended use
• Safety reason 1 / evidence for reason 1
• Safety reason 2 / evidence for reason 2
• …
• Invariants elicited from safety reasons
and evidences

How do we ensure that the RL agent does
not violate important properties?
1. modeling safety-critical requirements in
terms of invariants
2. monitoring the agent as it performs
actions freely in the environment
3. enforcing a safe behaviour of the agent
when it is about to violate the
requirements
4. shaping the reward of the agent so that
it learns to avoid hazardous situations in
the future and it converges faster to its goal

Case study: Unsafe Gridworld Environments with A2C
Enforcing invariants
Informal goal:
Reach the green square after turning the light on while avoiding water
Monitors:
• Absence: Always avoid to step on the water
• Universally: The light should always be on
• Precedence: Before entering in a room the light
should have been turned on in the past
• Response: If a light switch is detected and the
light is oﬀ. Enforced action: turn on
• Response: If a door is detected and the door is
closed. Enforced action: open
Temporal Logic:

Example of one episode of one run after the same number of steps
WiseML
A2C RL
with
A2C RL

Example of one episode of one run after the same number of steps
WiseML
with
A2C RL A2C RL

Randomly generated with safety hazards

Convergence comparison of one run
Number of deaths comparison of one run

Results of 3000 runs
[ALL RESULTShttps://goo.gl/FzgEdo/
[GITHUB] https://github.com/pierg/wiseml-patterns/
[DOCKER] https://hub.docker.com/r/pmallozzi/wiseml-patterns/
Never a catastrophic event
Always faster convergence
Always higher convergence rate
• Large evaluation on randomly generated
environments of different sizes
• Different environment sizes
7x7, 9x9, 11x11, 13x13, and 15x15
• 30 gridworld environments generated for
each size
• 10 iterations for both WiseML and ClassicalML

• We no longer live online or offline, we live
onlife (cit. Luciano Floridi)
• We are threatened, vulnerable, and
unprotected
• Systems are increasingly able to make
autonomous decisions over and above us and
on our behalf
• Our moral rights, as well as the social,
economic and political spheres, can be
affected by the behavior of such systems
• Although unavoidable, the digital world is
becoming uncomfortable and potentially
hostile to us as a human being and as citizens
Exosoul - http://exosoul.disim.univaq.it/

Exoskeleton
I/O actions Internal actions
Ethical actions
Ethical Actuator
Monitorand
Enforcer
Ethical rules
Ethical knob
interface
Personal data
I/O operations Internal operations
Operations
APIs
Active Data
MonitorandEnforcer
Life-cycle status
Privacy rules
A software exoskeleton to protect and support
citizen’s ethics and privacy in the digital world
Ethics
Defining the scope for and inferring citizens
ethical preferences
Automation
Automatically synthesizing software
exoskeletons
Privacy
Privacy managed through the notion of
active data
• P. Inverardi. 2019. The European perspective on responsible computing. Commun. ACM 62, 4
(March 2019), 64-64. DOI: https://doi.org/10.1145/3311783.
• M.Autili, D. Di Ruscio, P.Inverardi, P.Pelliccione, M.Tivoli (2019) A software exoskeleton to
protect and support citizen's ethics and privacy in the digital world IEEE Access.
• Webpage: http://exosoul.disim.univaq.it/

• ML and AI precious instruments for near future smart and
autonomous systems, however
• However, ML and AI are not a silver bullet
• Software 2.0 is not the end of software developers and SE
• There is the need of multi- and cross-disciplinary teams working
together
• SE can help ML and AI to be a key technology for near future
smart and autonomous systems
• SE for AI/ML
• Democratize AI and ML
• Trustworthy AI/ML
• Ethics and privacy
Main Takeaways

Thanks to Piergiuseppe Mallozzi for some of the slides
Patrizio Pelliccione
Associate Professor (Docent), Chalmers|GU
Associate Professor, University of L’Aquila, Italy
www.patriziopelliccione.com

Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019

Recommended

Recommended

More Related Content

What's hot

What's hot (19)

Similar to Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019

Similar to Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019 (20)

Recently uploaded

Recently uploaded (20)

Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019

Editor's Notes