ML and AI are increasingly dominating the high-tech industry. Organizations and technology companies are leveraging their big data to create new products or improve their processes to reach the next level in their market. However, ML and AI are not a silver bullet and Software 2.0 is not the end of software developers or software engineering.
In this talk I will argument on how software engineering can help ML and AI to become the key technology for (autonomous) systems of the near future. Software engineering best practices and achievements reached in the last decades might help, e.g., (i) democratising the use of ML/AI, (ii) composing, reusing, chaining ML/AI models to solve more complex problems, and (iii) supporting for reasoning about correctness, repeatability, explainability, traceability, fairness, ethics, while building an ML/AI pipeline.
Dealing with Cultural Dispersion — Stefano Lambiase — ICSE-SEIS 2024
Software Engineering for ML/AI, keynote at FAS*/ICAC/SASO 2019
1. Patrizio Pelliccione
Associate Professor (Docent), Chalmers|GU
Associate Professor, University of L’Aquila, Italy
www.patriziopelliccione.com
Software Engineering for ML/AI
6. Some success stories of AI/ML
https://www.shellypalmer.com/2016/03/alphago-vs-not-fair-fight
• March 15, 2016
• AlphaGo is the first computer program to defeat
a professional human Go player
• Lee Sedol - winner of 18 world titles
• widely considered to be the greatest player of the
past decade
• Over 200 million people watched online
• 4-1 victor of The Google DeepMind Challenge
match in Seoul, South Korea
• AlphaGo played a number of highly innovative
moves which contradicted centuries of received
Go knowledge
7. • August, 2017 - OpenAI vs Dendi
• OpenAI first ever to defeat world’s best players
in competitive eSports
• Dendi is a Ukranian Dota 2 player widely
recognized as one of the best of the world
• OpenAI’s is a genetic machine learning algorithm
• It evolves and learns as it plays and discards inferior
versions of its code
• It learns to play the game only by playing against
itself
• It can learn from scratch in about two weeks of real
time
Some success stories of AI/ML
https://research.nvidia.com/sites/default/files/pubs/2017-10_Progressive-Growing-of/karras2018iclr-paper.pdf
“This guy is really scary”
“Please stop bullying me!”
Dendi
8. • January 2018 – Libratus versus humans
• Developed by researchers at Carnegie Mellon’s
computer science department
• Defeated four top human specialist professionals in
heads-up No-limit Texas Hold’em (HUNL)
• a game where bluffing is a core, necessary
component
• HUNL is an “imperfect-information game” -
not all information about all elements in play
is available to all players at all times
• Approach
• precomputing an overall strategy,
• adapting the strategy to actual gameplay, and
• learning from its opponent
Some success stories of AI/ML
https://science.sciencemag.org/content/359/6374/418.full
9. ML and AI are not a Silver bullet
“There is no single development, in either technology or management
technique, which by itself promises even one order-of-magnitude improvement
within a decade in productivity, in reliability, in simplicity.”
Frederick P. Brooks (Turing award) No Silver Bullet Essence and Accidents of Software Engineering.
Computer 20, 4 (April 1987), 10-19. DOI: https://doi.org/10.1109/MC.1987.1663532
10. ML and AI are not a Silver bullet
• ML/AI <<can fail in unintuitive and embarrassing ways, or worse, they can
“silently fail”, e.g., by silently adopting biases in their training data…>>
Andrej Karpathy, Director of AI – Tesla
• The (Un)Known Unknown: AI Can't Analyze What It Does Not Know
• AI is not going to magically solve problems without any significant investments
from our end
• Achieving success using AI/ML requires engineering, discipline, and
operationalization
11. Software 2.0 is not the end of software developers and SE
• Software 2.0
• Andrej Karpathy, Director of AI – Tesla, Identified a fundamental paradigm shift in how we build
software
• We are building, or rather training, systems that we do not actually know how they work
• Software 2.0 is about finding programs through optimization i.e. directed search using training
data as the guide
• Learning programs from data means we need more data
“Accumulating a nice, varied, large, clean dataset for all the different tasks you want to do,
and worrying about all the edge cases and massaging it is where most of the action is”
Andrej Karpathy, 2018
13. Smart decisions under uncertainty
• Annie Duke was a professional poker player, she won over $4 million in tournaments, earned a World Series
of Poker bracelet, and is the only woman to have won the WSOP Tournament of Champions and NCS
National Heads-Up Poker Championship.
• Annie has devoted her life to the study of decision-making under pressure.
14. Smart decisions under uncertainty
With 26 seconds remaining in Super Bowl XLIX -
2015, and trailing by four points at the Patriots'
one-yard line, [Pete Carroll] called for a pass
instead of a hand off to his star running back
[Marshawn Lynch]. The pass was intercepted
and the Seattle Seahawks lost.
• Annie Duke was a professional poker player, she won over $4 million in tournaments, earned a World Series
of Poker bracelet, and is the only woman to have won the WSOP Tournament of Champions and NCS
National Heads-Up Poker Championship.
• Annie has devoted her life to the study of decision-making under pressure.
15. Smart decisions under uncertainty
The headlines the next day were brutal:
• USA Today: "What on Earth Was Seattle Thinking with Worst Play Call in NFL
History?"
• Washington Post: "'Worst Play-Call in Super Bowl History' Will Forever Alter
Perception of Seahawks, Patriots"
• FoxSports.com: "Dumbest Call in Super Bowl History Could Be Beginning of the
End for Seattle Seahawks"
• Seattle Times: "Seahawks Lost Because of the Worst Call in Super Bowl History"
• The New Yorker: "A Coach's Terrible Super Bowl Mistake”
But was the call really that bad?
Or did Carroll actually make a great move that was ruined by bad luck?
16. Smart decisions under uncertainty
• An interception was an extremely unlucky outcome
• During the season, out of sixty-six passes in that situation 0
had been intercepted
• In the previous fifteen seasons, the interception rate in similar
situations was about 2%
• Pete Carroll got unlucky
• He made a good-quality decision that got a bad result
17. Smart decisions under uncertainty
• An interception was an extremely unlucky outcome
• During the season, out of sixty-six passes in that situation 0
had been intercepted
• In the previous fifteen seasons, the interception rate in similar
situations was about 2%
• Pete Carroll got unlucky
• He made a good-quality decision that got a bad result
<<Pete Carroll was a victim of our tendency to adequate the quality of a decision with
the quality of its outcome. Poker players have a word for this: “resulting.”>>
18. Smart decisions under uncertainty
• We are very bad in separating luck from skills
• Results can be beyond of our control
• The connection between results and the quality of
the decisions preceding them is not so strong
19. Smart decisions under uncertainty
Learning from mistakes, adjust, and act: good
idea, but be careful on the quality of feedback!
Be aware of the dangers of “resulting”!
21. ML/AI for Software Engineering
• Exploiting ML/AI for facilitating and
enhancing SE activities
• ICSE 2019, technical track papers
• Roughly 22% of the accepted papers
exploit AI/ML for SE activities
22. ML/AI for Software Engineering
•ICSE-SEIP 2019 best paper award
Software Engineering for Machine Learning: A Case Study
Saleema Amershi - Microsoft, Andrew Begel - Microsoft Research, Christian Bird
- Microsoft Research, Rob DeLine – Microsoft Research, Harald Gall - University of
Zurich, Ece Kamar - Microsoft, Nachiappan Nagappan - Microsoft
Research, Besmira Nushi – Microsoft Research, Thomas Zimmermann - Microsoft
Research
23. Goal
• Create a “map” of AI at Microsoft
• Identify best practices across teams
and products using AI
• Discover research opportunities
Method
• 14 interviews
• 551 responses
ML/AI for Software Engineering: AI at Microsoft
24. ML/AI for Software Engineering: AI at Microsoft
Map of AI at Microsoft
• Microsoft puts AI in everything
• Many different AI algorithms in use
• 159 tools in use for ML
• AI and Data scientists is a recent addition to
the team
• Data scientists bring their own workflow
25. ML/AI for Software Engineering: AI at Microsoft
Common Challenges
End-to-end tool fragmentation
Data collection and cleaning is arduous
Traditionally developers focus on code, not data
Low experience
• Education and training
• Integrating AI into larger systems
High experience
• Tools
• Scalability
• Educating others
• Model evolution, evaluation, and deployment
26. ML/AI for Software Engineering: AI at Microsoft
Best practices for ML
• ML tools need to be better integrated into the ML workflow
and the workflow needs to be automated
• Center development around data
• Use simple, explainable, and composable ML models
• Carefully design test sets and human-in-the-loop evaluation
• Do not decouple model building from the rest of software
27. Software Engineering for ML/AI
•ICSE 2019, technical track papers
• 3 papers trying to improve AI/ML algorithms
• deep learning software testing
• adversarial sample detection software testing deep learning
software testing adversarial sample detection
28. • Software engineering can help ML and AI to become the key technology for
(autonomous) systems of the near future
• Software engineering best practices and achievements reached in the last
decades might help
• Democratizing the use of ML/AI
• Trustworthy ML/AI
• Ethics and privacy
Software Engineering for ML/AI
29. Democratize ML/AI
• Tools can be used in fields such as healthcare, education, manufacturing, retail, etc.
• Sharing AI’s power with the masses, allowing anyone and everyone to build the AI systems
they need
“…in the hands of every developer, every
organization, every public sector organization around
the world”
Putting the tools [ML/AI]
Satya Nadella (CEO of Microsoft) at the DLD (Digital-Life-Design)
conference in Munich, 2017
https://news.microsoft.com/europe/2017/01/17/democratizing-ai-satya-nadella-shares-vision-at-dld/
30. Democratize ML/AI
• Create building blocks that contain and embed ML/AI functionalities
https://azure.microsoft.com/en-us/overview/ai-platform/
Some Microsoft solutions
BigQuery ML
enable users to create and execute machine learning
models in BigQuery by using standard SQL queries
Some Google solutions
Cloud AutoML
include AutoML Vision for image classification,
AutoML Natural Language for text classification,
and AutoML Translation
An end-to-end open source machine learning platform
designed to simplify large scale deep learning
Knowledge mining
Uncover latent insights from all your content
Machine Learning
Quickly and easily build, train, deploy, and manage
your models
AI apps and agents
Deliver breakthrough experiences in your apps
https://medium.com/vickdata/how-google-are-democratising-ai-7d47a07a7307
31. Democratize ML/AI
https://www.h2o.ai/democratizing-ai/
https://peltarion.com
• A movement to democratize AI for
everyone
• Open source ML platform
• Increasing transparency, accountability,
and trustworthiness in AI
• Easy to use cloud-based operational AI
platform to build and deploy deep
learning models
• Integrated visual development
environment for deep learning
some other initiatives
32. Democratize ML/AI - challenges
• Package AI solutions in customizable and tunable
solutions
• Make clear assumptions, properties, and limitations of the
solutions
• Support the composition of simple solutions into more
complex solutions
• Wizard to guide end-users in selecting the best solution
for the specific problem
• Guide the selection and creation of training data
• Guide the verification and validation of the defined
solution
• Easy to use
Component-based
software engineering
Design-by-contract
V&V approaches
SEknowledge
35. Traditional Software vs ML
Pre-engineering all
behaviours at design time
Traditionalsoftware
Learn the behaviours
at run-time
Machinelearning
https://xkcd.com/1838/
36. Reinforcement learning
The good:
• No prior knowledge of the
environment is needed
• Can continuously learn and adapt
statet+1
statet
actiont
get reward
37. How to convey the goals to a Reinforcement Learning (RL) agent?
RL - Reward Function
• No knowledge about the
environment
• It learns the optimal policy to
play the game
• The reward function is simply
the score of the game
(or a function of the score)
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for engineering
reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE
38. RL - Reward Hacking
Policies that maximise the reward functions
are not guaranteed to satisfy the
specifications.
• Reward hacking:
• Informal goal: complete the race
• Conveyed goal: hit as many targets to get
more points
• Wrong assumption:
the score of the game conveys the implicit goal
of finishing the race
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for
engineering reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE
39. How do we trust that a reward
function conveys the right goals?
• Motivation
• Trivial reward functions not always work
• Real world applications typically involve complex
tasks
• The goals of the system are encoded in the reward
function
• Reward functions are typically handcrafted
• Problem
• Policies that maximize the reward functions are not
guaranteed to satisfy the specifications (Reward
hacking)
40. How do we trust that a reward
function conveys the right goals?
• Motivation
• Trivial reward functions not always work
• Real world applications typically involve complex
tasks
• The goals of the system are encoded in the reward
function
• Reward functions are typically handcrafted
• Problem
• Policies that maximize the reward functions are not
guaranteed to satisfy the specifications (Reward
hacking)
Solution: Engineering the reward function!
41. MoVEMo - A structured approach for
engineering reward functions
1. Model complex reward functions as a
network of state machines
2. Formally verify the correctness of the
reward model
3. Automatically enforce the reward model
to the RL agent at runtime
4. Monitor the behaviour of the agent as it
transverses the state machines to collect
the rewards
Results: improvement in the
achievement of the goals in less
time/episodes
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for engineering
reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE
42. MoVEMo: results
• We have compared our UPPAAL reward model (URM)
with a BRF (benchmark reward function) proposed in [4]
• This BRF has already been shown to be an improvement of the original
reward function proposed by Google in [5]
• Building more complex and robust reward functions, the
agent can learn faster to achieve its goal.
[4] “Using Keras and Deep Deterministic Policy Gradient to play TORCS,”
https://github.com/yanpanlau/DDPG- Keras-Torcs
[5] T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra,
“Continuous control with deep reinforcement learning”
P. Mallozzi, R. Pardo, V. Duplessis, P. Pelliccione, and G. Schneider “MoVEMo - A structured approach for
engineering reward functions” in International Conference on Robotic Computing (IRC), Laguna Hills (CA), 2018, IEEE
• Average results of 100 iterations.
• In order to complete one iteration, the agent
must learn how to drive on the track and
complete 20 laps.
• An episode of the algorithm ends when the
vehicle is perpendicular to the track axis.
• The Uppaal model contains 6 automata with a
total of 21 states.
Results
43. How do we ensure that the RL agent does
not violate important properties?
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
• Proposed approach:
• Runtime verification (Monitoring) to create a
safety envelop to control the ML agent
• Safety case
• A structured written argument, supported by
evidence, justifying system is acceptably safe
for intended use
• Safety reason 1 / evidence for reason 1
• Safety reason 2 / evidence for reason 2
• …
• Invariants elicited from safety reasons
and evidences
44. How do we ensure that the RL agent does
not violate important properties?
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
1. modeling safety-critical requirements in
terms of invariants
2. monitoring the agent as it performs
actions freely in the environment
3. enforcing a safe behaviour of the agent
when it is about to violate the
requirements
4. shaping the reward of the agent so that
it learns to avoid hazardous situations in
the future and it converges faster to its goal
45. Case study: Unsafe Gridworld Environments with A2C
Enforcing invariants
Informal goal:
Reach the green square after turning the light on while avoiding water
Monitors:
• Absence: Always avoid to step on the water
• Universally: The light should always be on
• Precedence: Before entering in a room the light
should have been turned on in the past
• Response: If a light switch is detected and the
light is off. Enforced action: turn on
• Response: If a door is detected and the door is
closed. Enforced action: open
Temporal Logic:
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
46. Example of one episode of one run after the same number of steps
WiseML
A2C RL
with
A2C RL
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
Enforcing invariants
47. Example of one episode of one run after the same number of steps
WiseML
with
A2C RL A2C RL
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
Enforcing invariants
48. Randomly generated with safety hazards
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
Enforcing invariants
49. Convergence comparison of one run
Number of deaths comparison of one run
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
Enforcing invariants
50. Results of 3000 runs
[ALL RESULTShttps://goo.gl/FzgEdo/
[GITHUB] https://github.com/pierg/wiseml-patterns/
[DOCKER] https://hub.docker.com/r/pmallozzi/wiseml-patterns/
Never a catastrophic event
Always faster convergence
Always higher convergence rate
P.Mallozzi, E.G.Castellano, P.Pelliccione, G.Schneider, K.Tei (2019) A Runtime Monitoring Framework to Enforce Invariants on Reinforcement Learning Agents Exploring
Complex Environments In: RoSE 2019 : 2nd International Workshop on Robotics Software Engineering (RoSE’19).
Enforcing invariants
• Large evaluation on randomly generated
environments of different sizes
• Different environment sizes
7x7, 9x9, 11x11, 13x13, and 15x15
• 30 gridworld environments generated for
each size
• 10 iterations for both WiseML and ClassicalML
51. • We no longer live online or offline, we live
onlife (cit. Luciano Floridi)
• We are threatened, vulnerable, and
unprotected
• Systems are increasingly able to make
autonomous decisions over and above us and
on our behalf
• Our moral rights, as well as the social,
economic and political spheres, can be
affected by the behavior of such systems
• Although unavoidable, the digital world is
becoming uncomfortable and potentially
hostile to us as a human being and as citizens
Exosoul - http://exosoul.disim.univaq.it/
52. Exoskeleton
I/O actions Internal actions
Ethical actions
Ethical Actuator
Monitorand
Enforcer
Ethical rules
Ethical knob
interface
Personal data
I/O operations Internal operations
Operations
APIs
Active Data
MonitorandEnforcer
Life-cycle status
Privacy rules
A software exoskeleton to protect and support
citizen’s ethics and privacy in the digital world
Ethics
Defining the scope for and inferring citizens
ethical preferences
Automation
Automatically synthesizing software
exoskeletons
Privacy
Privacy managed through the notion of
active data
• P. Inverardi. 2019. The European perspective on responsible computing. Commun. ACM 62, 4
(March 2019), 64-64. DOI: https://doi.org/10.1145/3311783.
• M.Autili, D. Di Ruscio, P.Inverardi, P.Pelliccione, M.Tivoli (2019) A software exoskeleton to
protect and support citizen's ethics and privacy in the digital world IEEE Access.
• Webpage: http://exosoul.disim.univaq.it/
53. • ML and AI precious instruments for near future smart and
autonomous systems, however
• However, ML and AI are not a silver bullet
• Software 2.0 is not the end of software developers and SE
• There is the need of multi- and cross-disciplinary teams working
together
• SE can help ML and AI to be a key technology for near future
smart and autonomous systems
• SE for AI/ML
• Democratize AI and ML
• Trustworthy AI/ML
• Ethics and privacy
Main Takeaways
54. Thanks to Piergiuseppe Mallozzi for some of the slides
Patrizio Pelliccione
Associate Professor (Docent), Chalmers|GU
Associate Professor, University of L’Aquila, Italy
www.patriziopelliccione.com
Software Engineering for ML/AI
Editor's Notes
AlphaGo is the first computer program to defeat a professional human Go player, the first program to defeat a Go world champion, and arguably the strongest Go player in history.
AlphaGo’s first formal match was against the reigning 3-times European Champion, Mr Fan Hui, in October 2015. Its 5-0 win was the first ever against a Go professional, and the results were published in full technical detail in the international journal, Nature. AlphaGo then went on to compete against legendary player Mr Lee Sedol, winner of 18 world titles and widely considered to be the greatest player of the past decade.
AlphaGo's 4-1 victory in Seoul, South Korea, in March 2016 was watched by over 200 million people worldwide. It was a landmark achievement that experts agreed was a decade ahead of its time, and earned AlphaGo a 9 dan professional ranking (the highest certification) - the first time a computer Go player had ever received the accolade.
Go behind the scenes in Seoul with the AlphaGo movie
Watch now on Netflix
Go behind the scenes in Seoul with the AlphaGo movie
During the games, AlphaGo played a handful of highly inventive winning moves,several of which - including move 37 in game two - were so surprising they overturned hundreds of years of received wisdom, and have since been examined extensively by players of all levels. In the course of winning, AlphaGo somehow taught the world completely new knowledge about perhaps the most studied and contemplated game in history.
In March 2016 AlphaGo took on its ultimate challenge thus far: playing the legendary Lee Sedol, winner of 18 world titles, famed for his creativity and widely considered to be the greatest player of the past decade.
Over 200 million people watched online as AlphaGo emerged a surprise 4-1 victor of The Google DeepMind Challenge match in Seoul, South Korea, with the consensus among experts that this breakthrough was a decade ahead of its time. Throughout the course of the tournament, AlphaGo played a number of highly innovative moves which contradicted centuries of received Go knowledge.
In addition to prompting a new wave of creativity from Go players of all levels, who have been inspired by AlphaGo’s unconventional moves, AlphaGo also prompted Go’s popularity to surge in the west, where the game had been previously little-known and understood.
The OpenAI-programmed bot took on Dendi, a Ukrainian Dota 2 player who’s widely regarded as one of the best in the world, on the main stage at The International, the competitive video game’s biggest annual competition — and absolutely crushed him.
Each round, the teams choose from a list of roughly 100 characters, called heroes, who each have different strengths, weaknesses and special abilities, with one player controlling one hero each. They then battle over territory on a set map, killing smaller, computer-controlled units to increase their power and attempting to kill one another to give their team an advantage (think of a kill, which knocks a player’s hero out of the game for a set amount of time, like a power play in hockey). The characters have different roles, like offense, defense, and support, but the deep complexity and number of variables mean human players are often able to play a single hero in dozens of different styles, strategies, and roles.
The OpenAI bot appears to run off a modified version of a genetic machine learning algorithm, meaning it evolves and learns as it plays and discards inferior versions of its code (the company wasn’t specific with its language, but a programmer friend of mine who also plays Dota said it sounded like a genetic algorithm). The bot learned to play the game only by playing against itself. Greg Brockman, leader of the OpenAI Dota 2 team, said that in the early stages, the dueling Shadow Fiends just ran aimlessly around the map until they died. But slowly, they learned strategies that would get them closer to their programmed winning parameters, and after a few weeks and thousands and thousands of games, they became strong enough to defeat the pros. “This bot can learn from scratch in about two weeks of real time,” Brockman said.
Over the course of a 20 day competition, with 120,000 poker hands played in total and a prize pool of $200,000, Libratus defeated top human pros – all using techniques that the researchers say aren’t uniquely applicable to poker, but that could apply to a broad range of imperfect-information games in general.
Over the last weeks, I have been on the road a lot for various engagements with customers as well as for our research activities. In virtually every meeting, the whole area of artificial intelligence and especially machine and deep learning come up as a discussion topic. This is great as I think the whole AI/ML/DL area is incredibly exciting and I keep being surprised and impressed by the incredible applications and examples that make the headlines on a very regular basis.
It is clear from all the discusses that AI is on the top of hype cycle and this is concerning in that the expectation on what AI will deliver are perhaps inflated. Especially those not overly well versed in the concept and underlying technology often start to expound on the fabulous opportunities that their products and services have if we just sprinkle a bit of AI dust over them.
Even worse are the cases where individuals start to talk about their expectations in terms that in no uncertain terms would require “General AI” rather than the “Narrow AI” technologies available today. Inflated expectations are not helping anyone and will only lead to disappointment.
The challenge is that we are at a stage in society, as I discussed in last week’s post, where we’re moving to the “post-intelligent design” era. In ML/DL, as humans we are building systems that are building (or rather training) systems that accomplish incredible feats, but we don’t actually know how these systems work. This is a major departure from the engineering approaches over the last decades and even the last centuries.
Over the last weeks, I have been writing about the software engineering challenges associated with building AI systems, but it seems that the key message that I am looking to get across is broader than what I have been communicating so far. The key idea is that AI is not a silver bullet. AI is not going to magically solve problems without any significant investments from our end.
For a machine or deep learning model to work well, we need accurate, clean and typically labelled training and validation data, a well designed model, iterations with several alternative designs to figure out which model performs best, reliable data pipelines to hook the model to, monitoring and logging to track model performance during operation, continuous remodeling and retraining to support continuous deployment, etc. As I’ve been outlining here, here, here and here, building production-quality ML/DL systems requires a solid engineering approach. In that sense, these technologies are tools in our toolbox rather than silver bullets that magically solve global warming, poverty, inequality and everything else that ails the world. This also goes the other way: it doesn’t make sense to blame AI for everything that goes bad in the world either.
Concluding, despite the hopes and expectations of lots of people that I meet, artificial intelligence, machine learning and deep learning are no silver bullets. These are novel technologies in our toolbox that help us solve problems that we were unable to solve earlier or at least solve as well. But, as the saying goes, there is no free lunch. Achieving success using AI/ML/DL requires engineering, discipline and operationalization. Although any advanced technology may seem indistinguishable from magic when looking at it from the outside, as Arthur C. Clarke – the famous science fiction writer once quipped, on the inside it’s typically heavyweight engineering. So, apply AI to your heart’s content, but remember that, as Thomas Edison mentioned, that it comes dressed in overalls.
Over the last weeks, I have been on the road a lot for various engagements with customers as well as for our research activities. In virtually every meeting, the whole area of artificial intelligence and especially machine and deep learning come up as a discussion topic. This is great as I think the whole AI/ML/DL area is incredibly exciting and I keep being surprised and impressed by the incredible applications and examples that make the headlines on a very regular basis.
It is clear from all the discusses that AI is on the top of hype cycle and this is concerning in that the expectation on what AI will deliver are perhaps inflated. Especially those not overly well versed in the concept and underlying technology often start to expound on the fabulous opportunities that their products and services have if we just sprinkle a bit of AI dust over them.
Even worse are the cases where individuals start to talk about their expectations in terms that in no uncertain terms would require “General AI” rather than the “Narrow AI” technologies available today. Inflated expectations are not helping anyone and will only lead to disappointment.
The challenge is that we are at a stage in society, as I discussed in last week’s post, where we’re moving to the “post-intelligent design” era. In ML/DL, as humans we are building systems that are building (or rather training) systems that accomplish incredible feats, but we don’t actually know how these systems work. This is a major departure from the engineering approaches over the last decades and even the last centuries.
Over the last weeks, I have been writing about the software engineering challenges associated with building AI systems, but it seems that the key message that I am looking to get across is broader than what I have been communicating so far. The key idea is that AI is not a silver bullet. AI is not going to magically solve problems without any significant investments from our end.
For a machine or deep learning model to work well, we need accurate, clean and typically labelled training and validation data, a well designed model, iterations with several alternative designs to figure out which model performs best, reliable data pipelines to hook the model to, monitoring and logging to track model performance during operation, continuous remodeling and retraining to support continuous deployment, etc. As I’ve been outlining here, here, here and here, building production-quality ML/DL systems requires a solid engineering approach. In that sense, these technologies are tools in our toolbox rather than silver bullets that magically solve global warming, poverty, inequality and everything else that ails the world. This also goes the other way: it doesn’t make sense to blame AI for everything that goes bad in the world either.
Concluding, despite the hopes and expectations of lots of people that I meet, artificial intelligence, machine learning and deep learning are no silver bullets. These are novel technologies in our toolbox that help us solve problems that we were unable to solve earlier or at least solve as well. But, as the saying goes, there is no free lunch. Achieving success using AI/ML/DL requires engineering, discipline and operationalization. Although any advanced technology may seem indistinguishable from magic when looking at it from the outside, as Arthur C. Clarke – the famous science fiction writer once quipped, on the inside it’s typically heavyweight engineering. So, apply AI to your heart’s content, but remember that, as Thomas Edison mentioned, that it comes dressed in overalls.
The discount factor determines the importance of future rewards. A factor of 0 will make the agent "myopic" (or short-sighted) by only considering current rewards
The learning rate or step size determines to what extent newly acquired information overrides old information. A factor of 0 makes the agent learn nothing
Policies that maximise the reward functions are not guaranteed to satisfy the specifications.
The goal of the game is to finish the boat race quickly and (preferably) ahead of other players
Reward hacking: High score without having to finish the course but instead it keeps hitting some targets on the way
well-defined reward function, such as the game score, which can be optimised to produce the desired behaviour. However, there are many other tasks where the “right” reward function is less clear, and optimisation of a naïvely selected one can lead to surprising results that do not match the expectations of the designer. This is particularly prevalent in continuous control tasks, such as locomotion, and it has become standard practice to carefully handcraft the reward function, or else elicit a reward function from demonstrations.
Reward functions are usually handcrafted and represent heuristics for relatively simple task
Reward functions are usually handcrafted and represent heuristics for relatively simple task
The results show that the agent learns much faster by using reward functions produced through our approach.
Monitors not based on the mission but rather on properties of the robot
Spiegare proprieta’, more simple better
The results show that the agent learns much faster by using reward functions produced through our approach.
The results show that the agent learns much faster by using reward functions produced through our approach.
The results show that the agent learns much faster by using reward functions produced through our approach.
The results show that the agent learns much faster by using reward functions produced through our approach.