1. Build an AI Auto
Mechanic Using
Reinforcement
Learning
2. All views expressed on this project are my
own and do not represent the opinions of
any entity whatsoever with which I have
been, am now, or will be affiliated.
.
Disclaimer
This presentation would not dive deep into
any of the technical methodology of
reinforcement learning. Anyone who is
interested to learn more about reinforcement
learning, please refer to “Reinforcement
Learning: An Introduction” by Richard S.
Sutton and Andrew G. Barto.
.
3. What is reinforcement learning ?
Imagine you are trying to build an AI Agent using
reinforcement learning(RL) to play “Space invaders”,
you give it a bunch of action button, live image and
score from the game and the agent do not have
access to any of back end system of the game.
The agent know there is a reward that it have to
maximize, but it have no idea what does each
button do and what does each image represent,
therefore, it would act like a child who is expose to
video game for the first time, by randomly pressing
action in the beginning and gradually learn to select
the correct action at the right time to achieve the
highest score.
A very simplified explanation using a example:
4. Case
Study
As APS is an important
function for trucks in its
everyday usages, which
generates pressurized air
that are utilized in various
functions in a truck, such as
braking and gear changes.
Hence, failure in APS would
resulted in costly repairs
and render the truck
unusable for any business
operation that would lead to
further profit loses.
This project will use the
publicized data by Scania
Trucks which consist of Air
Pressure System (APS)
Failure and Operational Data,
to simulate the environment
that is require to train the AI
agent using reinforcement
learning on real life
application.
Therefore, a functioning AI
agent that could predict the
if the truck predict APS failure
and take action can save a lot
of cost and time for the
business.
5. Information regarding about the case study
03
01
04
02
The training set contain about 60,000 observation
and test set contain 16,000, each observation
represent a truck.
Dataset
The attribute names of the data have been
anonymized by the provider for proprietary reasons.
Attribute
The dataset's positive class consists of component
failures for a specific component of the APS system.
The negative class consists of trucks with failures
for components not related to the APS.
Class
Each APS failure would cost $500 and $10 for any
false alarm for APS failure
Cost
6. PortfolioPresentation
A2CER PPO - Actor Advantage Critic with prioritized Experience
Replay Proximal Policy Optimization
• Actor Advantage Critic - A hybrid model that
implement policy based methods for the actors to
take action and value based methods for the Critic
to evaluate its action.
• Prioritized Experience Replay – a technique that
allowed rare experience to be learn more frequent
by the agent.
• Proximal Policy Optimization – avoid a single
update to be too great.
Reinforcement Learning Architecture used in this case –
A2CER PPO
8. Result
01 02
Fail take action against 35 APS fail truck, cost
$17,500, 447 false alarm which cost $4,470.
Total cost $21,970, without the AI, all 375 APS
failure would cost $187,500, which save about
$165,530 in total.
Cost:
As RL are build for continuous learning, hence, as
time past the AI Agent would only keep improving
and could adjust itself across time to handle a
completely unseen environment.
Continuous Learning
* Total number of truck in the test set is 16,000
9. Recommendation to improve
RL performance
Reinforcement learning would perform
better if there is a stream of data for each
observation (Truck) rather than using a
single dataset at a particular time, as the
agent would be able to learn more
preciously about the chain of reaction or
early symptom of the truck failure and
take a action to alert the owner at much
earlier stage.