1. Fuzzy control, overview
Olivier Teytaud
Essentially mathematics free methodology
Not widely used in Europe or Us
Very frequent in Asia
A pragmatic solution for control & human expertise
olivier.teytaud@gmail.com
2. Control : maximizing reward
At each time step, the agent
● receives a description of the world (state)
● makes a decision,
● gets a reward
3. Control : example
● Observation = temperature
● Temperature control :
– If temperature < 16°C, switch on heater
– If temperature > 20°C, switch off heater
– If temperature > 26°C, switch on air cond.
– If temperature < 24°C, switch off air cond.
● Reward
– reward = comfort reward – cost penalty
– comfort reward = 1 if 20 < temp < 24 (0 otherwise)
– Penalty = 0.2 per appliance switched on
4. Tools for control
● Dynamic programming (Bellman)
● Model Predictive Control
– build a model and a predictor
– optimize the decisions so that the predicted reward
is maximal over the next H time steps
● Direct Policy Search :
– define a parametric control function
– optimize the parameters on simulations
● Reinforcement learning (many different things,
close to DP or close to DPS or combining both)
5. Direct Policy Search
● Expert temperature control :
– If temperature < 16°C, switch on heater
– If temperature > 20°C, switch off heater
– If temperature > 26°C, switch on air cond.
– If temperature < 24°C, switch off air cond.
● DPS: replace constants with parameters
– If temperature < x1°C, switch on heater
– If temperature > x2°C, switch off heater
– If temperature > x3°C, switch on air cond.
– If temperature < x4°C, switch off air cond.
6. Direct Policy Search
● DPS: replace constants with parameters
– If temperature < x1°C, switch on heater
– If temperature > x2°C, switch off heater
– If temperature > x3°C, switch on air cond.
– If temperature < x4°C, switch off air cond.
● Then define a simulator depending on
x=(x1,x2,x3,x4):
– simulator(x) = average cost over 1000 simulations
– x* = argmax simulator(x) <== here you need an
optimization algorithm (not detailed today)
7. Direct Policy Search
● Simple, pragmatic, efficient
● Define a parametric policy :
– Either parameter-free (neural network, sum of
Gaussian)
– Or parametric/generic (e.g. PID)
– Or parametric/specific (human expertise)
● Fuzzy control :
– A form of parametric DPS
– Oriented toward human readability
8. Fuzzy control : convert rules into
fuzzy rules
If temperature > 26°C then switch on AC
becomes
switch on AC with power
– 100% if T > 27°C
– 0% if T < 25°C
– 50(T-25)% if 25<T<27
9. Fuzzy control : convert rules into
fuzzy rules
Optionally,
● replace constants with parameters
● optimize parameters
switch on AC with power
– 100% if T > A°C
– 0% if T < B°C
– 100(T-B)/(A-B)% if B<T<A
10. Fuzzy control : convert rules into
fuzzy rules
● switch on AC with power
– 100% if T > A°C
– 0% if T < B°C
– 100(T-B)/(A-B)% if B<T<A
–
– In fuzzy terminonolgy,
– « membership » = this percentage.
11. Multidimensional set
● Membership for « if A and B then ... »
– Minimum : ms(AandB) = min(msA,msB)
– Or do whatever you want (product...)
● Membership for « if A or B then ... »
– Maximum : ms(AorB) = max(msA,msB)
– Or do whatever you want
14. Natural language & fuzzy
« If the temperature is hot then switch on AC at
100% »
● Define linguistic terms : hot, cool, cold, warm,
comfortable
● One rule for each term
● You might lose readability if at the end of the
optimization of parameters you have « cold »
which is hotter than « hot ».
15. Defuzzification
● When you have several rules, maybe several of
them will have positive membership and distinct
recommendations.
● Which one should we apply ?
– Max: choose the rule with maximum membership
– Weighted: weighted average by membership
16. Fuzzy control in short
● Nothing complicated (maybe some people don't
like it due to that)
● Very convenient
– Human expertise can be introduced
– Parameter optimization possibly on complex
simulations
– Readable
● Widely used in Asia
L. Zadeh