Machine learning applications in Performance Management
Bayesian learning tools: extending ABLE
Advancing theory
Summary and future directions
3.
Pattern discovery, classification, diagnosis and prediction Learning problems: examples System event mining Events from hosts Time End-user transaction recognition Remote Procedure Calls (RPCs) BUY? SELL? OPEN_DB? SEARCH? Transaction1 Transaction2
4.
Approach: Bayesian learning Numerous important applications:
Medicine
Stock market
Bio-informatics
eCommerce
Military
………
Diagnosis: P(cause| symptom )=? Learn (probabilistic) dependency models P(S) P(B|S) P(X|C,S) P(C|S) P(D|C,B) Prediction: P(symptom| cause )=? Bayesian networks Pattern classification: P(class| data )=? C S B D X
14.
Transaction recognition results Accuracy Training set size
Good EUT recognition accuracy: 64% (harder problem than classification!)
Reversed order of results: best classifier - not necessarily best recognizer ! (?)
further research! Third best best Multinomial Fourth best best Geometric best worst Shift. Geom. Second best best Bernoulli Segmentation Classification Model
27.
Why Naïve Bayes does well? And when? When independence assumptions do not hurt classification? Class-conditional feature independence: Unrealistic assumption! But why/when it works? True NB estimate P(class|f) Class Intuition: wrong probability estimates wrong classification! Naïve Bayes: Bayes-optimal :
Random problem generator: uniform P(class); random P(f|class): 1. A randomly selected entry in P(f|class) is assigned 2. The rest of entries – uniform random sampling + normalization 2. Feature dependence does NOT correlate with NB error
32.
From Naïve Bayes to Bayesian Networks Naïve Bayes model: independent features given class Bayesian network (BN) model: Any joint probability distributions = P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) P(S, C, B, X, D)= Query: P (lung cancer =yes | smoking =no, dyspnoea =yes ) = ? lung Cancer Smoking X-ray Bronchitis Dyspnoea P(D|C,B) P(B|S) P(S) P(X|C,S) P(C|S) CPD: C B D=0 D=1 0 0 0.1 0.9 0 1 0.7 0.3 1 0 0.8 0.2 1 1 0.9 0.1
33.
Example: Printer Troubleshooting (Microsoft Windows 95) [Heckerman, 95] Print Output OK Correct Driver Uncorrupted Driver Correct Printer Path Net Cable Connected Net/Local Printing Printer On and Online Correct Local Port Correct Printer Selected Local Cable Connected Application Output OK Print Spooling On Correct Driver Settings Printer Memory Adequate Network Up Spooled Data OK GDI Data Input OK GDI Data Output OK Print Data OK PC to Printer Transport OK Printer Data OK Spool Process OK Net Path OK Local Path OK Paper Loaded Local Disk Space Adequate
34.
How to use Bayesian networks? MEU Decision-making (given utility function) Prediction: P(symptom| cause )=? Diagnosis: P(cause| symptom )=? NP-complete inference problems Approximate algorithms Applications:
Medicine
Stock market
Bio-informatics
eCommerce
Performance
management
etc.
cause symptom symptom cause Classification: P(class| data )=?
reduce complexity of inference by ignoring some dependencies
Successfully used for approximating Most Probable Explanation:
Very efficient on real-life (medical, decoding) and synthetic problems
Local approximation scheme “Mini-buckets” (paper submitted to JACM) Less “noise” => higher accuracy similarly to naïve Bayes! General theory needed: Independence assumptions and “almost-deterministic” distributions noise Approximation accuracy Potential impact: efficient inference in complex performance management models (e.g., event mining, system dependence models)
Be the first to comment