More Related Content
Similar to final_v.pptx (20)
More from Natan Katz (16)
final_v.pptx
- 1. 1
©2022 Check Point Software Technologies Ltd.
[Restricted] ONLY for designated groups and individuals
July 2023
PARAMETRIC PDF FOR
GOODNESS OF FIT
Natan Katz | AI Tech Lead
Natan.katz@gmail.com natank@checkpoint.com
- 2. 2
©2022 Check Point Software Technologies Ltd.
About me:
• In CP since Fall 21, after nearly 3 long months in Avanan
• M.Sc. from Weizmann in applied math (Non- linear dynamics)
• Two publications about ethics in AI
• AAIML- Talk today
• Spent 5 years in NICE
My Coordinates:
https://www.linkedin.com/in/natan-katz-2936425/
https://natan-katz.medium.com/
- 4. Inject the test data D to model M
Calculate the Scores for D
Calculate the performances graph
Set a thershold
Confusion matrix
- 7. 7
©2022 Check Point Software Technologies Ltd.
What are the Disadvantages?
Mathematics
Diffcult to estimate stability (threhsdata pattern)
Statististic
Difficult to measre a
cnodfidenceCredible interval
- 8. 8
©2022 Check Point Software Technologies Ltd.
What can we do?
General Claim
o “Use” an analytic function in the evaluation
Our Trial
Measure the Beta distributions of the populations
- 9. 9
©2022 Check Point Software Technologies Ltd.
Beta Distriubtion
• Two params 𝛼, 𝛽 –positive numbers
• Mean-
𝑎
𝛼+𝛽
. Variance-
𝛼𝛽
(𝛼+𝛽+1)(𝛼+𝛽)2
• Support on [0,1]
• Conjugate prior of Binomial
• It has a closed form solution for KL
- 10. 10
©2022 Check Point Software Technologies Ltd.
What Do We Gain?
• Risk estimation
• Perfromances with respect to thresh
• Estimate data fluctuations
• We are equppid with metrics:
KL ,𝐻𝑃
, 𝑊𝑃 ,𝐷𝛼.
- 12. 12
©2022 Check Point Software Technologies Ltd.
IOT PROJECT
EREZ ISRAEL,
OFEK DADUSH
AMIT ELHELO
DANIEL COHEN SASON
REAL WORLD USE-CASE
- 13. 13
©2022 Check Point Software Technologies Ltd.
Danielle Adam
Discover the Assets
Discover all IoT devices
at the organization
- 14. 14
©2022 Check Point Software Technologies Ltd.
Cyber Caveats
• Real world data is always imbalanced:
We are always concerned about precision
• Data is temporal :
Hackers change their campaigns
- 15. 15
©2022 Check Point Software Technologies Ltd.
Function Detection Model
16 classes: 15 machine types & one “Uknown”
• High Accuracy – More than 90 %
• Errors are non uniform :
“A/C” instead of printer
“A/C” instead of Unknown
“Uknown” instead of known
- 18. 19
©2022 Check Point Software Technologies Ltd.
Few Words on MCC
Take to binary r.v X,Y ans measure their correlation
- 22. 23
©2022 Check Point Software Technologies Ltd.
Training
• We add XGBOOST the following regulation:
• Min[ KL(R,P) +KL(Q,L)]
Now we check whether it is a good regulation
- 24. 25
©2022 Check Point Software Technologies Ltd.
Further Research
• Different metrics
• Wassertein 1 & 2
• Hellinger
• Different Distriubtions
• Intutively though not sound:
Conformal prediction (fiducial function)
• Generalizing to more dimensions
- 25. 26
©2022 Check Point Software Technologies Ltd.
Some Links
• The paper
https://www.oajaiml.com/public/index.php/archive/parametric-pdf-for-goodness-of-fit
• The code
https://github.com/natank1/Beta_paper
• Multiclass MCC paper
https://arxiv.org/pdf/2208.05651.pdf
Nielsen
https://arxiv.org/pdf/1901.03732.pdf