SlideShare a Scribd company logo
Enhanced TaPR (eTaPR)
Accuracy Metric for Anomaly Detection
on Time-Series Data
2021. 8. 12.
Won-Seok Hwang
hws23@nsr.re.kr
Accuracy Evaluation on Non Time-Series Data
• Evaluation setting
– Learning a detection method with training dataset
– Detecting “anomalies” from test dataset
• Many anomalies included in the test dataset
• Detection method generates “predictions” that point out the anomalies
• Accuracy of detection
– Portion of detected anomalies to the whole anomalies (i.e., recall)
– Portion of correct predictions to the whole predictions (i.e., precision)
2021-08-12 2
Necessity of Accuracy Metric for Time-Series Data
• For non time-series data (e.g., binary classification or information retrieval)
– An anomaly or a prediction is always evaluated as two cases only
• An anomaly can be (1) detected or (2) not
• A prediction can be (1) correct or (2) not
• For time-series data
– Only a part of an anomaly can be detected
– Only a part of a prediction can be correct
– Because an anomaly or a prediction is represented as a range in time-series data
2021-08-12 3
Characteristic of Anomaly in Time-Series Data
• Reason why an anomaly is a range in time-series data
– An anomalous event (e.g., an incident or a fraud) causes a series of values whose pattern are similar
– It is more reasonable to regard the above-mentioned series of values as a single anomaly
• Reason why a prediction is a range
– A human operator recognizes that a series of predictions as a single prediction that indicates a range
2021-08-12 4
Time
1 9 9 9 9 9 …
An intrusion event (anomaly)
2 1 2 1 2 1 2 1 2 1
An observed value at 𝑡1
Regarding the range (𝑡7 - 𝑡11) as an anomaly
𝑡1 𝑡7 𝑡11
Evaluation by Comparing Ranges (Idea 1)
• Case of detecting a part of an anomaly
– Evaluating how much each anomaly is likely to be detected (Idea 1)
• If a person understands more than a certain portion of an anomaly, s/he can find its whole range
– Because the operator tries to find an anomaly by analyzing a given prediction
– Anomalies 𝑎2 and 𝑎3 are likely to be detected in the below figure.
– Giving non-zero score to those anomalies whose more than a certain portion is detected
• Given parameter (𝜃𝑟) determines the above-mentioned portion
• As an operator understands more portion of an anomaly, s/he is more likely to detect its whole range
– 𝑎3 is detected more easily than 𝑎2
– Giving the anomaly a score proportional to its detected portion
2021-08-12 5
𝑎1 𝑎2 𝑎3
𝑝1 𝑝2 𝑝3
A prediction range
An anomaly range Time
Hard to be detect Likely to be detect More likely to be detect
Evaluation by Comparing Ranges (Idea 2)
• Case of a part of prediction is correct
– Evaluating how much each prediction is likely to be useful for the detection (Idea 2)
• A prediction that identifies more than a certain portion of anomalies is useful for a person
– A person would analyze the whole range of a prediction although its some part incorrectly identifies
normal range
– 𝑝2 and 𝑝3 is useful to detect anomalies in the below figure
– Giving non-zero scores to those predictions whose a certain portion correctly identifies anomalies
• Given parameter (𝜃𝑝) determines the above-mentioned portion
• As a prediction identifies more portion of an anomaly, the prediction is more useful for the detection
– 𝑝3 is more useful than 𝑝2
– Giving the prediction a score proportional to its portion identifying anomalies correctly
2021-08-12 6
𝑎1 𝑎2 𝑎3
𝑝1 𝑝2 𝑝3
A prediction range
Time
An anomaly range
Useless to detect Useful to detect More useful to be detect
Evaluation by Comparing Ranges (Idea 3)
• Evaluation on the detection failure case
– Only the detection success cases should get non-zero score
– Considering Ideas 1 and 2, the detection failure cases also get non-zero score
• A prediction is evaluated as being useful even though it identifies no anomaly (see 𝑝1 and 𝑎1)
• An anomaly is evaluated as being detected even though no prediction identifies it (see 𝑝2 and 𝑎2)
• Success of detection depends on both of predictions and anomalies (Idea 3)
– When anomalies and predictions are not range, this idea is of no use to consider
• If a prediction identifies an anomaly, of course, there is always one detected anomaly
2021-08-12 7
𝑎1 𝑎2
𝑝1 𝑝2
Time
𝑝1 detects no anomalies because it identifies too small portion
(not enough information) of 𝑎1 to understand 𝑎1.
𝑝1 seems to be useful when considering Idea 2 only
Most portion of 𝑝2 fails to identify any anomalies,
so it is very hard to detect 𝑎2 with 𝑝2.
𝑎2 seems to be detected when consider Idea 1 only
• A lengthy incorrect prediction penalizes more than a short incorrect one (Idea 4)
– A person has to spend time proportional to the prediction to check anomalies occurrence
• A lengthy incorrect prediction requires more personal effort
• On the other hand, we do not consider the length of anomalies
– For instance, a length of cyber attack is unrelated with its effect
2021-08-12 8
Evaluation by Comparing Ranges (Idea 4)
Proposed Accuracy Metric
• Enhanced Time-series aware Recall (eTaR)
– Average possibility that all anomalies in the test dataset are detected
– Based on Ideas 1 and 3
• Enhanced Time-series aware Precision (eTaP)
– Average usefulness of all prediction produced by a detection method
– Based on Ideas 2, 3, and 4
• eTaF1
– An harmonic average of eTaP and eTaR
– Your rank is determined by eTaF1!!!
2021-08-12 9
• To understand Ideas 1 and 2, see the paper bellows:
– W.-Hwang et al. “Time-Series Aware Precision and Recall for Anomaly Detection: Considering Variety of
Detection Result and Addressing Ambiguous Labeling,” In Proc. of CIKM, pp. 2241-2244, 2019.
• eTaPR is an enhanced version by employing Ideas 3 and 4
2021-08-12 10
Reference
How to use
• Installation
– Command: python -m pip install eTaPR-[version]-py3-none-any.whl
• Execution
– TaPR_pkg.etapr.evaluate_haicon(anomalies: list, predictions: list) -> dict
• anomalies
– A list including 0 or 1
– 0 indicates normal while 1 does anomaly
• predictions
– A list including 0 or 1
– 0 indicates that your prediction is normal while 1 that your prediction is anomaly
• Returned dictionary including ‘tar’, ‘tap’, and ‘f1’
– e.g.:
result = TaPR_pkg.etapr.evaluate_haicon(anomalies_list, predictions_list)
result[‘tar’], result[‘tap’], result[‘f1’]
2021-08-12 11
• Precision and recall are the most well-known accuracy metrics
• They fail to evaluate the variety of detected anomalies
– Method 2 gets higher score than Method 1 even though it detects only 𝑎1
2021-08-12 12
Appendix: Why We Do Not Consider Precision and Recall
Method
Metric
Precision Recall
1 0.67 0.40
2 1.00 0.67

More Related Content

Similar to Introduction to e tapr for hai con -eng

VCE Physics: Dealing with numerical measurments
VCE Physics: Dealing with numerical measurmentsVCE Physics: Dealing with numerical measurments
VCE Physics: Dealing with numerical measurments
Andrew Grichting
 
The uncertainty of measurements
The uncertainty of measurementsThe uncertainty of measurements
The uncertainty of measurements
erwin marlon sario
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
KaushikRaghavan4
 
ARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.ppt
REFOTDEBuea
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
PatriaYunita
 
Development of health measurement scales – part 2
Development of health measurement scales – part 2Development of health measurement scales – part 2
Development of health measurement scales – part 2Rizwan S A
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
Valerii Klymchuk
 
Confirmatory Factor Analysis
Confirmatory Factor AnalysisConfirmatory Factor Analysis
Confirmatory Factor Analysis
University of Southampton
 
ERRORS-IN-MEASUREMENT-slide 5.pdf
ERRORS-IN-MEASUREMENT-slide 5.pdfERRORS-IN-MEASUREMENT-slide 5.pdf
ERRORS-IN-MEASUREMENT-slide 5.pdf
Shree242780
 
Presentation5.ppt
Presentation5.pptPresentation5.ppt
Presentation5.ppt
Khalil Alhatab
 
Ali Mousavi -- Event modeling
Ali Mousavi -- Event modeling Ali Mousavi -- Event modeling
Ali Mousavi -- Event modeling
Anatoly Levenchuk
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptx
nagarajan740445
 
Data analysis
Data analysisData analysis
Data analysis
SANTHANAM V
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
Jon Lederman
 
ders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptxders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptx
Ergin Akalpler
 
Overview Of Factor Analysis Q Ti A
Overview Of  Factor  Analysis  Q Ti AOverview Of  Factor  Analysis  Q Ti A
Overview Of Factor Analysis Q Ti AZoha Qureshi
 
Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysis
Animesh Kumar
 
Bayesian Autoencoders for anomaly detection in industrial environments
Bayesian Autoencoders for anomaly detection in industrial environmentsBayesian Autoencoders for anomaly detection in industrial environments
Bayesian Autoencoders for anomaly detection in industrial environments
Bang Xiang Yong
 
Monte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk AnalysisMonte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk Analysis
Intaver Insititute
 

Similar to Introduction to e tapr for hai con -eng (20)

Errors in measurement
Errors in measurementErrors in measurement
Errors in measurement
 
VCE Physics: Dealing with numerical measurments
VCE Physics: Dealing with numerical measurmentsVCE Physics: Dealing with numerical measurments
VCE Physics: Dealing with numerical measurments
 
The uncertainty of measurements
The uncertainty of measurementsThe uncertainty of measurements
The uncertainty of measurements
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
 
ARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.pptARIMA Model for analysis of time series data.ppt
ARIMA Model for analysis of time series data.ppt
 
ARIMA Model.ppt
ARIMA Model.pptARIMA Model.ppt
ARIMA Model.ppt
 
Development of health measurement scales – part 2
Development of health measurement scales – part 2Development of health measurement scales – part 2
Development of health measurement scales – part 2
 
03 Data Mining Techniques
03 Data Mining Techniques03 Data Mining Techniques
03 Data Mining Techniques
 
Confirmatory Factor Analysis
Confirmatory Factor AnalysisConfirmatory Factor Analysis
Confirmatory Factor Analysis
 
ERRORS-IN-MEASUREMENT-slide 5.pdf
ERRORS-IN-MEASUREMENT-slide 5.pdfERRORS-IN-MEASUREMENT-slide 5.pdf
ERRORS-IN-MEASUREMENT-slide 5.pdf
 
Presentation5.ppt
Presentation5.pptPresentation5.ppt
Presentation5.ppt
 
Ali Mousavi -- Event modeling
Ali Mousavi -- Event modeling Ali Mousavi -- Event modeling
Ali Mousavi -- Event modeling
 
Statistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptxStatistical Learning and Model Selection module 2.pptx
Statistical Learning and Model Selection module 2.pptx
 
Data analysis
Data analysisData analysis
Data analysis
 
Hyperparameter Tuning
Hyperparameter TuningHyperparameter Tuning
Hyperparameter Tuning
 
ders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptxders 6 Panel data analysis.pptx
ders 6 Panel data analysis.pptx
 
Overview Of Factor Analysis Q Ti A
Overview Of  Factor  Analysis  Q Ti AOverview Of  Factor  Analysis  Q Ti A
Overview Of Factor Analysis Q Ti A
 
Spss tutorial-cluster-analysis
Spss tutorial-cluster-analysisSpss tutorial-cluster-analysis
Spss tutorial-cluster-analysis
 
Bayesian Autoencoders for anomaly detection in industrial environments
Bayesian Autoencoders for anomaly detection in industrial environmentsBayesian Autoencoders for anomaly detection in industrial environments
Bayesian Autoencoders for anomaly detection in industrial environments
 
Monte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk AnalysisMonte Carlo and Schedule Risk Analysis
Monte Carlo and Schedule Risk Analysis
 

More from DACON AI 데이콘

20210728 대회주최 문의
20210728 대회주최 문의20210728 대회주최 문의
20210728 대회주최 문의
DACON AI 데이콘
 
데이콘 뽀개기
데이콘 뽀개기데이콘 뽀개기
데이콘 뽀개기
DACON AI 데이콘
 
Bittrader competition (1)
Bittrader competition (1)Bittrader competition (1)
Bittrader competition (1)
DACON AI 데이콘
 
Bittrader competition
Bittrader competitionBittrader competition
Bittrader competition
DACON AI 데이콘
 
Superbai
SuperbaiSuperbai
K-Fashion 경진대회 1등 수상자 솔루션
K-Fashion 경진대회 1등 수상자 솔루션K-Fashion 경진대회 1등 수상자 솔루션
K-Fashion 경진대회 1등 수상자 솔루션
DACON AI 데이콘
 
K-Fashion 경진대회 2등 수상자 솔루션
K-Fashion 경진대회 2등 수상자 솔루션K-Fashion 경진대회 2등 수상자 솔루션
K-Fashion 경진대회 2등 수상자 솔루션
DACON AI 데이콘
 
K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션
DACON AI 데이콘
 
아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션
DACON AI 데이콘
 
아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션
DACON AI 데이콘
 
진동데이터 활용 충돌체 탐지 AI 경진대회 2등
진동데이터 활용 충돌체 탐지 AI 경진대회 2등진동데이터 활용 충돌체 탐지 AI 경진대회 2등
진동데이터 활용 충돌체 탐지 AI 경진대회 2등
DACON AI 데이콘
 
진동데이터 활용 충돌체 탐지 AI 경진대회 1등
진동데이터 활용 충돌체 탐지 AI 경진대회 1등진동데이터 활용 충돌체 탐지 AI 경진대회 1등
진동데이터 활용 충돌체 탐지 AI 경진대회 1등
DACON AI 데이콘
 
20200923
2020092320200923
포스트 코로나 데이터 시각화 경진대회 - 대상
포스트 코로나 데이터 시각화 경진대회 - 대상포스트 코로나 데이터 시각화 경진대회 - 대상
포스트 코로나 데이터 시각화 경진대회 - 대상
DACON AI 데이콘
 
포스트 코로나 데이터 시각화 경진대회 - 최우수상
포스트 코로나 데이터 시각화 경진대회 - 최우수상포스트 코로나 데이터 시각화 경진대회 - 최우수상
포스트 코로나 데이터 시각화 경진대회 - 최우수상
DACON AI 데이콘
 
포스트 코로나 데이터 시각화 경진대회 - 우수상
포스트 코로나 데이터 시각화 경진대회 - 우수상포스트 코로나 데이터 시각화 경진대회 - 우수상
포스트 코로나 데이터 시각화 경진대회 - 우수상
DACON AI 데이콘
 
포스트 코로나 데이터 시각화 경진대회 - 장려상2
포스트 코로나 데이터 시각화 경진대회 - 장려상2포스트 코로나 데이터 시각화 경진대회 - 장려상2
포스트 코로나 데이터 시각화 경진대회 - 장려상2
DACON AI 데이콘
 
포스트 코로나 데이터 시각화 경진대회 - 장려상
포스트 코로나 데이터 시각화 경진대회 - 장려상포스트 코로나 데이터 시각화 경진대회 - 장려상
포스트 코로나 데이터 시각화 경진대회 - 장려상
DACON AI 데이콘
 
생체 광학 데이터 분석 AI 경진대회 10위 수상작
생체 광학 데이터 분석 AI 경진대회 10위 수상작생체 광학 데이터 분석 AI 경진대회 10위 수상작
생체 광학 데이터 분석 AI 경진대회 10위 수상작
DACON AI 데이콘
 
생체 광학 데이터 분석 AI 경진대회 9위 수상작
생체 광학 데이터 분석 AI 경진대회 9위 수상작생체 광학 데이터 분석 AI 경진대회 9위 수상작
생체 광학 데이터 분석 AI 경진대회 9위 수상작
DACON AI 데이콘
 

More from DACON AI 데이콘 (20)

20210728 대회주최 문의
20210728 대회주최 문의20210728 대회주최 문의
20210728 대회주최 문의
 
데이콘 뽀개기
데이콘 뽀개기데이콘 뽀개기
데이콘 뽀개기
 
Bittrader competition (1)
Bittrader competition (1)Bittrader competition (1)
Bittrader competition (1)
 
Bittrader competition
Bittrader competitionBittrader competition
Bittrader competition
 
Superbai
SuperbaiSuperbai
Superbai
 
K-Fashion 경진대회 1등 수상자 솔루션
K-Fashion 경진대회 1등 수상자 솔루션K-Fashion 경진대회 1등 수상자 솔루션
K-Fashion 경진대회 1등 수상자 솔루션
 
K-Fashion 경진대회 2등 수상자 솔루션
K-Fashion 경진대회 2등 수상자 솔루션K-Fashion 경진대회 2등 수상자 솔루션
K-Fashion 경진대회 2등 수상자 솔루션
 
K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션K-Fashion 경진대회 3등 수상자 솔루션
K-Fashion 경진대회 3등 수상자 솔루션
 
아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 2등 수상자 솔루션
 
아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션
아리랑 위성영상 AI 객체 검출 경진대회 1등 수상자 솔루션
 
진동데이터 활용 충돌체 탐지 AI 경진대회 2등
진동데이터 활용 충돌체 탐지 AI 경진대회 2등진동데이터 활용 충돌체 탐지 AI 경진대회 2등
진동데이터 활용 충돌체 탐지 AI 경진대회 2등
 
진동데이터 활용 충돌체 탐지 AI 경진대회 1등
진동데이터 활용 충돌체 탐지 AI 경진대회 1등진동데이터 활용 충돌체 탐지 AI 경진대회 1등
진동데이터 활용 충돌체 탐지 AI 경진대회 1등
 
20200923
2020092320200923
20200923
 
포스트 코로나 데이터 시각화 경진대회 - 대상
포스트 코로나 데이터 시각화 경진대회 - 대상포스트 코로나 데이터 시각화 경진대회 - 대상
포스트 코로나 데이터 시각화 경진대회 - 대상
 
포스트 코로나 데이터 시각화 경진대회 - 최우수상
포스트 코로나 데이터 시각화 경진대회 - 최우수상포스트 코로나 데이터 시각화 경진대회 - 최우수상
포스트 코로나 데이터 시각화 경진대회 - 최우수상
 
포스트 코로나 데이터 시각화 경진대회 - 우수상
포스트 코로나 데이터 시각화 경진대회 - 우수상포스트 코로나 데이터 시각화 경진대회 - 우수상
포스트 코로나 데이터 시각화 경진대회 - 우수상
 
포스트 코로나 데이터 시각화 경진대회 - 장려상2
포스트 코로나 데이터 시각화 경진대회 - 장려상2포스트 코로나 데이터 시각화 경진대회 - 장려상2
포스트 코로나 데이터 시각화 경진대회 - 장려상2
 
포스트 코로나 데이터 시각화 경진대회 - 장려상
포스트 코로나 데이터 시각화 경진대회 - 장려상포스트 코로나 데이터 시각화 경진대회 - 장려상
포스트 코로나 데이터 시각화 경진대회 - 장려상
 
생체 광학 데이터 분석 AI 경진대회 10위 수상작
생체 광학 데이터 분석 AI 경진대회 10위 수상작생체 광학 데이터 분석 AI 경진대회 10위 수상작
생체 광학 데이터 분석 AI 경진대회 10위 수상작
 
생체 광학 데이터 분석 AI 경진대회 9위 수상작
생체 광학 데이터 분석 AI 경진대회 9위 수상작생체 광학 데이터 분석 AI 경진대회 9위 수상작
생체 광학 데이터 분석 AI 경진대회 9위 수상작
 

Recently uploaded

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Sérgio Sacani
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
subedisuryaofficial
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
muralinath2
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
muralinath2
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
Areesha Ahmad
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
SAMIR PANDA
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
DiyaBiswas10
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
ossaicprecious19
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
AADYARAJPANDEY1
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
Scintica Instrumentation
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
ssuserbfdca9
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
muralinath2
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Erdal Coalmaker
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
muralinath2
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
sachin783648
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Ana Luísa Pinho
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
Richard Gill
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
Health Advances
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Sérgio Sacani
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
Columbia Weather Systems
 

Recently uploaded (20)

Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...
 
Citrus Greening Disease and its Management
Citrus Greening Disease and its ManagementCitrus Greening Disease and its Management
Citrus Greening Disease and its Management
 
Hemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptxHemoglobin metabolism_pathophysiology.pptx
Hemoglobin metabolism_pathophysiology.pptx
 
platelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptxplatelets_clotting_biogenesis.clot retractionpptx
platelets_clotting_biogenesis.clot retractionpptx
 
GBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture MediaGBSN - Microbiology (Lab 4) Culture Media
GBSN - Microbiology (Lab 4) Culture Media
 
Seminar of U.V. Spectroscopy by SAMIR PANDA
 Seminar of U.V. Spectroscopy by SAMIR PANDA Seminar of U.V. Spectroscopy by SAMIR PANDA
Seminar of U.V. Spectroscopy by SAMIR PANDA
 
extra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdfextra-chromosomal-inheritance[1].pptx.pdfpdf
extra-chromosomal-inheritance[1].pptx.pdfpdf
 
Lab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerinLab report on liquid viscosity of glycerin
Lab report on liquid viscosity of glycerin
 
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCINGRNA INTERFERENCE: UNRAVELING GENETIC SILENCING
RNA INTERFERENCE: UNRAVELING GENETIC SILENCING
 
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...
 
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
4. An Overview of Sugarcane White Leaf Disease in Vietnam.pdf
 
ESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptxESR_factors_affect-clinic significance-Pathysiology.pptx
ESR_factors_affect-clinic significance-Pathysiology.pptx
 
Unveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdfUnveiling the Energy Potential of Marshmallow Deposits.pdf
Unveiling the Energy Potential of Marshmallow Deposits.pdf
 
Hemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptxHemostasis_importance& clinical significance.pptx
Hemostasis_importance& clinical significance.pptx
 
Comparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebratesComparative structure of adrenal gland in vertebrates
Comparative structure of adrenal gland in vertebrates
 
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...
 
Richard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlandsRichard's aventures in two entangled wonderlands
Richard's aventures in two entangled wonderlands
 
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...The ASGCT Annual Meeting was packed with exciting progress in the field advan...
The ASGCT Annual Meeting was packed with exciting progress in the field advan...
 
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
 
Orion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWSOrion Air Quality Monitoring Systems - CWS
Orion Air Quality Monitoring Systems - CWS
 

Introduction to e tapr for hai con -eng

  • 1. Enhanced TaPR (eTaPR) Accuracy Metric for Anomaly Detection on Time-Series Data 2021. 8. 12. Won-Seok Hwang hws23@nsr.re.kr
  • 2. Accuracy Evaluation on Non Time-Series Data • Evaluation setting – Learning a detection method with training dataset – Detecting “anomalies” from test dataset • Many anomalies included in the test dataset • Detection method generates “predictions” that point out the anomalies • Accuracy of detection – Portion of detected anomalies to the whole anomalies (i.e., recall) – Portion of correct predictions to the whole predictions (i.e., precision) 2021-08-12 2
  • 3. Necessity of Accuracy Metric for Time-Series Data • For non time-series data (e.g., binary classification or information retrieval) – An anomaly or a prediction is always evaluated as two cases only • An anomaly can be (1) detected or (2) not • A prediction can be (1) correct or (2) not • For time-series data – Only a part of an anomaly can be detected – Only a part of a prediction can be correct – Because an anomaly or a prediction is represented as a range in time-series data 2021-08-12 3
  • 4. Characteristic of Anomaly in Time-Series Data • Reason why an anomaly is a range in time-series data – An anomalous event (e.g., an incident or a fraud) causes a series of values whose pattern are similar – It is more reasonable to regard the above-mentioned series of values as a single anomaly • Reason why a prediction is a range – A human operator recognizes that a series of predictions as a single prediction that indicates a range 2021-08-12 4 Time 1 9 9 9 9 9 … An intrusion event (anomaly) 2 1 2 1 2 1 2 1 2 1 An observed value at 𝑡1 Regarding the range (𝑡7 - 𝑡11) as an anomaly 𝑡1 𝑡7 𝑡11
  • 5. Evaluation by Comparing Ranges (Idea 1) • Case of detecting a part of an anomaly – Evaluating how much each anomaly is likely to be detected (Idea 1) • If a person understands more than a certain portion of an anomaly, s/he can find its whole range – Because the operator tries to find an anomaly by analyzing a given prediction – Anomalies 𝑎2 and 𝑎3 are likely to be detected in the below figure. – Giving non-zero score to those anomalies whose more than a certain portion is detected • Given parameter (𝜃𝑟) determines the above-mentioned portion • As an operator understands more portion of an anomaly, s/he is more likely to detect its whole range – 𝑎3 is detected more easily than 𝑎2 – Giving the anomaly a score proportional to its detected portion 2021-08-12 5 𝑎1 𝑎2 𝑎3 𝑝1 𝑝2 𝑝3 A prediction range An anomaly range Time Hard to be detect Likely to be detect More likely to be detect
  • 6. Evaluation by Comparing Ranges (Idea 2) • Case of a part of prediction is correct – Evaluating how much each prediction is likely to be useful for the detection (Idea 2) • A prediction that identifies more than a certain portion of anomalies is useful for a person – A person would analyze the whole range of a prediction although its some part incorrectly identifies normal range – 𝑝2 and 𝑝3 is useful to detect anomalies in the below figure – Giving non-zero scores to those predictions whose a certain portion correctly identifies anomalies • Given parameter (𝜃𝑝) determines the above-mentioned portion • As a prediction identifies more portion of an anomaly, the prediction is more useful for the detection – 𝑝3 is more useful than 𝑝2 – Giving the prediction a score proportional to its portion identifying anomalies correctly 2021-08-12 6 𝑎1 𝑎2 𝑎3 𝑝1 𝑝2 𝑝3 A prediction range Time An anomaly range Useless to detect Useful to detect More useful to be detect
  • 7. Evaluation by Comparing Ranges (Idea 3) • Evaluation on the detection failure case – Only the detection success cases should get non-zero score – Considering Ideas 1 and 2, the detection failure cases also get non-zero score • A prediction is evaluated as being useful even though it identifies no anomaly (see 𝑝1 and 𝑎1) • An anomaly is evaluated as being detected even though no prediction identifies it (see 𝑝2 and 𝑎2) • Success of detection depends on both of predictions and anomalies (Idea 3) – When anomalies and predictions are not range, this idea is of no use to consider • If a prediction identifies an anomaly, of course, there is always one detected anomaly 2021-08-12 7 𝑎1 𝑎2 𝑝1 𝑝2 Time 𝑝1 detects no anomalies because it identifies too small portion (not enough information) of 𝑎1 to understand 𝑎1. 𝑝1 seems to be useful when considering Idea 2 only Most portion of 𝑝2 fails to identify any anomalies, so it is very hard to detect 𝑎2 with 𝑝2. 𝑎2 seems to be detected when consider Idea 1 only
  • 8. • A lengthy incorrect prediction penalizes more than a short incorrect one (Idea 4) – A person has to spend time proportional to the prediction to check anomalies occurrence • A lengthy incorrect prediction requires more personal effort • On the other hand, we do not consider the length of anomalies – For instance, a length of cyber attack is unrelated with its effect 2021-08-12 8 Evaluation by Comparing Ranges (Idea 4)
  • 9. Proposed Accuracy Metric • Enhanced Time-series aware Recall (eTaR) – Average possibility that all anomalies in the test dataset are detected – Based on Ideas 1 and 3 • Enhanced Time-series aware Precision (eTaP) – Average usefulness of all prediction produced by a detection method – Based on Ideas 2, 3, and 4 • eTaF1 – An harmonic average of eTaP and eTaR – Your rank is determined by eTaF1!!! 2021-08-12 9
  • 10. • To understand Ideas 1 and 2, see the paper bellows: – W.-Hwang et al. “Time-Series Aware Precision and Recall for Anomaly Detection: Considering Variety of Detection Result and Addressing Ambiguous Labeling,” In Proc. of CIKM, pp. 2241-2244, 2019. • eTaPR is an enhanced version by employing Ideas 3 and 4 2021-08-12 10 Reference
  • 11. How to use • Installation – Command: python -m pip install eTaPR-[version]-py3-none-any.whl • Execution – TaPR_pkg.etapr.evaluate_haicon(anomalies: list, predictions: list) -> dict • anomalies – A list including 0 or 1 – 0 indicates normal while 1 does anomaly • predictions – A list including 0 or 1 – 0 indicates that your prediction is normal while 1 that your prediction is anomaly • Returned dictionary including ‘tar’, ‘tap’, and ‘f1’ – e.g.: result = TaPR_pkg.etapr.evaluate_haicon(anomalies_list, predictions_list) result[‘tar’], result[‘tap’], result[‘f1’] 2021-08-12 11
  • 12. • Precision and recall are the most well-known accuracy metrics • They fail to evaluate the variety of detected anomalies – Method 2 gets higher score than Method 1 even though it detects only 𝑎1 2021-08-12 12 Appendix: Why We Do Not Consider Precision and Recall Method Metric Precision Recall 1 0.67 0.40 2 1.00 0.67