SlideShare a Scribd company logo
Pattern-based Features
A Data Transformation Pattern
http://research.microsoft.com/en-us/projects/tark/
Venkatesh-Prasad Ranganath
Jithin Thomas
Microsoft Research, India
http://research.microsoft.com/ICSE2013
Context / Scenarios
• Compatibility Testing
• Test Prioritization / Test Suite Minimization
• Representative Identification
• Similar Case Recommendation
• Anomaly Detection
Constraints
Input Characteristics
– Data is sequential
– Data is structured
– Fields may be irrelevant
– Values may be irrelevant
– Value flow may be
relevant
Output Constraints
– Usable with existing
DM/ML algorithms
– Amenable to simple
reasoning
– Accessible
– Possess explanatory
power
A Data Transformation Pattern
• Use off-the-shelf techniques to mine patterns
– Item-set mining
– Temporal pattern mining
– Association rule mining*
– Graph mining*
• Use patterns as features
– Binary/Categorical features: Presence of patterns
– Numeric features: Properties of patterns
* We have not tried these pattern mining techniques.
Example
Win8
USB 3.0
Driver Stack
XHCI
Driver1
Win7
USB 2.0
Driver Stack
Driver1
EHCI OHCI UHCI
Driver2
USB 2.0
device
USB 2.0
device
When a USB 2.0 device is plugged into a USB 3.0 port on
Win8, will the USB 3.0 driver in Win8 exhibit the same behavior
as the USB 2.0 driver?
Example
USB2
Log
Tark
USB2 Patterns
USB3
Log
Tark
USB3 Patterns
Structural and Temporal
Pattern Diffing
USB2
Patterns
USB3
Patterns
DispatchIrp forward alternates with IrpCompletion && PreIoCompleteRequest
when
IOCTLType=IRP_MJ_PNP(0x1B),IRP_MN_START_DEVICE(0x00), irpID=SAME, and
IrpSubmitDetails.irp.ioStackLocation.control=SAME
IOCTLType=URB_FUNCTION_BULK_OR_INTERRUPT_T
RANSFER(0x09)
&& IoCallDriverReturn && IoCallDriverReturn.irql=2
&& IoCallDriverReturn.status=0xC000000E
Pattern-based Features
Input Characteristics
– Data is sequential
– Data is structured
– Fields may be irrelevant
– Values may be irrelevant
– Value flow may be relevant
Output Constraints
– Usable with existing
DM/ML algorithms
– Amenable to simple
reasoning
– Accessible
– Possess explanatory power
Pattern
– Use off-the-shelf
techniques to mine
patterns
• Item-set mining
• Temporal pattern mining
– Use patterns as features
• Binary/Categorical features
• Numeric features
http://research.microsoft.com/en-us/projects/tark/

More Related Content

Similar to Pattern-based Features

Patterns of Test Automation
Patterns of Test AutomationPatterns of Test Automation
Patterns of Test Automation
vodQA
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data Quality
Vera Ekimenko
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
Cataldo Musto
 
Data, Text and Web Mining
Data, Text and Web Mining Data, Text and Web Mining
Data, Text and Web Mining
Jeremiah Fadugba
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
Turi, Inc.
 
Feature Selection.pdf
Feature Selection.pdfFeature Selection.pdf
Feature Selection.pdf
adarshbarnwal5
 
Machine Learning Platform Life-Cycle Management
Machine Learning Platform Life-Cycle ManagementMachine Learning Platform Life-Cycle Management
Machine Learning Platform Life-Cycle Management
Bill Liu
 
ML Ops.pptx
ML Ops.pptxML Ops.pptx
ML Ops.pptx
Adam Doyle
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
DataWorks Summit
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
Mark Kromer
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
Codecamp Romania
 
Requirements engineering iv
Requirements engineering ivRequirements engineering iv
Requirements engineering iv
indrisrozas
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
Petr Zapletal
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
Valerii Klymchuk
 
Instrumentation and measurement
Instrumentation and measurementInstrumentation and measurement
Instrumentation and measurement
Dr.M.Prasad Naidu
 
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
Richard Robinson
 
AzureML – zero to hero
AzureML – zero to heroAzureML – zero to hero
AzureML – zero to hero
Govind Kanshi
 
Process.ppt
Process.pptProcess.ppt
Process.ppt
SK Chew
 
Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...
Ram Shankar Siva Kumar
 
Data mining slide for data mining process
Data mining slide for data mining processData mining slide for data mining process
Data mining slide for data mining process
NivaTripathy1
 

Similar to Pattern-based Features (20)

Patterns of Test Automation
Patterns of Test AutomationPatterns of Test Automation
Patterns of Test Automation
 
Artificial Intelligence for Data Quality
Artificial Intelligence for Data QualityArtificial Intelligence for Data Quality
Artificial Intelligence for Data Quality
 
Tutorial Mahout - Recommendation
Tutorial Mahout - RecommendationTutorial Mahout - Recommendation
Tutorial Mahout - Recommendation
 
Data, Text and Web Mining
Data, Text and Web Mining Data, Text and Web Mining
Data, Text and Web Mining
 
Towards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning BenchmarkTowards a Comprehensive Machine Learning Benchmark
Towards a Comprehensive Machine Learning Benchmark
 
Feature Selection.pdf
Feature Selection.pdfFeature Selection.pdf
Feature Selection.pdf
 
Machine Learning Platform Life-Cycle Management
Machine Learning Platform Life-Cycle ManagementMachine Learning Platform Life-Cycle Management
Machine Learning Platform Life-Cycle Management
 
ML Ops.pptx
ML Ops.pptxML Ops.pptx
ML Ops.pptx
 
A machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companiesA machine learning and data science pipeline for real companies
A machine learning and data science pipeline for real companies
 
Build data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelinesBuild data quality rules and data cleansing into your data pipelines
Build data quality rules and data cleansing into your data pipelines
 
Alex mang patterns for scalability in microsoft azure application
Alex mang   patterns for scalability in microsoft azure applicationAlex mang   patterns for scalability in microsoft azure application
Alex mang patterns for scalability in microsoft azure application
 
Requirements engineering iv
Requirements engineering ivRequirements engineering iv
Requirements engineering iv
 
MLlib and Machine Learning on Spark
MLlib and Machine Learning on SparkMLlib and Machine Learning on Spark
MLlib and Machine Learning on Spark
 
01 Introduction to Data Mining
01 Introduction to Data Mining01 Introduction to Data Mining
01 Introduction to Data Mining
 
Instrumentation and measurement
Instrumentation and measurementInstrumentation and measurement
Instrumentation and measurement
 
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
SCRIMPS-STD: Test Automation Design Principles - and asking the right questions!
 
AzureML – zero to hero
AzureML – zero to heroAzureML – zero to hero
AzureML – zero to hero
 
Process.ppt
Process.pptProcess.ppt
Process.ppt
 
Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...Operationalizing security data science for the cloud: Challenges, solutions, ...
Operationalizing security data science for the cloud: Challenges, solutions, ...
 
Data mining slide for data mining process
Data mining slide for data mining processData mining slide for data mining process
Data mining slide for data mining process
 

More from Venkatesh Prasad Ranganath

SeMA: A Design Methodology for Building Secure Android Apps
SeMA: A Design Methodology for Building Secure Android AppsSeMA: A Design Methodology for Building Secure Android Apps
SeMA: A Design Methodology for Building Secure Android Apps
Venkatesh Prasad Ranganath
 
Are free Android app security analysis tools effective in detecting known vul...
Are free Android app security analysis tools effective in detecting known vul...Are free Android app security analysis tools effective in detecting known vul...
Are free Android app security analysis tools effective in detecting known vul...
Venkatesh Prasad Ranganath
 
Benchpress: Analyzing Android App Vulnerability Benchmark Suites
Benchpress:  Analyzing Android App Vulnerability Benchmark SuitesBenchpress:  Analyzing Android App Vulnerability Benchmark Suites
Benchpress: Analyzing Android App Vulnerability Benchmark Suites
Venkatesh Prasad Ranganath
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
Venkatesh Prasad Ranganath
 
Behavior Driven Development [10] - Software Testing Techniques (CIS640)
Behavior Driven Development [10] - Software Testing Techniques (CIS640)Behavior Driven Development [10] - Software Testing Techniques (CIS640)
Behavior Driven Development [10] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Code Coverage [9] - Software Testing Techniques (CIS640)
Code Coverage [9] - Software Testing Techniques (CIS640)Code Coverage [9] - Software Testing Techniques (CIS640)
Code Coverage [9] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Boundary Value Testing [7] - Software Testing Techniques (CIS640)
Boundary Value Testing [7] - Software Testing Techniques (CIS640)Boundary Value Testing [7] - Software Testing Techniques (CIS640)
Boundary Value Testing [7] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Property Based Testing [5] - Software Testing Techniques (CIS640)
Property Based Testing [5] - Software Testing Techniques (CIS640)Property Based Testing [5] - Software Testing Techniques (CIS640)
Property Based Testing [5] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Intro to Python3 [2] - Software Testing Techniques (CIS640)
Intro to Python3 [2] - Software Testing Techniques (CIS640)Intro to Python3 [2] - Software Testing Techniques (CIS640)
Intro to Python3 [2] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Unit testing [4] - Software Testing Techniques (CIS640)
Unit testing [4] - Software Testing Techniques (CIS640)Unit testing [4] - Software Testing Techniques (CIS640)
Unit testing [4] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Introduction [1] - Software Testing Techniques (CIS640)
Introduction [1] - Software Testing Techniques (CIS640)Introduction [1] - Software Testing Techniques (CIS640)
Introduction [1] - Software Testing Techniques (CIS640)
Venkatesh Prasad Ranganath
 
Compatibility Testing using Patterns-based Trace Comparison
Compatibility Testing using Patterns-based Trace ComparisonCompatibility Testing using Patterns-based Trace Comparison
Compatibility Testing using Patterns-based Trace Comparison
Venkatesh Prasad Ranganath
 
My flings with data analysis
My flings with data analysisMy flings with data analysis
My flings with data analysis
Venkatesh Prasad Ranganath
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
Venkatesh Prasad Ranganath
 
R language, an introduction
R language, an introductionR language, an introduction
R language, an introduction
Venkatesh Prasad Ranganath
 

More from Venkatesh Prasad Ranganath (17)

SeMA: A Design Methodology for Building Secure Android Apps
SeMA: A Design Methodology for Building Secure Android AppsSeMA: A Design Methodology for Building Secure Android Apps
SeMA: A Design Methodology for Building Secure Android Apps
 
Are free Android app security analysis tools effective in detecting known vul...
Are free Android app security analysis tools effective in detecting known vul...Are free Android app security analysis tools effective in detecting known vul...
Are free Android app security analysis tools effective in detecting known vul...
 
Benchpress: Analyzing Android App Vulnerability Benchmark Suites
Benchpress:  Analyzing Android App Vulnerability Benchmark SuitesBenchpress:  Analyzing Android App Vulnerability Benchmark Suites
Benchpress: Analyzing Android App Vulnerability Benchmark Suites
 
Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?Why do Users kill HPC Jobs?
Why do Users kill HPC Jobs?
 
Behavior Driven Development [10] - Software Testing Techniques (CIS640)
Behavior Driven Development [10] - Software Testing Techniques (CIS640)Behavior Driven Development [10] - Software Testing Techniques (CIS640)
Behavior Driven Development [10] - Software Testing Techniques (CIS640)
 
Code Coverage [9] - Software Testing Techniques (CIS640)
Code Coverage [9] - Software Testing Techniques (CIS640)Code Coverage [9] - Software Testing Techniques (CIS640)
Code Coverage [9] - Software Testing Techniques (CIS640)
 
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
Equivalence Class Testing [8] - Software Testing Techniques (CIS640)
 
Boundary Value Testing [7] - Software Testing Techniques (CIS640)
Boundary Value Testing [7] - Software Testing Techniques (CIS640)Boundary Value Testing [7] - Software Testing Techniques (CIS640)
Boundary Value Testing [7] - Software Testing Techniques (CIS640)
 
Property Based Testing [5] - Software Testing Techniques (CIS640)
Property Based Testing [5] - Software Testing Techniques (CIS640)Property Based Testing [5] - Software Testing Techniques (CIS640)
Property Based Testing [5] - Software Testing Techniques (CIS640)
 
Intro to Python3 [2] - Software Testing Techniques (CIS640)
Intro to Python3 [2] - Software Testing Techniques (CIS640)Intro to Python3 [2] - Software Testing Techniques (CIS640)
Intro to Python3 [2] - Software Testing Techniques (CIS640)
 
Unit testing [4] - Software Testing Techniques (CIS640)
Unit testing [4] - Software Testing Techniques (CIS640)Unit testing [4] - Software Testing Techniques (CIS640)
Unit testing [4] - Software Testing Techniques (CIS640)
 
Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)Testing concepts [3] - Software Testing Techniques (CIS640)
Testing concepts [3] - Software Testing Techniques (CIS640)
 
Introduction [1] - Software Testing Techniques (CIS640)
Introduction [1] - Software Testing Techniques (CIS640)Introduction [1] - Software Testing Techniques (CIS640)
Introduction [1] - Software Testing Techniques (CIS640)
 
Compatibility Testing using Patterns-based Trace Comparison
Compatibility Testing using Patterns-based Trace ComparisonCompatibility Testing using Patterns-based Trace Comparison
Compatibility Testing using Patterns-based Trace Comparison
 
My flings with data analysis
My flings with data analysisMy flings with data analysis
My flings with data analysis
 
Data analytics, a (short) tour
Data analytics, a (short) tourData analytics, a (short) tour
Data analytics, a (short) tour
 
R language, an introduction
R language, an introductionR language, an introduction
R language, an introduction
 

Recently uploaded

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
Aftab Hussain
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
sonjaschweigert1
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
Neo4j
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Aggregage
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
Neo4j
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
KAMESHS29
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
Zilliz
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
Alpen-Adria-Universität
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
KatiaHIMEUR1
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
DianaGray10
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Albert Hoitingh
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
Neo4j
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
Kumud Singh
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
DianaGray10
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
innovationoecd
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Speck&Tech
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
danishmna97
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Paige Cruz
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Malak Abu Hammad
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
Matthew Sinclair
 

Recently uploaded (20)

Removing Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software FuzzingRemoving Uninteresting Bytes in Software Fuzzing
Removing Uninteresting Bytes in Software Fuzzing
 
A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...A tale of scale & speed: How the US Navy is enabling software delivery from l...
A tale of scale & speed: How the US Navy is enabling software delivery from l...
 
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
GraphSummit Singapore | Neo4j Product Vision & Roadmap - Q2 2024
 
Generative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to ProductionGenerative AI Deep Dive: Advancing from Proof of Concept to Production
Generative AI Deep Dive: Advancing from Proof of Concept to Production
 
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
GraphSummit Singapore | Graphing Success: Revolutionising Organisational Stru...
 
RESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for studentsRESUME BUILDER APPLICATION Project for students
RESUME BUILDER APPLICATION Project for students
 
Full-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalizationFull-RAG: A modern architecture for hyper-personalization
Full-RAG: A modern architecture for hyper-personalization
 
Video Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the FutureVideo Streaming: Then, Now, and in the Future
Video Streaming: Then, Now, and in the Future
 
Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !Securing your Kubernetes cluster_ a step-by-step guide to success !
Securing your Kubernetes cluster_ a step-by-step guide to success !
 
UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5UiPath Test Automation using UiPath Test Suite series, part 5
UiPath Test Automation using UiPath Test Suite series, part 5
 
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024
 
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
GraphSummit Singapore | The Future of Agility: Supercharging Digital Transfor...
 
Mind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AIMind map of terminologies used in context of Generative AI
Mind map of terminologies used in context of Generative AI
 
UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6UiPath Test Automation using UiPath Test Suite series, part 6
UiPath Test Automation using UiPath Test Suite series, part 6
 
Presentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of GermanyPresentation of the OECD Artificial Intelligence Review of Germany
Presentation of the OECD Artificial Intelligence Review of Germany
 
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?
 
How to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptxHow to Get CNIC Information System with Paksim Ga.pptx
How to Get CNIC Information System with Paksim Ga.pptx
 
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfObservability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdf
 
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdfUnlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
Unlock the Future of Search with MongoDB Atlas_ Vector Search Unleashed.pdf
 
20240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 202420240605 QFM017 Machine Intelligence Reading List May 2024
20240605 QFM017 Machine Intelligence Reading List May 2024
 

Pattern-based Features

  • 1. Pattern-based Features A Data Transformation Pattern http://research.microsoft.com/en-us/projects/tark/ Venkatesh-Prasad Ranganath Jithin Thomas Microsoft Research, India http://research.microsoft.com/ICSE2013
  • 2. Context / Scenarios • Compatibility Testing • Test Prioritization / Test Suite Minimization • Representative Identification • Similar Case Recommendation • Anomaly Detection
  • 3. Constraints Input Characteristics – Data is sequential – Data is structured – Fields may be irrelevant – Values may be irrelevant – Value flow may be relevant Output Constraints – Usable with existing DM/ML algorithms – Amenable to simple reasoning – Accessible – Possess explanatory power
  • 4. A Data Transformation Pattern • Use off-the-shelf techniques to mine patterns – Item-set mining – Temporal pattern mining – Association rule mining* – Graph mining* • Use patterns as features – Binary/Categorical features: Presence of patterns – Numeric features: Properties of patterns * We have not tried these pattern mining techniques.
  • 5. Example Win8 USB 3.0 Driver Stack XHCI Driver1 Win7 USB 2.0 Driver Stack Driver1 EHCI OHCI UHCI Driver2 USB 2.0 device USB 2.0 device When a USB 2.0 device is plugged into a USB 3.0 port on Win8, will the USB 3.0 driver in Win8 exhibit the same behavior as the USB 2.0 driver?
  • 6. Example USB2 Log Tark USB2 Patterns USB3 Log Tark USB3 Patterns Structural and Temporal Pattern Diffing USB2 Patterns USB3 Patterns DispatchIrp forward alternates with IrpCompletion && PreIoCompleteRequest when IOCTLType=IRP_MJ_PNP(0x1B),IRP_MN_START_DEVICE(0x00), irpID=SAME, and IrpSubmitDetails.irp.ioStackLocation.control=SAME IOCTLType=URB_FUNCTION_BULK_OR_INTERRUPT_T RANSFER(0x09) && IoCallDriverReturn && IoCallDriverReturn.irql=2 && IoCallDriverReturn.status=0xC000000E
  • 7. Pattern-based Features Input Characteristics – Data is sequential – Data is structured – Fields may be irrelevant – Values may be irrelevant – Value flow may be relevant Output Constraints – Usable with existing DM/ML algorithms – Amenable to simple reasoning – Accessible – Possess explanatory power Pattern – Use off-the-shelf techniques to mine patterns • Item-set mining • Temporal pattern mining – Use patterns as features • Binary/Categorical features • Numeric features http://research.microsoft.com/en-us/projects/tark/