SlideShare a Scribd company logo
Using Similarity Metrics for Matching Lifelong Learners Nicolas Van Labeke Alexandra Poulovassilis George Magoulas
The Context ,[object Object],[object Object],[object Object],[object Object],[object Object]
L4ALL –  Lifelong Learning for All  Supporting Engagement & Participation   ,[object Object],[object Object],[object Object],[object Object]
L4ALL –  Lifelong Learning for All  The System ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
L4ALL User Model ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Similarity Metrics ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Our approach ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Hypothesis 1 & 2 : Time ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Hypothesis 3 : Category of episode ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],Any user-defined episode not covered previously OT Developed a (temporary) illness  IL Developed a (permanent) disability  DS Divorced  SE Got married MA Death in the family DE Adopted a child AD Birth in the family CH Spent some time abroad TV Moved to a different location MV Home carer CR Unemployed UE Retired RE Attended military service ML Started a business  BS Voluntary work in charity/voluntary organisation VL Employed WK Attended a particular course CS Obtained a degree DG Attended University UN Attended college CL Attended school SC Description
Hypothesis 4 : Classification of episodes ,[object Object],[object Object],[object Object],[object Object],Unknown 0.0.0.0 Managers and Senior Officials 1.0.0.0 Researchers N.E.C. 2.3.2.9 Social Science Researchers 2.3.2.2 Scientific Researchers 2.3.2.1 Research Professionals 2.3.2.0 Teaching and Research Professionals 2.3.0.0 Professional Occupations 2.0.0.0 -  - 2.3.2.1 6.4.0.0 WK Secondary classification (e.g. discipline, activity sector) Primary classification (e.g. qualification, occupation) Episode Category  (e.g. work, college, military service, …) Computer Science 6.4.0.0 Mathematical and Computer Sciences 6.0.0.0 Medicine and Dentistry 1.0.0.0 Unknow 0.0.0.0
Tokenisation of Timelines CL-10.1.0.0-3.1.0.0  DG-10.1.0.0-3.1.0.0  WK-4.0.0.0-7.2.1.2  WK-11.0.0.0-3.1.3.2  WK-3.0.0.0-4.1.3.6  MV-0.0.0.0-0.0.0.0  UN-6.4.0.0-6.3.0.0  CL-10-3 DG-10-3 WK-4-7 WK-11-3 WK-3-4 MV-0-0 UN-6-6  CL-- DG-- WK-- WK-- WK-- UN--  Expressivity
Similarity Metrics SimMetrics JAVA package – http://www.dcs.shef.ac.uk/~sam/simmetrics.html Levenshtein Needleman – Wunsch  Jaro Matching Coefficient Euclidean Distance Block Distance Jaccard Similarity Cosine Similarity Dice Similarity Overlap Coefficient
Encoding of some timelines  Cl-00 Un-00 Mv-00  Wk-10 One of the episodes of the source timeline is substituted by a  variant of an existing  episode. SB v Cl-00 Un-00 Mv-00  Un-00 One of the episodes of the source timeline is substituted by an  existing  episode. SB e Cl-00 Un-00 Mv-00  Bs-00 One of the episodes of the source timeline is substituted by a new one ( different from all existing ones ). SB n Cl-00 Mv-00 Wk-00 One of the episodes of the source timeline is removed. RM u Cl-00 Un-00 Mv-00 The  last  episode is removed from the source timeline. RM w Cl-00 Un-00 Mv-00 Wk-00  Bs-00 A new episode ( different from all existing ones ) is added to the timeline. AD n Cl-00 Un-00 Mv-00 Wk-00  Wk-00 A new work episode ( similar to an existing one ) is added to the timeline. AD e Un-00 Wk-00 Cl-00 Mv-00 A timeline containing the same episodes as the source but in a totally  different order  (i.e. no episode is at the same position in the string). Re Cl-00 Un-00 Mv-00 Wk-00 A timeline similar to the source. Id Cl-00 Un-00 Mv-00 Wk-00 The original timeline used as the source for the similarity measure Source Encoding Description ID
Comparison of Metrics 0.75 1 0.75 1 1 1 1 1 1 Overlap Coefficient 0.75 0.86 0.75 0.86 0.86 0.89 1 1 1 Dice Similarity 0.75 0.87 0.75 0.87 0.87 0.89 1 1 1 Cosine Similarity 0.6 0.75 0.6 0.75 0.75 0.8 1 1 1 Jaccard Similarity 0.75 0.75 0.75 0.86 0.86 0.89 0.89 1 1 Block Distance 0.75 0.75 0.75 0.8 0.8 0.84 0.84 1 1 Euclidean Distance 0.75 0.75 0.75 0.75 0.75 0.8 0.8 1 1 Matching Coefficient 0.83 0.83 0.83 0.92 0.92 0.93 0.93 0.72 1 Jaro 0.88 0.75 0.75 0.75 0.75 0.8 0.8 0 1 Needleman - Wunsch 0.75 0.75 0.75 0.75 0.75 0.8 0.8 0 1 Levenshtein SB v SB e SB n RM u RM w AD n AD e RE ID
Search for “People like me” ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
 
Explaining Similarity Measures ,[object Object],[object Object],[object Object],Learner’s Timeline Target’s Timeline
 
 
Conclusions  ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Which Measure of (Dis)similarity? ,[object Object],[object Object],[object Object],[object Object],E _ C B A 66% (4/6) 50% (2/4) Similarity Dissimilarity -  - 2.3.2.1 6.4.0.0 WK -  - 1.0.0.0 4.2.0.0 WK _ _ _ D C B A C B E _ C B A
Future Work ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

More Related Content

Similar to Using Similarity Metrics for Matching Lifelong Learners

Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005Paolo Missier
 
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Xavier Ochoa
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
NBER
 
3rd Workshop on Social Information Retrieval for Technology-Enhanced Learnin...
3rd Workshop onSocial  Information Retrieval for Technology-Enhanced Learnin...3rd Workshop onSocial  Information Retrieval for Technology-Enhanced Learnin...
3rd Workshop on Social Information Retrieval for Technology-Enhanced Learnin...
Hendrik Drachsler
 
Sirtel Workshop
Sirtel WorkshopSirtel Workshop
Sirtel Workshop
MegaVjohnson
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET Journal
 
Best Practices for Creating Definitions in Technical Writing and Editing
Best Practices for Creating Definitions in Technical Writing and EditingBest Practices for Creating Definitions in Technical Writing and Editing
Best Practices for Creating Definitions in Technical Writing and Editing
The Integral Worm
 
2015-User Modeling of Skills and Expertise from Resumes-KMIS
2015-User Modeling of Skills and Expertise from Resumes-KMIS2015-User Modeling of Skills and Expertise from Resumes-KMIS
2015-User Modeling of Skills and Expertise from Resumes-KMISHua Li, PhD
 
Chounta@paws
Chounta@pawsChounta@paws
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
IRJET Journal
 
[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...
[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...
[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...
DataScienceConferenc1
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
NBER
 
Lecture_01.1.pptx
Lecture_01.1.pptxLecture_01.1.pptx
Lecture_01.1.pptx
RockyIslam5
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
IRJET Journal
 
Object oriented software engineering
Object oriented software engineeringObject oriented software engineering
Object oriented software engineering
Varsha Ajith
 
Bijker, M. (2010) Making Measures And Inferences Reserve
Bijker, M. (2010)   Making Measures And Inferences ReserveBijker, M. (2010)   Making Measures And Inferences Reserve
Bijker, M. (2010) Making Measures And Inferences Reserve
Fontys University
 
Bijker, M. (2010) Making Measures And Inferences Reserve
Bijker, M. (2010)   Making Measures And Inferences ReserveBijker, M. (2010)   Making Measures And Inferences Reserve
Bijker, M. (2010) Making Measures And Inferences Reserve
Fontys University
 
Helping teachers understand their learners and their needs better in WebCT
Helping teachers understand their learners and their needs better in WebCTHelping teachers understand their learners and their needs better in WebCT
Helping teachers understand their learners and their needs better in WebCT
cies
 
CIS_515_Week_3_Assignment352866 (1).docUniversity Database .docx
CIS_515_Week_3_Assignment352866 (1).docUniversity Database  .docxCIS_515_Week_3_Assignment352866 (1).docUniversity Database  .docx
CIS_515_Week_3_Assignment352866 (1).docUniversity Database .docx
clarebernice
 
Mca1040 system analysis and design
Mca1040  system analysis and designMca1040  system analysis and design
Mca1040 system analysis and design
smumbahelp
 

Similar to Using Similarity Metrics for Matching Lifelong Learners (20)

Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005Paper presentations: UK e-science AHM meeting, 2005
Paper presentations: UK e-science AHM meeting, 2005
 
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
Use of Contextualized Attention Metadata for Ranking and Recommending Learnin...
 
Recommenders, Topics, and Text
Recommenders, Topics, and TextRecommenders, Topics, and Text
Recommenders, Topics, and Text
 
3rd Workshop on Social Information Retrieval for Technology-Enhanced Learnin...
3rd Workshop onSocial  Information Retrieval for Technology-Enhanced Learnin...3rd Workshop onSocial  Information Retrieval for Technology-Enhanced Learnin...
3rd Workshop on Social Information Retrieval for Technology-Enhanced Learnin...
 
Sirtel Workshop
Sirtel WorkshopSirtel Workshop
Sirtel Workshop
 
IRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect InformationIRJET - Deep Collaborrative Filtering with Aspect Information
IRJET - Deep Collaborrative Filtering with Aspect Information
 
Best Practices for Creating Definitions in Technical Writing and Editing
Best Practices for Creating Definitions in Technical Writing and EditingBest Practices for Creating Definitions in Technical Writing and Editing
Best Practices for Creating Definitions in Technical Writing and Editing
 
2015-User Modeling of Skills and Expertise from Resumes-KMIS
2015-User Modeling of Skills and Expertise from Resumes-KMIS2015-User Modeling of Skills and Expertise from Resumes-KMIS
2015-User Modeling of Skills and Expertise from Resumes-KMIS
 
Chounta@paws
Chounta@pawsChounta@paws
Chounta@paws
 
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
Mining Users Rare Sequential Topic Patterns from Tweets based on Topic Extrac...
 
[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...
[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...
[DSC Adria 23]Davor Horvatic Human-Centric Explainable AI In Time Series Anal...
 
Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1Nbe rtopicsandrecomvlecture1
Nbe rtopicsandrecomvlecture1
 
Lecture_01.1.pptx
Lecture_01.1.pptxLecture_01.1.pptx
Lecture_01.1.pptx
 
Topic detecton by clustering and text mining
Topic detecton by clustering and text miningTopic detecton by clustering and text mining
Topic detecton by clustering and text mining
 
Object oriented software engineering
Object oriented software engineeringObject oriented software engineering
Object oriented software engineering
 
Bijker, M. (2010) Making Measures And Inferences Reserve
Bijker, M. (2010)   Making Measures And Inferences ReserveBijker, M. (2010)   Making Measures And Inferences Reserve
Bijker, M. (2010) Making Measures And Inferences Reserve
 
Bijker, M. (2010) Making Measures And Inferences Reserve
Bijker, M. (2010)   Making Measures And Inferences ReserveBijker, M. (2010)   Making Measures And Inferences Reserve
Bijker, M. (2010) Making Measures And Inferences Reserve
 
Helping teachers understand their learners and their needs better in WebCT
Helping teachers understand their learners and their needs better in WebCTHelping teachers understand their learners and their needs better in WebCT
Helping teachers understand their learners and their needs better in WebCT
 
CIS_515_Week_3_Assignment352866 (1).docUniversity Database .docx
CIS_515_Week_3_Assignment352866 (1).docUniversity Database  .docxCIS_515_Week_3_Assignment352866 (1).docUniversity Database  .docx
CIS_515_Week_3_Assignment352866 (1).docUniversity Database .docx
 
Mca1040 system analysis and design
Mca1040  system analysis and designMca1040  system analysis and design
Mca1040 system analysis and design
 

Recently uploaded

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
91mobiles
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
Dorra BARTAGUIZ
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Jeffrey Haguewood
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
Frank van Harmelen
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
Product School
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
Ana-Maria Mihalceanu
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
Elena Simperl
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
Prayukth K V
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
Product School
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
DianaGray10
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
Laura Byrne
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
ThousandEyes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
Product School
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
ControlCase
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Inflectra
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
Thijs Feryn
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
BookNet Canada
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
RTTS
 

Recently uploaded (20)

Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdfSmart TV Buyer Insights Survey 2024 by 91mobiles.pdf
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf
 
Elevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object CalisthenicsElevating Tactical DDD Patterns Through Object Calisthenics
Elevating Tactical DDD Patterns Through Object Calisthenics
 
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...
 
Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*Neuro-symbolic is not enough, we need neuro-*semantic*
Neuro-symbolic is not enough, we need neuro-*semantic*
 
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdfFIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
FIDO Alliance Osaka Seminar: Passkeys at Amazon.pdf
 
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
De-mystifying Zero to One: Design Informed Techniques for Greenfield Innovati...
 
Monitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR EventsMonitoring Java Application Security with JDK Tools and JFR Events
Monitoring Java Application Security with JDK Tools and JFR Events
 
When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...When stars align: studies in data quality, knowledge graphs, and machine lear...
When stars align: studies in data quality, knowledge graphs, and machine lear...
 
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 previewState of ICS and IoT Cyber Threat Landscape Report 2024 preview
State of ICS and IoT Cyber Threat Landscape Report 2024 preview
 
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
From Siloed Products to Connected Ecosystem: Building a Sustainable and Scala...
 
UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3UiPath Test Automation using UiPath Test Suite series, part 3
UiPath Test Automation using UiPath Test Suite series, part 3
 
The Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and SalesThe Art of the Pitch: WordPress Relationships and Sales
The Art of the Pitch: WordPress Relationships and Sales
 
Assuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyesAssuring Contact Center Experiences for Your Customers With ThousandEyes
Assuring Contact Center Experiences for Your Customers With ThousandEyes
 
How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...How world-class product teams are winning in the AI era by CEO and Founder, P...
How world-class product teams are winning in the AI era by CEO and Founder, P...
 
PCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase TeamPCI PIN Basics Webinar from the Controlcase Team
PCI PIN Basics Webinar from the Controlcase Team
 
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered QualitySoftware Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
Software Delivery At the Speed of AI: Inflectra Invests In AI-Powered Quality
 
Accelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish CachingAccelerate your Kubernetes clusters with Varnish Caching
Accelerate your Kubernetes clusters with Varnish Caching
 
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...Transcript: Selling digital books in 2024: Insights from industry leaders - T...
Transcript: Selling digital books in 2024: Insights from industry leaders - T...
 
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdfFIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
FIDO Alliance Osaka Seminar: FIDO Security Aspects.pdf
 
JMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and GrafanaJMeter webinar - integration with InfluxDB and Grafana
JMeter webinar - integration with InfluxDB and Grafana
 

Using Similarity Metrics for Matching Lifelong Learners

  • 1. Using Similarity Metrics for Matching Lifelong Learners Nicolas Van Labeke Alexandra Poulovassilis George Magoulas
  • 2.
  • 3.
  • 4.
  • 5.  
  • 6.
  • 7.
  • 8.
  • 9.
  • 10.
  • 11.
  • 12. Tokenisation of Timelines CL-10.1.0.0-3.1.0.0 DG-10.1.0.0-3.1.0.0 WK-4.0.0.0-7.2.1.2 WK-11.0.0.0-3.1.3.2 WK-3.0.0.0-4.1.3.6 MV-0.0.0.0-0.0.0.0 UN-6.4.0.0-6.3.0.0 CL-10-3 DG-10-3 WK-4-7 WK-11-3 WK-3-4 MV-0-0 UN-6-6 CL-- DG-- WK-- WK-- WK-- UN-- Expressivity
  • 13. Similarity Metrics SimMetrics JAVA package – http://www.dcs.shef.ac.uk/~sam/simmetrics.html Levenshtein Needleman – Wunsch Jaro Matching Coefficient Euclidean Distance Block Distance Jaccard Similarity Cosine Similarity Dice Similarity Overlap Coefficient
  • 14. Encoding of some timelines Cl-00 Un-00 Mv-00 Wk-10 One of the episodes of the source timeline is substituted by a variant of an existing episode. SB v Cl-00 Un-00 Mv-00 Un-00 One of the episodes of the source timeline is substituted by an existing episode. SB e Cl-00 Un-00 Mv-00 Bs-00 One of the episodes of the source timeline is substituted by a new one ( different from all existing ones ). SB n Cl-00 Mv-00 Wk-00 One of the episodes of the source timeline is removed. RM u Cl-00 Un-00 Mv-00 The last episode is removed from the source timeline. RM w Cl-00 Un-00 Mv-00 Wk-00 Bs-00 A new episode ( different from all existing ones ) is added to the timeline. AD n Cl-00 Un-00 Mv-00 Wk-00 Wk-00 A new work episode ( similar to an existing one ) is added to the timeline. AD e Un-00 Wk-00 Cl-00 Mv-00 A timeline containing the same episodes as the source but in a totally different order (i.e. no episode is at the same position in the string). Re Cl-00 Un-00 Mv-00 Wk-00 A timeline similar to the source. Id Cl-00 Un-00 Mv-00 Wk-00 The original timeline used as the source for the similarity measure Source Encoding Description ID
  • 15. Comparison of Metrics 0.75 1 0.75 1 1 1 1 1 1 Overlap Coefficient 0.75 0.86 0.75 0.86 0.86 0.89 1 1 1 Dice Similarity 0.75 0.87 0.75 0.87 0.87 0.89 1 1 1 Cosine Similarity 0.6 0.75 0.6 0.75 0.75 0.8 1 1 1 Jaccard Similarity 0.75 0.75 0.75 0.86 0.86 0.89 0.89 1 1 Block Distance 0.75 0.75 0.75 0.8 0.8 0.84 0.84 1 1 Euclidean Distance 0.75 0.75 0.75 0.75 0.75 0.8 0.8 1 1 Matching Coefficient 0.83 0.83 0.83 0.92 0.92 0.93 0.93 0.72 1 Jaro 0.88 0.75 0.75 0.75 0.75 0.8 0.8 0 1 Needleman - Wunsch 0.75 0.75 0.75 0.75 0.75 0.8 0.8 0 1 Levenshtein SB v SB e SB n RM u RM w AD n AD e RE ID
  • 16.
  • 17.  
  • 18.
  • 19.  
  • 20.  
  • 21.
  • 22.
  • 23.

Editor's Notes

  1. Access to resources and facilities Share information about pathways Reflect on current and future pathways
  2. String metrics (also known as similarity metrics ) are a class of textual based metrics resulting in a similarity or dissimilarity ( distance ) score between two pairs of text strings for approximate matching or comparison and in fuzzy string searching . For example the strings "Sam" and "Samuel" can be considered (although not the same) to a degree similar. A string metric provides a floating point number indicating an algorithm-specific indication of similarity. The most widely known (although rudimentary) string metric is Levenshtein Distance (also known as Edit Distance), which operates between two input strings, returning a score equivalent to the number of transpositions , substitutions and deletions needed in order to transform one input string into another. Simplistic string metrics such as Levenshtein distance have expanded to include phonetic, token , grammatical and character-based methods of statistical comparisons .