Machine Learning Techniques for the Semantic Web

4,658 views

Published on

Published in: Technology, Education
1 Comment
10 Likes
Statistics
Notes
  • Nice to find someone looking at bridging machine learning with semweb :)

    I found my way here by searching for ruby + restricted boltzmann, ... hoping to find some nicely packaged RBM implementation that could be fed to the SemWeb community, so that structure implicit in eg dbpedia and social graph data can be explored. Any recommendations? Or maybe it'd be more productive teaching the machine learning folk where to go find RDF linked data themselves?
       Reply 
    Are you sure you want to  Yes  No
    Your message goes here
No Downloads
Views
Total views
4,658
On SlideShare
0
From Embeds
0
Number of Embeds
452
Actions
Shares
0
Downloads
110
Comments
1
Likes
10
Embeds 0
No embeds

No notes for slide

Machine Learning Techniques for the Semantic Web

  1. 1. Machine Learning Techniques for the Semantic Web Paul Dix http://pauldix.net paul@pauldix.net
  2. 2. Machine Learning
  3. 3. Semantic Web
  4. 4. What is Semantic Web?
  5. 5. Ontology
  6. 6. RDF
  7. 7. Machine Learning is about Data
  8. 8. actually...
  9. 9. Making Predictions Based on Data
  10. 10. FOAF Simple Example
  11. 11. Marco Neumann <http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://community.linkeddata.org/dataspace/person/ kidehen2/about.rdf> . <http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://www.johnbreslin.com/foaf/foaf.rdf> . <http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://swordfish.rdfweb.org/people/libby/rdfweb/ webwho.xrdf> . <http://www.marconeumann.org/foaf.rdf> <http://xmlns.com/foaf/0.1/knows> <http://danbri.org/foaf.rdf> .
  12. 12. Marco only knows 4 people?
  13. 13. Two Degrees Out 4 - <http://www.w3.org/People/Connolly/home-smart.rdf> 4 - <http://jibbering.com/foaf.rdf> 2 - <http://sw.deri.org/~haller/foaf.rdf> 2 - <http://sw.deri.org/~knud/knudfoaf.rdf> 2 - <http://www-cdr.stanford.edu/~petrie/foaf.rdf>
  14. 14. Three Degrees 9 - <http://sw.deri.org/~knud/knudfoaf.rdf> 8 - <http://www.w3.org/People/Connolly/home-smart.rdf> 7 - <http://jibbering.com/foaf.rdf> 6 - <http://www.aaronsw.com/about.xrdf> 5 - <http://sw.deri.org/~aharth/foaf.rdf>
  15. 15. but that’s not really machine learning
  16. 16. Short
  17. 17. Machine Learning is • How you formulate the problem • How you represent the data
  18. 18. • Graphical Models • Vector Space Models
  19. 19. Back to FOAF Convert RDF triples to vector space
  20. 20. We Want to Find Groups of People
  21. 21. To make predictions on their interests...
  22. 22. (subject) (predicate) (object) Paul knows Jeff Paul knows Joe Paul knows Marco Jeff knows Joe
  23. 23. Vector Space Representation Jeff Joe Marco Paul Jeff 1 1 Joe 1 1 Marco 1 Paul 1 1 1
  24. 24. Latent Factors Analysis • Used in Latent Semantic Indexing (LSI) • Good for finding synonyms • Good for finding “genres”
  25. 25. Latent Factors Methods • Principle Component Analysis (PCA) • Singular Value Decomposition (SVD) • Restricted Boltzmann Machines (RBM)
  26. 26. Considerations for Semantic Web Data • Large Data Sets • Sparse Data Sets
  27. 27. Netflix Prize Research • Movie Review Data set has similar problems • Generalized Hebbian Algorithm for Dimensionality Reduction in NLP (Gorrell ’06.)
  28. 28. Reduce Dimensions • 1m x 1m matrix with 1m people • Reduce to 1m x 100
  29. 29. 100 Latent Factors Represent different groups of people based on who they know.
  30. 30. What the Data Might Look Like Factor 1 Factor 2 Paul 0.678 0.311 Joe 0.455 0.432 Jeff 0.476 0.398 Marco 0.203 0.789
  31. 31. Find Similar People k Nearest Neighbors
  32. 32. Pick a Similarity Metric • Euclidean Distance • Jaccard index • Cosine Similarity
  33. 33. Joe’s Similarity to Paul (Paul (f1) - Joe (f1))^2 + (Paul (f2) - Joe (f2))^2)^1/2
  34. 34. Once We’ve Calculated Similarities • Fill In Missing Interests • Target Ads, Content, Products • ??? • Profit!
  35. 35. Generalizing RDF Triples to Vector Space
  36. 36. • Subjects are Rows • Objects are Columns • Predicates are values
  37. 37. Object 1 Object 2 Subject 1 Predicate Subject 2
  38. 38. Predicates Should be Mutually Exclusive • Paul likes Ruby • Paul hates PHP • Paul loves PHP
  39. 39. Assign Values to Predicates • 1 = Hates • 2 = Dislikes • 3 = Neutral • 4 = Likes • 5 = Loves
  40. 40. More Applications
  41. 41. Supervised Learning • Classifiers • Ontology Mapping • Assigning Instances to Concepts
  42. 42. Ontology Mapping • Examples from Ontology A • Examples from Ontology B
  43. 43. Train Classifiers • One Classifier for each Concept in A • One Classifier for each Concept in B
  44. 44. Classify Instances • Use A Classifiers to predict which concepts B instances map to • Use B Classifiers to predict which concepts A instances map to
  45. 45. Use Classified Instances • Predict Concept Mappings • Which in A match ones in B
  46. 46. Limitations • One Classifier per Concept • Large Ontologies Could be a Problem • Ontologies should be a little similar
  47. 47. Unsupervised Learning • Clustering • Hierarchical Clustering • Learning Ontologies from Text
  48. 48. Machine Learning as Triage • Automatically tag or recommend Examples the algorithm is Certain About • Send uncertain examples to human for review
  49. 49. Thank You Paul Dix paul@pauldix.net http://pauldix.net

×