Reconciling ourselves to
what's out there: how one
dataset talks to another
Tony Hirst
Dept of Computing and Communication...
I play with
other people’s
data….
Clustering and
Approximate
Matching
OpenRefine.org
Metaphone3 (soundalike)
metaphone( 'Epic Garments Limited’)
EPKKRMNTSLMTT

metaphone( 'EPOCH GARMENTS
LTD’)
EPXKRMNTSLTT
Levenshtein (edit distance)
You know computers can do this anyway…
..it’s just that no-one’s
told you how you can
do it on your computer
with your data…
Reconcile
your data
http://bit.ly/ScoDa-bg-reconcile
http://schoolofdata.org/2013/10/18/in-support-of-the-bangladeshi-garm...
opencorporates.com
http://opencorporates.com/reconcile
cell.recon.match.name
cell.recon.match.id
In this way, we
can make our
data linkable…
Reconcile your
data with what’s
out there
And why not
have a go at
clustering too…?
Can you
match your
data to itself?
blog.ouseful.info

@psychemedia
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Online info2013 reconciliation
Upcoming SlideShare
Loading in...5
×

Online info2013 reconciliation

384

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
384
On Slideshare
0
From Embeds
0
Number of Embeds
2
Actions
Shares
0
Downloads
4
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Online info2013 reconciliation

  1. 1. Reconciling ourselves to what's out there: how one dataset talks to another Tony Hirst Dept of Computing and Communications, The Open University UK
  2. 2. I play with other people’s data….
  3. 3. Clustering and Approximate Matching
  4. 4. OpenRefine.org
  5. 5. Metaphone3 (soundalike)
  6. 6. metaphone( 'Epic Garments Limited’) EPKKRMNTSLMTT metaphone( 'EPOCH GARMENTS LTD’) EPXKRMNTSLTT
  7. 7. Levenshtein (edit distance)
  8. 8. You know computers can do this anyway…
  9. 9. ..it’s just that no-one’s told you how you can do it on your computer with your data…
  10. 10. Reconcile your data http://bit.ly/ScoDa-bg-reconcile http://schoolofdata.org/2013/10/18/in-support-of-the-bangladeshi-garmentindustries-data-expedition/
  11. 11. opencorporates.com
  12. 12. http://opencorporates.com/reconcile
  13. 13. cell.recon.match.name
  14. 14. cell.recon.match.id
  15. 15. In this way, we can make our data linkable…
  16. 16. Reconcile your data with what’s out there
  17. 17. And why not have a go at clustering too…?
  18. 18. Can you match your data to itself?
  19. 19. blog.ouseful.info @psychemedia
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×