Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

of

TXDHC OpenRefine Training Slide 1 TXDHC OpenRefine Training Slide 2 TXDHC OpenRefine Training Slide 3 TXDHC OpenRefine Training Slide 4 TXDHC OpenRefine Training Slide 5 TXDHC OpenRefine Training Slide 6 TXDHC OpenRefine Training Slide 7 TXDHC OpenRefine Training Slide 8 TXDHC OpenRefine Training Slide 9 TXDHC OpenRefine Training Slide 10 TXDHC OpenRefine Training Slide 11 TXDHC OpenRefine Training Slide 12 TXDHC OpenRefine Training Slide 13 TXDHC OpenRefine Training Slide 14 TXDHC OpenRefine Training Slide 15 TXDHC OpenRefine Training Slide 16 TXDHC OpenRefine Training Slide 17 TXDHC OpenRefine Training Slide 18 TXDHC OpenRefine Training Slide 19 TXDHC OpenRefine Training Slide 20 TXDHC OpenRefine Training Slide 21 TXDHC OpenRefine Training Slide 22 TXDHC OpenRefine Training Slide 23
Upcoming SlideShare
OpenRefine Tutorial
Next
Download to read offline and view in fullscreen.

2 Likes

Share

Download to read offline

TXDHC OpenRefine Training

Download to read offline

Presented by Jennifer Hecker and Elizabeth Grumbach and hosted by the Texas Consortium on Digital Humanities, these are the slides for the TXDHC training webcast on OpenRefine, February 12th, 2015.

Related Audiobooks

Free with a 30 day trial from Scribd

See all

TXDHC OpenRefine Training

  1. 1. Intro to Open Refine An overview & walkthrough to get you started.
  2. 2.  intro/overview (15 min)  walkthrough (45 min)  intro to advanced (10 min)  q&a (20 min) http://www.txdhc.org/txdhc-training-webcast-materials/
  3. 3. Jennifer Hecker Liz Grumbach
  4. 4. “a tool for working with messy data”
  5. 5. Cleaning up data that is:  in a simple tabular format  is inconsistently formatted  has inconsistent terminology
  6. 6.  get an overview of a data set  resolve inconsistencies  split data up into more granular parts  match local data up to other data sets  enhance a data set with data from other sources
  7. 7. https://cms-assets.tutsplus.com/uploads/users/199/posts/20843/image/text-facet-openrefine.png
  8. 8. https://cms-assets.tutsplus.com/uploads/users/199/posts/20843/image/clustering-openrefine.png
  9. 9. https://cms-assets.tutsplus.com/uploads/users/199/posts/20843/image/clustering-openrefine.png
  10. 10. Freebase Gridworks = GoogleRefine = OpenRefine = Refine
  11. 11. …ask some questions about your data set:  What type of data is it & what format is it in?  What’s the size of your data set?  What question do you want to ask your data?  What do you need to do to find the answer?
  12. 12. Excel familiarity, better for data entry, cut and paste operation, no paging to navigate Google Spreadsheets similar to Excel, can get external data relatively easily, easy to collaborate and share Google Fusion Tables if you just want to filter, easy to share Text editor powerful text editor can do many things Unix tools more challenging to use, but quick and some things (finding things, sorting) are easy Writing code most sophisticated and most to learn!
  13. 13. <And now Liz attempts the dangerous LIVE DEMO!>
  14. 14. Regular expressions  “wildcards on steroids” that allow for more granular data manipulation (http://www.regular-expressions.info)
  15. 15. Transformations using Open Refine Expression Language (GREL)  kind of like a formula in Excel
  16. 16. Retrieve data from online sources  example: use names to retrieve birth/death dates from Virtual International Authority File (VIAF) Match data to external data sources using  Extensions for RDF, DBpedia, Named-Entity Recognition (NER), etc…  And ‘reconciliation’ services
  17. 17. Use ‘cross’ function to compare contents of two Refine projects, or share data between the two projects.
  18. 18.  TxDHC blog post on this webinar http://www.txdhc.org/txdhc-training- webcast-materials/  The OpenRefine Wiki https://github.com/OpenRefine/OpenRefine/wiki  OpenRefine User Documentation https://github.com/OpenRefine/OpenRefine/wiki/Documentation-For-Users  The ‘Free your metadata’ site http://freeyourmetadata.org...  …and book http://book.freeyourmetadata.org  The OpenRefine mailing list and forum http://groups.google.com/d/forum/openrefine
  19. 19. http://bit.ly/1uGPd0f Please email us if you have any questions: Jennifer = jenniferraehecker@gmail.com Liz = egrumbac@tamu.edu
  20. 20. credits * acknowledgements * citations These slides were developed by Jennifer Hecker (j.hecker@Austin.utexas.edu) and Liz Grumbach (egrumbac@tamu.edu ) on behalf of University of Texas Libraries, Texas A&M’s Initiative for Digital Humanities, Media and Culture, and the Texas Digital Humanities Consortium using many resources including the wonderful course material developed by Owen Stephens on behalf of the British Library (http://www.meanboyfriend.com/overdue_ideas/2014/11/working-with-data- using-openrefine/). Unless otherwise stated, all images, audio or video content are separate works with their own license, and should not be assumed to be CC-BY in their own right. This work is licensed under a Creative Commons Attribution 4.0 International License http://creativecommons.org/licenses/by/4.0/. It is suggested when crediting this work, you include the phrase “Developed by Liz Grumback and Jennifer Hecker on behalf of the university of Texas, Texas A&M, and the TXDHC.” Thanks to University of Texas Libraries, Texas A&M’s Initiative for Digital Humanities, and the Texas Digital Humanities Consortium for facilitating this presentation.
  • RavirajAdrangi

    Aug. 29, 2018
  • ChristalSeahorn1

    Feb. 11, 2015

Presented by Jennifer Hecker and Elizabeth Grumbach and hosted by the Texas Consortium on Digital Humanities, these are the slides for the TXDHC training webcast on OpenRefine, February 12th, 2015.

Views

Total views

1,347

On Slideshare

0

From embeds

0

Number of embeds

15

Actions

Downloads

27

Shares

0

Comments

0

Likes

2

×