Your SlideShare is downloading. ×
0
Towards Open Methods: Using Scientific Workflows in Linguistics<br />Richard Littauer<br />1<br />
Various tools, such as Kepler, Taverna, Vistrails, and many others have been designed in order to allow for scientific wor...
Scientific workflows are typically used to automate the processing, analysis, and management of scientific data. <br />Int...
Scientific workflows are typically used to automate the processing, analysis, and management of scientific data. <br />The...
By providing front-end visualisationsand adaptations of shell scripts and manual steps, it is easier for scientists to do ...
How does this relate to Linguistics? <br />Workflows in Linguistics<br />6<br />
How does this relate to Linguistics?<br />Many workflow systems I've been looking at would work in the field of corpus lin...
How does this relate to Linguistics?<br />Many workflow systems I've been looking at would work in the field of corpus lin...
How does this relate to Open Linguistics? <br />Workflows in Linguistics<br />9<br />
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to languag...
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to languag...
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to languag...
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to languag...
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to languag...
Examples<br /><ul><li>Example workflow</li></ul>15<br />
Examples<br /><ul><li>Example workflow
This grabs the most recent XKCD comic off the web.
http://www.myexperiment.org/workflows/1370.html</li></ul>16<br />
Examples<br /><ul><li>Another example workflow</li></ul>17<br />
Examples<br /><ul><li>Another example workflow
This workflow retrieves relevant documents, based on a query optimized by adding a string to the original query that will ...
http://www.myexperiment.org/workflows/117.html</li></ul>18<br />
Hypothetical Example<br />19<br />
Hypothetical Example<br />20<br />Chinese character <br />from a text<br />
Hypothetical Example<br />21<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dict...
Hypothetical Example<br />22<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dict...
Hypothetical Example<br />23<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dict...
Hypothetical Example<br />24<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dict...
Use in Linguistics<br /><ul><li>So, if we have a linked network online that is queryable</li></ul>25<br />
Use in Linguistics<br /><ul><li>So, if we have a linked network online that is queryable
Hypothetically, it should be possible to use current workflow systems to access and download data</li></ul>26<br />
Use in Linguistics<br /><ul><li>So, if we have a linked network online that is queryable
Hypothetically, it should be possible to use current workflow systems to access and download data
My hope is to see how feasible this is</li></ul>27<br />
Use in Linguistics<br />28<br />Other use:<br />
Use in Linguistics<br />29<br />Other use:<br />Shims: data conversion workflows.<br />
Use in Linguistics<br />30<br />Other use:<br />Shims: data conversion workflows.<br />As seen in the LexInfo slides, ther...
Use in Linguistics<br />31<br />How does this help Open Methods?<br />
Use in Linguistics<br />32<br />How does this help Open Methods?<br />By keeping track of workflows and workflow systems b...
Use in Linguistics<br />33<br />How does this help Open Methods?<br />By keeping track of workflows and workflow systems b...
Use in Linguistics<br />34<br />How does this help Open Methods?<br />Also, most workflows are now focusing more on provid...
Use in Linguistics<br />35<br />How does this help Open Methods?<br />Also, most workflows are now focusing more on provid...
Use in Linguistics<br />Work going on this, currently:<br />36<br />
Upcoming SlideShare
Loading in...5
×

Towards Open Methods: Using Scientific Workflows in Linguistics

1,103

Published on

Published in: Technology, Education
0 Comments
0 Likes
Statistics
Notes
  • Be the first to comment

  • Be the first to like this

No Downloads
Views
Total Views
1,103
On Slideshare
0
From Embeds
0
Number of Embeds
0
Actions
Shares
0
Downloads
13
Comments
0
Likes
0
Embeds 0
No embeds

No notes for slide

Transcript of "Towards Open Methods: Using Scientific Workflows in Linguistics"

  1. 1. Towards Open Methods: Using Scientific Workflows in Linguistics<br />Richard Littauer<br />1<br />
  2. 2. Various tools, such as Kepler, Taverna, Vistrails, and many others have been designed in order to allow for scientific workflows to be created, executed, and shared among scientists and laboratories. <br />Introduction<br />2<br />
  3. 3. Scientific workflows are typically used to automate the processing, analysis, and management of scientific data. <br />Introduction<br />3<br />
  4. 4. Scientific workflows are typically used to automate the processing, analysis, and management of scientific data. <br />They provide a way of tracing provenance and methodologies to help foster reproducible science and the publications of executable papers.<br />Introduction<br />4<br />
  5. 5. By providing front-end visualisationsand adaptations of shell scripts and manual steps, it is easier for scientists to do their work, especially when integrating grids and parallel processing or external databases.<br />Introduction<br />5<br />
  6. 6. How does this relate to Linguistics? <br />Workflows in Linguistics<br />6<br />
  7. 7. How does this relate to Linguistics?<br />Many workflow systems I've been looking at would work in the field of corpus linguistics if we merely had open source databases online to mine. <br />Workflows in Linguistics<br />7<br />
  8. 8. How does this relate to Linguistics?<br />Many workflow systems I've been looking at would work in the field of corpus linguistics if we merely had open source databases online to mine. <br />They, most often, provide a way of cleaning data, and a way of processing repetitive tasks. This is directly applicable to Linguistic work.<br />Workflows in Linguistics<br />8<br />
  9. 9. How does this relate to Open Linguistics? <br />Workflows in Linguistics<br />9<br />
  10. 10. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data.<br />Act as a central point of reference and support for people interested in open linguistic data.<br />Provide guidance on legal issues surrounding linguistic data to the community.<br />Build an index of indexes of open linguistic data sources and tools and link existing resources.<br />Facilitate communication between existing groups.<br />Serve as a mediator between providers and users of of technical infrastructure.<br />Assemble best-practice guidelines / use cases to create, use and distribute data.<br />Open Linguistics<br />10<br />
  11. 11. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data.<br />Act as a central point of reference and support for people interested in open linguistic data.<br />Provide guidance on legal issues surrounding linguistic data to the community.<br />Build an index of indexes of open linguistic data sources and tools and link existing resources.<br />Facilitate communication between existing groups.<br />Serve as a mediator between providers and users of of technical infrastructure.<br />Assemble best-practice guidelines / use cases to create, use and distribute data.<br />Open Linguistics<br />11<br />
  12. 12. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data.<br />Act as a central point of reference and support for people interested in open linguistic data.<br />Provide guidance on legal issues surrounding linguistic data to the community.<br />Build an index of indexes of open linguistic data sources and tools and link existing resources.<br />Facilitate communication between existing groups.<br />Serve as a mediator between providers and users of of technical infrastructure.<br />Assemble best-practice guidelines / use cases to create, use and distribute data.<br />Open Linguistics<br />12<br />
  13. 13. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data.<br />Act as a central point of reference and support for people interested in open linguistic data.<br />Provide guidance on legal issues surrounding linguistic data to the community.<br />Build an index of indexes of open linguistic data sources and tools and link existing resources.<br />Facilitate communication between existing groups.<br />Serve as a mediator between providers and users of of technical infrastructure.<br />Assemble best-practice guidelines / use cases to create, use and distribute data.<br />Open Linguistics<br />13<br />
  14. 14. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data.<br />Act as a central point of reference and support for people interested in open linguistic data.<br />Provide guidance on legal issues surrounding linguistic data to the community.<br />Build an index of indexes of open linguistic data sources and tools and link existing resources.<br />Facilitate communication between existing groups.<br />Serve as a mediator between providers and users of of technical infrastructure.<br />Assemble best-practice guidelines / use cases to create, use and distribute data.<br />Open Linguistics<br />14<br />
  15. 15. Examples<br /><ul><li>Example workflow</li></ul>15<br />
  16. 16. Examples<br /><ul><li>Example workflow
  17. 17. This grabs the most recent XKCD comic off the web.
  18. 18. http://www.myexperiment.org/workflows/1370.html</li></ul>16<br />
  19. 19. Examples<br /><ul><li>Another example workflow</li></ul>17<br />
  20. 20. Examples<br /><ul><li>Another example workflow
  21. 21. This workflow retrieves relevant documents, based on a query optimized by adding a string to the original query that will rank the search output according to the most recent years.
  22. 22. http://www.myexperiment.org/workflows/117.html</li></ul>18<br />
  23. 23. Hypothetical Example<br />19<br />
  24. 24. Hypothetical Example<br />20<br />Chinese character <br />from a text<br />
  25. 25. Hypothetical Example<br />21<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dictionary Database<br />
  26. 26. Hypothetical Example<br />22<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dictionary Database<br />Geographical data from researcher<br />
  27. 27. Hypothetical Example<br />23<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dictionary Database<br />Geographical data from researcher<br />
  28. 28. Hypothetical Example<br />24<br />[ zhi1], [zi2], [zhi2], [shi2], [ci1]<br />Chinese character <br />from a text<br />Dictionary Database<br />Geographical data from researcher<br />Character - Proper dialect reading - definition<br />
  29. 29. Use in Linguistics<br /><ul><li>So, if we have a linked network online that is queryable</li></ul>25<br />
  30. 30. Use in Linguistics<br /><ul><li>So, if we have a linked network online that is queryable
  31. 31. Hypothetically, it should be possible to use current workflow systems to access and download data</li></ul>26<br />
  32. 32. Use in Linguistics<br /><ul><li>So, if we have a linked network online that is queryable
  33. 33. Hypothetically, it should be possible to use current workflow systems to access and download data
  34. 34. My hope is to see how feasible this is</li></ul>27<br />
  35. 35. Use in Linguistics<br />28<br />Other use:<br />
  36. 36. Use in Linguistics<br />29<br />Other use:<br />Shims: data conversion workflows.<br />
  37. 37. Use in Linguistics<br />30<br />Other use:<br />Shims: data conversion workflows.<br />As seen in the LexInfo slides, there are varying definitions for parts of speech (from 5 to 181 different types). Workflows could be used to standardise these after accessing the database…<br />
  38. 38. Use in Linguistics<br />31<br />How does this help Open Methods?<br />
  39. 39. Use in Linguistics<br />32<br />How does this help Open Methods?<br />By keeping track of workflows and workflow systems before they start being popular, we can make sure that users upload and share their workflows to a single repository (like myExperiment.) <br />
  40. 40. Use in Linguistics<br />33<br />How does this help Open Methods?<br />By keeping track of workflows and workflow systems before they start being popular, we can make sure that users upload and share their workflows to a single repository (like myExperiment.)<br />This could then be used by other linguists, along with data supplements, to produce replications, and to check methodology. <br />
  41. 41. Use in Linguistics<br />34<br />How does this help Open Methods?<br />Also, most workflows are now focusing more on providing provenance solutions.<br />
  42. 42. Use in Linguistics<br />35<br />How does this help Open Methods?<br />Also, most workflows are now focusing more on providing provenance solutions.<br />This would make linguistics research more sharable, understandable and repeatable.<br />
  43. 43. Use in Linguistics<br />Work going on this, currently:<br />36<br />
  44. 44. Use in Linguistics<br />Work going on this, currently:<br />Steiner Lydia, Peter F. Stadler, Michael Cysouw. 2011. A Pipeline for Computational Historical Linguistics. Language Dynamics and Change, p. 89-127.<br />37<br />
  45. 45. More Information<br />Places to look for more information:<br />http://notebooks.dataone.org/workflows<br />38<br />
  46. 46. More Information<br />Places to look for more information:<br />http://notebooks.dataone.org/workflows<br />https://kepler-project.org/<br />39<br />
  47. 47. More Information<br />Places to look for more information:<br />http://notebooks.dataone.org/workflows<br />https://kepler-project.org/<br />http://www.taverna.org.uk/<br />40<br />
  48. 48. More Information<br />Places to look for more information:<br />http://notebooks.dataone.org/workflows<br />https://kepler-project.org/<br />http://www.taverna.org.uk/<br />http://www.myexperiment.org<br />41<br />
  49. 49. More Information<br />Places to look for more information:<br />http://notebooks.dataone.org/workflows<br />https://kepler-project.org/<br />http://www.taverna.org.uk/<br />http://www.myexperiment.org<br />http://www.mendeley.com/groups/1235381/workflows-in-linguistics/<br />42<br />
  50. 50. More Information<br />Places to look for more information:<br />http://notebooks.dataone.org/workflows<br />https://kepler-project.org/<br />http://www.taverna.org.uk/<br />http://www.myexperiment.org<br />http://www.mendeley.com/groups/1235381/workflows-in-linguistics/<br />Thank you. Questions?<br />43<br />
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×