SlideShare a Scribd company logo
1 of 43
Towards Open Methods: Using Scientific Workflows in Linguistics Richard Littauer 1
Various tools, such as Kepler, Taverna, Vistrails, and many others have been designed in order to allow for scientific workflows to be created, executed, and shared among scientists and laboratories.  Introduction 2
Scientific workflows are typically used to automate the processing, analysis, and management of scientific data.  Introduction 3
Scientific workflows are typically used to automate the processing, analysis, and management of scientific data.  They provide a way of tracing provenance and methodologies to help foster reproducible science and the publications of executable papers. Introduction 4
By providing front-end visualisationsand adaptations of shell scripts and manual steps, it is easier for scientists to do their work, especially when integrating grids and parallel processing or external databases. Introduction 5
How does this relate to Linguistics?  Workflows in Linguistics 6
How does this relate to Linguistics? Many workflow systems I've been looking at would work in the field of corpus linguistics if we merely had open source databases online to mine.  Workflows in Linguistics 7
How does this relate to Linguistics? Many workflow systems I've been looking at would work in the field of corpus linguistics if we merely had open source databases online to mine.  They, most often, provide a way of cleaning data, and a way of processing repetitive tasks. This is directly applicable to Linguistic work. Workflows in Linguistics 8
How does this relate to Open Linguistics?  Workflows in Linguistics 9
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 10
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 11
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 12
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 13
Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 14
Examples ,[object Object],15
Examples ,[object Object]
This grabs the most recent XKCD comic off the web.
http://www.myexperiment.org/workflows/1370.html16
Examples ,[object Object],17
Examples ,[object Object]
This workflow retrieves relevant documents, based on a query optimized by adding a string to the original query that will rank the search output according to the most recent years.
http://www.myexperiment.org/workflows/117.html18
Hypothetical Example 19
Hypothetical Example 20 Chinese character  from a text
Hypothetical Example 21 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character  from a text Dictionary Database
Hypothetical Example 22 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character  from a text Dictionary Database Geographical data from researcher
Hypothetical Example 23 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character  from a text Dictionary Database Geographical data from researcher
Hypothetical Example 24 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character  from a text Dictionary Database Geographical data from researcher Character - Proper dialect reading - definition
Use in Linguistics ,[object Object],25
Use in Linguistics ,[object Object]
Hypothetically, it should be possible to use current workflow systems to access and download data26
Use in Linguistics ,[object Object]
Hypothetically, it should be possible to use current workflow systems to access and download data
My hope is to see how feasible this is27
Use in Linguistics 28 Other use:
Use in Linguistics 29 Other use: Shims: data conversion workflows.
Use in Linguistics 30 Other use: Shims: data conversion workflows. As seen in the LexInfo slides, there are varying definitions for parts of speech (from 5 to 181 different types). Workflows could be used to standardise these after accessing the database…
Use in Linguistics 31 How does this help Open Methods?
Use in Linguistics 32 How does this help Open Methods? By keeping track of workflows and workflow systems before they start being popular, we can make sure that users upload and share their workflows to a single repository (like myExperiment.)
Use in Linguistics 33 How does this help Open Methods? By keeping track of workflows and workflow systems before they start being popular, we can make sure that users upload and share their workflows to a single repository (like myExperiment.) This could then be used by other linguists, along with data supplements, to produce replications, and to check methodology.
Use in Linguistics 34 How does this help Open Methods? Also, most workflows are now focusing more on providing provenance solutions.
Use in Linguistics 35 How does this help Open Methods? Also, most workflows are now focusing more on providing provenance solutions. This would make linguistics research more sharable, understandable and repeatable.
Use in Linguistics Work going on this, currently: 36

More Related Content

What's hot

Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureRoss Mounce
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextEric Kansa
 
Web Data Management in the RDF Age
Web Data Management in the RDF AgeWeb Data Management in the RDF Age
Web Data Management in the RDF AgeM. Tamer Özsu
 
LOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked DataLOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked DataFilip Ilievski
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...Armin Haller
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataMarcia Zeng
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...Marko Rodriguez
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Julien PLU
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale Bernadette Hyland-Wood
 
Consuming Linked Data by Machines - WWW2010
Consuming Linked Data by Machines - WWW2010Consuming Linked Data by Machines - WWW2010
Consuming Linked Data by Machines - WWW2010Juan Sequeda
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in ComputingMarko Rodriguez
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web ArchivesMichael Nelson
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDMpetermurrayrust
 
Modern Tools & Rationales for 21st Century Research
Modern Tools & Rationales  for 21st Century ResearchModern Tools & Rationales  for 21st Century Research
Modern Tools & Rationales for 21st Century ResearchRoss Mounce
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkPaul Groth
 
Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Maulik Kamdar
 

What's hot (20)

Open Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | FutureOpen Research Data: Licensing | Standards | Future
Open Research Data: Licensing | Standards | Future
 
Interpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open ContextInterpretation, Context, and Metadata: Examples from Open Context
Interpretation, Context, and Metadata: Examples from Open Context
 
Web Data Management in the RDF Age
Web Data Management in the RDF AgeWeb Data Management in the RDF Age
Web Data Management in the RDF Age
 
LOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked DataLOTUS: Adaptive Text Search for Big Linked Data
LOTUS: Adaptive Text Search for Big Linked Data
 
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
What Are Links in Linked Open Data? A Characterization and Evaluation of Link...
 
Contributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library DataContributing to the Smart City Through Linked Library Data
Contributing to the Smart City Through Linked Library Data
 
A Clean Slate?
A Clean Slate?A Clean Slate?
A Clean Slate?
 
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
A Practical Ontology for the Large-Scale Modeling of Scholarly Artifacts and ...
 
Probabilistic Topic models
Probabilistic Topic modelsProbabilistic Topic models
Probabilistic Topic models
 
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
Knowledge extraction in Web media: at the frontier of NLP, Machine Learning a...
 
Linking Open Government Data at Scale
Linking Open Government Data at Scale Linking Open Government Data at Scale
Linking Open Government Data at Scale
 
Consuming Linked Data by Machines - WWW2010
Consuming Linked Data by Machines - WWW2010Consuming Linked Data by Machines - WWW2010
Consuming Linked Data by Machines - WWW2010
 
Dante al tempo del web semantico
Dante al tempo del web semanticoDante al tempo del web semantico
Dante al tempo del web semantico
 
The Network Data Structure in Computing
The Network Data Structure in ComputingThe Network Data Structure in Computing
The Network Data Structure in Computing
 
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
NISO/NFAIS Joint Virtual Conference: Connecting the Library to the Wider Worl...
 
Profiling Web Archives
Profiling Web ArchivesProfiling Web Archives
Profiling Web Archives
 
Text and Data Mining explained at FTDM
Text and Data Mining explained at FTDMText and Data Mining explained at FTDM
Text and Data Mining explained at FTDM
 
Modern Tools & Rationales for 21st Century Research
Modern Tools & Rationales  for 21st Century ResearchModern Tools & Rationales  for 21st Century Research
Modern Tools & Rationales for 21st Century Research
 
Research Data Sharing: A Basic Framework
Research Data Sharing: A Basic FrameworkResearch Data Sharing: A Basic Framework
Research Data Sharing: A Basic Framework
 
Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...Current advances to bridge the usability-expressivity gap in biomedical seman...
Current advances to bridge the usability-expressivity gap in biomedical seman...
 

Viewers also liked

Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...Richard Littauer
 
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...Richard Littauer
 
Composing Domain-Specific Languages
Composing Domain-Specific LanguagesComposing Domain-Specific Languages
Composing Domain-Specific LanguagesEelco Visser
 
Static name resolution
Static name resolutionStatic name resolution
Static name resolutionEelco Visser
 

Viewers also liked (6)

Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
Trends in Use of Scientific Workflows: Insights from a Public Repository and ...
 
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
Academic Research in the Blogosphere: Adapting to New Risks and Opportunities...
 
Composing Domain-Specific Languages
Composing Domain-Specific LanguagesComposing Domain-Specific Languages
Composing Domain-Specific Languages
 
Static name resolution
Static name resolutionStatic name resolution
Static name resolution
 
Type analysis
Type analysisType analysis
Type analysis
 
Dynamic Semantics
Dynamic SemanticsDynamic Semantics
Dynamic Semantics
 

Similar to Towards Open Methods: Using Scientific Workflows in Linguistics

Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019heila1
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”DuraSpace
 
Open Opportunities
Open OpportunitiesOpen Opportunities
Open OpportunitiesRuss White
 
OpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of DataOpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of Dataopenminted_eu
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open researchheila1
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemNicole Vasilevsky
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011Alex Hardisty
 
Making working thesauri
Making working thesauriMaking working thesauri
Making working thesauriliddy
 
Open Access: Prospectors Wanted!
Open Access: Prospectors Wanted!Open Access: Prospectors Wanted!
Open Access: Prospectors Wanted!Amos Kujenga
 
L&P Eric Celeste - SHARE
L&P Eric Celeste -  SHAREL&P Eric Celeste -  SHARE
L&P Eric Celeste - SHARECASRAI
 
Overview of open access progress globally
Overview of open access progress globallyOverview of open access progress globally
Overview of open access progress globallyIryna Kuchma
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphSören Auer
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Robert H. McDonald
 
Open access for researchers, policy makers and research managers - Short ver...
Open access  for researchers, policy makers and research managers - Short ver...Open access  for researchers, policy makers and research managers - Short ver...
Open access for researchers, policy makers and research managers - Short ver...Iryna Kuchma
 
A demonstration of transparent and scalable OpenURL quality metrics for use i...
A demonstration of transparent and scalable OpenURL quality metrics for use i...A demonstration of transparent and scalable OpenURL quality metrics for use i...
A demonstration of transparent and scalable OpenURL quality metrics for use i...alc28
 
Reshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha MunshiReshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha MunshiAta Rehman
 
OpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open scienceOpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open scienceJisc
 

Similar to Towards Open Methods: Using Scientific Workflows in Linguistics (20)

Open sciencerefresher2019
Open sciencerefresher2019Open sciencerefresher2019
Open sciencerefresher2019
 
UKON 2014
UKON 2014UKON 2014
UKON 2014
 
Final Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational ResearchFinal Johnson Research Libraries and Computational Research
Final Johnson Research Libraries and Computational Research
 
12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”12.10.14 Slides, “Roadmap to the Future of SHARE”
12.10.14 Slides, “Roadmap to the Future of SHARE”
 
Open Opportunities
Open OpportunitiesOpen Opportunities
Open Opportunities
 
OpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of DataOpenMinTeD: Making Sense of Large Volumes of Data
OpenMinTeD: Making Sense of Large Volumes of Data
 
Data and science
Data and scienceData and science
Data and science
 
Open science / open research
Open science / open researchOpen science / open research
Open science / open research
 
Research resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery systemResearch resources: curating the new eagle-i discovery system
Research resources: curating the new eagle-i discovery system
 
AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011AH-XLDBEurope-position-09 jun2011
AH-XLDBEurope-position-09 jun2011
 
Making working thesauri
Making working thesauriMaking working thesauri
Making working thesauri
 
Open Access: Prospectors Wanted!
Open Access: Prospectors Wanted!Open Access: Prospectors Wanted!
Open Access: Prospectors Wanted!
 
L&P Eric Celeste - SHARE
L&P Eric Celeste -  SHAREL&P Eric Celeste -  SHARE
L&P Eric Celeste - SHARE
 
Overview of open access progress globally
Overview of open access progress globallyOverview of open access progress globally
Overview of open access progress globally
 
Towards an Open Research Knowledge Graph
Towards an Open Research Knowledge GraphTowards an Open Research Knowledge Graph
Towards an Open Research Knowledge Graph
 
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
Creating Sustainable Communities in Open Data Resources: The eagle-i and VIVO...
 
Open access for researchers, policy makers and research managers - Short ver...
Open access  for researchers, policy makers and research managers - Short ver...Open access  for researchers, policy makers and research managers - Short ver...
Open access for researchers, policy makers and research managers - Short ver...
 
A demonstration of transparent and scalable OpenURL quality metrics for use i...
A demonstration of transparent and scalable OpenURL quality metrics for use i...A demonstration of transparent and scalable OpenURL quality metrics for use i...
A demonstration of transparent and scalable OpenURL quality metrics for use i...
 
Reshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha MunshiReshaping the world of scholarly communication by Dr. Usha Munshi
Reshaping the world of scholarly communication by Dr. Usha Munshi
 
OpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open scienceOpenAIRE-connect: Services for open science
OpenAIRE-connect: Services for open science
 

More from Richard Littauer

Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationRichard Littauer
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationRichard Littauer
 
Building Corpora from Social Media
Building Corpora from Social MediaBuilding Corpora from Social Media
Building Corpora from Social MediaRichard Littauer
 
Visualising Typological Relationships: Plotting WALS with Heat Maps
Visualising Typological Relationships: Plotting WALS with Heat MapsVisualising Typological Relationships: Plotting WALS with Heat Maps
Visualising Typological Relationships: Plotting WALS with Heat MapsRichard Littauer
 
On Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem IsoglossOn Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem IsoglossRichard Littauer
 
The Evolution of Morphological Agreement
The Evolution of Morphological AgreementThe Evolution of Morphological Agreement
The Evolution of Morphological AgreementRichard Littauer
 
Evolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche KuchaEvolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche KuchaRichard Littauer
 
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...Richard Littauer
 
The Evolution of Speech Segmentation: A Computer Simulation
The Evolution of Speech Segmentation: A Computer SimulationThe Evolution of Speech Segmentation: A Computer Simulation
The Evolution of Speech Segmentation: A Computer SimulationRichard Littauer
 
A Reanalysis of Anatomical Changes for Language
A Reanalysis of Anatomical Changes for LanguageA Reanalysis of Anatomical Changes for Language
A Reanalysis of Anatomical Changes for LanguageRichard Littauer
 

More from Richard Littauer (12)

Named Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 PresentationNamed Entity Recognition - ACL 2011 Presentation
Named Entity Recognition - ACL 2011 Presentation
 
Marcu 2000 presentation
Marcu 2000 presentationMarcu 2000 presentation
Marcu 2000 presentation
 
Barzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentationBarzilay & Lapata 2008 presentation
Barzilay & Lapata 2008 presentation
 
Saarland and UdS
Saarland and UdSSaarland and UdS
Saarland and UdS
 
Building Corpora from Social Media
Building Corpora from Social MediaBuilding Corpora from Social Media
Building Corpora from Social Media
 
Visualising Typological Relationships: Plotting WALS with Heat Maps
Visualising Typological Relationships: Plotting WALS with Heat MapsVisualising Typological Relationships: Plotting WALS with Heat Maps
Visualising Typological Relationships: Plotting WALS with Heat Maps
 
On Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem IsoglossOn Tocharian Exceptionality to the centum/satem Isogloss
On Tocharian Exceptionality to the centum/satem Isogloss
 
The Evolution of Morphological Agreement
The Evolution of Morphological AgreementThe Evolution of Morphological Agreement
The Evolution of Morphological Agreement
 
Evolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche KuchaEvolution of Morphological Agreement - Peche Kucha
Evolution of Morphological Agreement - Peche Kucha
 
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
Workflow Classification and Open-Sourcing Methods: Towards a New Publication ...
 
The Evolution of Speech Segmentation: A Computer Simulation
The Evolution of Speech Segmentation: A Computer SimulationThe Evolution of Speech Segmentation: A Computer Simulation
The Evolution of Speech Segmentation: A Computer Simulation
 
A Reanalysis of Anatomical Changes for Language
A Reanalysis of Anatomical Changes for LanguageA Reanalysis of Anatomical Changes for Language
A Reanalysis of Anatomical Changes for Language
 

Recently uploaded

Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxMarkSteadman7
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governanceWSO2
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....rightmanforbloodline
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformWSO2
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseWSO2
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightSafe Software
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Zilliz
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard37
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdfSandro Moreira
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfdanishmna97
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDropbox
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityVictorSzoltysek
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxRemote DBA Services
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 

Recently uploaded (20)

Simplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptxSimplifying Mobile A11y Presentation.pptx
Simplifying Mobile A11y Presentation.pptx
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
API Governance and Monetization - The evolution of API governance
API Governance and Monetization -  The evolution of API governanceAPI Governance and Monetization -  The evolution of API governance
API Governance and Monetization - The evolution of API governance
 
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
TEST BANK For Principles of Anatomy and Physiology, 16th Edition by Gerard J....
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data PlatformLess Is More: Utilizing Ballerina to Architect a Cloud Data Platform
Less Is More: Utilizing Ballerina to Architect a Cloud Data Platform
 
Navigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern EnterpriseNavigating Identity and Access Management in the Modern Enterprise
Navigating Identity and Access Management in the Modern Enterprise
 
WSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering DevelopersWSO2's API Vision: Unifying Control, Empowering Developers
WSO2's API Vision: Unifying Control, Empowering Developers
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
The Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and InsightThe Zero-ETL Approach: Enhancing Data Agility and Insight
The Zero-ETL Approach: Enhancing Data Agility and Insight
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)Introduction to Multilingual Retrieval Augmented Generation (RAG)
Introduction to Multilingual Retrieval Augmented Generation (RAG)
 
JohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptxJohnPollard-hybrid-app-RailsConf2024.pptx
JohnPollard-hybrid-app-RailsConf2024.pptx
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf[BuildWithAI] Introduction to Gemini.pdf
[BuildWithAI] Introduction to Gemini.pdf
 
How to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cfHow to Check CNIC Information Online with Pakdata cf
How to Check CNIC Information Online with Pakdata cf
 
DBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor PresentationDBX First Quarter 2024 Investor Presentation
DBX First Quarter 2024 Investor Presentation
 
ChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps ProductivityChatGPT and Beyond - Elevating DevOps Productivity
ChatGPT and Beyond - Elevating DevOps Productivity
 
Vector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptxVector Search -An Introduction in Oracle Database 23ai.pptx
Vector Search -An Introduction in Oracle Database 23ai.pptx
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 

Towards Open Methods: Using Scientific Workflows in Linguistics

  • 1. Towards Open Methods: Using Scientific Workflows in Linguistics Richard Littauer 1
  • 2. Various tools, such as Kepler, Taverna, Vistrails, and many others have been designed in order to allow for scientific workflows to be created, executed, and shared among scientists and laboratories. Introduction 2
  • 3. Scientific workflows are typically used to automate the processing, analysis, and management of scientific data. Introduction 3
  • 4. Scientific workflows are typically used to automate the processing, analysis, and management of scientific data. They provide a way of tracing provenance and methodologies to help foster reproducible science and the publications of executable papers. Introduction 4
  • 5. By providing front-end visualisationsand adaptations of shell scripts and manual steps, it is easier for scientists to do their work, especially when integrating grids and parallel processing or external databases. Introduction 5
  • 6. How does this relate to Linguistics? Workflows in Linguistics 6
  • 7. How does this relate to Linguistics? Many workflow systems I've been looking at would work in the field of corpus linguistics if we merely had open source databases online to mine. Workflows in Linguistics 7
  • 8. How does this relate to Linguistics? Many workflow systems I've been looking at would work in the field of corpus linguistics if we merely had open source databases online to mine. They, most often, provide a way of cleaning data, and a way of processing repetitive tasks. This is directly applicable to Linguistic work. Workflows in Linguistics 8
  • 9. How does this relate to Open Linguistics? Workflows in Linguistics 9
  • 10. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 10
  • 11. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 11
  • 12. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 12
  • 13. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 13
  • 14. Promote the idea and definition, as specified in opendefinition.org of open data in linguistics and in relation to language data. Act as a central point of reference and support for people interested in open linguistic data. Provide guidance on legal issues surrounding linguistic data to the community. Build an index of indexes of open linguistic data sources and tools and link existing resources. Facilitate communication between existing groups. Serve as a mediator between providers and users of of technical infrastructure. Assemble best-practice guidelines / use cases to create, use and distribute data. Open Linguistics 14
  • 15.
  • 16.
  • 17. This grabs the most recent XKCD comic off the web.
  • 19.
  • 20.
  • 21. This workflow retrieves relevant documents, based on a query optimized by adding a string to the original query that will rank the search output according to the most recent years.
  • 24. Hypothetical Example 20 Chinese character from a text
  • 25. Hypothetical Example 21 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character from a text Dictionary Database
  • 26. Hypothetical Example 22 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character from a text Dictionary Database Geographical data from researcher
  • 27. Hypothetical Example 23 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character from a text Dictionary Database Geographical data from researcher
  • 28. Hypothetical Example 24 [ zhi1], [zi2], [zhi2], [shi2], [ci1] Chinese character from a text Dictionary Database Geographical data from researcher Character - Proper dialect reading - definition
  • 29.
  • 30.
  • 31. Hypothetically, it should be possible to use current workflow systems to access and download data26
  • 32.
  • 33. Hypothetically, it should be possible to use current workflow systems to access and download data
  • 34. My hope is to see how feasible this is27
  • 35. Use in Linguistics 28 Other use:
  • 36. Use in Linguistics 29 Other use: Shims: data conversion workflows.
  • 37. Use in Linguistics 30 Other use: Shims: data conversion workflows. As seen in the LexInfo slides, there are varying definitions for parts of speech (from 5 to 181 different types). Workflows could be used to standardise these after accessing the database…
  • 38. Use in Linguistics 31 How does this help Open Methods?
  • 39. Use in Linguistics 32 How does this help Open Methods? By keeping track of workflows and workflow systems before they start being popular, we can make sure that users upload and share their workflows to a single repository (like myExperiment.)
  • 40. Use in Linguistics 33 How does this help Open Methods? By keeping track of workflows and workflow systems before they start being popular, we can make sure that users upload and share their workflows to a single repository (like myExperiment.) This could then be used by other linguists, along with data supplements, to produce replications, and to check methodology.
  • 41. Use in Linguistics 34 How does this help Open Methods? Also, most workflows are now focusing more on providing provenance solutions.
  • 42. Use in Linguistics 35 How does this help Open Methods? Also, most workflows are now focusing more on providing provenance solutions. This would make linguistics research more sharable, understandable and repeatable.
  • 43. Use in Linguistics Work going on this, currently: 36
  • 44. Use in Linguistics Work going on this, currently: Steiner Lydia, Peter F. Stadler, Michael Cysouw. 2011. A Pipeline for Computational Historical Linguistics. Language Dynamics and Change, p. 89-127. 37
  • 45. More Information Places to look for more information: http://notebooks.dataone.org/workflows 38
  • 46. More Information Places to look for more information: http://notebooks.dataone.org/workflows https://kepler-project.org/ 39
  • 47. More Information Places to look for more information: http://notebooks.dataone.org/workflows https://kepler-project.org/ http://www.taverna.org.uk/ 40
  • 48. More Information Places to look for more information: http://notebooks.dataone.org/workflows https://kepler-project.org/ http://www.taverna.org.uk/ http://www.myexperiment.org 41
  • 49. More Information Places to look for more information: http://notebooks.dataone.org/workflows https://kepler-project.org/ http://www.taverna.org.uk/ http://www.myexperiment.org http://www.mendeley.com/groups/1235381/workflows-in-linguistics/ 42
  • 50. More Information Places to look for more information: http://notebooks.dataone.org/workflows https://kepler-project.org/ http://www.taverna.org.uk/ http://www.myexperiment.org http://www.mendeley.com/groups/1235381/workflows-in-linguistics/ Thank you. Questions? 43