1. Challenges and opportunities to
open access to scientific
documents and data
Antonia Ferrer Sapena
Tony Hernández-Pérez
Maredata Research Group (maredata.net)
Workshop on Open Data and Language Processing
Technologies: An opportunity not to be missed
BEST PRACTICES
2. Proyecto CSO2015-71867-REDT financiado por:
Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
¿What’s Maredata?
Sinergies
between res. groups Identify research groups
producing Research data
Liasons w Special
Interest Groups
Libraries, funders
Recommendations
3. Collaborative Work
Data, data, data everywhere
¿Experience transferable?
Promote open research data
Mª Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
From Open Access to
Open Research Data
4. Advantages & Obstacles for
Open Research Data
Advantages of Open Research Data
• Increasing the efficiency of research.
• Promoting scholarly rigour and
enhancements to the quality of research.
• Enhancing visibility and scope for
engagement.
• Enabling researchers to ask new research
questions.
• Enhancing collaboration and community-
building.
• Increasing the economic and social
impact of research.
Obstacles for researchers
• Lack of evidence of benefits and rewards.
• Lack of skills, time and other resources.
• Cultures of independence and
competition.
• Concerns about quality. ¿Peer reviewed
of data? Fear of data misread or
misapplied (methodological or without
key contextual information).
• Ethical, legal and other restrictions on
accessibility (anonymization, Licenses…)
Mª Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
6. Mª Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
ESFRI Roadmap
European Strategy Forum on Research Infrastructures
Landmarks
7. Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
Infrastructures for Language
Processing Technologies
• Language resources (corpora, lexical, linguistic data, audio)
• Linguistic tools (concordance, clusters, keywords…)
http://linguistlist.org/sp/GetWRListings.cfm?wrtypeid=2
• Not only for translations nor only for linguists
• Text and data mining (TDM)
• Sentiment analysis (extracted from social media)
• Script analysis (extracted from tv or cinema scripts)
• Discourse analysis (extracted from all types of discourses)
8. Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
http://www.consorciomadrono.es/pagoda/index2.php
Tools – Data literacy
AntConc
Stanford CoreNLP
DATA REPOSITORY
1 OR N
23 things RDA
Metadata
Privacy
Licences
Preservation
Citing Data
Community
of practices
Research Data & Libraries
9. Antonia Ferrer y Tony Hernández (Maredata) – 5 de octubre de 2016
@maredataproject
http://www.maredata.net