Presentation slides on Open Science and research reproducibility. Presented by Gareth Knight (LSHTM Research Data Manager) on 18th September 2018, as part of an Open Science event for LSHTM Week 2018.
Identifying and tracking research resources using RRIDs: a practical approachdkNET
At this presentation, you will learn (1) Why you need to use Research Resource identifier (RRID) (2) What is Resource Identification Initiative (3) How dkNET.org supports RRID (4) What can you do with RRID
Presentation by Ruth Wilson on Nature Publishing Group's Scientific Data journal given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
Transparency and reproducibility in researchLouise Corti
Talk given at the ESS Summer School: An introduction to using big data in the social sciences, 20-24 July 2020, University of Essex, Colchester, UK.
In the morning we look at publishing and sharing data and the importance of research replication, code sharing, examining what methodological issues peer reviewers might look for in a published paper using big data. An increasing number of journals in the sciences and social sciences expect a high degree of transparency and knowing how best to publish high quality raw (or processed data), methodology and code is a useful skill. We show how ‘data papers’ help to elucidate how datasets were constructed, compiled and processed, and help to showcase the value of data beyond the original research.
Identifying and tracking research resources using RRIDs: a practical approachdkNET
At this presentation, you will learn (1) Why you need to use Research Resource identifier (RRID) (2) What is Resource Identification Initiative (3) How dkNET.org supports RRID (4) What can you do with RRID
Presentation by Ruth Wilson on Nature Publishing Group's Scientific Data journal given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
Transparency and reproducibility in researchLouise Corti
Talk given at the ESS Summer School: An introduction to using big data in the social sciences, 20-24 July 2020, University of Essex, Colchester, UK.
In the morning we look at publishing and sharing data and the importance of research replication, code sharing, examining what methodological issues peer reviewers might look for in a published paper using big data. An increasing number of journals in the sciences and social sciences expect a high degree of transparency and knowing how best to publish high quality raw (or processed data), methodology and code is a useful skill. We show how ‘data papers’ help to elucidate how datasets were constructed, compiled and processed, and help to showcase the value of data beyond the original research.
This presentation was provided by Patricia Payton of Proquest during the NISO webinar, Engineering Access Under the Hood, Part Two, held on November 15, 2017.
An overview of the National Institutes of Health new rules that aim to improve the rigor and reproducibility of research, especially research involving animals.
Sbm open science committee report to the boardBradford Hesse
In the spirit of transparency, I am uploading a mid-course presentation I made to the Board of Directors for the Society of Behavioral Medicine on the topic of Open Science. The report embodies the best thinking of some of the greatest thinkers in our field.
Ontologies for Clinical Research - Assessment and DevelopmentWolfgang Kuchinke
Ontologies for Clinical Research. Ontologies are representations with names and categories, properties and relations between concepts and entities of a domain of knowledge; they are showing the properties of a subject area and how they are related. The purpose of an ontology is to limit the complexity of information and to organize data into information and knowledge. The topic of my presentation deals with Ontology-Based Data Integration for clinical research. Data from clinical trials need to be reused and shared for secondary research purposes. Integrated clinical terminologies are necessary for an efficient clinical trial system. Following ontologies for clinical research were assessed: Clinical Trial Ontology (CTO), Ontology of Clinical Research (OCRe), Ontology for Biomedical Investigations (OBI), Cochrane PICO Ontology, ACGT Master Ontology, Basic Formal Ontology (BFO). The aim of our efforts was to find ontologies to enable and simplify data reuse and data sharing between pre-clinical research, clinical phase I – phase III clinical trials and health data research. Existing ontologies are insufficient for this purpose and it exists a need for an ontology for clinical trial
data integration. We want to create such an ontology by joining PICO, OCRe and OBI to form a new, more comprehensive ontology.
Doing research better: The role of meta‐dataGarethKnight
Presentation given by David Leon, Professor of Epidemiology at the London School of Hygiene and Tropical Medicine in January 2012. Subsequently reused at various internal events
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditSC CTSI at USC and CHLA
This 50-minute presentation introduces r/SampleSize, a community on the website Reddit that allows for online participant recruitment without compulsory or immediate payment. It will provide an overview of best practices for recruiting participants on r/SampleSize. It will also compare r/SampleSize to Amazon Mechanical Turk (MTurk), a widely used crowdsourcing platform for recruiting research participants.
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...sesrdm
Presentation by Dr Sarah Butcher, Imperial College London at Science and Engineering South (SES) Event - Helping Researchers Manage their Data - Friday 9th May 2014 held at Imperial College London
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...ASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
Sean Buckner, Texas A&M University
Jeremy Donald, Trinity University
Bruce Herbert, Texas A&M University
Wendi Kaspar, Texas A&M University
Nick Lauland, Texas Digital Library
Kristi Park, Texas Digital Library
Todd Peters, Texas State University
Denyse Rodgers, Baylor University
Cecilia Smith, Texas A&M University
Chris Starcher, Texas Tech University
Ryan Steans, Texas Digital Library
Santi Thompson, University of Houston
Ray Uzwyshyn, Texas State University
Laura Waugh, Texas Digital Library
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
This presentation was provided by Patricia Payton of Proquest during the NISO webinar, Engineering Access Under the Hood, Part Two, held on November 15, 2017.
An overview of the National Institutes of Health new rules that aim to improve the rigor and reproducibility of research, especially research involving animals.
Sbm open science committee report to the boardBradford Hesse
In the spirit of transparency, I am uploading a mid-course presentation I made to the Board of Directors for the Society of Behavioral Medicine on the topic of Open Science. The report embodies the best thinking of some of the greatest thinkers in our field.
Ontologies for Clinical Research - Assessment and DevelopmentWolfgang Kuchinke
Ontologies for Clinical Research. Ontologies are representations with names and categories, properties and relations between concepts and entities of a domain of knowledge; they are showing the properties of a subject area and how they are related. The purpose of an ontology is to limit the complexity of information and to organize data into information and knowledge. The topic of my presentation deals with Ontology-Based Data Integration for clinical research. Data from clinical trials need to be reused and shared for secondary research purposes. Integrated clinical terminologies are necessary for an efficient clinical trial system. Following ontologies for clinical research were assessed: Clinical Trial Ontology (CTO), Ontology of Clinical Research (OCRe), Ontology for Biomedical Investigations (OBI), Cochrane PICO Ontology, ACGT Master Ontology, Basic Formal Ontology (BFO). The aim of our efforts was to find ontologies to enable and simplify data reuse and data sharing between pre-clinical research, clinical phase I – phase III clinical trials and health data research. Existing ontologies are insufficient for this purpose and it exists a need for an ontology for clinical trial
data integration. We want to create such an ontology by joining PICO, OCRe and OBI to form a new, more comprehensive ontology.
Doing research better: The role of meta‐dataGarethKnight
Presentation given by David Leon, Professor of Epidemiology at the London School of Hygiene and Tropical Medicine in January 2012. Subsequently reused at various internal events
Digital Scholar Webinar: Recruiting Research Participants Online Using RedditSC CTSI at USC and CHLA
This 50-minute presentation introduces r/SampleSize, a community on the website Reddit that allows for online participant recruitment without compulsory or immediate payment. It will provide an overview of best practices for recruiting participants on r/SampleSize. It will also compare r/SampleSize to Amazon Mechanical Turk (MTurk), a widely used crowdsourcing platform for recruiting research participants.
Case Study Life Sciences Data: Central for Integrative Systems Biology and Bi...sesrdm
Presentation by Dr Sarah Butcher, Imperial College London at Science and Engineering South (SES) Event - Helping Researchers Manage their Data - Friday 9th May 2014 held at Imperial College London
RDAP 16 Poster: Diving into Data: Implementing a Data Repository at the Texas...ASIS&T
Research Data Access and Preservation Summit, 2016
Atlanta, GA
May 4-7, 2016
Poster session (Wednesday, May 4)
Presenters:
Sean Buckner, Texas A&M University
Jeremy Donald, Trinity University
Bruce Herbert, Texas A&M University
Wendi Kaspar, Texas A&M University
Nick Lauland, Texas Digital Library
Kristi Park, Texas Digital Library
Todd Peters, Texas State University
Denyse Rodgers, Baylor University
Cecilia Smith, Texas A&M University
Chris Starcher, Texas Tech University
Ryan Steans, Texas Digital Library
Santi Thompson, University of Houston
Ray Uzwyshyn, Texas State University
Laura Waugh, Texas Digital Library
An introduction to the FAIR principles and a discussion of key issues that must be addressed to ensure data is findable, accessible, interoperable and re-usable. The session explored the role of the CDISC and DDI standards for addressing these issues.
Presented by Gareth Knight at the ADMIT Network conference, organised by the Association for Data Management in the Tropics, in Antwerp, Belgium on December 1st 2015.
This presentation was provided by Carly Strasser of the Chan Zuckerberg Initiative during the NISO hot topic virtual conference "Effective Data Management," which was held on September 29, 2021.
Presentation to IASSIST 2013, in the session Expanding Scholarship: Research Journals and Data Linkages. Describes PREPARDE workshop on repository accreditation for data publication and invites comments on guidelines.
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
Preparing your data for sharing and publishingVarsha Khodiyar
Talk given as part of the MRC Cognition and Brain Sciences Unit Open Science Day on 20th November 2018 , University of Cambridge (https://www.eventbrite.co.uk/e/open-science-day-at-the-mrc-cbu-tickets-50363553745)
Using Feedback from Data Consumers to Capture Quality Information on Environm...Anusuriya Devaraju
Data quality information is essential to facilitate reuse of Earth science data. Recorded quality information must be sufficient for other researchers to select suitable data sets for their analysis and confirm the results and conclusions. In the research data ecosystem, several entities are responsible for data quality. Data producers (researchers and agencies) play a major role in this aspect as they often include validation checks or data cleaning as part of their work. It is possible that the quality information is not supplied with published data sets; if it is available, the descriptions might be incomplete, ambiguous or address specific quality aspects. Data repositories have built infrastructures to share data, but not all of them assess data quality. They normally provide guidelines of documenting quality information. Some suggests that scholarly and data journals should take a role in ensuring data quality by involving reviewers to assess data sets used in articles, and incorporating data quality criteria in the author guidelines. However, this mechanism primarily addresses data sets submitted to journals. We believe that data consumers will complement existing entities to assess and document the quality of published data sets. This has been adopted in crowd-source platforms such as Zooniverse, OpenStreetMap, Wikipedia, Mechanical Turk and Tomnod. This paper presents a framework designed based on open source tools to capture and share data users’ feedback on the application and assessment of research data. The framework comprises a browser plug-in, a web service and a data model such that feedback can be easily reported, retrieved and searched. The feedback records are also made available as Linked Data to promote integration with other sources on the Web. Vocabularies from Dublin Core and PROV-O are used to clarify the source and attribution of feedback. The application of the framework is illustrated with the CSIRO’s Data Access Portal.
Data Management for Research (New Faculty Orientation)aaroncollie
Situates research data management as a contingency that should be addressed and provisioned for during planning and research design. Draws out fundamental practices for file management, data description, and enumerates storage decision points.
Presentation on data sharing that outlines five layers that must be addressed to enable data to be located, obtained, access, understood and use, and cited.
This is a presentation given to final year doctoral students at the London School of Hygiene & Tropical Medicine. It covers issues pertaining to copyright and open access publishing that students need to consider before submitting their thesis, as well as information on research data management and the actual process of submission.
Laurence Horton of the London School of Economics gave a talk on the information security implications of the General Data Protection Regulation (GDPR). Presented at the London Area Research Data meeting on 17th November 2017, held at the London School of Hygiene & Tropical Medicine.
An introduction to the General Data Protection Regulation (GDPR) and its implications for research data management. Presentation given by Tim Rodgers of Imperial College London at the London Area Research Data meeting, held at the London School of Hygiene & Tropical Medicine on 17th Nov 2017.
Report on key findings of a Wellcome-commissioned study to investigate current practices for paper, data & code sharing among Wellcome & ESRC funded researchers and any barriers that are encountered. Presented by Gareth Knight at a CPD25 Open Access workshop at the Foundling museum in London on 26 April 2017.
Presentation slides from a talk by Gareth Knight which discussed the need to consider data sharing activities in academic citizenship, different approaches that may be taken to publish data associated with publications, and the opportunities presented by data journals
Presentation by Chris Grundy of LSHTM which describes his use of satellite images for population estimation and surveys, as well as mapping work performed by the online mapping community and NGOs to improve crowd sourced mapping data.
Ketevan is a Research Fellow in the Department of Health Services Research and Policy at LSHTM. She currently works on SPOTLIGHT, a cross-European research project for sustainable prevention of obesity through integrated strategies, where she is managing a large-scale survey conducted in England to assess the perceptions of environmental obesogenicity in selected neighbourhoods. She also assessed the built environment in those neighbourhoods using remote imaging using Google Street View.
An overview of the i-Sense platform, developed by UCL to monitor the spread of infectious disease. Presented by Jens Geyti of University College London at LSHTM's 'Enhancing data capture in health research' RDM event on November 20th, 2015.
Case study on the FluSurvey platform, developed by the London School of Hygiene & Tropical Medicine. Presented by Dr Sebastian Funk at LSHTM's 'Enhancing data capture in health research' RDM event on November 20th, 2015.
Case study on the development of the MyHeart Counts app built using Apple’s ResearchKit platform and future plans for Android development. Presented by Dr Dario Salvi of University of Oxford at LSHTM's 'Enhancing data capture in health research' RDM event on November 20th, 2015.
Case study on the use of electronic data collection in a modular household survey as part of the IDEAS project. Presented by Keith Tomlin at LSHTM's 'Enhancing data capture in health research' RDM event on November 20th, 2015.
Case study on mobile-based experience sampling using the Q-Sense and EmotionSense platform. Presented by Dr. Neal Lathia of Cambridge University at LSHTM's 'Enhancing data capture in health research' RDM event on November 20th, 2015.
Presentation by Angus Whyte of the Digital Curation Centre. It was presented at the LSHTM Research Data Services workshop on June 30th 2015, an event organised to mark the end of LSHTM's Wellcome Trust funded RDM project. Updated version added on 14th August to clarify graph labels.
Presentation by Sally Rumsey of the University of Oxford. It was presented at the LSHTM Research Data Services workshop on June 30th 2015, an event organised to mark the end of LSHTM's Wellcome Trust funded RDM project.
Presentation by Stuart Lewis of the University of Edinburgh. It was presented at the LSHTM Research Data Services workshop on June 30th 2015, an event organised to mark the end of LSHTM's Wellcome Trust funded RDM project.
Presentation by Jeremy Barraud & Jess Crilly of University of the Arts London. It was presented at the LSHTM Research Data Services workshop on June 30th 2015, an event organised to mark the end of LSHTM's Wellcome Trust funded RDM project.
Presentation by Stephen Grace of the University of East London. It was presented at the LSHTM Research Data Services workshop on June 30th 2015, an event organised to mark the end of LSHTM's Wellcome Trust funded RDM project.
Presentation by Gareth Knight of the London School of Hygiene & Tropical Medicine. It was presented at the LSHTM Research Data Services workshop on June 30th 2015, an event organised to mark the end of LSHTM's Wellcome Trust funded RDM project.
More from London School of Hygiene and Tropical Medicine (20)
Slack (or Teams) Automation for Bonterra Impact Management (fka Social Soluti...Jeffrey Haguewood
Sidekick Solutions uses Bonterra Impact Management (fka Social Solutions Apricot) and automation solutions to integrate data for business workflows.
We believe integration and automation are essential to user experience and the promise of efficient work through technology. Automation is the critical ingredient to realizing that full vision. We develop integration products and services for Bonterra Case Management software to support the deployment of automations for a variety of use cases.
This video focuses on the notifications, alerts, and approval requests using Slack for Bonterra Impact Management. The solutions covered in this webinar can also be deployed for Microsoft Teams.
Interested in deploying notification automations for Bonterra Impact Management? Contact us at sales@sidekicksolutionsllc.com to discuss next steps.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Neuro-symbolic is not enough, we need neuro-*semantic*Frank van Harmelen
Neuro-symbolic (NeSy) AI is on the rise. However, simply machine learning on just any symbolic structure is not sufficient to really harvest the gains of NeSy. These will only be gained when the symbolic structures have an actual semantics. I give an operational definition of semantics as “predictable inference”.
All of this illustrated with link prediction over knowledge graphs, but the argument is general.
Let's dive deeper into the world of ODC! Ricardo Alves (OutSystems) will join us to tell all about the new Data Fabric. After that, Sezen de Bruijn (OutSystems) will get into the details on how to best design a sturdy architecture within ODC.
GDG Cloud Southlake #33: Boule & Rebala: Effective AppSec in SDLC using Deplo...James Anderson
Effective Application Security in Software Delivery lifecycle using Deployment Firewall and DBOM
The modern software delivery process (or the CI/CD process) includes many tools, distributed teams, open-source code, and cloud platforms. Constant focus on speed to release software to market, along with the traditional slow and manual security checks has caused gaps in continuous security as an important piece in the software supply chain. Today organizations feel more susceptible to external and internal cyber threats due to the vast attack surface in their applications supply chain and the lack of end-to-end governance and risk management.
The software team must secure its software delivery process to avoid vulnerability and security breaches. This needs to be achieved with existing tool chains and without extensive rework of the delivery processes. This talk will present strategies and techniques for providing visibility into the true risk of the existing vulnerabilities, preventing the introduction of security issues in the software, resolving vulnerabilities in production environments quickly, and capturing the deployment bill of materials (DBOM).
Speakers:
Bob Boule
Robert Boule is a technology enthusiast with PASSION for technology and making things work along with a knack for helping others understand how things work. He comes with around 20 years of solution engineering experience in application security, software continuous delivery, and SaaS platforms. He is known for his dynamic presentations in CI/CD and application security integrated in software delivery lifecycle.
Gopinath Rebala
Gopinath Rebala is the CTO of OpsMx, where he has overall responsibility for the machine learning and data processing architectures for Secure Software Delivery. Gopi also has a strong connection with our customers, leading design and architecture for strategic implementations. Gopi is a frequent speaker and well-known leader in continuous delivery and integrating security into software delivery.
Smart TV Buyer Insights Survey 2024 by 91mobiles.pdf91mobiles
91mobiles recently conducted a Smart TV Buyer Insights Survey in which we asked over 3,000 respondents about the TV they own, aspects they look at on a new TV, and their TV buying preferences.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
Search and Society: Reimagining Information Access for Radical FuturesBhaskar Mitra
The field of Information retrieval (IR) is currently undergoing a transformative shift, at least partly due to the emerging applications of generative AI to information access. In this talk, we will deliberate on the sociotechnical implications of generative AI for information access. We will argue that there is both a critical necessity and an exciting opportunity for the IR community to re-center our research agendas on societal needs while dismantling the artificial separation between the work on fairness, accountability, transparency, and ethics in IR and the rest of IR research. Instead of adopting a reactionary strategy of trying to mitigate potential social harms from emerging technologies, the community should aim to proactively set the research agenda for the kinds of systems we should build inspired by diverse explicitly stated sociotechnical imaginaries. The sociotechnical imaginaries that underpin the design and development of information access technologies needs to be explicitly articulated, and we need to develop theories of change in context of these diverse perspectives. Our guiding future imaginaries must be informed by other academic fields, such as democratic theory and critical theory, and should be co-developed with social science scholars, legal scholars, civil rights and social justice activists, and artists, among others.
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
PHP Frameworks: I want to break free (IPC Berlin 2024)Ralf Eggert
In this presentation, we examine the challenges and limitations of relying too heavily on PHP frameworks in web development. We discuss the history of PHP and its frameworks to understand how this dependence has evolved. The focus will be on providing concrete tips and strategies to reduce reliance on these frameworks, based on real-world examples and practical considerations. The goal is to equip developers with the skills and knowledge to create more flexible and future-proof web applications. We'll explore the importance of maintaining autonomy in a rapidly changing tech landscape and how to make informed decisions in PHP development.
This talk is aimed at encouraging a more independent approach to using PHP frameworks, moving towards a more flexible and future-proof approach to PHP development.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...
Enhance your research impact through open science
1. Enhance your research impact
through open science
Gareth Knight
Research Data Manager
Library & Archives Service
researchdatamanagement@lshtm.ac.uk
2. Open Science
A broad movement that seeks to improve the quality of
research through greater:
• Transparency: Ensure methods are clearly explained and made
available earlier
• Consistency: Common standards, tools and services are used to
perform analysis.
• Collaboration: Opportunities are available for external
contribution & collaboration on research
• Access: All resources necessary to recreate the analysis are
made available in a form that enable verification & reuse
(Summary: it’s science with the benefit of 21st century tools)
3. Reproducibility Crisis
Vimes et al (2014) investigated data availability for 516 articles
published 2-22 years previous – odds of a dataset being
obtainable fell by 17% per year
A 2016 Nature survey revealed 52% of 1,576 surveyed researchers
considered there to be a 'significant' reproducibility crisis in
science.
• Approx. 68% of respondents failed to reproduce medical experiment.
Research replication is time-consuming and expensive
• Cancer Biology: https://osf.io/e81xl/wiki/home/
• Psychological Science - https://osf.io/ezcuj/wiki/home/
Retraction Watch lists 18,000+ papers that have been retracted,
many as a result of faulty science
Vimes et al (2014) https://doi.org/10.1016/j.cub.2013.11.014
Nature (2016) https://www.nature.com/news/1-500-scientists-lift-the-lid-on-reproducibility-1.19970
4. What are the benefits of open science?
Analysis of open research practices and motivations of
583 Wellcome & 259 ESRC funded researchers:
• Improved visibility of research
• More publications
• Higher citation rate – See Piwowar & Vision (2013)
• Contribute to academic profile
• Career benefits (e.g. promotion)
• New collaborations
Van den Eynden, V. et al. (2016) Towards Open Research: Practices, experiences, barriers and
Opportunities. Wellcome Trust. https://doi.org/10.6084/m9.figshare.4055448
Piwowar HA, Vision TJ. (2013) Data reuse and the open data citation advantage.
https://doi.org/10.7717/peerj.175
5. Open Science by Design
Plan
Collect
ManageAnalyse
Publish
https://www.flaticon.com/free-icon/scientist_857648
Enhanced
Research
standards
Enhanced
Research
standards
Open
Education
Resources
Open
Education
Resources
Open
software
Open
software
Citizen
Science &
peer review
opportunities
Citizen
Science &
peer review
opportunities
Open accessOpen access
Reusable
resources
Reusable
resources
7. Research Objectives
Research is reviewed for many purposes:
• Verification: check analysis to confirm conclusions are valid
• Replicate: Same methods applied to get same result, different
environment
• Reproduce: Same methods applied, different setup
• Reuse: same data, different research
What steps do you take to ensure research is easier to
validate/replicate/reproduce or reuse by others?
The Difference
https://xkcd.com/242/
8. Plan for openness from the outset
Plan
Be aware of
requirements
Consider
community
engagement
opportunities
Document
research
protocol &
publish
Data
collection
Inform
participants and
relevant
stakeholder
Acquire raw
data in
electronic form
using secure
systems (e.g.
ODK)
Data
Management
Organise
resources
logically
Ensure raw data
is read only
Assign unique
IDs to relevant
items
Data
processing
Automate
processing
activities (as far
as possible) in
an open format
to enable it to
be re-applied
Document
activities
performed to
ensure an audit
trail
Data analysis
Provide
opportunities
for relevant
individuals to
contribute
Store resources
used to
underpin
analysis (inc.
that used to
produce
graphs)
Reporting
Consider how
resources can
be made
accessible
Ensure
resources are
curated &
accessible in
the long-term
https://doi.org/10.1371/journal.pcbi.1003285
9. Openness requirements
Research practice
• Demonstrate rigour of research
Funder requirements:
• Gold vs. Green
• Publication status, research data, other outputs
Domain-specific reporting guidelines:
• For study protocol and project outputs
https://www.equator-network.org/
Journal policies:
• Transparency and Openness Promotion (TOP)
https://cos.io/our-services/top-guidelines/
• Joint Data Archiving Policy (JDAP)
https://datadryad.org//pages/jdap https://cos.io/prereg/
10. Storage and organisation
• Ensure project resources are stored in a location that is
secure and available to relevant parties
• Can you find files from a project completed 10 years ago?
• Store on Secure Server or other defined location
• Adopt a consistent structure to organise & label content
• Content type (data, documents, code)
• Version (raw, processed)
• Sensitivity – store personal info in secure locale
• Create a file inventory spreadsheet
• Filename, location, content, source, sensitivity, etc.
https://xkcd.com/1459/
11. Tidy data
Common issues:
• Column headers contain values
• Multiple variables held in 1
column.
• Variables held in both rows and
columns.
• Multiple types of observation
recorded in the same table.
Wickham applies 3rd Normal Form:
• One row for each observation
• One column for each variable
• One table for each type of
observation
• Column headers (where they are
used) should be variable names,
Tidy data tools:
tidyr, dplyr, ggplot2, data.table, pandas
A set of principles to make data more consistent
https://www.jstatsoft.org/article/view/v059i10/v59i10.pdf
12. Documentation & metadata
What info is needed to replicate or re-apply your analysis?
What info is needed to analyse and use your data?
User guide:
• Study design and data collection methods
• Data Analysis and Preparation
• Quality checks applied
Codebook:
• Variable type (Continuous, Ordinal, Categorical,
Missing values, censored/redacted)
• Permitted responses & their meaning (what is 1?)
• Abbreviations & phrases
• Research protocols
• Standard Operating Procedures
• Codebooks & data dictionaries
• Informed Consent form &
participant information sheet
• Questionnaires, interview
guide and other collection tools
• Data papers and other
publications
• Other relevant documents
http://www.dcc.ac.uk/resources/metadata-standards
13. Working with code and scripts in workflows
• Use ‘open’ programming/scripting languages not dependent upon
proprietary software
• Don’t reinvent the wheel: reuse existing code if it serves purpose
• Don’t update the source data, generate a derived file & label the version
no.
• Ensure a header to code files that explains their purpose and indicate
who created it & when
• Add comments throughout code explaining purpose of functions/specific
lines (if not obvious)
• Document dependencies, including version number
14. Providing access to resources
What do you
make available?
Anonymised data
Code
Research tools
Workflows
When do you
make it available?
-
During the project
lifetime
On publication of
findings
Within 6-12 months of
publication
Where do you
host it?
What platforms are
appropriate to your
needs?
How will access
be provided?
Open vs. controlled access
Need a reason
Participant consent, identifiable
-
How will it be managed?
Corresponding author,
Data Access Committee,
Data Sharing Agreement
https://www.flickr.com/photos/lwr/3897479560
https://www.flickr.com/photos/ryanr/142455033/
15. Data sharing principles
Publish a description
in a research catalogue
Obtain a permanent ID
to make it easy to cite
Provide clear method to
obtain files – open vs.
safeguarded
Handle access consistently
(PLOS req.)
Use recognised domain
standards & vocabularies
Common formats, e.g.
STATA, CSV
Apply clear usage licence -
Creative Commons or other
Provide documentation
relevant to researchers in
your field
The FAIR Guiding Principles for scientific data management and stewardship
16. Resource management tools
Functionality:
• Lifecycle management
• Object & version identifiers
• Workflow description standards that balance generic &
domain specific needs (E.g. DDI lifecycle, BPM variants)
Platforms:
• Electronic Lab Notebooks (Rspace, SciNote, LabArchives
• Code hosting: My Experiment, runmycode, Github/lab
• Repository platforms: OSF, Data Compass
17. Analysis and reporting tools
Growing number of online tools allow you to
create and share interactive documents that
contain live code, data, and other resources
• R Markdown - https://rmarkdown.rstudio.com/
• Jupyter - http://jupyter.org/
• Collaboratory https://colab.research.google.com/
• Benefits:
• Dynamic content that combines data & analysis
• Development environment - R, Python SQL.
• Disadvantages:
• Another complex platform to host & manage
• Content will become publicly accessible
Images sourced from project webpages
18. In summary
Open science requires you to consider:
• Research stakeholders who will be interested in
your work
• The value of research outputs for verification and
further use
• Systems that will be used to collect, manage,
analyse and provide access to research
https://www.flickr.com/photos/keith_marshall_avery/8132240925/