The challenge of sharing data well, how publishers can helpVarsha Khodiyar
Researchers, academic institutes and funders are increasingly recognizing the importance of data sharing for reproducible science. However, it is not always straightforward and clear to researchers as to how best to share data in a useful way. At Springer Nature we are working on several initiatives to help facilitate the sharing of research data in a reusable way, with our overarching goal being to publish research that is robust and reproducible. I will talk about the effort that goes into our flagship data journal, Scientific Data, to facilitate best practices in publication and sharing of research data, and share some of our experiences publishing Challenge datasets. I will also describe some of the newer Research Data Services that are now available to help all researchers (not only Springer Nature authors) to share their data in a useful way.
Presentation by Ruth Wilson on Nature Publishing Group's Scientific Data journal given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
The challenge of sharing data well, how publishers can helpVarsha Khodiyar
Researchers, academic institutes and funders are increasingly recognizing the importance of data sharing for reproducible science. However, it is not always straightforward and clear to researchers as to how best to share data in a useful way. At Springer Nature we are working on several initiatives to help facilitate the sharing of research data in a reusable way, with our overarching goal being to publish research that is robust and reproducible. I will talk about the effort that goes into our flagship data journal, Scientific Data, to facilitate best practices in publication and sharing of research data, and share some of our experiences publishing Challenge datasets. I will also describe some of the newer Research Data Services that are now available to help all researchers (not only Springer Nature authors) to share their data in a useful way.
Presentation by Ruth Wilson on Nature Publishing Group's Scientific Data journal given at the Now and Future of Data Publishing Symposium, 22 May 2013, Oxford, UK
Identifying and tracking research resources using RRIDs: a practical approachdkNET
At this presentation, you will learn (1) Why you need to use Research Resource identifier (RRID) (2) What is Resource Identification Initiative (3) How dkNET.org supports RRID (4) What can you do with RRID
DataONE Education Module 01: Why Data Management?DataONE
Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Presentation slides on Open Science and research reproducibility. Presented by Gareth Knight (LSHTM Research Data Manager) on 18th September 2018, as part of an Open Science event for LSHTM Week 2018.
Transparency and reproducibility in researchLouise Corti
Talk given at the ESS Summer School: An introduction to using big data in the social sciences, 20-24 July 2020, University of Essex, Colchester, UK.
In the morning we look at publishing and sharing data and the importance of research replication, code sharing, examining what methodological issues peer reviewers might look for in a published paper using big data. An increasing number of journals in the sciences and social sciences expect a high degree of transparency and knowing how best to publish high quality raw (or processed data), methodology and code is a useful skill. We show how ‘data papers’ help to elucidate how datasets were constructed, compiled and processed, and help to showcase the value of data beyond the original research.
Wouter Haak's presentation on open science and research data management from the Elsevier Library Connect Event 2016 "Navigating the new publishing & open science terrain: what librarians need to know." Wouter is Elsevier's Vice President of Research Data Management Solutions.
Spring 2014 Data Management Lab: Session 2 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Lesson 7 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This presentation was provided by Patricia Payton of Proquest during the NISO webinar, Engineering Access Under the Hood, Part Two, held on November 15, 2017.
DataONE Education Module 03: Data Management PlanningDataONE
Lesson 3 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Preparing your data for sharing and publishingVarsha Khodiyar
Talk given as part of the MRC Cognition and Brain Sciences Unit Open Science Day on 20th November 2018 , University of Cambridge (https://www.eventbrite.co.uk/e/open-science-day-at-the-mrc-cbu-tickets-50363553745)
Identifying and tracking research resources using RRIDs: a practical approachdkNET
At this presentation, you will learn (1) Why you need to use Research Resource identifier (RRID) (2) What is Resource Identification Initiative (3) How dkNET.org supports RRID (4) What can you do with RRID
DataONE Education Module 01: Why Data Management?DataONE
Lesson 1 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Presentation slides on Open Science and research reproducibility. Presented by Gareth Knight (LSHTM Research Data Manager) on 18th September 2018, as part of an Open Science event for LSHTM Week 2018.
Transparency and reproducibility in researchLouise Corti
Talk given at the ESS Summer School: An introduction to using big data in the social sciences, 20-24 July 2020, University of Essex, Colchester, UK.
In the morning we look at publishing and sharing data and the importance of research replication, code sharing, examining what methodological issues peer reviewers might look for in a published paper using big data. An increasing number of journals in the sciences and social sciences expect a high degree of transparency and knowing how best to publish high quality raw (or processed data), methodology and code is a useful skill. We show how ‘data papers’ help to elucidate how datasets were constructed, compiled and processed, and help to showcase the value of data beyond the original research.
Wouter Haak's presentation on open science and research data management from the Elsevier Library Connect Event 2016 "Navigating the new publishing & open science terrain: what librarians need to know." Wouter is Elsevier's Vice President of Research Data Management Solutions.
Spring 2014 Data Management Lab: Session 2 Slides (more details at http://ulib.iupui.edu/digitalscholarship/dataservices/datamgmtlab)
What you will learn:
1. Build awareness of research data management issues associated with digital data.
2. Introduce methods to address common data management issues and facilitate data integrity.
3. Introduce institutional resources supporting effective data management methods.
4. Build proficiency in applying these methods.
5. Build strategic skills that enable attendees to solve new data management problems.
Lesson 7 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
This presentation was provided by Patricia Payton of Proquest during the NISO webinar, Engineering Access Under the Hood, Part Two, held on November 15, 2017.
DataONE Education Module 03: Data Management PlanningDataONE
Lesson 3 in a set of 10 created by DataONE on Best Practices fo Data Management. The full module can be downloaded from the DataONE.org website at: http://www.dataone.org/educaiton-modules. Released under a CC0 license, attribution and citation requested.
Preparing your data for sharing and publishingVarsha Khodiyar
Talk given as part of the MRC Cognition and Brain Sciences Unit Open Science Day on 20th November 2018 , University of Cambridge (https://www.eventbrite.co.uk/e/open-science-day-at-the-mrc-cbu-tickets-50363553745)
Presentation to IASSIST 2013, in the session Expanding Scholarship: Research Journals and Data Linkages. Describes PREPARDE workshop on repository accreditation for data publication and invites comments on guidelines.
INSERM Workshop 246 - Management and reuse of health data: methodological issues: https://ateliersinserm.dakini.fr/en/workshop.246.management.and.reuse.of.health.data.methodological.issues-66-22.php
Capturing and Analyzing Publication, Citation and Usage Data for Contextual C...NASIG
Libraries have long sought to demonstrate the value of their collections through a variety of usage statistics. Traditionally, a strong emphasis is placed on high usage statistics when evaluating journals in collection development discussions. However, as budget pressures persist, administrators are increasingly concerned with looking beyond traditional usage metrics to determine the real impact of library services and collections. By examining journal usage in the context of scholarly communication, we hope to gain a more holistic understanding of the use and impact of our library’s resources. In this session, we begin by outlining our methodology for gathering comprehensive publication and citation data for authors affiliated with Northwestern University’s Feinberg School of Medicine, utilizing Web of Science as our primary data source and leveraging a custom Python script to manage the data. Using this data we discuss various potential metrics that could be employed to measure and evaluate journals in institutional and field-specific contexts, including but not limited to: number of publications and references per journal, co-citation networks, percentage of references per journal, and increases or decreases of references over time per title. We then consider the development of normalized benchmarks and criteria for creating field-specific core journal lists. We also discuss a process for establishing usage thresholds to evaluate existing journal subscriptions and to highlight potential gaps in the collection. Finally, we apply and compare these metrics to traditional collection development tools like COUNTER usage reports, cost-per-use analysis, Inter-Library Loan statistics and turnaway reports, to determine what correlations or discrepancies might exist. We finish by highlighting some use-cases which demonstrate the value of considering publication and citation metrics, and provide suggestions for incorporating these metrics into library collection development practices.
Speakers: Joelen Pastva and Jonathan Shank, Northwestern University
Project GitHub page: https://goo.gl/2C2Pcy
FAIR data has flown up the hype curve without a clear sense of return from the required data stewardship investment. The killer use case for FAIR data is a science knowledge graph. It enables you to richly address novel questions of your and the world’s data. We started with data catalogues (findability) which exploited linked/referenced data using a few focused vocabularies (interoperability), for credentialed users (accessibility), with provenance and attribution (reusability) to make this happen.
This talk was presented at The Molecular Medicine Tri-Conference/Bio-IT West on March 11, 2019.
Introduction to research data managementdri_ireland
An Introduction to Research Data Management: slides from a presentation given online on May 12 2022, by Beth Knazook, Project Manager, Research Data. Covers topics such as: what are research data; why share research data; why DMPs are important; and where should you share your data?
FAIR Data Knowledge Graphs–from Theory to PracticeTom Plasterer
FAIR data has flown up the hype curve without a clear sense of return from the required data stewardship investment. The killer use case for FAIR data is a science knowledge graph. It enables you to richly address novel questions of your and the world’s data. We started with data catalogues (findability) which exploited linked/referenced data using a few focused vocabularies (interoperability), for credentialed users (accessibility), with provenance and attribution (reusability) to make this happen. Our processes enable simple creation of dataset records and linking to source data, providing a seamless federated knowledge graph for novice and advanced users alike.
Presented May 7th, 2019 at the Knowledge Graph Conference, Columbia University.
Research Data Sharing and Re-Use: Practical Implications for Data Citation Pr...SC CTSI at USC and CHLA
Date: Apr 4, 2018
Speaker: Hyoungjoo Park, PhD candidate, School of Information Studies, University of Wisconsin-Milwaukee, and Dietmar Wolfram, PhD
Overview: It is increasingly common for researchers to make their data freely available. This is often a requirement of funding agencies but also consistent with the principles of open science, according to which all research data should be shared and made available for reuse. Once data is reused, the researchers who have provided access to it should be acknowledged for their contributions, much as authors are recognised for their publications through citation. Hyoungjoo Park and Dietmar Wolfram have studied characteristics of data sharing, reuse, and citation and found that current data citation practices do not yet benefit data sharers, with little or no consistency in their format. More formalised citation practices might encourage more authors to make their data available for reuse.
GSmith Springer Nature Data policies and practices: HKU Open Data and Data Pu...GrahamSmith646206
Supporting research data across Springer Nature: joining up policy and practice. Slides from Graham Smith (Research Data Manager, Springer Nature) at HKU Open Data and Data Publishing Seminar, 25th October 2021.
Digital transformation to enable a FAIR approach for health data scienceVarsha Khodiyar
Invited talk for ConTech Pharma on 1st March 2022
Abstract
Health Data Research UK is the UK’s national institute for health data science, with a mission to unite the UK’s health data to enable discoveries that improve people’s lives. In this talk, Dr Varsha Khodiyar will outline how HDR UK is bringing together disparate health data from all four countries of the United Kingdom, creating the infrastructure to enable discovery of and access to health data, and the convening standards making bodies to improve data linkage and data reuse. Varsha will also discuss how HDR UK is moving beyond the traditional confines of FAIR data to also ensure that data sharing and data use is transparent and ‘fair’ for the patients and lay public who are the subjects of these datasets.
Lessons from the UK: Data access, patient trust & real-world impact with heal...Varsha Khodiyar
Slides supporting presentation given at the virtual Beilstein Open Science Symposium in October 2021.
Abstract:
Health Data Research UK’s mission is to unite the UK’s health data to enable discoveries that improve people’s lives. Our 20-year vision is for large scale data and advanced analytics to benefit every patient interaction, clinical trial, biomedical discovery and enhance public health. A key part of HDR UK’s vision is our data portal, the Innovation Gateway. The Gateway facilitates discovery of healthcare data and simplifies data request procedures across multiple data custodians. The Gateway contains metadata on a variety of datasets, including those related to COVID-19, cardiovascular, maternal health, emergency care, primary care, secondary care, acute care, palliative care, biobanks, research cohorts and deeply phenotyped patient cohorts.
From the outset HDR UK has sought the voices, views and experiences of patient and lay-public groups to ensure there is transparency and clear public benefit in the use of the UK’s health data. Patient and public involvement is key to making the Gateway accessible, transparent and to ensure public confidence in research access to health data. The importance of public outreach combined with providing research access to data is illustrated with HDR UK’s contribution to the UK’s coronavirus pandemic response. HDR UK was tasked by the UK’s Chief Scientific Office to build and facilitate the infrastructure to support the National Core Studies, providing key insights on the evolving situation to UK policy makers during the course of the pandemic.
In this talk, I will show how HDR UK is enabling open science by facilitating the discovery of health data, and simplifying the process of requesting access to multiple datasets. I’ll discuss HDR UK’s approach to embedding transparency on research data usage for patients and public, and summarise some of the key ways in which HDR UK has contributed to the coronavirus pandemic.
The information in this slide deck was presented at the Covid Crisis in India - Information & Appeal on Sunday 23rd May 2021.
If you find the information in this slide deck useful, please donate to https://justgiving.com/fundraising/covidcrisisinindia
Data citation and sharing during article publicationVarsha Khodiyar
Deck presented to CHORUS forum on 21st Jan 2021, as part of panel on Data Citations & Sharing (https://www.chorusaccess.org/events/chorus-forum-new-connections/)
What role can publishers play in the open data ecosystem?Varsha Khodiyar
Presentation at session 3 of the NIH workshop 'Role of Generalist Repositories to Enhance Data Discoverability and Reuse' on Feb 11th, at the NIH Main Campus.
New approaches to data management: supporting FAIR data sharing at Springer N...Varsha Khodiyar
Presentation given at Biocuration 2019 Session 5 (Data standards and ontologies: Making data FAIR)
Abstract:
Since 2016, academic publishers including Springer Nature, Elsevier and Taylor & Francis have been providing standard research data policies to journal authors, reflecting key aspects of the FAIR Principles’ practical applications: sharing data in repositories, using persistent identifiers and citing data appropriately. In spite of the rise of FAIR and good data management practice, recent surveys found that nearly 60% of researchers had never heard of the FAIR Principles, and 46% are not sure how to organise their data in a presentable and useful way. In this presentation we will analyse the results of a white paper which assessed the key challenges faced by researchers in sharing their data, and discuss current initiatives and approaches to support researchers to adopt good data sharing practice.
These include the roll-out of research data policies since 2016, as well as the launch of a Helpdesk service which has provided support to authors and allowed the research data team to capture more granular information on the challenges they face in sharing their data. We will also discuss the development of a third-party curation service which assists authors in depositing their data into appropriate repositories, and drafting data availability statements.
Finally we will assess the impacts of some of these interventions, including an analysis of data availability statements and an overview of the methods authors are currently using to share their data, and how these align with FAIR.
The value of data curation as part of the publishing processVarsha Khodiyar
Presentation given at Biocuration 2019 Session 5 (Interacting with the Research Community)
Abstract:Journals and publishers have an important role to play in the drive to increase the reproducibility of published science. Since its launch in 2014, the Nature Research journal Scientific Data has established a reputation for publishing data papers (‘Data Descriptors’) that are highly reusable, as evidenced by a strong citation record. One of the ways in which Scientific Data ensures maximum reusability of published data is via the in-house data curation workflow applied to every Data Descriptor. In 2017, Springer Nature launched its Research Data Support (RDS) service to provide data curation expertise to researchers publishing at other Springer Nature journals.
During curation at Scientific Data and RDS, our data editors familiarise themselves with the related manuscript and perform a thorough check of each data archive. This ensures the descriptions in the manuscript match the metadata and data at the data repositories. The curation process facilitates the identification of any discrepancies between the manuscript text and the information held at the data repository.
Over the last year, the curation team have been recording the types of discrepancies rectified as a direct result of our curation process. At Scientific Data approximately 10% of the discrepancies the team find are significant enough to potentially have warranted a formal correction had the issue had not been resolved prior to publication.
In this presentation we give an overview of our observed outcomes from embedding data curation within the publishing process. We describe of how we are monitoring the value of our curation work, and show examples of the types of discrepancy most commonly identified through curation at Scientific Data and RDS.
Facilitating good research data management practice as part of scholarly publ...Varsha Khodiyar
Presentation given to the SciDataCon #IDW2018 session: Democratising Data Publishing: A Global Perspective, on Tuesday 6th November 2018, Gaborone, Botswana
Practical challenges for researchers in data sharingVarsha Khodiyar
Presentation given at the Research Data Alliance Plenary 12 session: IG Open Questionnaire for Research Data Sharing Survey, on Tuesday 6th November 2018, Gaborone, Botswana
Update from Data policy standardisation and implementation IGVarsha Khodiyar
Update given to the Research Data Alliance Plenary 12 joint meeting session: WG FAIRSharing Registry and Data Policy Standardisation and Implementation IG, on Monday 5th November 2018, Gaborone, Botswana
Data Publishing and Institutional RepositoriesVarsha Khodiyar
Slides presented at the Force16 panel discussion on 18th April 2016 "Libraries united in opening new scholarly platforms" https://www.force11.org/meetings/force2016/program/agenda/concurrent-session-libraries-united-opening-new-scholarly
Presentation given at Open Science question and answer session hosted by the Institute for Quantitative Social Science (IQSS), and the Office for Scholarly Communication (OSC) at Harvard University, on July 16th 2014.
Slides shown to BOSC2014 (Bioinformatics Open Source Conference 2014) attendees as an introduction to the open science journal F1000Research, prior to a panel discussion on reproducibility.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Richard's aventures in two entangled wonderlandsRichard Gill
Since the loophole-free Bell experiments of 2020 and the Nobel prizes in physics of 2022, critics of Bell's work have retreated to the fortress of super-determinism. Now, super-determinism is a derogatory word - it just means "determinism". Palmer, Hance and Hossenfelder argue that quantum mechanics and determinism are not incompatible, using a sophisticated mathematical construction based on a subtle thinning of allowed states and measurements in quantum mechanics, such that what is left appears to make Bell's argument fail, without altering the empirical predictions of quantum mechanics. I think however that it is a smoke screen, and the slogan "lost in math" comes to my mind. I will discuss some other recent disproofs of Bell's theorem using the language of causality based on causal graphs. Causal thinking is also central to law and justice. I will mention surprising connections to my work on serial killer nurse cases, in particular the Dutch case of Lucia de Berk and the current UK case of Lucy Letby.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Observation of Io’s Resurfacing via Plume Deposition Using Ground-based Adapt...Sérgio Sacani
Since volcanic activity was first discovered on Io from Voyager images in 1979, changes
on Io’s surface have been monitored from both spacecraft and ground-based telescopes.
Here, we present the highest spatial resolution images of Io ever obtained from a groundbased telescope. These images, acquired by the SHARK-VIS instrument on the Large
Binocular Telescope, show evidence of a major resurfacing event on Io’s trailing hemisphere. When compared to the most recent spacecraft images, the SHARK-VIS images
show that a plume deposit from a powerful eruption at Pillan Patera has covered part
of the long-lived Pele plume deposit. Although this type of resurfacing event may be common on Io, few have been detected due to the rarity of spacecraft visits and the previously low spatial resolution available from Earth-based telescopes. The SHARK-VIS instrument ushers in a new era of high resolution imaging of Io’s surface using adaptive
optics at visible wavelengths.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Cancer cell metabolism: special Reference to Lactate Pathway
Data peer review workshop
1. Show me the data!
Data peer review at Scientific Data
Varsha Khodiyar, Scientific Data
30.03.2017
2. 1
Scientific Data, a Nature Research journal
Data Descriptor
Primary article type; sound
science and facilitates data
reuse
Analysis
New analyses or meta-
analyses of existing data
Article
Original reports on
advances in data sharing &
reuse
Comment
Announcements of broad
interest; usually invited
www.nature.com/scientificdata
3. 2
Under the hood of a Data Descriptor
• Context for data generation
(background)
• How was data generated?
• How was data processed?
• Where is the data?
• Synthesis
• Analysis
• Conclusions
4. 3
A key principle of publishing at Scientific Data
Wilkinson M.D., et al . The FAIR Guiding Principles for
scientific data management and stewardship.
Scientific Data 3; 160018 (2016)
doi:10.1038/sdata.2016.18
Findable – (meta)data is uniquely and
persistently identifiable.
Accessible – data is reachable and
accessible by humans and machines, using
standard formats and protocols.
Interoperable – (meta)data is machine
readable and annotated with resolvable
vocabularies and ontologies.
Reusable – (meta)data is sufficiently well-
described to allow integration with
compatible data.
5. 4
Data Descriptors have human and machine
understandable components
Human readable
representation of
study
i.e. article (HTML &
PDF)
Human readable
representation of
study
i.e. article (HTML &
PDF)
6. 5
Data Descriptors have human and machine
understandable components
Machine accessible
representation of
study
i.e. metadata
7. 6
What types of data can be published?
6
Decades old
dataset
Standalone
dataset
Data that has been
used in an analysis
article
Large
consortium
dataset
Data from a
single
experiment
Any data that the researcher
finds valuable and that others
might find useful too
Data associated with a
high impact analysis
article
8. 7
When can a Data Descriptor be published?
7
After data
analysis has been
published
Before analysis has
been published
Authors not
intending to
analyse data
Data Descriptors can be
submitted and published at
any point in the research
workflow, i.e. whenever it
makes most sense for your
data
After data
analysis has been
published
Before the analysis
has been
published
Publication alongside
analysis article
10. 9
Researchers are sharing and reusing data
• Direct contact between researchers
(on request) is the most common
way of sharing data
• Repositories are second most
common method of sharing
Why might direct contact be the
most preferred method?
Fig 2A & C; Kratz and Strasser, PLOS ONE (2015)
doi: 10.1371/journal.pone.0117619
11. 10
Researchers see peer review as a mark of data quality
• Respondents trust peer review above all else: 72% (n = 175) say peer review
confers high or complete confidence in the data
Figure 6B; Kratz and Strasser, PLOS ONE (2015) doi: 10.1371/journal.pone.0117619
14. 13
Selection of Editorial Board members
Experts in their discipline
AND
Demonstrable experience of data standards, data reuse or data analysis in
their discipline
www.nature.com/sdata/about/editorial-board#eb
15. 14
Data peer review
www.nature.com/sdata/policies/for-referees
Experimental
Rigor and
Technical Data
Quality
Were data produced in a sound manner?
Technical quality of data – appropriate statistical analyses?
Experimental rigor - appropriate depth, coverage?
Completeness
of the
Description
Sufficient detail to allow others to reproduce these steps?
Sufficient detail to allow others to reuse this data?
Consistent with relevant minimum reporting standards?
Integrity of the
Data Files and
Repository
Record
Do data files appear complete and match manuscript
descriptions?
Are data archived to the most appropriate repository?
16. 15
We capture metadata about the dataset being described in each Data Descriptor.
During the metadata curation process
• Manuscript re-read
• Data archive checked
• Minor issues with the data and/or manuscript often identified
Metadata curation and final data checking
17. 16
Why a Data Descriptor may be rejected
Reject without review
• Out of scope or no data present
Reject after review
• Serious flaws in the study design,
e.g. lack of crucial controls
• Serious issues identified in the data
files by the peer reviewers
After rejection
• Address concerns and resubmit to Scientific Data
• Resubmit to another data journal
• Withdraw data from Scientific Data integrated repositories
Data should be technically reliable and suitable for use by others
19. 18
Create a data management plan
• Can avoid problems later
• Increasingly required by funders
• Critically evaluate existing practices – you may be setting standards for
your field
• Some aspects of best practice may incur costs
• Find people and resources that can help you
Datasets CodeMetadataResearch paper
Nature Genetics
20. 19
Archive your data to the most appropriate repository
We currently list around 90
repositories, across biological,
medical, physical and social sciences
www.nature.com/sdata/policies/repositories
Considerations:
1. Is there a discipline or data-specific repository for your data?
2. If no discipline or data-specific repository for your data exists, does your
funder or institution mandate deposition to a particular repository?
21. 20
Spot the mistakes
Unhelpful
document name
Formatting used to
convey information
Special characters
can cause text
mining errors
Meaningless
column titles
Undefined
abbreviation No units are
given
25. 24
Increasing reproducibility
• Include any additional information needed to understand the data,
methods, parameters, e.g. which instrument (make and model) was
used to measure blood carbon dioxide levels?
• Include availability statements for any code that was used to view,
parse or analyse the data, in support of the conclusions.
28. 27
Data reuse by other researchers in the same field
2
“The Data Descriptor made it easier to
use the data, for me it was critical that
everything was there…all the technical
details like voxel size.”
Professor Daniele Marinazzo
30. 29
Data reuse by the non-research community
2
http://www.nytimes.com/interactive/2014/12/30/science/history-of-ebola-in-24-outbreaks.html
31. 30
Data peer review at Scientific Data
Data Archive
• Checked multiple times
• Scientific reasoning underlying data reviewed by active researchers
• Technical validity reviewed by discipline experts
Data
Citations
• Citation accuracy confirmed by specialist editor
• Citation format checked by editorial team
• Data linkage tested by production team
Data Peer
Review
• Does not have to be onerous
• Can save overall reviewing time
• Results in data that is reusable and useful!