Genome sharing projects across the world
Did you ever wonder what happened to the exponential increase in genome sequencing data? It is out there around the world and a lot of it is consented for research use. This means that if you just know where to find the data, you can potentially analyse gigabytes of data to power your research.
In this talk Fiona will present community genome initiatives, the genome sharing projects across the world, how you can benefit from this wealth of data in your work, and how you can boost your academic career by sharing and collaboration.
by Fiona Nielsen, Founder and CEO of DNAdigest and Repositive
With a background in software development Fiona pursued her career in bioinformatics research at Radboud University Nijmegen. Now a scientist-turned-entrepreneur Fiona founded DNAdigest and its social enterprise spin-out Repositive Ltd. Both the charity and company focus on efficient and ethical sharing of genetics data for research to accelerate diagnostics and cures for genetic diseases.
Workshop finding and accessing data - fiona - lunteren april 18 2016Fiona Nielsen
Workshop presentation on finding and accessing human genomics data for research.
Including statistics of publicly available data sources and tips on how to save time in your workflow of data access.
Presented at BioSB2016, pre-conference PhD retreat for young researchers in bioinformatics and systems biology at Congrescentrum De Werelt in Lunteren. #BioSB2016 #BioSB16
Link to event:
http://www.youngcb.nl/events/biosb-phd-retreat-2016/
Read more about my work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...Fiona Nielsen
Workshop presentation on finding and accessing human genomics data for research.
Including statistics of publicly available data sources and tips on how to save time in your workflow of data access.
Organised in collaboration between DNAdigest and Open Data Cambridge.
Read more about our work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
http://www.data.cam.ac.uk
Why i left my job in genomics R&D - Lunteren - april 18 - 2016Fiona Nielsen
Career talk about how I moved from bioinformatics scientist to become an entrepreneur.
Presented at BioSB2016, pre-conference PhD retreat for young researchers in bioinformatics and systems biology at Congrescentrum De Werelt in Lunteren. #BioSB2016 #BioSB16
Link to event:
http://www.youngcb.nl/events/biosb-phd-retreat-2016/
Read more about my work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
http://dlab.berkeley.edu/event/open-research-challenge-peer-review-and-publication-research-data
A talk by Dr. Jonathan Tedds, Senior Research Fellow, D2K Data to Knowledge, Dept of Health Sciences, University of Leicester.
PI: #BRISSKit www.brisskit.le.ac.uk
PI: #PREPARDE www.le.ac.uk/projects/preparde
The Peer REview for Publication & Accreditation of Research data in the Earth sciences (PREPARDE) project seeks to capture the processes and procedures required to publish a scientific dataset, ranging from ingestion into a data repository, through to formal publication in a data journal. It will also address key issues arising in the data publication paradigm, namely, how does one peer-review a dataset, what criteria are needed for a repository to be considered objectively trustworthy, and how can datasets and journal publications be effectively cross-linked for the benefit of the wider research community.
I will discuss this and alternative approaches to research data management and publishing through examples in astronomy, biomedical and interdisciplinary research including the arts and humanities. Who can help in the long tail of research if lacking established data centers, archives or adequate institutional support? How much can we transfer from the so called “big data” sciences to other settings and where does the institution fit in with all this? What about software?
Publishing research data brings a wide and differing range of challenges for all involved, whatever the discipline. In PREPARDE we also considered the pre and post publication peer review paradigm, as implemented in the F1000 Research Publishing Model for the life sciences. Finally, in an era of truly international research how might we coordinate the many institutional, regional, national and international initiatives – has the time come for an international Research Data Alliance?
Why study Data Sharing? (+ why share your data)Heather Piwowar
A presentation to the DBMI department at the University of Pittsburgh about data sharing and reuse: what this means, why it is important, some of what we’ve learned, and what we still don’t know.
Workshop finding and accessing data - fiona - lunteren april 18 2016Fiona Nielsen
Workshop presentation on finding and accessing human genomics data for research.
Including statistics of publicly available data sources and tips on how to save time in your workflow of data access.
Presented at BioSB2016, pre-conference PhD retreat for young researchers in bioinformatics and systems biology at Congrescentrum De Werelt in Lunteren. #BioSB2016 #BioSB16
Link to event:
http://www.youngcb.nl/events/biosb-phd-retreat-2016/
Read more about my work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Workshop finding and accessing data - fiona nadia charlotte - cambridge apr...Fiona Nielsen
Workshop presentation on finding and accessing human genomics data for research.
Including statistics of publicly available data sources and tips on how to save time in your workflow of data access.
Organised in collaboration between DNAdigest and Open Data Cambridge.
Read more about our work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
http://www.data.cam.ac.uk
Why i left my job in genomics R&D - Lunteren - april 18 - 2016Fiona Nielsen
Career talk about how I moved from bioinformatics scientist to become an entrepreneur.
Presented at BioSB2016, pre-conference PhD retreat for young researchers in bioinformatics and systems biology at Congrescentrum De Werelt in Lunteren. #BioSB2016 #BioSB16
Link to event:
http://www.youngcb.nl/events/biosb-phd-retreat-2016/
Read more about my work:
http://DNAdigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Workshop - finding and accessing data - Cambridge August 22 2016Fiona Nielsen
Finding and accessing human genomic data for research
University of Cambridge, United Kingdom | Seminar Room G
Monday, 22 August 2016 from 10:00 to 12:00 (BST)
Charlotte, Nadia and Fiona presented an overview of data sources around the world where you can find genomics data for your research and gave examples of the data access application for dbGaP and EGA with specific details relevant for University of Cambridge researchers.
Jonathan Tedds Distinguished Lecture at DLab, UC Berkeley, 12 Sep 2013: "The ...Jonathan Tedds
http://dlab.berkeley.edu/event/open-research-challenge-peer-review-and-publication-research-data
A talk by Dr. Jonathan Tedds, Senior Research Fellow, D2K Data to Knowledge, Dept of Health Sciences, University of Leicester.
PI: #BRISSKit www.brisskit.le.ac.uk
PI: #PREPARDE www.le.ac.uk/projects/preparde
The Peer REview for Publication & Accreditation of Research data in the Earth sciences (PREPARDE) project seeks to capture the processes and procedures required to publish a scientific dataset, ranging from ingestion into a data repository, through to formal publication in a data journal. It will also address key issues arising in the data publication paradigm, namely, how does one peer-review a dataset, what criteria are needed for a repository to be considered objectively trustworthy, and how can datasets and journal publications be effectively cross-linked for the benefit of the wider research community.
I will discuss this and alternative approaches to research data management and publishing through examples in astronomy, biomedical and interdisciplinary research including the arts and humanities. Who can help in the long tail of research if lacking established data centers, archives or adequate institutional support? How much can we transfer from the so called “big data” sciences to other settings and where does the institution fit in with all this? What about software?
Publishing research data brings a wide and differing range of challenges for all involved, whatever the discipline. In PREPARDE we also considered the pre and post publication peer review paradigm, as implemented in the F1000 Research Publishing Model for the life sciences. Finally, in an era of truly international research how might we coordinate the many institutional, regional, national and international initiatives – has the time come for an international Research Data Alliance?
Why study Data Sharing? (+ why share your data)Heather Piwowar
A presentation to the DBMI department at the University of Pittsburgh about data sharing and reuse: what this means, why it is important, some of what we’ve learned, and what we still don’t know.
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
Increasing transparency in Medical Education through Open Data Rebecca Grant
Slides presented at the AMEE Virtual Conference 2021, introducing the MedEdPublish platform and data policies. Approaches to sharing sensitive human data, and particulary qualitative data, are discussed.
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
Scot Edmunds talk at CODATA2019 on Quantifying how FAIR is Hong Kong: The Hong Kong Shareability of Hong Kong University Research Experiment. 19th September 2019 in Beijing
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering science data, medical data and ethics, and the FAIR data principles.
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
Based on a survey of over 4,500 researchers published in the white paper The State of Open Data 2020, this session will explore the impacts of the pandemic on early career reearchers (ECRs), their research practice, and how they interact with open data. We will discuss the specific challenges reported by ECRs, as well as the gaps in training and support that they have identified that would encourage their sharing and reuse of research data.
Presentation at the E-ARMA conference 2021.
With the explosion of interest in both enhanced knowledge management and open science, the past few years have seen considerable discussion about making scientific data “FAIR” — findable, accessible, interoperable, and reusable. The problem is that most scientific datasets are not FAIR. When left to their own devices, scientists do an absolutely terrible job creating the metadata that describe the experimental datasets that make their way in online repositories. The lack of standardization makes it extremely difficult for other investigators to locate relevant datasets, to re-analyse them, and to integrate those datasets with other data. The Center for Expanded Data Annotation and Retrieval (CEDAR) has the goal of enhancing the authoring of experimental metadata to make online datasets more useful to the scientific community. The CEDAR work bench for metadata management will be presented in this webinar. CEDAR illustrates the importance of semantic technology to driving open science. It also demonstrates a means for simplifying access to scientific data sets and enhancing the reuse of the data to drive new discoveries.
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
Keynote at JISC Digifest 2015 on Reproducibility and Research Objects in Scholarly Communication
Includes hidden slides
All material except maybe the IT Crowd screengrab reusable
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
Digital badges have previously been shown to incentivise journal authors to share their data openly. In this paper we introduce an Open data badging project at the Springer Nature journal BMC Microbiology. The development of the Open data badge is described, as well as the challenges of developing standard badging criteria and ensuring authors’ awareness of the badges. Next steps for the badging project are outlined, which are based on the experiences of the team assessing the badges, the number of badges awarded at the journal to date, and the results of an author survey.
Managing Ireland's Research Data - 3 Research MethodsRebecca Grant
Slides providing an overview of the research methods used in the author's thesis, "Managing Ireland's Research Data: Recognising Roles for Recordkeepers". The methods discussed are online surveys, comparative case studies, and autoethnography.
Licensed as CC-BY.
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
Presentation about OHSL's new initiative, Mycroft Cognitive Assistant®, which is intended to streamline the operational aspects of research using IBM Watson cognitive computing capabilities.
Recomendations for infrastructure and incentives for open science, presented to the Research Data Alliance 6th Plenary. Presenter: William Gunn, Director of Scholarly Communications for Mendeley.
Increasing transparency in Medical Education through Open Data Rebecca Grant
Slides presented at the AMEE Virtual Conference 2021, introducing the MedEdPublish platform and data policies. Approaches to sharing sensitive human data, and particulary qualitative data, are discussed.
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
Scot Edmunds talk at CODATA2019 on Quantifying how FAIR is Hong Kong: The Hong Kong Shareability of Hong Kong University Research Experiment. 19th September 2019 in Beijing
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering science data, medical data and ethics, and the FAIR data principles.
Research in the time of Covid: Surveying impacts on Early Career ResearchersRebecca Grant
Based on a survey of over 4,500 researchers published in the white paper The State of Open Data 2020, this session will explore the impacts of the pandemic on early career reearchers (ECRs), their research practice, and how they interact with open data. We will discuss the specific challenges reported by ECRs, as well as the gaps in training and support that they have identified that would encourage their sharing and reuse of research data.
Presentation at the E-ARMA conference 2021.
With the explosion of interest in both enhanced knowledge management and open science, the past few years have seen considerable discussion about making scientific data “FAIR” — findable, accessible, interoperable, and reusable. The problem is that most scientific datasets are not FAIR. When left to their own devices, scientists do an absolutely terrible job creating the metadata that describe the experimental datasets that make their way in online repositories. The lack of standardization makes it extremely difficult for other investigators to locate relevant datasets, to re-analyse them, and to integrate those datasets with other data. The Center for Expanded Data Annotation and Retrieval (CEDAR) has the goal of enhancing the authoring of experimental metadata to make online datasets more useful to the scientific community. The CEDAR work bench for metadata management will be presented in this webinar. CEDAR illustrates the importance of semantic technology to driving open science. It also demonstrates a means for simplifying access to scientific data sets and enhancing the reuse of the data to drive new discoveries.
RARE and FAIR Science: Reproducibility and Research ObjectsCarole Goble
Keynote at JISC Digifest 2015 on Reproducibility and Research Objects in Scholarly Communication
Includes hidden slides
All material except maybe the IT Crowd screengrab reusable
Do Open data badges influence author behaviour? A case study at Springer NatureRebecca Grant
Digital badges have previously been shown to incentivise journal authors to share their data openly. In this paper we introduce an Open data badging project at the Springer Nature journal BMC Microbiology. The development of the Open data badge is described, as well as the challenges of developing standard badging criteria and ensuring authors’ awareness of the badges. Next steps for the badging project are outlined, which are based on the experiences of the team assessing the badges, the number of badges awarded at the journal to date, and the results of an author survey.
Managing Ireland's Research Data - 3 Research MethodsRebecca Grant
Slides providing an overview of the research methods used in the author's thesis, "Managing Ireland's Research Data: Recognising Roles for Recordkeepers". The methods discussed are online surveys, comparative case studies, and autoethnography.
Licensed as CC-BY.
Research Data Management Services at UWA (November 2015)Katina Toufexis
Research Data Management Services at the University of Western Australia (November 2015).
Created by Katina Toufexis of the eResearch Support Unit (University Library).
CC-BY
Presentation about OHSL's new initiative, Mycroft Cognitive Assistant®, which is intended to streamline the operational aspects of research using IBM Watson cognitive computing capabilities.
Session for MACPA's Beach Retreat on July 6, 2012. The 5 C's (seas) Change, Complexity, Compliance, Convergence, and Competition. We covered the latest trends in social business and what it means from a leadership and change management perspective.
Barometrul educatiei si culturii antreprenoriale - 2014Mihaela Matei
Prima ediţie a Barometrului culturii și educației antreprenoriale
în rândul studenților, România 2014 indică faptul că studenții
români ar prefera într-o măsură mai mare să fie antreprenori
și nu angajați la finalizarea studiilor (57%), iar majoritatea
și-ar dori să înceapă propria afacere în următorii 2 ani (52%).
Ne referim în principal la studenții în an terminal de facultăți
cu profil economic.
Aceste răspunsuri subliniază importanța încurajării
antreprenoriatului ca alegere valoroasă şi respectată de
carieră. Astfel, cei care iau în calcul calea antreprenoriatului
trebuie să simtă că nu sunt stigmatizaţi în urma unui eșec.
De asemenea, tinerii care iau în considerare calea
antreprenoriatului au nevoie de educație antreprenorială
continuă realizată prin cursuri specifice, ca și prin expunerea
la povești de succes și la practica antreprenorială, inclusiv în
mediul universitar.
The need to redefine genomic data sharing - moving towards Open Science Oct ...Fiona Nielsen
This presentation was given at the symposium: Genomics for Health and Environment in Nijmegen on Oct 30, 2014
http://www.studiegids.science.ru.nl/2014/science/prospectus/biology_bachelor/course/34732/
The presentation introduces Open Science and Open Access Publishing and discusses these concepts in relation to (human) genomics.
The discussion includes a presentation of the concept behind http://repositive.io, the social enterprise software platform which was spun out of the DNAdigest research activities.
As a special edition to the students in the audience who are curious about their future scientific career, I included a couple of slides about my move from academic research to being a social entrepreneur.
DNAdigest works to promote and enable easier and more efficient sharing of genomics data for research. We educate and engage the community about the hurdles and dilemmas for data sharing as faced from the perspective of stakeholders in academia, industry and patient communities. As part of our work we are working with our community and supporters to prototype new mechanisms and concepts for data sharing and data access.
Please visit our website to learn more about our activities and events: http://DNAdigest.org
Follow us on twitter: @DNAdigest
Presentation by Faith Bannier (Intellectual Property Office) and Alex Shirvani (Dept for Business, Innovation & Skills), covering working as a government economist and applying to the Analytical Fast Stream as an economist
Whole grain crackers deserve to be elected as the healthy hero to snack time. Eat on their own or pair with dips, spreads or proteins like meat and cheese to make easy and delicious snacks. This candidate also happens to come with health benefits like fiber for digestive health and whole grains for weight management.
Après plusieurs anniversaires emblématiques célébrés sur les différents sites de l'Ifsttar en régions, l'Institut clôt le cycle des "Décennies" par deux journées de conférences et de portes ouvertes sur ses sites franciliens les 22 et 23 septembre.
Le titre "Aujourd'hui l'Ifsttar" a été choisi pour ce nouvel événement et ce, afin d'inscrire résolument l'institut dans le présent et l'avenir. Ainsi avec ces deux nouvelles journées, les thématiques portées par les sites franciliens seront mises à l'honneur et exposeront ce que l'institut apporte en matière de mobilité durable, sa vision pour la Ville de demain, les solutions proposées pour la maitrise des risques et la préservation de notre patrimoine routier.
SciDataCon - How to increase accessibility and reuse for clinical and persona...Fiona Nielsen
Presented in session 48 - Sharing of sensitive data - presented by Fiona Nielsen on September 12, 2016 at #SciDataCon http://scidatacon.org
We have addressed the most pressing problem for public genomic data, that of data discoverability, by indexing worldwide resources for genomic research data on an online platform (repositive.io) providing a single point of entry to find and access available genomic research data.
http://www.scidatacon.org/2016/sessions/48/paper/26/
http://www.scidatacon.org/2016/sessions/48/
International data week - #RDAPlenary #IDW2016
Data dialogue - Human Genomic Data DiscoveryFiona Nielsen
Presenting at The Data Dialogue. Time to Share: Navigating Boundaries & Benefits - Afternoon session: Sharing difficult data.
July 28 - 2016 @ University of Cambridge
http://www.ses.ac.uk/event/data-dialogue-time-share-navigating-boundaries-benefits/
In this talk I present an overview of human genomic data sources around the world, their funding, access policies and type of data they contain. Discussing why data sharing is hard, including issues of data privacy and a research culture that does not incentivise sharing of data and results.
Presented by Fiona Nielsen, founder and CEO of Repositive
http://repositive.io
Data sharing promotes many goals of the NIH research endeavor. It is particularly important for unique data that cannot be readily replicated. Data sharing allows scientists to expedite the translation of research results into knowledge, products, and procedures to improve human health. Do you know what a data sharing plan should include? Are you aware of common practices and standards for data sharing? Do you know what services are available to help share your data responsibly? This workshop will begin to address these questions. Q&A will follow the presentation. Anyone interested in or planning to apply for NIH funding should attend. Note: The NIH data-sharing policy applies to applicants seeking $500,000 or more in direct costs in any year of the proposed research.
Being FAIR: FAIR data and model management SSBSS 2017 Summer SchoolCarole Goble
Lecture 1:
Being FAIR: FAIR data and model management
In recent years we have seen a change in expectations for the management of all the outcomes of research – that is the “assets” of data, models, codes, SOPs, workflows. The “FAIR” (Findable, Accessible, Interoperable, Reusable) Guiding Principles for scientific data management and stewardship [1] have proved to be an effective rallying-cry. Funding agencies expect data (and increasingly software) management retention and access plans. Journals are raising their expectations of the availability of data and codes for pre- and post- publication. The multi-component, multi-disciplinary nature of Systems and Synthetic Biology demands the interlinking and exchange of assets and the systematic recording of metadata for their interpretation.
Our FAIRDOM project (http://www.fair-dom.org) supports Systems Biology research projects with their research data, methods and model management, with an emphasis on standards smuggled in by stealth and sensitivity to asset sharing and credit anxiety. The FAIRDOM Platform has been installed by over 30 labs or projects. Our public, centrally hosted Asset Commons, the FAIRDOMHub.org, supports the outcomes of 50+ projects.
Now established as a grassroots association, FAIRDOM has over 8 years of experience of practical asset sharing and data infrastructure at the researcher coal-face ranging across European programmes (SysMO and ERASysAPP ERANets), national initiatives (Germany's de.NBI and Systems Medicine of the Liver; Norway's Digital Life) and European Research Infrastructures (ISBE) as well as in PI's labs and Centres such as the SynBioChem Centre at Manchester.
In this talk I will show explore how FAIRDOM has been designed to support Systems Biology projects and show examples of its configuration and use. I will also explore the technical and social challenges we face.
I will also refer to European efforts to support public archives for the life sciences. ELIXIR (http:// http://www.elixir-europe.org/) the European Research Infrastructure of 21 national nodes and a hub funded by national agreements to coordinate and sustain key data repositories and archives for the Life Science community, improve access to them and related tools, support training and create a platform for dataset interoperability. As the Head of the ELIXIR-UK Node and co-lead of the ELIXIR Interoperability Platform I will show how this work relates to your projects.
[1] Wilkinson et al, The FAIR Guiding Principles for scientific data management and stewardship Scientific Data 3, doi:10.1038/sdata.2016.18
The slides that will accompany my live webcast for OpenCon 2014 attendees, all about open data in research. The benefits, the how to (both legally & technically), examples, pitfalls, and the future of open research data.
IRIDA: A Federated Bioinformatics Platform Enabling Richer Genomic Epidemiolo...William Hsiao
Introducing BCCDC and Public Health Microbiology (PHM)
Current State of PHM
Sequence Technology Advancement -> revolution of PHM
Genomic Epidemiology
Amount of Sequence Data Produced
Need to Process the data – Introduction to IRIDA
Need of Metadata and Ontology
Software to improve data sharing
How research microbiology and PHM can joint effort
"Open Science, Open Data" training for participants of Software Writing Skills for Your Research - Workshop for Proficient, Helmholtz Centre Potsdam - GFZ German Research Centre for Geosciences, Telegrafenberg, December 16, 2015
EICT Summer School August 2023 - Things I never knew I never knew - about bu...Fiona Nielsen
Expert workshop session delivered at IECT Summer School for Entrepreneurs on August 23, 2023, by Fiona Nielsen.
Fiona is a serial entrepreneur with lots of experience in hiring, leading and laying off people as part of her startup journey. In this presentation Fiona shares practical down to earth tips and examples on how to build a great team at your startup.
Topics include breakdowns of how to:
- Get great people on board
- Always improve your leadership
- Invest in good culture from the start
For example "1. Get great people on board"
Attract the right people to apply/express interest
Describe the role you are looking for and be specific about making the title reflect the job, e.g. “co-founder” or “marketing intern”
Always include the mission and vision of the company. Don’t fluff it.
Consider why anyone would work for you - beyond being paid a salary.
Great candidates have a choice of where to work, they will choose a place where they find meaning, feel motivated and challenged, and feel welcome.
Investing in innovation for genomic medicine - sept 5 2017Fiona Nielsen
Keynote at #TechBBQ 2017 by Fiona Nielsen
The journey of Repositive and how groundbreaking innovation is found in the crossover between traditional business and investment verticals.
Investing in innovation for genomic medicine - the journey of RepositiveFiona Nielsen
by Fiona Nielsen
Presented for UK Pharmacogenetics and Stratified Medicine Network (UK PGx Network) - Entrepreneurship, Disruptive Innovation and Personalised Medicine University of Liverpool London Campus, Finsbury Square, Wednesday 7th June 2017
From bioinformatics scientist to entrepreneur - Women in Omics - ICG11 - 2016Fiona Nielsen
Presented by Fiona Nielsen at the International Conference of Genomics at China National Genebank, Shenzhen http://www.icg-11.org
I present the "WHY" of what I am doing, and how I got here. A personal story of frustration, science and family.
Session chaired by Laurie Goodman, Gigascience
ICG-11 - genomic data projects around the world - nov 5 2016Fiona Nielsen
How to find data for your research
Presented by Fiona Nielsen at the International Conference of Genomics 2016 www.icg-11.org in the session Data Sharing and Analysis chaired by Laurie Goodman, editor-in-chief, GigaScience
Genome sharing projects around the world - Open Access is not enough Fiona Nielsen
Presented by Fiona Nielsen at the 2016 conference on Electronic Publishing #Elpub2016 in Goettingen, Germany, June 8th 2016
Take home message 1: Open Access does not equal discoverability
Take home message 2: Lots of genomic research data is not found and reused because it is not discoverable
Take home message 3: Repositive is a portal for searching for genomics data
Read more:
- EPUB conference http://meetings.copernicus.org/elpub2016/programme.html
- Repositive http://repositive.io
From Bioinformatics Scientist to EntrepreneurFiona Nielsen
A 15min presentation at the University of Southern Denmark (SDU) Alumni career event on May 28th, 2016.
Thanks to Jørgen Bang Nielsen and IMADA for organising.
Read more:
- SDU http://www.sdu.dk
- Institute for mathematics and computer science (IMADA): http://imada.sdu.dk
- DNAdigest http://DNAdigest.org
- Repositive http://repositive.io
Pistoia Alliance European Conference, Kings College London, April 19, 2016
Panel introduction to Big (Biomedical) Data and the challenges facing research in biomedical R&D with examples from genomics data around the world. #Pistoia2016
Event link:
https://www.eventbrite.co.uk/e/pistoia-alliance-european-conference-2016-tickets-19618953819
Read more about me and my work at:
http://dnadigest.org
http://repositive.io
https://uk.linkedin.com/in/fionanielsen
Overcoming barriers for genomic data sharing yaac presentation may 23 2015Fiona Nielsen
Overcoming barriers for genomic data sharing - presented at Young Alliance Against Cancer conference on May 23rd 2015 in Copenhagen. http://young-alliance.org
Repositive is a mission-driven company aiming to facilitate data sharing for genomics research via the online platform http://repositive.io
Repositive was spun out of the charity DNAdigest.
Read more: http://dnadigest.org/repositive-raises-300k-for-genomics-platform/
Find us on Twitter @repositiveio and @DNAdigest
DNAdigest Eagle Genomics Symposium March 27, 2014Fiona Nielsen
At the Eagle Genomics Symposium Fiona Nielsen CEO and founder of DNAdigest presented a discussion of the trade-off between privacy and open access and how hard-to-access data is hindering progress in genetics research.
Read more at http://DNAdigest.org and have a look at our campaign in support of collaborative open science for human genetics: http://tiny.cc/funddna <-- this is where you can get one of those cool #OpenScience T-shirts ;)
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...Sérgio Sacani
We characterize the earliest galaxy population in the JADES Origins Field (JOF), the deepest
imaging field observed with JWST. We make use of the ancillary Hubble optical images (5 filters
spanning 0.4−0.9µm) and novel JWST images with 14 filters spanning 0.8−5µm, including 7 mediumband filters, and reaching total exposure times of up to 46 hours per filter. We combine all our data
at > 2.3µm to construct an ultradeep image, reaching as deep as ≈ 31.4 AB mag in the stack and
30.3-31.0 AB mag (5σ, r = 0.1” circular aperture) in individual filters. We measure photometric
redshifts and use robust selection criteria to identify a sample of eight galaxy candidates at redshifts
z = 11.5 − 15. These objects show compact half-light radii of R1/2 ∼ 50 − 200pc, stellar masses of
M⋆ ∼ 107−108M⊙, and star-formation rates of SFR ∼ 0.1−1 M⊙ yr−1
. Our search finds no candidates
at 15 < z < 20, placing upper limits at these redshifts. We develop a forward modeling approach to
infer the properties of the evolving luminosity function without binning in redshift or luminosity that
marginalizes over the photometric redshift uncertainty of our candidate galaxies and incorporates the
impact of non-detections. We find a z = 12 luminosity function in good agreement with prior results,
and that the luminosity function normalization and UV luminosity density decline by a factor of ∼ 2.5
from z = 12 to z = 14. We discuss the possible implications of our results in the context of theoretical
models for evolution of the dark matter halo mass function.
Earliest Galaxies in the JADES Origins Field: Luminosity Function and Cosmic ...
Genome sharing projects around the world nijmegen oct 29 - 2015
1. Genome sharing projects
around the world
– and how you find data for
your research
Fiona Nielsen, October 2015
Find me on twitter: @glyn_dk
2. • In case my talk will be boring…
First the take home messages…
3. Do not forget:
By 2025 genome research will produce as much data
as Twitter /YouTube.
You do not have
enough statistical
power to interpret
your data
But
You can
improve your
study design
And
You can access
more data from
public genome
data repositories
5. Data output is going up
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
400K
Genomes
Sequenced
The output of human genome
sequencing data is growing at
exponential rates
Estimated number of human
genomes sequenced in 2015
6. Population scale genome sequencing projects
Population scale genome
sequencing projects have
been launched all over
the world
Soon every research lab
and every genetic clinic
will have a DNA
sequencer
7. How much data do you need to publish a paper?
2001: 1 human genome
2012: 1000 Genomes (1092 genomes, since increased to ~2500)
2015:
UK10K, Icelandic population (2,636 + 100k imputed),
Cancer genome atlas ~11,000 genomes
Exac consortium 65,000 exomes
?
8. Statistically speaking, you still need 10s of thousands of samples for
validation
The more severe the phenotype and the more complete penetrance, the
easier it will be for you to find your variant, but
“As the genetic complexity of the disease increases (for example,
reduced penetrance and increased locus heterogeneity), issues of
statistical power quickly become paramount.”
http://www.nature.com/nrg/journal/v15/n5/full/nrg3706.html
But I am just looking at this one disease…
9. What can I do?
PRO TIP: involve a statistician early on in your study design!
10. How can I determine significance?
“One potentially powerful approach is to assess conservation across and within
multiple species as whole-genome sequence data become more abundant.”
Look at extreme phenotypes “Sampling cases or controls from the extremes of an
appropriate quantitative distribution can often increase power”
Look at non-SNP variants, they are more likely to have functional effects
- “how to account for the technical features of sequencing, such as incomplete
sequencing and biased coverage over the genome?”
11. Think of how you can provide evidence that your result is not just a local
technical variation or sampling bias
e.g. data from same cell type, same seq technology, same alignment…
How to account for bias?
PRO TIP: include more reference data in your analysis
12. • Know what data is available in your lab,
your dept, your org
• Survey from Qiagen showed that one of
the main reasons researchers collaborate
is to get access to data!
How can I access more data for my research?
13. How can I find collaborators?
PRO TIP: Search for collaborators who have the data you need
PRO TIP: Tell your colleagues and peers what type of data you
have in your lab
14. Where can I access data?
public repositories
• some you apply for access,
especially if data contains
clinical info or whole genome
PID
• some are open access: GEO,
SRA, PGP, OpenSNP, GigaDB, …
• some are consented for
general research use, some
have specific consent
16. And it takes time
Bottlenecks:
• Finding relevant and usable
data
• Getting authorisation to
access data
• Formatting data
• Storing and moving data
We studied the problem by
qualitative interviews followed
by a survey of researchers in
human genetics
17. And it takes time
T. A. van Schaik et al
The need to redefine genomic
data sharing: a focus on data
accessibility, Applied &
Translational Genomics, 2014
10.1016/j.atg.2014.09.013
Researchers spend months to
find and access genomic data,
and often choose to not access
data at all
19. Barriers to access
NIH / eRA Commons
login
No
Yes
Organisation registered
with eRA
Organisation has DUNS
number
No
No
Write research proposal
Yes
+ 2-3 days
+ 1-2 weeks
+ 1 week
Yes
Submit proposal
+ days to weeks
Access granted
Variable: from
weeks to months
dbGaP Application Process
Science…
Find/Download/Decryp
t data
+ 1-2 days
20. Why the barrier?
• Benefits: strict governance, review of consent, applicant signs for full
responsibility for governance
• Disadvantages: No control of data once access is given, high barrier for
access – too high?
21. • Start planning your data needs early in your project
• When you find the data you need, start application
• Use Open Access data
How can I save time?
PRO Tip: If you use human genomic data, apply for the GRU
datasets in dbGaP, one application – access to all the GRU
datasets
22. • Some data is Open Access requires specific consent
• OpenSNP.org (Bastian)
• Personal Genomes Projects
• Individuals who put their genomes online, e.g. Manuel Corpas
and his family “the Corpasome”
• http://manuelcorpas.com/about/
Not all data is restricted
23. • Some data is Open Access requires specific consent
• Individuals who put their genomes online, e.g. Manuel Corpas
and his family “the Corpasome”
• http://manuelcorpas.com/about/
• OpenSNP.org (Bastian)
• Personal Genomes Projects
Not all data is restricted
24. Personal Genome Project
PGP Harvard PGP Canada PGP UK Genom Austria
Host institution Harvard Medical School
Boston
SickKids Toronto University College London CeMM Research Center for
Molecular Medicine
Principal Investigator George Church Steven Scherer Stephan Beck Christoph Bock & Giulio
Superti-Furga
Launch year 2005 2012 2013 2014
Geographic scope USA, mainly Boston Canada United Kingdom Mainly Austria
Enrollment eligibility At least 18 years old, able to make an informed decision, perfect score in the PGP enrollment exam, certain vulnerable groups
excluded
Data Generated Whole genome sequencing,
upload of additional data
possible
Mainly whole genome
sequencing
Whole genome sequencing,
DNA methylome sequencing,
RNA transcriptome sequencing
Mainly whole genome
sequencing
Number of genomes 100s 10s 10s 10s
Data access
http://personalgenomes.org/harvard/data
http://genomaustria.at/unser-
genom/#genome-der-
pionierinnen
Project funding Discretional funds and
corporate sponsoring
Institutional startup funds Discretional funds and
corporate sponsoring
Institutional startup funds
Areas of emphasis Integration with phenotypic data,
collaboration with other personal
omics initiatives
Genome donations, synergy with
massive-scale clinical genome
sequencing projects
Genomes and society, genetic
literacy, school projects,
education
Website http://personalgenomes.org/harvard/ http://personalgenomes.org/canada/ http://personalgenomes.org/uk/ http://genomaustria.at/
25. Summary of data access barriers
Data is uploaded
to repository
Data is discovered
by potential user
Data is accessed
by potential user
26. Where is the data?
2006 2007 2008 2009 2010 2011 2012 2013 2014 2015
≈ 5K
Genomes
Available
400K
Genomes
Sequenced
Only a fraction of the data is
findable or available through
public repositories
27. • “even when researchers are authorised to share data they
report reluctance to do so because of the amount of effort
required“ http://www.sciencedirect.com/science/article/pii/S2212066114000386
• “Clinical geneticists cited a lack of time because their main priority is
diagnosing patients. Industrial researchers cited a lack of time because of
the pressure to meet the deadlines in their job. Researchers in academia
cited both a concern about the potential loss of future publications once
unpublished data is shared, and the lack of time and incentive to share
data as this does not contribute to their publication record. Researchers
from all categories felt that they lacked sufficient resources to make their
data available.”
The barrier of making data available
But I do not want to share my data
28. • If you expect data to be available to you
– you have to make your data available too!
• Encourage collaborations: power by numbers
1. Get credit – publish and make your data available
2. Give credit – cite data sources
3. Understand consent – for all uses of clinical data
Best practices
29. • Use all available tools to make your life easier:
• Data publications visibility and citations for your data, e.g.
GigaScience
• Figshare, Zenodo, Dryad for sharing open access data
• PhenomeCentral, Matchmaker exchange for rare disease research
• Repositive for finding data across repositories and make your own data
discoverable
Best practices: use the tools
31. “Weakness: Involvement of non-
academic beneficiaries is limited”
“Weakness: highly focused on academic activities, and
lacks an advanced communication strategy”
“Weakness: limited exposure to
non-academic partners & infrastructures”
Excellence
Impact
Implementation
“data accessibility is unclear!”
“data storage & access not considered”
32. “Strengths: extensive dissemination of data to the
scientific community (open access, databases)”
“outreach activities to a broad audience”
“research software is freely available”
Impact:
33.
34. Make the (research) world a better place by sharing in return
Best practices
35. • Digital consent: towards automatic processing of applications
• Dynamic consent and power to the patient, e.g.
PatientsKnowBest
• Privacy-preserving access to datasets: preserving control and
governance with data custodian, lower barrier for access
What the future holds
36. In the meantime: It is a jungle out there!
What if finding data was as easy as finding a book on
Amazon, book a hotel on Expedia?
38. Repositive is a web platform
Discover new data sources
We are indexing all the public sources of
data, so users have an easy portal for
searching through data descriptions.
EASY
SEARCH
39. Repositive is a web platform
Make your data visible
As a two-sided marketplace, the users
can also make their own data findable.
SHARE
KNOWLEDGE
40. Active Repositive users increase benefits
Build a data community
BUILD
TRUST
Users can interact to find relevant
collaborators for their research either to
analyse their data or to combine data
sources.
41. Active Repositive users increase benefits
Find data collaborators
SAVE TIME
Feedback from other users through ratings
and comments helps users evaluate data
quality
42. Benefit for both sides
Data consumers Data producers
Find relevant data faster
Feedback from other users
through ratings and comments to
evaluate data quality
Find collaborators with data
Make your data visible
Build credibility as a trusted
provider of quality data
Find collaborators to analyse
your data
It has been shown that the combination of summary single-variant statistics from multiple data sets, rather than the joint analysis of a combined data set, does not result in an appreciable loss of information85, and that taking into account heterogeneity in effect size across studies can improve statistical power
“Although they are harder to call and annotate, insertion or deletions, multinucleotide variants and structural variants (including copy-number variants, translocations and inversions) constitute a smaller set of variation (in terms of the number of discrete events an individual is expected to carry) relative to all SNVs and are more likely to have functional effects.”
It has been shown that the combination of summary single-variant statistics from multiple data sets, rather than the joint analysis of a combined data set, does not result in an appreciable loss of information85, and that taking into account heterogeneity in effect size across studies can improve statistical power
Because interpretation requires LOTS of data
And although data exists around the world, it is siloed, and even if available, it is not accessible
This is Jenn, a genetic researcher –our target customer- seeking to interpret data from genetic diseases and cancer
She needs data from other patients to compare and interpret Mabels DNA
She also has data available in her own lab, but she cannot share because of concerns how to deal with secure access to sensitive data and data governance, e.g. vetting of users
Public repositories: default is apply for access -> full access
Benefits: strict governance, review of consent, applicant signs for full responsibility for governance
Disadvantages: No control of data once access is given, high barrier for access – too high? (researchers giving up, even patients can’t get access to their own data)
Cost of data is going down
Data production is going up
Growing problem
Market opportunity for solutions!
ODP trained, EURO-BASIN manager, – a boring title, for a diverse job, in an exciting research domain.
DIP into EACH step of the research cycle, from proposal formulation to providing the best return-on-investment to the funders.
So I`d like to share with you some experiences from the last few years of OS advocacy in the Marine Science Community
Excellence at your Research Subject is … excellent, but is it ENOUGH ?
To be successful, a candidate will be judged on being complete.
MESSAGE: FOSUC only on IF could expose you to risk
ODP trained, EURO-BASIN manager, – a boring title, for a diverse job, in an exciting research domain.
DIP into EACH step of the research cycle, from proposal formulation to providing the best return-on-investment to the funders.
So I`d like to share with you some experiences from the last few years of OS advocacy in the Marine Science Community
So, if the IMPACT FACTOR is no good, how will it evolve in future?
Here is an example from the UK, on how Research Institutes are evaluated …
The key message here is that, in future, funders will place even more emphasis on ”Societal Impact” in future, but more pertinent for you right now and today is that it is already affecting your chances for Post-Doc funding.
For Jenn, the inaccessibility of data means it takes her up to 6 months to find and up to 6 months to access to the data she needs for analysis.
But for clinical cases like Mabel, she only has days to finish her analysis!
THIS IS RIDICULOUS BECAUSE:
Today one can:
- Find any hotel on Trivago
- Find any book on amazon – with feedback from other users
- But researchers have nowhere to find and acess (human) genomics data!
The Repositive platform and technology will remove barriers to data sharing and will incentivise users to explore, contribute and collaborate in alignment with best practices
When Jenn needs data for a specific disease she makes a search on Repositive to find the data directly, understand the value of the data based on feedback from the Repositive user community, access the data securely, because she knows that
…Repositive is the trusted broker for secure and efficient data exchange …
No more hassles of finding data or hassles of exchanging data with collaborators, [show dbGaP screenshots with a cross over it]
We are changing the landscape of genomics research through the Repositive platform which we have just launched in private beta.
Providing the search facility for rapid data discovery of existing data sources
and make your own data visible to community
We are indexing all the public sources of data, so users have one easy portal for searching through data descriptions
The platform UI is described by our users as “slick” “easy” and “refreshing” compared to other bioinformatics tools
We are changing the landscape of genomics research through the Repositive platform which we have just launched in private beta.
Providing the search facility for rapid data discovery of existing data sources
and make your own data visible to community
We are indexing all the public sources of data, so users have one easy portal for searching through data descriptions
The platform UI is described by our users as “slick” “easy” and “refreshing” compared to other bioinformatics tools
Providing the community for peer feedback to help you determine what data is relevant
Providing the technology to get data insights
for secure and efficient data access, e.g. privacy-preserving technology,
to remove the barriers for making data available and accessible
Providing the community for peer feedback to help you determine what data is relevant
Providing the technology to get data insights
for secure and efficient data access, e.g. privacy-preserving technology,
to remove the barriers for making data available and accessible
Our mission is to speed up research and diagnostics for genetic diseases by enabling efficient and ethical access to genomic research data
Our mission is to speed up research and diagnostics for genetic diseases by enabling efficient and ethical access to genomic research data