This document discusses a new open-access journal called GigaScience that publishes large biological datasets. It aims to improve data sharing by assigning digital object identifiers (DOIs) to published datasets to make them easily citable and trackable. The journal faces challenges regarding reproducibility, usability, and adherence to standards. It works to address these by providing tools for data access, encouraging standards compliance, and integrating datasets into its expanding repository.
Scott Edmunds talk on GigaScience Big-Data, Data Citation and future data handling at the International Conference of Genomics on the 15th November 2011.
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...GigaScience, BGI Hong Kong
Scott Edmunds talk at the HUPO congress in Geneva, September 6th 2011 on GigaScience - a journal or a database? Lessons learned from the Genomics Tsunami.
Scott Edmunds talk in the "Policies and Standards for Reproducible Research" session on Revolutionizing Data Dissemination: GigaScience, at the Genomic Standards Consortium meeting at Shenzhen. 6th March 2012
Scott Edmunds talk on GigaScience Big-Data, Data Citation and future data handling at the International Conference of Genomics on the 15th November 2011.
Scott Edmunds: GigaScience - a journal or a database? Lessons learned from th...GigaScience, BGI Hong Kong
Scott Edmunds talk at the HUPO congress in Geneva, September 6th 2011 on GigaScience - a journal or a database? Lessons learned from the Genomics Tsunami.
Scott Edmunds talk in the "Policies and Standards for Reproducible Research" session on Revolutionizing Data Dissemination: GigaScience, at the Genomic Standards Consortium meeting at Shenzhen. 6th March 2012
data management, information management, data, big data, personal organization, organization, file management, scientific research, research, project management, data security, file naming conventions, data management plan,
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering science data, medical data and ethics, and the FAIR data principles.
Published on Feb 07, 2016 by PMR
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus. Includes clips of the software in action
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus.
Three slides have embedded movies - these do not show in slideshare and a first pass of this can be seen as a single file at https://vimeo.com/154705161
Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0Hang Wu
Traditional genius usually have the problems of either genetic abnormalities or unusual social political behaviors. These what we called the Genius 1.0 and Genius 2.0. The new kind of genius, using technology to boost their intelligence while maintaining their humanity, is proposed as the next evolution of human being.
传统的天才即使在基因上拥有很大的优势,因为社会政治的原因,导致了他们无法被世界所接受。因此,我设计出了天才3.0的概念,用来解释利用现代神经工程学提升大脑的人。
The Sixth Sense is the Basic Latest Technology. It is the a wearable gestural interface that augments the physical world around us with digital information
data management, information management, data, big data, personal organization, organization, file management, scientific research, research, project management, data security, file naming conventions, data management plan,
Keynote presented to KE workshop held in conjunction with the release of the report "A Surfboard for Riding the Wave
Towards a four country action programme on research data": http://www.knowledge-exchange.info/Default.aspx?ID=469
Scott Edmunds slides for class 8 from the HKU Data Curation (module MLIM7350 from the Faculty of Education) course covering science data, medical data and ethics, and the FAIR data principles.
Published on Feb 07, 2016 by PMR
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus. Includes clips of the software in action
Use of ContentMine tools on the Open Access subset of EuropePubMedCentral to discover new knowledge about the Zika virus.
Three slides have embedded movies - these do not show in slideshare and a first pass of this can be seen as a single file at https://vimeo.com/154705161
Presentation -Intelligence Enhancer and Genius 3.0 智能增长以及天才3.0Hang Wu
Traditional genius usually have the problems of either genetic abnormalities or unusual social political behaviors. These what we called the Genius 1.0 and Genius 2.0. The new kind of genius, using technology to boost their intelligence while maintaining their humanity, is proposed as the next evolution of human being.
传统的天才即使在基因上拥有很大的优势,因为社会政治的原因,导致了他们无法被世界所接受。因此,我设计出了天才3.0的概念,用来解释利用现代神经工程学提升大脑的人。
The Sixth Sense is the Basic Latest Technology. It is the a wearable gestural interface that augments the physical world around us with digital information
The Noetic perspective (from Greek: noetikos- mental; nous- mind) identifies the [human] mind as the nexus of the future evolution of humanity. At present, human evolution is a mental process rather than biological or technological process.
The Noetic model describes mind as a relation generating complex system arising as a product of biological evolution and manifesting certain defining characteristics such as systemic closure, self reference, plasticity, etc. This model aims to integrate a systemic view with the mental constructs of the subjective plane. According to the Noetic model, human identity is a dynamic constructive process that brings forth the human observer as the subject of its perceptive and mental states. This process is identified as mind. Images and narratives are the elements encompassing the experiential and mental aspects of the identity process as they appear to the human observer.
The idea of mind as the theater of evolutionary processes is further explored: Mind as a complex system can essentially be disassociated from the historical conditions of its emergence; therefore it is virtually unbound in its evolutionary potential. This has deep implications on the understanding of human nature and the human condition. Finally, the ideas of openness and freedom beyond utility are proposed as futuristic directives of consciously guided evolution of mind.
What computational principles explain the success of human intelligence? I will describe recent work that combines together the unbounded flexibility of mathematical logic with the robustness of statistical inference. This combination brings us several steps closer to understanding human intelligence -- and to the tools for true intelligence engineering.
Noah D. Goodman is a research scientist in the Department of Brain and Cognitive Sciences at MIT, and a member of the Computer Science and Artificial Intelligence Laboratory. He studies the computational basis of human thought, merging behavioral experiments with formal methods from statistics and logic. He received his Ph.D. in mathematics from the University of Texas at Austin. After a brief stint as a Chicago real estate developer, he joined the Computational Cognitive Science group at MIT. Goodman has published more than thirty publications in psychology, cognitive science, artificial intelligence, and mathematics. Several of these papers have won awards.
Military 2.0 - Patrick Lin - H+ Summit @ HarvardHumanity Plus
For better or worse, the military is a major driver of technological, world-changing innovations, such as the Internet. At the same time, wars and armed conflicts are a key roadblock in the evolution of humanity. Therefore, to understand how emerging technologies will change our lives, we must look at their military origins as a harbinger of things to come for society at large. This presentation will focus on ethical and policy questions arising from two key areas making headlines today and in the future: human enhancement technologies and robotics.
For instance, are there moral or practical issues with eliminating human emotions such as fear or anger, which have led to abuses and accidents in wartime? Must these enhancements (and others, such as super-strength) be temporary or reversible, considering that soldiers usually return to civilian life? Robots can discourage such abuses if equipped with cameras, becoming objective and unblinking observers on the battlefield, but would this erode cohesion and trust among soldiers – and in the civilian realm, would surveillance robots infringe on our privacy? Generally, would these new technologies make it easier to engage in war, since they would lower political costs by reducing the number of casualties on our side – if so, is it immoral, or otherwise counterproductive to humanity's progress, to develop these capabilities?
Patrick Lin is the director of the Ethics + Emerging Sciences Group , based at California Polytechnic State University, San Luis Obispo. Most recently, he has led research efforts that culminated in two major reports: Autonomous Military Robotics: Risk, Ethics, and Design (funded by the U.S. Dept. of Defense/Navy, 2008) and Ethics of Human Enhancement: 25 Questions & Answers (funded by the U.S. National Science Foundation, 2009). He has published several books and papers in the field of technology ethics, including a new monograph What Is Nanotechnology and Why Does It Matter?: From Science to Ethics (Wiley-Blackwell, 2010) and a forthcoming anthology Robot Ethics: The Social and Ethical Implication of Robotics (MIT Press, in preparation). Dr. Lin earned his B.A. from University of California at Berkeley, M.A. and Ph.D. from University of California at Santa Barbara, and completed a three-year post-doctoral appointment at Dartmouth College. He is currently an assistant professor in Cal Poly’s philosophy department and an ethics fellow at the U.S. Naval Academy.
Presentation given by Amon Tywman, DPhil, to UKH+, 11th July 2009.
"Extreme Simulation Scenarios: Thinking about the promise, risk, and plausibility of AI & VR"
GigaScience Editor-in-Chief Laurie Goodman's talk at the International Conference on Genomics pre-conference press-session on the release of new unpublished datasets, and a new look beta version of their database: GigaDB.org
Global Biodiversity Information Facility (GBIF) - 2012Dag Endresen
Presentation of the Global Biodiversity Information Facility (GBIF) and GBIF Norway for the Department of Technical and Scientific Conservation (CONSERV) at the Natural History Museum, University of Oslo. Tøyen, Oslo, 7 November 2012.
The Dryad Digital Repository: Published data as part of the greater data ecos...Hilmar Lapp
Presented at the M3 and Biosharing Special Interest Group (SIG) meeting at ISMB 2010 in Boston, MA: http://gensc.org/gc_wiki/index.php/M3_%26_BioSharing
Making your data work for you: Scratchpads, publishing & the biodiversity dat...Vince Smith
This is a derivative of a talk I gave at the Linnean society on 20th Sept. 2012. This version was given at the i4Life Environmental Genomics workshop on 25th Sept. and refocused to look at the dark taxa problem and developing published descriptions of molecular sequence clusters.
Lecture for a course at NTNU, 27th January 2021
CC-BY 4.0 Dag Endresen https://orcid.org/0000-0002-2352-5497
See also http://bit.ly/biodiversityinformatics
https://www.gbif.no/events/2021/lecture-ntnu-gbif.html
IDW2022: A decades experiences in transparent and interactive publication of ...GigaScience, BGI Hong Kong
Scott Edmunds at International Data Week 2022: A decades experiences in transparent and interactive publication of FAIR data and software via an end-to-end XML publishing platform. 21st June 2022
GigaByte Chief Editor Scott Edmunds presents on how to prepare a data paper for the TDR and WHO sponsored call for data papers describing datasets on vectors of human diseases launched in Nov 2021. Presented at the GBIF webinar on 25th January 2022 and aimed at authors interested in submitting a manuscript submitted to the series.
STM Week: Demonstrating bringing publications to life via an End-to-end XML p...GigaScience, BGI Hong Kong
Scott Edmunds at the STM Week 2020 Digital Publishing seminar on Demonstrating bringing publications to life via an End-to-end XML publishing platform. 2nd December 2020
Scott Edmunds: A new publishing workflow for rapid dissemination of genomes u...GigaScience, BGI Hong Kong
Scott Edmunds on a new publishing workflow for rapid dissemination of genomes using GigaByte & GigaDB. Presented at Biodiversity 2020 in the Annotation & Databases track, 9th October 2020.
Scott Edmunds: Quantifying how FAIR is Hong Kong: The Hong Kong Shareability ...GigaScience, BGI Hong Kong
Scot Edmunds talk at CODATA2019 on Quantifying how FAIR is Hong Kong: The Hong Kong Shareability of Hong Kong University Research Experiment. 19th September 2019 in Beijing
Scott Edmunds talk at IARC: How can we make science more trustworthy and FAIR...GigaScience, BGI Hong Kong
Scott Edmunds talk at IARC, Lyon. How can we make science more trustworthy and FAIR? Principled publishing for more evidence based research. 8th July 2019
PAGAsia19 - The Digitalization of Ruili Botanical Garden Project: Production...GigaScience, BGI Hong Kong
A 3 part talk presented at PAG Asia 2019 in Shenzhen- The Digitalization of Ruili Botanical Garden Project: Production, Curation and Re-Use. Presented by Huan Liu (CNGB), Scott Edmunds (GigaScience) & Stephen Tsui (CUHK). 8th June 2019
Democratising biodiversity and genomics research: open and citizen science to...GigaScience, BGI Hong Kong
Scott Edmunds at the China National GeneBank Youth Biodiversity MegaData Forum: Democratising biodiversity and genomics research: open and citizen science to build trust and fill the data gaps. 18th December 2018
Ricardo Wurmus at #ICG13: Reproducible genomics analysis pipelines with GNU Guix. Presented at the GigaScience Prize Track at the International Conference on Genomics, Shezhen 26th October 2018
Paul Pavlidis at #ICG13: Monitoring changes in the Gene Ontology and their im...GigaScience, BGI Hong Kong
Paul Pavlidis talk at the #ICG13 GigaScience Prize Track: Monitoring changes in the Gene Ontology and their impact on genomic data analysis (GOtrack). Shenzhen, 26th October 2018
Stefan Prost at #ICG13: Genome analyses show strong selection on coloration, ...GigaScience, BGI Hong Kong
Stefan Prost presentation for the #ICG13 GigaScience Prize Track: Genome analyses show strong selection on coloration, morphological and behavioral phenotypes in birds-of-paradise. Shenzhen, 26th October, 2018
Lisa Johnson at #ICG13: Re-assembly, quality evaluation, and annotation of 67...GigaScience, BGI Hong Kong
Lisa Johnson's talk at the #ICG13 GigaScience Prize Track: Re-assembly, quality evaluation, and annotation of 678 microbial eukaryotic reference transcriptomes. Shenzhen, 26th October 2018
Reproducible method and benchmarking publishing for the data (and evidence) d...GigaScience, BGI Hong Kong
Scott Edmunds presentation on: Reproducible method and benchmarking publishing for the data (and evidence) driven era. The Silk Road Forensics Conference, Yantai, 18th September 2018
Mary Ann Tuli: What MODs can learn from Journals – a GigaDB curator’s perspec...GigaScience, BGI Hong Kong
Mary Ann Tuli's talk at the International Society of Biocuration meeting : What MODs can learn from Journals – a GigaDB curator’s perspective. Shanghai 9th April 2018
Connector Corner: Automate dynamic content and events by pushing a buttonDianaGray10
Here is something new! In our next Connector Corner webinar, we will demonstrate how you can use a single workflow to:
Create a campaign using Mailchimp with merge tags/fields
Send an interactive Slack channel message (using buttons)
Have the message received by managers and peers along with a test email for review
But there’s more:
In a second workflow supporting the same use case, you’ll see:
Your campaign sent to target colleagues for approval
If the “Approve” button is clicked, a Jira/Zendesk ticket is created for the marketing design team
But—if the “Reject” button is pushed, colleagues will be alerted via Slack message
Join us to learn more about this new, human-in-the-loop capability, brought to you by Integration Service connectors.
And...
Speakers:
Akshay Agnihotri, Product Manager
Charlie Greenberg, Host
UiPath Test Automation using UiPath Test Suite series, part 3DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 3. In this session, we will cover desktop automation along with UI automation.
Topics covered:
UI automation Introduction,
UI automation Sample
Desktop automation flow
Pradeep Chinnala, Senior Consultant Automation Developer @WonderBotz and UiPath MVP
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
State of ICS and IoT Cyber Threat Landscape Report 2024 previewPrayukth K V
The IoT and OT threat landscape report has been prepared by the Threat Research Team at Sectrio using data from Sectrio, cyber threat intelligence farming facilities spread across over 85 cities around the world. In addition, Sectrio also runs AI-based advanced threat and payload engagement facilities that serve as sinks to attract and engage sophisticated threat actors, and newer malware including new variants and latent threats that are at an earlier stage of development.
The latest edition of the OT/ICS and IoT security Threat Landscape Report 2024 also covers:
State of global ICS asset and network exposure
Sectoral targets and attacks as well as the cost of ransom
Global APT activity, AI usage, actor and tactic profiles, and implications
Rise in volumes of AI-powered cyberattacks
Major cyber events in 2024
Malware and malicious payload trends
Cyberattack types and targets
Vulnerability exploit attempts on CVEs
Attacks on counties – USA
Expansion of bot farms – how, where, and why
In-depth analysis of the cyber threat landscape across North America, South America, Europe, APAC, and the Middle East
Why are attacks on smart factories rising?
Cyber risk predictions
Axis of attacks – Europe
Systemic attacks in the Middle East
Download the full report from here:
https://sectrio.com/resources/ot-threat-landscape-reports/sectrio-releases-ot-ics-and-iot-security-threat-landscape-report-2024/
Kubernetes & AI - Beauty and the Beast !?! @KCD Istanbul 2024Tobias Schneck
As AI technology is pushing into IT I was wondering myself, as an “infrastructure container kubernetes guy”, how get this fancy AI technology get managed from an infrastructure operational view? Is it possible to apply our lovely cloud native principals as well? What benefit’s both technologies could bring to each other?
Let me take this questions and provide you a short journey through existing deployment models and use cases for AI software. On practical examples, we discuss what cloud/on-premise strategy we may need for applying it to our own infrastructure to get it to work from an enterprise perspective. I want to give an overview about infrastructure requirements and technologies, what could be beneficial or limiting your AI use cases in an enterprise environment. An interactive Demo will give you some insides, what approaches I got already working for real.
The Art of the Pitch: WordPress Relationships and SalesLaura Byrne
Clients don’t know what they don’t know. What web solutions are right for them? How does WordPress come into the picture? How do you make sure you understand scope and timeline? What do you do if sometime changes?
All these questions and more will be explored as we talk about matching clients’ needs with what your agency offers without pulling teeth or pulling your hair out. Practical tips, and strategies for successful relationship building that leads to closing the deal.
Dev Dives: Train smarter, not harder – active learning and UiPath LLMs for do...UiPathCommunity
💥 Speed, accuracy, and scaling – discover the superpowers of GenAI in action with UiPath Document Understanding and Communications Mining™:
See how to accelerate model training and optimize model performance with active learning
Learn about the latest enhancements to out-of-the-box document processing – with little to no training required
Get an exclusive demo of the new family of UiPath LLMs – GenAI models specialized for processing different types of documents and messages
This is a hands-on session specifically designed for automation developers and AI enthusiasts seeking to enhance their knowledge in leveraging the latest intelligent document processing capabilities offered by UiPath.
Speakers:
👨🏫 Andras Palfi, Senior Product Manager, UiPath
👩🏫 Lenka Dulovicova, Product Program Manager, UiPath
Elevating Tactical DDD Patterns Through Object CalisthenicsDorra BARTAGUIZ
After immersing yourself in the blue book and its red counterpart, attending DDD-focused conferences, and applying tactical patterns, you're left with a crucial question: How do I ensure my design is effective? Tactical patterns within Domain-Driven Design (DDD) serve as guiding principles for creating clear and manageable domain models. However, achieving success with these patterns requires additional guidance. Interestingly, we've observed that a set of constraints initially designed for training purposes remarkably aligns with effective pattern implementation, offering a more ‘mechanical’ approach. Let's explore together how Object Calisthenics can elevate the design of your tactical DDD patterns, offering concrete help for those venturing into DDD for the first time!
DevOps and Testing slides at DASA ConnectKari Kakkonen
My and Rik Marselis slides at 30.5.2024 DASA Connect conference. We discuss about what is testing, then what is agile testing and finally what is Testing in DevOps. Finally we had lovely workshop with the participants trying to find out different ways to think about quality and testing in different parts of the DevOps infinity loop.
Builder.ai Founder Sachin Dev Duggal's Strategic Approach to Create an Innova...Ramesh Iyer
In today's fast-changing business world, Companies that adapt and embrace new ideas often need help to keep up with the competition. However, fostering a culture of innovation takes much work. It takes vision, leadership and willingness to take risks in the right proportion. Sachin Dev Duggal, co-founder of Builder.ai, has perfected the art of this balance, creating a company culture where creativity and growth are nurtured at each stage.
GraphRAG is All You need? LLM & Knowledge GraphGuy Korland
Guy Korland, CEO and Co-founder of FalkorDB, will review two articles on the integration of language models with knowledge graphs.
1. Unifying Large Language Models and Knowledge Graphs: A Roadmap.
https://arxiv.org/abs/2306.08302
2. Microsoft Research's GraphRAG paper and a review paper on various uses of knowledge graphs:
https://www.microsoft.com/en-us/research/blog/graphrag-unlocking-llm-discovery-on-narrative-private-data/
Encryption in Microsoft 365 - ExpertsLive Netherlands 2024Albert Hoitingh
In this session I delve into the encryption technology used in Microsoft 365 and Microsoft Purview. Including the concepts of Customer Key and Double Key Encryption.
LF Energy Webinar: Electrical Grid Modelling and Simulation Through PowSyBl -...DanBrown980551
Do you want to learn how to model and simulate an electrical network from scratch in under an hour?
Then welcome to this PowSyBl workshop, hosted by Rte, the French Transmission System Operator (TSO)!
During the webinar, you will discover the PowSyBl ecosystem as well as handle and study an electrical network through an interactive Python notebook.
PowSyBl is an open source project hosted by LF Energy, which offers a comprehensive set of features for electrical grid modelling and simulation. Among other advanced features, PowSyBl provides:
- A fully editable and extendable library for grid component modelling;
- Visualization tools to display your network;
- Grid simulation tools, such as power flows, security analyses (with or without remedial actions) and sensitivity analyses;
The framework is mostly written in Java, with a Python binding so that Python developers can access PowSyBl functionalities as well.
What you will learn during the webinar:
- For beginners: discover PowSyBl's functionalities through a quick general presentation and the notebook, without needing any expert coding skills;
- For advanced developers: master the skills to efficiently apply PowSyBl functionalities to your real-world scenarios.
Essentials of Automations: Optimizing FME Workflows with ParametersSafe Software
Are you looking to streamline your workflows and boost your projects’ efficiency? Do you find yourself searching for ways to add flexibility and control over your FME workflows? If so, you’re in the right place.
Join us for an insightful dive into the world of FME parameters, a critical element in optimizing workflow efficiency. This webinar marks the beginning of our three-part “Essentials of Automation” series. This first webinar is designed to equip you with the knowledge and skills to utilize parameters effectively: enhancing the flexibility, maintainability, and user control of your FME projects.
Here’s what you’ll gain:
- Essentials of FME Parameters: Understand the pivotal role of parameters, including Reader/Writer, Transformer, User, and FME Flow categories. Discover how they are the key to unlocking automation and optimization within your workflows.
- Practical Applications in FME Form: Delve into key user parameter types including choice, connections, and file URLs. Allow users to control how a workflow runs, making your workflows more reusable. Learn to import values and deliver the best user experience for your workflows while enhancing accuracy.
- Optimization Strategies in FME Flow: Explore the creation and strategic deployment of parameters in FME Flow, including the use of deployment and geometry parameters, to maximize workflow efficiency.
- Pro Tips for Success: Gain insights on parameterizing connections and leveraging new features like Conditional Visibility for clarity and simplicity.
We’ll wrap up with a glimpse into future webinars, followed by a Q&A session to address your specific questions surrounding this topic.
Don’t miss this opportunity to elevate your FME expertise and drive your projects to new heights of efficiency.
Essentials of Automations: Optimizing FME Workflows with Parameters
Alexandra Basford, InCoB 2011: A Journal’s Perspective on Data Standards and Biocuration
1. A Journal’s Perspective on Data
Standards and Biocuration
Alexandra Basford, PhD
w w w. g i g a s c i e n c e j o u r n a l . c o m
2. Overview
/ The Curation
Challenges of a
Introduction Journal/Database
Reproducibility/Reuse
Data Publishing
Utility/Usability
Our DOI
Adventures
Standards/Searchability/
Sharing
3. Overview
/ The Curation
Challenges of a
Introduction Journal/Database
How do we deal with “big data”?
Reproducibility/Reuse
Data Publishing
Utility/Usability
Our DOI
Adventures
Standards/Searchability/
Sharing
7. is a new open-access open-
data journal for the publication of all types of
biological studies that use or create large-
scale data sets
The scope spans the biomedical and life sciences,
including:
- “Omics” - Ecology
- Imaging - Medicine
- Neuroscience - Systems biology
… “big and sharable”
Published by
in partnership with
8. Editorial Board – International
Stephan Beck, UK Stephen O'Brien, USA
Alvis Brazma, UK Hanchuan Peng, USA
Ann-Shyn Chiang, Taiwan Russell Poldrack, USA
Richard Durbin, UK Ming Qi, China/USA
Paul Flicek, UK Susanna-Assunta Sansone, UK
Robert Hanner, Canada Michael Schatz, USA
Yoshihide Hayashizaki, Japan David Schwartz, USA
Henning Hermjakob, UK Fritz Sommer, USA
Wolfgang Huber, Germany Lincoln Stein, Canada
Gary King, USA Sumio Sugano, Japan
Tin-Lap Lee, Hong Kong Thomas Wachtler, Germany
Donald Moerman, Canada Jun Wang, China
Karen Nelson, USA Alistair Young, New Zealand
Francis Ouellette, Canada Zang Yufeng, China
Lennart Hammarström, Sweden Marie Zins, France
Paul Horton, Japan
9. Editorial Board – Multidisciplinary
Stephan Beck, Epigenomics Stephen O'Brien, Genomics
Alvis Brazma, Transcriptomics Hanchuan Peng, Imaging/Neuro
Ann-Shyn Chiang, Neuroscience Russell Poldrack, Neuroscience
Richard Durbin, Genetics/Genomics Ming Qi, Genetics
Paul Flicek, Genomics Susanna-Assunta Sansone, Standards
Robert Hanner, DNA Barcoding/Ecology Michael Schatz, Cloud Computing
Yoshihide Hayashizaki, Genomics David Schwartz, Optical Mapping
Henning Hermjakob, Proteomics Fritz Sommer, Neuroscience
Wolfgang Huber, Functional Genomics Lincoln Stein, Cloud Computing
Gary King, Medicine Sumio Sugano, Genomics
Tin-Lap Lee, Genomics Thomas Wachtler, Neuroscience
Donald Moerman, Functional Genomics Jun Wang, Genomics
Karen Nelson, Metagenomics Alistair Young, Medical Imaging
Francis Ouellette, Genomics Zang Yufeng, Neuroscience
Lennart Hammarström, Immuno/Genetics Marie Zins, Medicine
Paul Horton, Genetics/Tools
14. An Unusual Format
• GigaScience combines standard manuscript
publication with an ever expanding database
• Evolving data repository
– Integrating tools for public access, viewing, and analysis of
the stored data
– Improvements driven by community input
• All datasets are assigned data digital object
identifiers (DOIs) to make them easy to access, track,
and cite
&
15. Data Sharing Hurdles
• Technical
– too large volumes
– too heterogeneous
– no home for many data types
• Economic
– too expensive
– no long-term funding
• Cultural
– inertia
– no incentives to share
– unaware of how ?
– too time consuming
16. Changing Trends
Cultural shift towards data sharing.
Growing/widening user base.
The long tail of new “big-data” producers?
Curation, cutation, curation
?
17. Use of Data = Importance + Usability
subjective? easier to assess
18. Challenges for a Journal/Database
Reproducibility/Reuse
Utility/Usability
Standards/Searchability/Shari
ng
Data publishing/DOI DOI®
19. Why DOI®s?
• Guarantee of permanency .org
• Clear method for data tracking and data citation,
allowing:
– Increased the searchability (and hopefully use) of data
– Credit for data production, making it clear who produced
the data and when
– Credit to original authors for their data’s use
– The ability to track and receive feedback on data usage
– A data citation metric potentially rivaling and
complementary to the impact factor
– The potential make the data available and receive credit
for it earlier, then later publishing papers on the dataset
20. Largest Sequencing Capacity in the World
Sequencers Data Production
137 Illumina/HiSeq 2000 5.6 Tb / day
27 LifeTech/SOLiD 4 > 1500X of human genome / day
16 AB/3730xl + 110 MegaBACEs
Multiple Supercomputing Centers
2 Illumina iScan
157 TB Flops
20 TB Memory
12.6 PB Storage
23. Datasets
Vertebrates
Invertebrates Giant panda Plants
Macaque Chinese cabbage
Ant
- Chinese rhesus Cucumber
- Florida carpenter ant
- Crab-eating
- Jerdon’s jumping ant Foxtail millet
Naked mole rat Pigeonpea
- Leaf-cutter ant
Penguin Potato
Roundworm
- Emperor penguin Sorghum
Silkworm
- Adelie penguin
Pigeon, domestic
Human
Polar bear
Asian individual (YH)
Sheep
- DNA Methylome
Tibetan antelope
- Genome Assembly
- Transcriptome Microbe
Ancient DNA (coming soon)
E. Coli O104:H4 TY-2482
- Saqqaq Eskimo
- Aboriginal Australian Cell Line
Chinese Hamster Ovary
25. Our First DOI®
To maximize its utility to the research community and aid those fighting the current
epidemic, genomic data is released here into the public domain under a CC0
license. Until the publication of research papers on the assembly and whole-
genome analysis of this isolate we would ask you to cite this dataset as:
Li, D; Xi, F; Zhao, M; Liang, Y; Chen, W; Cao, S; Xu, R; Wang, G; Wang, J; Zhang,
Z; Li, Y; Cui, Y; Chang, C; Cui, C; Luo, Y; Qin, J; Li, S; Li, J; Peng, Y; Pu, F; Sun,
Y; Chen,Y; Zong, Y; Ma, X; Yang, X; Cen, Z; Zhao, X; Chen, F; Yin, X; Song,Y ;
Rohde, H; Li, Y; Wang, J; Wang, J and the Escherichia coli O104:H4 TY-2482
isolate genome sequencing consortium (2011)
Genomic data from Escherichia coli O104:H4 isolate TY-2482. BGI Shenzhen.
doi:10.5524/100001
http://dx.doi.org/10.5524/100001
To the extent possible under law, BGI Shenzhen has waived all copyright and related or neighboring
rights to Genomic Data from the 2011 E. coli outbreak. This work is published from: China.
34. • Data also submitted to NCBI (including SV data
to dbVar)
• Submission to public databases complemented
by its citable form in GigaDB:
- Assemblies of three strains - Raw data
- SNPs - InDels
- CNVs - SV
38. Progress!
We begin issuing
data DOIs Journals accept
articles with data August
July that have data DOIs
Data DOIs listed in journal
October
articles
Data DOIs are properly cited in the
November
reference section of journal articles
(It’s been a busy year.)
39. Challenges for a Journal/Database
Reproducibility/Reuse
Utility/Usability
Standards/Searchability/Shari
ng
Data publishing/DOI DOI®
40. Challenges for /
Reproducibility/Reuse
Utility/Usability
Standards/Searchability/Shari
ng
✔Data publishing/DOI DOI®
41. Reproducibility/Reuse
• BGI Cloud Computing resources for
handling and analyzing large-scale data.
• Integrated tools to promote more
widespread access, viewing, and analysis
of data.
• Encourage and aid use of workflow
systems for methods (e.g. submission of
Galaxy XML files).
42. Utility/Usability = ease of access
• Special series/hub for cloud-based tools
- Technical notes: test tools in the BGI-Cloud.
- Tools + test data (BGI or user) in one place.
- Aids reproducibility.
- Aids reviewers (free)
- Aids authors: visibility (pubmed, etc.)
hosting (included/free offers)
–contact us: editorial@gigasciencejournal.com
Oledoe flickr cc
44. Standards/Searchability/Sharing
• ISA-Tab compatibility to aid and promote
best practice in metadata reporting.
• All supporting data must be publically
available.
• Ask for MIBBI compliance and use of
reporting checklists.
• Part of the Biosharing network and the
International Neuroinformatics
Coordinating Facility.
45. Big Data
•Initiated 505 plant and animal genome
projects
•Completed fine or draft genome maps for
over 100 species
ldl.genomics.cn •Finished the sequencing of about 200
species
46. Editor-in-Chief: Laurie Goodman, PhD
Editor: Scott Edmunds, PhD
Assistant Editor: Alexandra Basford, PhD
Contact: editorial@gigasciencejournal.com
Follow GigaScience on Twitter @GigaScience
w w w. g i g a s c i e n c e j o u r n a l . c o m
w w w. g i g a D B . o r g
Editor's Notes
Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
Our facilities feature Sanger and next-generation sequencing technologies, providing the highest throughput sequencing capacity in the world. Powered by 137 IlluminaHiSeq 2000 instruments and 27 Applied BiosystemsSOLiD™ 4 Systems, we provide, high-quality sequencing results with industry-leading turnaround time. As of December 2010, our sequencing capacity is 5 Tb raw data per day, supported by several supercomputing centers with a total peak performance up to 102 Tflops, 20 TB of memory, and 10 PB storage. We provide stable and efficient resources to store and analyze massive amounts of data generated by next generation sequencing.
Raw data has been submitted to the SRA, the assembly submitted to GenBank (no number), SV data todbVar (it’s the first plant data they’ve received). Complements the traditional public databases by having all these “extra” data types, it’s all in one place, and it’s citable.
Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
Integrated tools to promote more widespread access, viewing, and analysis of the stored data. BGI Cloud Computing resources for handling and analyzing large-scale data. All Data given a DOI to allow ease of finding and citing datasets, as well as for citation tracking.
Have all of the metadata fields, working on integrating the tools.