This presentation was provided by Neil Thakur of the NIH during the NISO virtual conference, The Preprint: Integrating the Form into the Scholarly Ecosystem, held on February 14, 2018.
This presentation was provided by John Inglis of Cold Spring Harbor Laboratory during the NISO virtual conference, The Preprint: Integrating the Form into the Scholarly Ecosystem, held on February 14, 2018.
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...Andrew Sallans
The UVA Library Scientific Data Consulting Group (SciDaC) provides new partnerships and services to support scientific data management in research. SciDaC was formed in 2010 to focus on data consulting after restructuring from the Research Computing Lab. SciDaC conducts data interviews and assessments, assists with NSF Data Management Plan requirements, and works to integrate research data into the institutional repository. Future work includes expanding disciplinary support, integrating into the research proposal process, and advising on data policy.
BioMed Central is a large open access publisher that is committed to open data initiatives. They have implemented several solutions to promote open data practices, including data journals, an open data award, and enabling data citation. They also work to integrate data hosting and deposition, address data licensing issues, and provide guidance on best practices. Future goals include adding more value to text and data mining applications and building business models around open data.
Strand SmartLab - Enabling Precision Medicine at community HospitalsHarsha Rajasimha
Strand SmartLab is a complete soup to nuts solution that enables a community hospital to establish precision medicine testing services in-house. This enables the retention of revenues internally rather than loosing them to external third party laboratories. Genomics driven precision medicine for Cancer and other diseases require highly skilled people, lab equipment, processes, regulatory experts, bigdata software, databases and curation, medical geneticists to interpret the results in clinical settings and genetic counselors. Strand SmartLab brings all these to your institution in a pre-packaged solution.
This document summarizes a presentation by Timothy Hoctor, VP of Professional Services at Elsevier, about Elsevier's strategic vision and professional services. The key points are:
1) Elsevier aims to increase R&D productivity by linking data across the development spectrum and increase return on information through enhanced search and visualization tools.
2) Elsevier's Professional Services team leverages Elsevier's capabilities to provide customized data management and analysis solutions.
3) Elsevier's strategic objective is to become a leading collaborator in R&D data management through services like data mapping, gap analysis, data governance, and integrated data management.
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Andrew Sallans
The Center for Open Science (COS) was founded as a non-profit technology start-up in 2013 with the goal of improving transparency and reproducibility by connecting the scholarly workflow. COS achieves this goal through the development of a free, open source web application called the Open Science Framework (OSF), providing features like file sharing and citing, persistent urls, provenance tracking, and automated versioning. Initial workflow API connections focused on storage services and included Figshare, GitHub, Amazon S3, Dropbox, and Dataverse. The team is now working to connect other parts of the workflow with services like DMPTool, Databib/re3data, and Databrary. This session will introduce the core architecture and the problems that it solves, and illustrate how connecting services can benefit everyone involved in supporting the research ecosystem. COS is funded through the generosity of grants from the Laura and John Arnold Foundation, the John Templeton Foundation, the Alfred P. Sloan Foundation, the Association of Research Libraries, and others.
Presented at CNI Fall 2014, Washington, DC.
This presentation was provided by Neil Thakur of the NIH during the NISO virtual conference, The Preprint: Integrating the Form into the Scholarly Ecosystem, held on February 14, 2018.
This presentation was provided by John Inglis of Cold Spring Harbor Laboratory during the NISO virtual conference, The Preprint: Integrating the Form into the Scholarly Ecosystem, held on February 14, 2018.
UVa Library Scientific Data Consulting Group (SciDaC): New Partnerships and...Andrew Sallans
The UVA Library Scientific Data Consulting Group (SciDaC) provides new partnerships and services to support scientific data management in research. SciDaC was formed in 2010 to focus on data consulting after restructuring from the Research Computing Lab. SciDaC conducts data interviews and assessments, assists with NSF Data Management Plan requirements, and works to integrate research data into the institutional repository. Future work includes expanding disciplinary support, integrating into the research proposal process, and advising on data policy.
BioMed Central is a large open access publisher that is committed to open data initiatives. They have implemented several solutions to promote open data practices, including data journals, an open data award, and enabling data citation. They also work to integrate data hosting and deposition, address data licensing issues, and provide guidance on best practices. Future goals include adding more value to text and data mining applications and building business models around open data.
Strand SmartLab - Enabling Precision Medicine at community HospitalsHarsha Rajasimha
Strand SmartLab is a complete soup to nuts solution that enables a community hospital to establish precision medicine testing services in-house. This enables the retention of revenues internally rather than loosing them to external third party laboratories. Genomics driven precision medicine for Cancer and other diseases require highly skilled people, lab equipment, processes, regulatory experts, bigdata software, databases and curation, medical geneticists to interpret the results in clinical settings and genetic counselors. Strand SmartLab brings all these to your institution in a pre-packaged solution.
This document summarizes a presentation by Timothy Hoctor, VP of Professional Services at Elsevier, about Elsevier's strategic vision and professional services. The key points are:
1) Elsevier aims to increase R&D productivity by linking data across the development spectrum and increase return on information through enhanced search and visualization tools.
2) Elsevier's Professional Services team leverages Elsevier's capabilities to provide customized data management and analysis solutions.
3) Elsevier's strategic objective is to become a leading collaborator in R&D data management through services like data mapping, gap analysis, data governance, and integrated data management.
Improving Integrity, Transparency, and Reproducibility Through Connection of ...Andrew Sallans
The Center for Open Science (COS) was founded as a non-profit technology start-up in 2013 with the goal of improving transparency and reproducibility by connecting the scholarly workflow. COS achieves this goal through the development of a free, open source web application called the Open Science Framework (OSF), providing features like file sharing and citing, persistent urls, provenance tracking, and automated versioning. Initial workflow API connections focused on storage services and included Figshare, GitHub, Amazon S3, Dropbox, and Dataverse. The team is now working to connect other parts of the workflow with services like DMPTool, Databib/re3data, and Databrary. This session will introduce the core architecture and the problems that it solves, and illustrate how connecting services can benefit everyone involved in supporting the research ecosystem. COS is funded through the generosity of grants from the Laura and John Arnold Foundation, the John Templeton Foundation, the Alfred P. Sloan Foundation, the Association of Research Libraries, and others.
Presented at CNI Fall 2014, Washington, DC.
The document discusses the central role of scholarly societies in developing preprint servers. It describes how the American Chemical Society (ACS) established ChemRxiv, a preprint server for chemistry. The ACS engaged widely with the chemistry community and other stakeholders. ChemRxiv offers a simple submission process and makes all preprints freely available. It provides version tracking, citations, and metrics to authors. The ACS aims for ChemRxiv to be sustainable and integrated into the scholarly communication cycle.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
The document provides guidance on writing scientific research papers. It discusses the objectives of scientific research which include observing phenomena, developing hypotheses, testing hypotheses through experiments, and explaining results. It also outlines the typical structure of a research paper, including the introduction, literature review, methodology, results, discussion, and conclusion sections. Tips are provided for writing each section effectively, such as stating the research question or hypothesis in the introduction and interpreting findings in the discussion section.
PLOS Biology is launching a new section focused on meta-research to increase transparency in biosciences research. Meta-research examines issues related to research design, methods, reporting, evaluation and rewards. This will include exploring sources of bias, data sharing standards, and assessment metrics. Registered Reports will also be introduced, which accept studies for publication based on proposed methods rather than results, reducing bias against negative findings. However, most research data is lost within 10-15 years, highlighting the need for improved data sharing policies to maximize the value of research findings.
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Research Skills Session 10: Improve a Research Paper QualityNader Ale Ebrahim
In this workshop, Dr. Nader introduces some tools for improving a research paper quality from his Research Tools Mind Map. The Research Tools enable researchers to follow the correct path in research and to ultimately produce high-quality research outputs with more accuracy and efficiency. Besides introducing some tools, he emphasize on ten techniques such as: Collaborate with excellent researchers, Choose a good research team, Focus on quality instead of quantity, Use recent and relevant references, Avoid obvious errors, Don’t forget story telling style, Write clearly, concisely and smartly, Read your paper several times, Target the top journals, and Follow patterns of well-written papers in your field, for improving a research paper quality.
Enhancing Our Capacity for Large Health Dataset AnalysisCTSI at UCSF
Overview of UCSF-CTSI Comparative Effectiveness Large Dataset Analysis Core, which offers resources for the analysis of large, public data sets on health and health care.
This presentation gives a quick insight into how Scopus can benefit the scientific community and which value it adds to research institutions.
Increasing the speed to discovery and making resources more visible are just a few key drivers for the world wide success of www.scopus.com.
Read more on at http://info.scopus.com
The document summarizes talks given by Jez Cope at the Oxford Open Science meeting about technology training provided to postgraduate students at the University of Bath. It describes two projects Cope worked on: 1) Connected Researcher @ Bath which involved workshops to promote social media use among researchers, and 2) research data management workshops for postgraduate students to develop practical data management strategies. Feedback from students indicated high satisfaction with the courses but low levels of actual implementation of data management plans. The document concludes that postgraduate students are aware of technologies' potential but wary of risks, and are influenced by supervisors but also influence them.
This document provides an overview of the publishing process for the Journal of Advanced Nursing (JAN). It discusses the peer review process, submission requirements, reasons for rejection, revisions, and production. Metrics on submissions and acceptances are presented. Guidelines that JAN adheres to are outlined, including those around authorship, plagiarism detection, and retractions. What authors can expect during submission and post-acceptance is also reviewed.
The document discusses the importance of managing research data. It notes that data management saves time, makes long-term data preservation easier, and supports sharing data with others. Data sharing is now required by most major funding agencies and academic journals. The document provides examples of problems caused by poor data management practices and outlines the key components of a data management plan, such as describing the data, file formats, sharing and archiving policies, and responsibilities. Researchers are encouraged to seek help from scientific consulting services for creating data management plans.
Research Transparency in the Social Sciences: DA-RTARDC
Transparency Protects the Legitimacy of Research
Transparency and Openness Promotion (TOP) Guidelines for Journals
What are we afraid of?
What can be gained?
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2016-02-03. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Susanna-Assunta Sansone
- The document discusses the need for open and accessible data in research. It notes that over 50% of studies are not published due to selective reporting of results.
- There is a movement for "FAIR data" in life and medical sciences, where data is findable, accessible, interoperable, and reusable. However, not much data currently meets these standards.
- Publishers can play a role in incentivizing data sharing by implementing policies requiring data availability and format standards for publishing research. This includes supporting data citations and data journals.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2017-02-22. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
The document summarizes a workshop hosted by the NIH Associate Director for Data Science to discuss charting the future of data science at NIH. The workshop goals were to get input from all stakeholders, identify strategic directions, policies, and funding initiatives, and have participants leave as advocates and supporters. The agenda included providing background, open discussion, identifying topics for breakout groups, subgroup discussions, and reporting back. The document provides context on current NIH data science efforts and examples of collaborators in building a biomedical research digital enterprise.
This document discusses the role of libraries in assessing and reporting research impact. It begins by outlining common metrics used to measure impact, such as citations, social media mentions, and altmetrics. It then discusses frameworks that can help contextualize impact, such as the Becker Model. The document emphasizes moving beyond simple counts to understand the true impact of research. It proposes strategies libraries can take, such as integrating altmetrics into institutional repositories and using research networking systems to map the diffusion of research outputs. Overall, the key points are that understanding research impact is complex, libraries can play an important role by supporting impact assessment, and stakeholder engagement is critical for local success.
The document discusses the central role of scholarly societies in developing preprint servers. It describes how the American Chemical Society (ACS) established ChemRxiv, a preprint server for chemistry. The ACS engaged widely with the chemistry community and other stakeholders. ChemRxiv offers a simple submission process and makes all preprints freely available. It provides version tracking, citations, and metrics to authors. The ACS aims for ChemRxiv to be sustainable and integrated into the scholarly communication cycle.
Introduction to research data managementMichael Day
Slides from a presentation given at the JIBS User Group / RLUK joint event "Demystifying research data: don't be scared, be prepared" held at the SOAS Brunei Gallery, London, 17 July 2012.
The document provides guidance on writing scientific research papers. It discusses the objectives of scientific research which include observing phenomena, developing hypotheses, testing hypotheses through experiments, and explaining results. It also outlines the typical structure of a research paper, including the introduction, literature review, methodology, results, discussion, and conclusion sections. Tips are provided for writing each section effectively, such as stating the research question or hypothesis in the introduction and interpreting findings in the discussion section.
PLOS Biology is launching a new section focused on meta-research to increase transparency in biosciences research. Meta-research examines issues related to research design, methods, reporting, evaluation and rewards. This will include exploring sources of bias, data sharing standards, and assessment metrics. Registered Reports will also be introduced, which accept studies for publication based on proposed methods rather than results, reducing bias against negative findings. However, most research data is lost within 10-15 years, highlighting the need for improved data sharing policies to maximize the value of research findings.
This slideshow was used in an Introduction to Research Data Management course for the Social Sciences Division, University of Oxford, on 2015-05-27. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Research Skills Session 10: Improve a Research Paper QualityNader Ale Ebrahim
In this workshop, Dr. Nader introduces some tools for improving a research paper quality from his Research Tools Mind Map. The Research Tools enable researchers to follow the correct path in research and to ultimately produce high-quality research outputs with more accuracy and efficiency. Besides introducing some tools, he emphasize on ten techniques such as: Collaborate with excellent researchers, Choose a good research team, Focus on quality instead of quantity, Use recent and relevant references, Avoid obvious errors, Don’t forget story telling style, Write clearly, concisely and smartly, Read your paper several times, Target the top journals, and Follow patterns of well-written papers in your field, for improving a research paper quality.
Enhancing Our Capacity for Large Health Dataset AnalysisCTSI at UCSF
Overview of UCSF-CTSI Comparative Effectiveness Large Dataset Analysis Core, which offers resources for the analysis of large, public data sets on health and health care.
This presentation gives a quick insight into how Scopus can benefit the scientific community and which value it adds to research institutions.
Increasing the speed to discovery and making resources more visible are just a few key drivers for the world wide success of www.scopus.com.
Read more on at http://info.scopus.com
The document summarizes talks given by Jez Cope at the Oxford Open Science meeting about technology training provided to postgraduate students at the University of Bath. It describes two projects Cope worked on: 1) Connected Researcher @ Bath which involved workshops to promote social media use among researchers, and 2) research data management workshops for postgraduate students to develop practical data management strategies. Feedback from students indicated high satisfaction with the courses but low levels of actual implementation of data management plans. The document concludes that postgraduate students are aware of technologies' potential but wary of risks, and are influenced by supervisors but also influence them.
This document provides an overview of the publishing process for the Journal of Advanced Nursing (JAN). It discusses the peer review process, submission requirements, reasons for rejection, revisions, and production. Metrics on submissions and acceptances are presented. Guidelines that JAN adheres to are outlined, including those around authorship, plagiarism detection, and retractions. What authors can expect during submission and post-acceptance is also reviewed.
The document discusses the importance of managing research data. It notes that data management saves time, makes long-term data preservation easier, and supports sharing data with others. Data sharing is now required by most major funding agencies and academic journals. The document provides examples of problems caused by poor data management practices and outlines the key components of a data management plan, such as describing the data, file formats, sharing and archiving policies, and responsibilities. Researchers are encouraged to seek help from scientific consulting services for creating data management plans.
Research Transparency in the Social Sciences: DA-RTARDC
Transparency Protects the Legitimacy of Research
Transparency and Openness Promotion (TOP) Guidelines for Journals
What are we afraid of?
What can be gained?
This slideshow was used in an Introduction to Research Data Management course taught for the Mathematical, Physical and Life Sciences Division, University of Oxford, on 2016-02-03. It provides an overview of some key issues, looking at both day-to-day data management, and longer term issues, including sharing, and curation.
Oxford DTP - Sansone - Data publications and Scientific Data - Dec 2014Susanna-Assunta Sansone
- The document discusses the need for open and accessible data in research. It notes that over 50% of studies are not published due to selective reporting of results.
- There is a movement for "FAIR data" in life and medical sciences, where data is findable, accessible, interoperable, and reusable. However, not much data currently meets these standards.
- Publishers can play a role in incentivizing data sharing by implementing policies requiring data availability and format standards for publishing research. This includes supporting data citations and data journals.
This slideshow was used in a Preparing Your Research Material for the Future course for the Humanities Division, University of Oxford, on 2017-02-22. It provides an overview of some key issues, focusing on the long-term management of data and other research material, including sharing and curation.
The document summarizes a workshop hosted by the NIH Associate Director for Data Science to discuss charting the future of data science at NIH. The workshop goals were to get input from all stakeholders, identify strategic directions, policies, and funding initiatives, and have participants leave as advocates and supporters. The agenda included providing background, open discussion, identifying topics for breakout groups, subgroup discussions, and reporting back. The document provides context on current NIH data science efforts and examples of collaborators in building a biomedical research digital enterprise.
This document discusses the role of libraries in assessing and reporting research impact. It begins by outlining common metrics used to measure impact, such as citations, social media mentions, and altmetrics. It then discusses frameworks that can help contextualize impact, such as the Becker Model. The document emphasizes moving beyond simple counts to understand the true impact of research. It proposes strategies libraries can take, such as integrating altmetrics into institutional repositories and using research networking systems to map the diffusion of research outputs. Overall, the key points are that understanding research impact is complex, libraries can play an important role by supporting impact assessment, and stakeholder engagement is critical for local success.
June 18, 2014
NISO Virtual Conference: Transforming Assessment: Alternative Metrics and Other Trends
Assessing and Reporting Research Impact – A Role for the Library
- Kristi L. Holmes, Ph.D., Director, Galter Health Sciences Library, Northwestern University, Feinberg School of Medicine
Presentation slides on Open Science and research reproducibility. Presented by Gareth Knight (LSHTM Research Data Manager) on 18th September 2018, as part of an Open Science event for LSHTM Week 2018.
June 18, 2014
NISO Virtual Conference: Transforming Assessment: Alternative Metrics and Other Trends
Assessing and Reporting Research Impact – A Role for the Library
- Kristi L. Holmes, Ph.D., Director, Galter Health Sciences Library, Northwestern University, Feinberg School of Medicine
Curlew Research Brussels 2014 Electronic Data & Knowledge ManagementNick Lynch
Life Science externalisation and collaboration overview and the challenges that Life Science companies face in delivering successful data sharing with their partners in either Open Innovation or pre-competitive workflows
From logic model to data model: real and perceived barriers to research asses...ORCID, Inc
The document discusses barriers to research assessment and describes how a web-based data collection and analysis system called iTRAQR helped address those barriers for the Physical Sciences-Oncology Centers (PS-OC) program. It summarizes how iTRAQR allowed automated collection of publication, collaboration, and other data; linking of individuals' contributions over time; and generation of charts and graphs to analyze outputs and outcomes at individual, center, and network levels. The document concludes that evaluation is improved by early design, engagement with participants, and consideration of follow-up actions informed by the evaluation.
Librarians can provide valuable data management services to researchers on campus. An effective strategy includes surveying researchers to identify needs, communicating service offerings through workshops and consultations, and providing in-depth guidance on data management plans and long-term data preservation. Developing workshops involves setting learning objectives, evaluating content, and securing resources like space and food. Consultations allow librarians to help with specific topics like choosing file formats or finding metadata standards. Creating a data management plan requires detailing a data inventory, metadata description, long-term preservation and access methods. Trusted disciplinary repositories and use of stable identifiers help ensure long-term findability and access.
The document summarizes the development of an evaluation framework and data collection system for the Physical Sciences-Oncology Centers (PS-OC) program. It describes how the program initially collected data manually but transitioned to an automated system called iTRAQR that allows for structured data entry and visualization of outputs like publications, collaborations, and personnel. The system helps analyze activities at the individual, center, and network levels. Lessons learned include starting with a logic model, having a flexible approach, and recognizing that evaluation depends on available data. Overall, the document outlines how the PS-OC program developed its evaluation strategy and an in-house system to systematically track outputs and outcomes over time.
It is about:
Introduction: What Is “Research Data”? and Data Lifecycle
Part 1:
Why Manage Your Data?
Formatting and organizing the data
Storage and Security of Data
Data documentation and meta data
Quality Control
Version controlling
Working with sensitive data
Controlled Vocabulary
Centralized Data Management
Part 2:
Data sharing
What are publishers & funders saying about data sharing?
Researchers’ Attitudes
Benefits of data sharing
Considerations before data sharing
Methods of Data Sharing
Shared Data Uses and Its’ Limitations
Data management plans
Brief summary
Acknowledgment , References
A 45min presentation given at the 'Getting published in Nature's Scientific Data journal', hosted by the University of Cambridge Research Data Management team (www.data.cam.ac.uk). Presented on Monday 11th January 2016.
Getting to grips with research data management Wendy Mears
This document provides an overview of research data management. It defines research data management and discusses its importance. It also outlines the data lifecycle model and provides guidance on sharing data, working with data, planning for data management, and useful resources for research data management. The document aims to help researchers effectively manage the data created throughout the research process.
PIDs, Data and Software: How Libraries Can Support Researchers in an Evolving...Sarah Anna Stewart
Presentation given at the M25 Consortium of Academic Libraries, CPD25 Event on 'The Role of the Library in Supporting Research'. Provides an introduction to data, software and PIDs and a brief look at how libraries can enable researchers to gain impact and credit for their research data and software.
Identification of Early Career Researchers: How Universities and Funding Orga...ORCID, Inc
Funding agencies, universities, and research institutes all face challenges of reliably identifying their researchers and monitoring outcomes over time. All researchers—and especially early career researchers seeking to establish their careers—need to be reliably connected to their research outputs, without the confusion common, changeable names creates. Graduate students and postdoctoral researchers supported by grants also have specific challenges: if they are not the PI, they are not included in grant information; they may not even know which grant(s) they are supported by; and as a result, the existing challenges of reliably tying publications to grant funding are even more problematic. The use of the unique, persistent ORCID identifier can help support outcomes tracking and evaluation.
In 2012, the U.S. National Institutes of Health Biomedical Research Workforce Working Group made recommendations that the NIH should take to support a sustainable biomedical research workforce in the U.S. In the course of its study, working group members were “frustrated and sometimes stymied” by the lack of quality, comprehensive data about biomedical researchers. In response, NIH has recommended the development of a simple, comprehensive tracking system for trainees, implemented a shared, voluntary researcher profile system called the Science Experts Network Curriculum Vitae (SciENcv), and encouraged the adoption of unique, persistent ORCID identifiers for researchers. Additionally, NIH has begun collecting data about individuals in graduate and undergraduate student project roles who are supported by NIH grants.
Research universities like Texas A&M are also responding by incorporating the ORCID identifier into their systems, enabling the improved identification, data collection, and career outcome tracking of students and postdoctoral researchers--and educating these early career researchers about the benefits they will receive from a unique, persistent research identifier. They are also beginning to link Electronic Theses and Dissertations (ETDs) to early career researchers' ORCID records.
ORCID is an independent, non-profit organization that provides an open registry of unique and persistent identifiers for researchers and scholars. ORCID collaborates with the community to integrate ORCID identifiers into research systems and workflows, improving data management and accuracy across systems. ORCID enables interoperability between research systems worldwide, ensuring that researchers are correctly and automatically linked to their contributions. Since its launch in October 2012, ORCID has seen rapid adoption by more than 670,000 researchers and 130+ member organizations.
From Webinar 4/23/14, https://orcid.org/content/identification-early-career-researchers-how-universities-and-funding-organizations-are-using
This document discusses implementing ORCID identifiers at Northumbria University. It describes Northumbria as a research-rich university with over 1,300 academic staff across four faculties. The Scholarly Publications team provides support for research activities including the institutional repository and research data management. ORCID was first promoted in 2013 and is now integrated into the postgraduate researcher workflow and upcoming staff publishing workflows. ORCID helps with accurate attribution of authors in research metrics reporting and identifying collaborations. Maintaining central support and emphasizing the benefits to individuals have helped adoption.
This document discusses reproducible research and provides guidance on how to conduct research in a reproducible manner. It covers:
1. The importance of reproducible research due to large datasets, computational analyses, and the potential for human error. Ensuring reproducibility requires new expertise and infrastructure.
2. Key aspects of reproducible research include data management plans, version control, use of file formats and software/tools that allow reproducibility, and publishing data and code to allow others to replicate results.
3. Reproducible research benefits the scientific community by increasing transparency and allows researchers to re-analyze their own data in the future. Journals and funders are increasingly requiring reproducibility.
Closing the Loop in Healthcare Analytics - Correlating Clinical and Administrative Systems with Research Efforts to Deliver Clinical Efficiency in Real Time
Similar to 2018 Bio-IT World Agile in Wet Labs Speeds Big Data (20)
2019-04-17 Bio-IT World G Suite-Jira Cloud Sample TrackingBruce Kozuma
The document describes building a low-cost sample tracking system using G Suite and Jira Cloud. It discusses using current off-the-shelf technology to create a serverless solution, how low-cost solutions can accelerate academic research, and developing the minimum viable product through iterative delivery. Permission to learn new skills can help develop capabilities to address problems and move research forward.
Perceptions of Project Managers in the Job Marketplace (and what to do about it)Bruce Kozuma
Given to the PMI Central Mass chapter on 2015-01-13. Can also be downloaded from here: https://pmicmass.org/document-repository/meetings-archive/2015-meetings-archive/381-2015-01-13-perceptions-of-pms-in-the-job-marketplace-bruce-kozuma
IT-focused Project Management in a Biopharmaceutical Manufacturing EnvironmentBruce Kozuma
This document provides an overview of project management in a biopharmaceutical manufacturing environment. It discusses the key drivers in this environment beyond typical cost, schedule, and resource constraints, including supplying product to patients and compliance with regulations. The presentation focuses on an overview of a project to support manufacturing, differing aspects of quality between PMBOK and cGMP standards, and skills for project managers to thrive in a cGMP environment.
2016 Bio-IT World Cell Line Coordination Poster 2016-04-05v1Bruce Kozuma
The document discusses establishing a common cell line metadata registry at the Broad Institute to facilitate collaboration. It proposes using an institutional database as the canonical source, and ingesting data into local systems to link project-specific information to parental cell line data. This would create a shared registry of parental cell lines available to all groups, along with project-specific daughter cell lines. The goals are to standardize metadata, enable discovery of related work, and accelerate research progress.
2016 Bio-IT World Cell Line Coordination 2016-04-06v1Bruce Kozuma
The document discusses enabling cross-group collaboration on cell lines at the Broad Institute by developing a common platform for sharing cell line metadata. It proposes establishing a Cell Line Master Data Review Board to set standards for metadata categories and curation. A framework is described using an institutional database as the canonical source of cell line definitions, with local data management systems ingesting this data and linking it to project-specific metadata. This approach aims to facilitate collaboration by providing a shared understanding of cell lines across different research groups.
Essentials of Automations: The Art of Triggers and Actions in FMESafe Software
In this second installment of our Essentials of Automations webinar series, we’ll explore the landscape of triggers and actions, guiding you through the nuances of authoring and adapting workspaces for seamless automations. Gain an understanding of the full spectrum of triggers and actions available in FME, empowering you to enhance your workspaces for efficient automation.
We’ll kick things off by showcasing the most commonly used event-based triggers, introducing you to various automation workflows like manual triggers, schedules, directory watchers, and more. Plus, see how these elements play out in real scenarios.
Whether you’re tweaking your current setup or building from the ground up, this session will arm you with the tools and insights needed to transform your FME usage into a powerhouse of productivity. Join us to discover effective strategies that simplify complex processes, enhancing your productivity and transforming your data management practices with FME. Let’s turn complexity into clarity and make your workspaces work wonders!
TrustArc Webinar - 2024 Global Privacy SurveyTrustArc
How does your privacy program stack up against your peers? What challenges are privacy teams tackling and prioritizing in 2024?
In the fifth annual Global Privacy Benchmarks Survey, we asked over 1,800 global privacy professionals and business executives to share their perspectives on the current state of privacy inside and outside of their organizations. This year’s report focused on emerging areas of importance for privacy and compliance professionals, including considerations and implications of Artificial Intelligence (AI) technologies, building brand trust, and different approaches for achieving higher privacy competence scores.
See how organizational priorities and strategic approaches to data security and privacy are evolving around the globe.
This webinar will review:
- The top 10 privacy insights from the fifth annual Global Privacy Benchmarks Survey
- The top challenges for privacy leaders, practitioners, and organizations in 2024
- Key themes to consider in developing and maintaining your privacy program
AI 101: An Introduction to the Basics and Impact of Artificial IntelligenceIndexBug
Imagine a world where machines not only perform tasks but also learn, adapt, and make decisions. This is the promise of Artificial Intelligence (AI), a technology that's not just enhancing our lives but revolutionizing entire industries.
Full-RAG: A modern architecture for hyper-personalizationZilliz
Mike Del Balso, CEO & Co-Founder at Tecton, presents "Full RAG," a novel approach to AI recommendation systems, aiming to push beyond the limitations of traditional models through a deep integration of contextual insights and real-time data, leveraging the Retrieval-Augmented Generation architecture. This talk will outline Full RAG's potential to significantly enhance personalization, address engineering challenges such as data management and model training, and introduce data enrichment with reranking as a key solution. Attendees will gain crucial insights into the importance of hyperpersonalization in AI, the capabilities of Full RAG for advanced personalization, and strategies for managing complex data integrations for deploying cutting-edge AI solutions.
“An Outlook of the Ongoing and Future Relationship between Blockchain Technologies and Process-aware Information Systems.” Invited talk at the joint workshop on Blockchain for Information Systems (BC4IS) and Blockchain for Trusted Data Sharing (B4TDS), co-located with with the 36th International Conference on Advanced Information Systems Engineering (CAiSE), 3 June 2024, Limassol, Cyprus.
Driving Business Innovation: Latest Generative AI Advancements & Success StorySafe Software
Are you ready to revolutionize how you handle data? Join us for a webinar where we’ll bring you up to speed with the latest advancements in Generative AI technology and discover how leveraging FME with tools from giants like Google Gemini, Amazon, and Microsoft OpenAI can supercharge your workflow efficiency.
During the hour, we’ll take you through:
Guest Speaker Segment with Hannah Barrington: Dive into the world of dynamic real estate marketing with Hannah, the Marketing Manager at Workspace Group. Hear firsthand how their team generates engaging descriptions for thousands of office units by integrating diverse data sources—from PDF floorplans to web pages—using FME transformers, like OpenAIVisionConnector and AnthropicVisionConnector. This use case will show you how GenAI can streamline content creation for marketing across the board.
Ollama Use Case: Learn how Scenario Specialist Dmitri Bagh has utilized Ollama within FME to input data, create custom models, and enhance security protocols. This segment will include demos to illustrate the full capabilities of FME in AI-driven processes.
Custom AI Models: Discover how to leverage FME to build personalized AI models using your data. Whether it’s populating a model with local data for added security or integrating public AI tools, find out how FME facilitates a versatile and secure approach to AI.
We’ll wrap up with a live Q&A session where you can engage with our experts on your specific use cases, and learn more about optimizing your data workflows with AI.
This webinar is ideal for professionals seeking to harness the power of AI within their data management systems while ensuring high levels of customization and security. Whether you're a novice or an expert, gain actionable insights and strategies to elevate your data processes. Join us to see how FME and AI can revolutionize how you work with data!
Observability Concepts EVERY Developer Should Know -- DeveloperWeek Europe.pdfPaige Cruz
Monitoring and observability aren’t traditionally found in software curriculums and many of us cobble this knowledge together from whatever vendor or ecosystem we were first introduced to and whatever is a part of your current company’s observability stack.
While the dev and ops silo continues to crumble….many organizations still relegate monitoring & observability as the purview of ops, infra and SRE teams. This is a mistake - achieving a highly observable system requires collaboration up and down the stack.
I, a former op, would like to extend an invitation to all application developers to join the observability party will share these foundational concepts to build on:
Let's Integrate MuleSoft RPA, COMPOSER, APM with AWS IDP along with Slackshyamraj55
Discover the seamless integration of RPA (Robotic Process Automation), COMPOSER, and APM with AWS IDP enhanced with Slack notifications. Explore how these technologies converge to streamline workflows, optimize performance, and ensure secure access, all while leveraging the power of AWS IDP and real-time communication via Slack notifications.
In the rapidly evolving landscape of technologies, XML continues to play a vital role in structuring, storing, and transporting data across diverse systems. The recent advancements in artificial intelligence (AI) present new methodologies for enhancing XML development workflows, introducing efficiency, automation, and intelligent capabilities. This presentation will outline the scope and perspective of utilizing AI in XML development. The potential benefits and the possible pitfalls will be highlighted, providing a balanced view of the subject.
We will explore the capabilities of AI in understanding XML markup languages and autonomously creating structured XML content. Additionally, we will examine the capacity of AI to enrich plain text with appropriate XML markup. Practical examples and methodological guidelines will be provided to elucidate how AI can be effectively prompted to interpret and generate accurate XML markup.
Further emphasis will be placed on the role of AI in developing XSLT, or schemas such as XSD and Schematron. We will address the techniques and strategies adopted to create prompts for generating code, explaining code, or refactoring the code, and the results achieved.
The discussion will extend to how AI can be used to transform XML content. In particular, the focus will be on the use of AI XPath extension functions in XSLT, Schematron, Schematron Quick Fixes, or for XML content refactoring.
The presentation aims to deliver a comprehensive overview of AI usage in XML development, providing attendees with the necessary knowledge to make informed decisions. Whether you’re at the early stages of adopting AI or considering integrating it in advanced XML development, this presentation will cover all levels of expertise.
By highlighting the potential advantages and challenges of integrating AI with XML development tools and languages, the presentation seeks to inspire thoughtful conversation around the future of XML development. We’ll not only delve into the technical aspects of AI-powered XML development but also discuss practical implications and possible future directions.
Pushing the limits of ePRTC: 100ns holdover for 100 daysAdtran
At WSTS 2024, Alon Stern explored the topic of parametric holdover and explained how recent research findings can be implemented in real-world PNT networks to achieve 100 nanoseconds of accuracy for up to 100 days.
HCL Notes and Domino License Cost Reduction in the World of DLAUpanagenda
Webinar Recording: https://www.panagenda.com/webinars/hcl-notes-and-domino-license-cost-reduction-in-the-world-of-dlau/
The introduction of DLAU and the CCB & CCX licensing model caused quite a stir in the HCL community. As a Notes and Domino customer, you may have faced challenges with unexpected user counts and license costs. You probably have questions on how this new licensing approach works and how to benefit from it. Most importantly, you likely have budget constraints and want to save money where possible. Don’t worry, we can help with all of this!
We’ll show you how to fix common misconfigurations that cause higher-than-expected user counts, and how to identify accounts which you can deactivate to save money. There are also frequent patterns that can cause unnecessary cost, like using a person document instead of a mail-in for shared mailboxes. We’ll provide examples and solutions for those as well. And naturally we’ll explain the new licensing model.
Join HCL Ambassador Marc Thomas in this webinar with a special guest appearance from Franz Walder. It will give you the tools and know-how to stay on top of what is going on with Domino licensing. You will be able lower your cost through an optimized configuration and keep it low going forward.
These topics will be covered
- Reducing license cost by finding and fixing misconfigurations and superfluous accounts
- How do CCB and CCX licenses really work?
- Understanding the DLAU tool and how to best utilize it
- Tips for common problem areas, like team mailboxes, functional/test users, etc
- Practical examples and best practices to implement right away
GraphSummit Singapore | The Art of the Possible with Graph - Q2 2024Neo4j
Neha Bajwa, Vice President of Product Marketing, Neo4j
Join us as we explore breakthrough innovations enabled by interconnected data and AI. Discover firsthand how organizations use relationships in data to uncover contextual insights and solve our most pressing challenges – from optimizing supply chains, detecting fraud, and improving customer experiences to accelerating drug discoveries.
UiPath Test Automation using UiPath Test Suite series, part 6DianaGray10
Welcome to UiPath Test Automation using UiPath Test Suite series part 6. In this session, we will cover Test Automation with generative AI and Open AI.
UiPath Test Automation with generative AI and Open AI webinar offers an in-depth exploration of leveraging cutting-edge technologies for test automation within the UiPath platform. Attendees will delve into the integration of generative AI, a test automation solution, with Open AI advanced natural language processing capabilities.
Throughout the session, participants will discover how this synergy empowers testers to automate repetitive tasks, enhance testing accuracy, and expedite the software testing life cycle. Topics covered include the seamless integration process, practical use cases, and the benefits of harnessing AI-driven automation for UiPath testing initiatives. By attending this webinar, testers, and automation professionals can gain valuable insights into harnessing the power of AI to optimize their test automation workflows within the UiPath ecosystem, ultimately driving efficiency and quality in software development processes.
What will you get from this session?
1. Insights into integrating generative AI.
2. Understanding how this integration enhances test automation within the UiPath platform
3. Practical demonstrations
4. Exploration of real-world use cases illustrating the benefits of AI-driven test automation for UiPath
Topics covered:
What is generative AI
Test Automation with generative AI and Open AI.
UiPath integration with generative AI
Speaker:
Deepak Rai, Automation Practice Lead, Boundaryless Group and UiPath MVP
Cosa hanno in comune un mattoncino Lego e la backdoor XZ?Speck&Tech
ABSTRACT: A prima vista, un mattoncino Lego e la backdoor XZ potrebbero avere in comune il fatto di essere entrambi blocchi di costruzione, o dipendenze di progetti creativi e software. La realtà è che un mattoncino Lego e il caso della backdoor XZ hanno molto di più di tutto ciò in comune.
Partecipate alla presentazione per immergervi in una storia di interoperabilità, standard e formati aperti, per poi discutere del ruolo importante che i contributori hanno in una comunità open source sostenibile.
BIO: Sostenitrice del software libero e dei formati standard e aperti. È stata un membro attivo dei progetti Fedora e openSUSE e ha co-fondato l'Associazione LibreItalia dove è stata coinvolta in diversi eventi, migrazioni e formazione relativi a LibreOffice. In precedenza ha lavorato a migrazioni e corsi di formazione su LibreOffice per diverse amministrazioni pubbliche e privati. Da gennaio 2020 lavora in SUSE come Software Release Engineer per Uyuni e SUSE Manager e quando non segue la sua passione per i computer e per Geeko coltiva la sua curiosità per l'astronomia (da cui deriva il suo nickname deneb_alpha).
2018 Bio-IT World Agile in Wet Labs Speeds Big Data
1. Using Agile Techniques in Wet Labs
to Speed the Creation of Even More Big Data
Bruce Kozuma, Principal System Analyst
Kendra West, Scrum Master, Data Sciences and Data Engineering
Thursday 2018/05/17, Bio-IT World
2. About the Authors
• Bruce Kozuma is a Principal
Systems Analyst in IT
• Connect via LinkedIn:
https://linkedin.com/in/bkozuma
• Kendra West is a Scrum Master in
Data Sciences and Data Engineering
• Connect via LinkedIn:
https://linkedin.com/in/kendraleighwe
st
3. Core Members
~10
Institute Members
~38
Associate Members
~322
Employees
~1000
Post-Docs, Fellows & Scholars in Residence
~100
Visiting Scientists, Staff & Researchers
~750
Students
~550
Post-Docs/Partner Institutions
~600
Over 3,400 Broadies working together
4. About the Broad Institute of MIT and Harvard
• Propelling the understanding and
treatment of disease
• Collaborating deeply
• Reaching globally
• Empowering scientists
• Building partnerships
• Sharing data and knowledge
• Promoting inclusion
5. The Agile Manifesto
Individuals & Interactions > Processes & Tools
*Delivering Value > Comprehensive Documentation
Customer Collaboration > Contract Negotiation
Responding to Change > Following a Plan
*adapted to fit organizational needs
6. What is the Agile approach?
• We follow Twelve Agile Principles behind the Manifesto:
• Our highest priority is to satisfy the customer
through early and continuous delivery
of value
• Welcome changing requirements, even late in
development; Agile processes harness change
for the customer's competitive advantage
• Deliver frequently, from a couple of weeks to
a couple of months, with a preference to the
shorter timescale
• Value delivery is the primary measure of progress
Frequent
delivery and
feedback
7. What is the Agile approach?
• We follow Twelve Agile Principles behind the Manifesto:
• Business people and developers must work together
daily throughout the project
• The most efficient and effective method of conveying
information to and within a development team is
face-to-face conversation
• The best architectures, requirements, and designs
emerge from self-organizing teams
• At regular intervals, the team reflects on how to
become more effective, then tunes and adjusts its
behavior accordingly
Teams
communicating
openly
8. What is the Agile approach?
• We follow Twelve Agile Principles behind the Manifesto:
• Build projects around motivated individuals;
Give them the environment and support they need,
and trust them to get the job done
• Agile processes promote sustainable development;
The sponsors, developers, and users should be able
to maintain a constant pace indefinitely
• Continuous attention to technical excellence
and good design enhances agility
• Simplicity – the art of maximizing the amount
of work not done – is essential
Doing
our best
work
9. What is Scrum?
• An Agile framework
• Born in Boston
• 90% of Agile teams worldwide use Scrum
• Borrows its name from rugby
10. Scrum Values, Pillars, and Elements
Scrum values
OpeneSs
Courage
Respect
FocUs
ComMitment
Scrum pillars
• Transparency
• Inspection
• Adaptation
Scrum team
• Product Owner
• Scrum Master
• Development Team
Scrum events
• The Sprint
• Sprint Planning
• Daily Scrum
• Sprint Review
• Sprint Retrospective
Scrum artifacts
• Product Backlog
• Sprint Backlog
• Increment
• Definition of Done
11. The Broad’s mission embodies many Agile values!
Broad Mission
• Propelling the understanding
and treatment of disease
• Collaborating deeply
• Reaching globally
• Empowering scientists
• Building partnerships
• Sharing data and knowledge
• Promoting inclusion
Agile themes
• Frequent delivery & feedback
• Teams communicating openly
• Doing our best work
Too many arrows!
12. How to measure Big Data?
• Classic way is via Doug Laney’s Volume, Velocity, Variety model
• Volume: size of data (e.g., total size of a data set, number of records, number
of files, size of files)
• Velocity: Rate at which data produced and changed (e.g., production of BAMs,
changes in UCSC genome releases GRCh37 vs hg17)
• Variety:
• Diversity of formats (e.g., FASTQ, BAM, VCF, CRAM)
• Non-aligned data structures (e.g., CDISC)
• Inconsistent data semantics (e.g., cell line names)
13. Thesis of this talk!
• Using Agile techniques in wet labs and computational science speeds production of
big data in multiple dimensions
• Volume
• Increases number of samples sequenced
• Lowers cost of sequencinganalysis and barriers to clinical sequencing
• Velocity
• Reduces cycle time of physical sample preparation prior to sequencing
• Improves use of people and resources in lab work
• Variety
• Increases types of samples being sequenced (e.g., types of cells, diseases,
ethnic and geographic diversity, nomenclatures, APIs, and repositories)
14. Broad Institute launched
Initial $100M gift from Broad Foundations;
A 10-year “experiment” in collaborative
science
Broad doubles in size
Governed by MIT-Harvard leadership;
Administratively managed within MIT
Headquarters building opens
250,000 sq. ft. at 415 Main Street
Broads double initial gift to $200M
Unrestricted for Broad research and
operations
Creation of Stanley Center
Founding $100M, 10-year gift from
Stanley Medical Research Institute
“Experiment” declared a success
Broads announce new endowment of $400 million
Combined $600M Current Use + Endowment Gift
Carlos Slim Foundation provides $65M
New initiative in genomic disease research;
1st U.S. collaboration to receive funding
Stanley
building opens
at 75 Ames Street
Second gift of $74M
Slim Initiative for Genomic
Medicine for the Americas
10th anniversary
$100M gift from Broad Foundations
to launch next decade of science
Creation of the Klarman Cell Observatory
Klarman Family Foundation gift of 33M
Commitment of $650M
Ted Stanley invests in
psychiatric research
2002 2004 2007 2008 2009 2010 2012 2013 20142006 2015
Broad Genomics
GP and DSP align
Genomics Platform
BSP Arrays and
Sequencing merge
Volume – Size of sequenced sample x # samples
100,000 genomes
~ 70 PB of data
~ 825K BAM files
~ 1.2 billion hours
of streaming music
Two major research groups come together
Whitehead/MIT Center for Genome Research;
Harvard Institute of Chemistry and Cell Biology
Broad Institute, Inc. established
501(c)3 formed 9/08; Operations begin 7/09
15. Velocity
• Sequence cost/genome fallen ~$1K
• Cost to analyze a genome has also
fallen to ~$5
• Why does this matter?
Precision/Personalized
medicine involves more
sequencing
• Assert: Agile increases
velocity of reducing costs
via shorter cycle times,
cheaper reagents, reusable
software, better use of
people, etc.
16. Velocity – Sample preparation and sequencing
• Reduces cycle time of physical sample preparation prior to sequencing
• Improves use of people and resources in lab work
• How? Using Dynamic Work Design
• Principle #1: Constant reconciliation of intent and activity
• Principle #2: Regular use of structured problem solving
• Principle #3: Optimal challenge
• Principle #4: Connect the human chain
17. Velocity – Sample preparation and sequencing
• Genomics
Platform
achieves
these results
through better
technology:
• Instruments
• Software
• Reagents
• Training
• Organization
18. Velocity – Sample preparation and sequencing
• Dynamic Work Design shares many similarities with Agile/Scrum and uses many of
the same techniques:
• Visual management
• Morning production meeting
• Pull system (Kanban)
21. Velocity – People and resources
• PRISM for multiplexing screen of compounds against
cancer cell lines (wet lab)
• Dependency Map a public
portal for cancer data (wet
lab, COTS software,
software development)
Agile practices used
• Retrospectives
• Standups
• Sprints
• Kaizen
• Visual board
22. Velocity – People and resources
• Improving use of people and resources in data
science by enabling reuse
• Data Biosphere: modular and interoperable
components that can be assembled into diverse
data environments. The Data Biosphere should be
based on four governing principles. It should be:
• (1) modular, composed of functional components with well-specified interfaces
• (2) community-driven, created by many groups to foster a diversity of ideas;
• (3) open, developed under open-source licenses that enable extensibility and reuse, with
users able to add custom, proprietary modules as needed
• (4) standards-based, consistent with standards developed by coalitions such as the Global
Alliance for Genomics and Health (GA4GH)
Agile values
• Deliver value
• Work together
• Self-organizing teams
• Simplicity
23. Variety
• Increases types of samples being sequenced in additional dimensions, e.g.,
• Types and sources of cells
• Types of diseases
• Ethnic and geographic diversity
• Nomenclatures, APIs, and repositories
• Agile practices being applied in each case, speeding the processing of samples
and the creation of both sample metadata and genomic data
24. Variety – Types and sources of cells
• Agile principles being used by Broad labs involved
in Human Cell Atlas to manage wet lab work (e.g.,
visual boards,
retrospectives)
• Agile used to develop
portals to enable patients,
at scale, to sign up and
consent for studies, and
for sample processing
25. Variety – Ethnic and geographic diversity
• In 2016, 81% of participants in Genome-Wide Association Studies (GWAS) of
European descent, where African, Latin American, native or indigenous make up
less than 4%
• Agile practices used to
further studies in under-
represented populations
(e.g., visual management,
short delivery cycles)
26. Variety – Types of diseases
• Agile practices used to aid the study of a wider range of
diseases, e.g.,
• The Sabeti Lab uses Agile
practices in their work on
infectious diseases to
enable real-time sharing of
genomic data
27. Variety – Nomenclatures, APIs, and repositories
• Nomenclatures are critically important to sharing
data and promoting collaboration (e.g., cell lines)
• Broad scientists, both wet lab and data, are key
contributors to organizations and alliances that
have and promote sharing of data through public
(and coordinated) APIs
• Agile practices
used by both
groups in their
daily work!
28. How the Broad encourages adoption of Agile
• Encourages collaboration within the Broad, e.g.,
• Platforms (e.g., Genomics, Data Sciences)
• Programs (e.g., Cancer, Infectious Disease and Microbiome)
• Academic labs (e.g., Sabeti Lab, Regev Lab)
• Employs Agile within scientific groups and administration, e.g.,
• Data Sciences Platform has Agile coaches, Scrum Masters, and Product Owners as job
descriptions/titles
• Broad Information Technology Services employs Scrum for specific projects
• Supports affinity groups and offers related training
• Agile Academia, focused specifically on educating and spreading use of Agile
• PM@Broad, focused on traditional project management, but PMI embracing Agile…
• People Development workshops (e.g., Influencing without Authority, Matrix Management)
29. Recapitulation – Thesis of this talk!
• Using Agile techniques in wet labs and computational science speeds production of
big data in multiple dimensions
• Volume
• Increases number of samples sequenced
• Lowers cost of sequencinganalysis and barriers to clinical sequencing
• Velocity
• Reduces cycle time of physical sample preparation prior to sequencing
• Improves use of people and resources in lab work
• Variety
• Increases types of samples being sequenced (e.g., types of cells, diseases,
ethnic and geographic diversity, nomenclatures, APIs, and repositories)
30. Acknowledgements
• Mark Baker
• Michelle Campo
• Jean Chang
• Raymond Coderre
• Sheila Dodge
• Vicky Guo
• Andrew Hollinger
• Eric Jones
• Jen Lapan
• Yenarae Lee
• Anthony Losada
• William Mayo
• Peter Ragone
• Jennifer Roth
Thank you to the many people who helped paved the way for current and
future success! A few notable individuals:
• Katie Shakun
• David Siedzik
• Rocky Stroud
• Diolinda Vaz
• Sarah Winnicki
Broad Alumni
• Sadiya Akasha
• Zeyna Haddad
Editor's Notes
All citations in modified MLA format: <author>. <title of source>. <title of container>, <other contributors>, <version>, <number>, <publisher>, <publication date in format <year>, <month> <day>. Retrieved from <url> on <year>, <month> <day>.
“Manifesto for Agile Software Development”, 2001. Retrieved from http://agilemanifesto.org on 2018, May 14.
“Principles behind the Agile Manifesto”, 2001. Retrieved from http://agilemanifesto.org/principles.html on 2018, May 14
Kendra West, Zeyna Hadadd. “Agile ToolKit @ Broad Workshop”, 2017, October 12.
“Principles behind the Agile Manifesto”, 2001. Retrieved from http://agilemanifesto.org/principles.html on 2018, May 14
Kendra West, Zeyna Hadadd. “Agile ToolKit @ Broad Workshop”, 2017, October 12.
“Principles behind the Agile Manifesto”, 2001. Retrieved from http://agilemanifesto.org/principles.html on 2018, May 14
Kendra West, Zeyna Hadadd. “Agile ToolKit @ Broad Workshop”, 2017, October 12.
Diego Lo Giudice , Holger Kisker, Nasry Angel, “How Can You Scale Your Agile Adoption?”, 2014, February 05. Forrester. Retrieved from https://www.forrester.com/report/How+Can+You+Scale+Your+Agile+Adoption/-/E-RES110444#AST962998%202013 on 2018, May 14.
Jeff Sutherland, J.J. Sutherland, “SCRUM The Art of Doing Twice the Work in Half the Time”, 2014. Retrieved from https://www.scruminc.com/new-scrum-the-book/ on 2018, May 14.
Kendra West, Zeyna Hadadd. “Agile ToolKit @ Broad Workshop”, 2017, October 12.
The actual order is Commitment, Courage, Focus, Openness, Respect
It’s not an acrostic, it’s a mesostic.
Wikipedia. Acrostic. Retrieved from https://en.wikipedia.org/wiki/Acrostic on 2018, May 14.
Wikipedia. Mesostic. Retrieved from https://en.wikipedia.org/wiki/Mesostic on 2018, May 14.
“The Scrum GuideTM”, 2017, November. Retrieved from http://www.scrumguides.org/scrum-guide.html on 2018, May 14.
We are OPEN about what we’re working on and our progress.
We have COURAGE to change; to take on new challenges; to have frank conversations.
We RESPECT each other’s time; ideas; skills. We respect our customers.
We are FOCUSED on our goal; shield each other from distractions.
We COMMIT to completing our work; to delivering value to the customer.
Ken Schwaber, Jeff Sutherland. “The Scrum GuideTM”, 2017, November. Retrieved from http://www.scrumguides.org/scrum-guide.html on 2018, May 14.
Doug Laney. “3D Data Management: Controlling Data Volume, Velocity, and Variety”. META Group (now Gartner Group). 2001, February 06. Retrieved from https://blogs.gartner.com/doug-laney/files/2012/01/ad949-3D-Data-Management-Controlling-Data-Volume-Velocity-and-Variety.pdf on 2018, May 14.
Gartner, Inc. “Big Data”. Gartner IT Glossary > Big Data. Retrieved from https://www.gartner.com/it-glossary/big-data/ on 2018, May 14.
Seth Grimes. “4 VS for Big Data Analytics”. Breakthrough Analysis (blog). 2013, July 31. Retrieved from https://breakthroughanalysis.com/2013/07/31/4-vs-for-big-data-analytics/ on 2018, May 14.
International Business Machines Corp. “The Four V’s of Big Data”. Retrieved from http://www.ibmbigdatahub.com/infographic/four-vs-big-data on 2018, May 14.
“Dimensions of Big Data”. Klarity, Social Media Broadcasts (SMB) Limited. 2015, July 27. Retrieved from http://www.klarity-analytics.com/2015/07/27/dimensions-of-big-data/ on 2018, May 14.
Gil Press. “A Very Short History of Big Data”. Forbes Media LLC. 2013, December 21. Retrieved from https://www.forbes.com/sites/gilpress/2013/05/09/a-very-short-history-of-big-data/#897211d65a18 on 2018, May 14.
Steve Lohr. “The Origins of ‘Big Data’: An Etymological Detective Story”. The New York Times Company. 2013, February 01. Retrieved from https://bits.blogs.nytimes.com/2013/02/01/the-origins-of-big-data-an-etymological-detective-story/?_r=0 on 2018, May 14.
University of California Santa Cruz, “List of UCSC genome releases”, “Frequently Asked Questions: Assembly Releases and Versions”. Retrieved from https://genome.ucsc.edu/FAQ/FAQreleases.html#release1 on 2018, May 14.
You can skip the rest of the talk if you get this
BAM files are 80 – 90 GB each
Just establishing Broad’s Big Data credentials in terms of size
Zachary D. Stephens, Skylar Y. Lee, Faraz Faghri, et. al. “Big Data: Astronomical or Genomical?”, PLOS Biology, 2015, July 15. Retrieved from http://journals.plos.org/plosbiology/article?id=10.1371/journal.pbio.1002195 on 2018, May 14.
“DNA Sequencing Costs: Data”, National Human Genome Research Institute. Retrieved from https://www.genome.gov/sequencingcostsdata/ on 2018, May 14.
“Cost per Genome”, National Human Genome Research Institute. Retrieved from https://www.genome.gov/images/content/costpermb_2017.jpg on 2018, May 14.
Sheila Dodge, Don Kieffer, Nelson Repenning, et. al. “Using Dynamic Work Design to Help Cure Cancer (and other diseases)”, MIT Sloan School of Management, 2016, June. Retrieved from http://mitsloan.mit.edu/shared/ods/documents/Repenning_Cancer_full.pdf&PubID=15032 on 2018, May 14.
MIT Sloan Executive Education . “Speeding the cure for cancer: Financial engineering and Dynamic Work Design”, 2017, August 25. Retrieved from https://executive.mit.edu/blogpost/speeding-the-cure-for-cancer-financial-engineering-and-dynamic-work-design on 2018, May 14.
Alix Stuart, “From Cogs to Creators: Fueling employee engagement with dynamic work design”, Alumni Magazine, 2016. Retrieved from http://mitsloan.mit.edu/alumnimagazine/2016/fall/innovation-at-work.php on 2018, May 14.
Sheila Dodge, et. al. “Using Dynamic Work Design to Help Cure Cancer (and other diseases)”, 2016, June. Retrieved from mitsloan.mit.edu/shared/ods/documents/Repenning_Cancer_full.pdf&PubID=15032 on 2018, May 14.
MIT Sloan Executive Education . “Speeding the cure for cancer: Financial engineering and Dynamic Work Design”, 2017, August 25. Retrieved from https://executive.mit.edu/blogpost/speeding-the-cure-for-cancer-financial-engineering-and-dynamic-work-design on 2018, May 14.
Alix Stuart, “From Cogs to Creators: Fueling employee engagement with dynamic work design”, Alumni Magazine, 2016. Retrieved from http://mitsloan.mit.edu/alumnimagazine/2016/fall/innovation-at-work.php on 2018, May 14.
Sheila Dodge, et. al. “Using Dynamic Work Design to Help Cure Cancer (and other diseases)”, 2016, June. Retrieved from mitsloan.mit.edu/shared/ods/documents/Repenning_Cancer_full.pdf&PubID=15032 on 2018, May 14.
MIT Sloan Executive Education . “Speeding the cure for cancer: Financial engineering and Dynamic Work Design”, 2017, August 25. Retrieved from https://executive.mit.edu/blogpost/speeding-the-cure-for-cancer-financial-engineering-and-dynamic-work-design on 2018, May 14.
Alix Stuart, “From Cogs to Creators: Fueling employee engagement with dynamic work design”, Alumni Magazine, 2016. Retrieved from http://mitsloan.mit.edu/alumnimagazine/2016/fall/innovation-at-work.php on 2018, May 14.
Achilles. Retrieved from https://portals.broadinstitute.org/achilles on 2018, May 14.
PRISM. Retrieved from https://www.broadinstitute.org/news/7944 on 2018, May 14.
Dependency Map. Retrieved from https://depmap.org/portal/ on 2018, May 14.
Jamie Ducharme, “Local Researchers Mapped the Many Ways Cancer Cells Dodge Death”, Boston Magazine, 2017, July 31. Retrieved from https://www.bostonmagazine.com/health/2017/07/31/broad-cancer-dependency-map/ on 2018, May 14.
Benedict Paten, et. al. “A Data Biosphere for Biomedical Research”, Medium, 2017, Oct 16. Retrieved from https://medium.com/@benedictpaten/a-data-biosphere-for-biomedical-research-d212bbfae95d on 2018, May 14.
Human Cell Atlas. Retrieved from https://www.humancellatlas.org on 2018, May 14.
Metastatic Prostate Cancer Project. Retrieved from https://mpcproject.org/home on 2018, May 14.
Emily Mullin. “Solving the Lack of Diversity in Genomic Research”, MIT Technology Review, 2016, October 25. Retrieved from https://www.technologyreview.com/s/602671/solving-the-lack-of-diversity-in-genomic-research/ on 2018, May 14.
Heather Lindsey, “Bringing Diversity to Genomic Data: Under-Represented Ethnic Minorities Sometimes Misclassified, Misdiagnosed”, Clinical Laboratory News, 2017, June 1. Retrieved from https://www.aacc.org/publications/cln/articles/2017/june/bringing-diversity-to-genomic-data-under-represented-ethnic-minorities on 2018, May 14.
Zhai Yun Tan. “Genetic test accuracy stymied by lack of diversity in genomic research”, MedCityNews, 2016, August 18. Retrieved from https://medcitynews.com/2016/08/ack-of-diversity-in-genomic-research/?rf=1 on 2018, May 14.
NeuroGAP-Psychosis. Retrieved from https://www.broadinstitute.org/neurogap/neurogap-psychosis on 2018, May 14.
“Sherlock: Detecting disease with CRISPR”. https://www.broadinstitute.org/videos/sherlock-detecting-disease-crispr on 2018, May 14.
Sabeti Lab. Retrieved from https://www.sabetilab.org/ on 2018, May 14.
PRISM. Retrieved from https://www.broadinstitute.org/news/7944 on 2018, May 14.
Global Alliance for Genomics & Health. Retrieved from https://www.ga4gh.org/ on 2018, May 14.
Genomic Data Commons. Retrieved from https://www.cancer.gov/about-nci/organization/ccg/research/computational-genomics/gdc on 2018, May 14.
Horia Slusanschi, “Introducing the PMI Agile Practice Guide”, Projectmanagement.com, 2017, May 21. Retrieved from https://www.projectmanagement.com/blog-post/29761/Introducing-the-PMI-Agile-Practice-Guide on 2018, May 14.