Fostering Serendipity through Big Linked DataMuhammad Saleem
This document discusses fostering serendipity through linking large biomedical datasets. It linked over 30 billion triples from The Cancer Genome Atlas (TCGA) and over 23 million publications from PubMed. It developed an architecture called TopFed to continuously integrate new data through parallel querying. TopFed was evaluated against the FedX system and shown to have significantly better performance, with query runtimes over 75 times faster for some queries. A visualization interface was also created to explore the linked data.
CPTAC Data Portal and Proteomics Data Commonsimgcommcall
The CPTAC Data Coordinating Center houses proteomic datasets from CPTAC studies in its public data portal and assay portal. It analyzes data through a common pipeline and enables high-speed access. The Proteomic Data Commons is being developed to provide unified access to mass spectrometry data from multiple sources and allow analysis tools to access data in the cloud. It currently hosts data from CPTAC studies and is working to integrate with other cancer research data clouds. The goal is to improve data sharing, reuse and reproducibility across proteomic studies.
Establishing a UQ Research Data Management Service ARDC
The University of Queensland established a new Research Data Management service called the UQ Research Data Manager to address research integrity issues and satisfy requirements for research data management. The service will provide centralized infrastructure for the entire research lifecycle including data creation, curation, usage, and publication. It will initially roll out working data storage and access functionality to high-risk schools while emphasizing training and education. The goals are to ensure research data remains secure, accurate and reusable according to FAIR data principles and to meet various compliance requirements.
Integrating repositories and eLab notebooks through an open science frameworkrmacneil88
Overviews Jisc's investigation of including electronic lab notebooks in the Research Data Shared Service, and the benefits of Connected ELNs like RSpace
Rots RDAP11 Data Archives in Federal AgenciesASIS&T
Arnold Rots, VAO; Data Archives in Federal Agencies; RDAP11 Summit
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
Open science, open-source, and open data: Collaboration as an emergent property?Hilmar Lapp
Talk I gave as part of the panel "How will cyberinfrastructure capabilities shape the future of scientific collaboration?" at the Cyberinfrastructure for Collaborative Science workshop, held at the National Evolutionary Synthesis Center (NESCent), May 18-20, 2011.
More information about the workshop at
https://www.nescent.org/wg_collabsci/2011_Workshop
The document discusses a global initiative to facilitate open access to scholarly resources and research data across boundaries by building a federation of registries. It provides use cases of how such a system could help postgraduate students, research project leaders, administrators, and ICT specialists discover and monitor globally accessible data relevant to their work. The proposed strategy is to create a "Register of Registries" that would enable consistent discovery services for finding data in collections through a standardized, interoperable model. An initial scoping meeting was held in 2007 and annual meetings since to develop the strategy.
Fostering Serendipity through Big Linked DataMuhammad Saleem
This document discusses fostering serendipity through linking large biomedical datasets. It linked over 30 billion triples from The Cancer Genome Atlas (TCGA) and over 23 million publications from PubMed. It developed an architecture called TopFed to continuously integrate new data through parallel querying. TopFed was evaluated against the FedX system and shown to have significantly better performance, with query runtimes over 75 times faster for some queries. A visualization interface was also created to explore the linked data.
CPTAC Data Portal and Proteomics Data Commonsimgcommcall
The CPTAC Data Coordinating Center houses proteomic datasets from CPTAC studies in its public data portal and assay portal. It analyzes data through a common pipeline and enables high-speed access. The Proteomic Data Commons is being developed to provide unified access to mass spectrometry data from multiple sources and allow analysis tools to access data in the cloud. It currently hosts data from CPTAC studies and is working to integrate with other cancer research data clouds. The goal is to improve data sharing, reuse and reproducibility across proteomic studies.
Establishing a UQ Research Data Management Service ARDC
The University of Queensland established a new Research Data Management service called the UQ Research Data Manager to address research integrity issues and satisfy requirements for research data management. The service will provide centralized infrastructure for the entire research lifecycle including data creation, curation, usage, and publication. It will initially roll out working data storage and access functionality to high-risk schools while emphasizing training and education. The goals are to ensure research data remains secure, accurate and reusable according to FAIR data principles and to meet various compliance requirements.
Integrating repositories and eLab notebooks through an open science frameworkrmacneil88
Overviews Jisc's investigation of including electronic lab notebooks in the Research Data Shared Service, and the benefits of Connected ELNs like RSpace
Rots RDAP11 Data Archives in Federal AgenciesASIS&T
Arnold Rots, VAO; Data Archives in Federal Agencies; RDAP11 Summit
The 2nd Research Data Access and Preservation (RDAP) Summit
An ASIS&T Summit
March 31-April 1, 2011 Denver, CO
In cooperation with the Coalition for Networked Information
http://asist.org/Conferences/RDAP11/index.html
Open science, open-source, and open data: Collaboration as an emergent property?Hilmar Lapp
Talk I gave as part of the panel "How will cyberinfrastructure capabilities shape the future of scientific collaboration?" at the Cyberinfrastructure for Collaborative Science workshop, held at the National Evolutionary Synthesis Center (NESCent), May 18-20, 2011.
More information about the workshop at
https://www.nescent.org/wg_collabsci/2011_Workshop
The document discusses a global initiative to facilitate open access to scholarly resources and research data across boundaries by building a federation of registries. It provides use cases of how such a system could help postgraduate students, research project leaders, administrators, and ICT specialists discover and monitor globally accessible data relevant to their work. The proposed strategy is to create a "Register of Registries" that would enable consistent discovery services for finding data in collections through a standardized, interoperable model. An initial scoping meeting was held in 2007 and annual meetings since to develop the strategy.
The Cancer Genomics Cloud (CGC) Pilots NIH IC Show and TellSteve Tsang
The document discusses three cloud pilot projects funded by the NCI to address challenges with large cancer genomics datasets. It describes the FireCloud, Cancer Genomics Cloud, and Seven Bridges Genomics Cancer Genomics Cloud platforms, which provide co-located storage and computation of datasets like TCGA to enable broader access and analysis of the data. Key goals are making the data usable, supporting collaboration, ensuring reproducibility, and allowing integration of public and private data with analysis tools.
The document outlines plans to transition the cBioPortal cancer genomics platform to an open source model with coordinated development between Memorial Sloan Kettering Cancer Center, Dana-Farber Cancer Institute, and Princess Margaret Cancer Centre. It discusses expanding usage, new features, funding options, and establishing an advisory committee. The goal is to build a sustainable open source community through collaborative development, additional funding, and engagement with users and potential contributors.
NCI Cancer Research Data Commons - Overviewimgcommcall
The NCI Cancer Research Data Commons aims to enable sharing of diverse cancer research data across institutions by providing easy access to data stored in domain-specific repositories through a common authentication and authorization mechanism. It utilizes a framework of reusable components including data nodes, a cancer data aggregator, and cloud resources to integrate genomic, imaging, proteomic, and other data types while controlling access. The goals are to facilitate discovery and analysis tools as well as sustainably sharing data publicly to advance cancer research.
This webinar discusses opportunities for high performance computing (HPC) in pharmaceutical research and development. It features presentations from experts in the field including Peter Coveney from University College London, Matt Gianni from Cray Inc., and Darren Green from GlaxoSmithKline. The webinar covers topics such as using HPC for virtual drug screening and binding affinity calculations, automation and integration of HPC resources, and examples of HPC facilities being used for genomic sequencing and analyzing large biomedicine datasets.
The document discusses several US grid projects including campus and regional grids like Purdue and UCLA that provide tens of thousands of CPUs and petabytes of storage. It describes national grids like TeraGrid and Open Science Grid that provide over a petaflop of computing power through resource sharing agreements. It outlines specific communities and projects using these grids for sciences like high energy physics, astronomy, biosciences, and earthquake modeling through the Southern California Earthquake Center. Software providers and toolkits that enable these grids are also mentioned like Globus, Virtual Data Toolkit, and services like Introduce.
This document discusses the challenges and opportunities biology faces with increasing data generation. It outlines four key points:
1) Research approaches for analyzing infinite genomic data streams, such as digital normalization which compresses data while retaining information.
2) The need for usable software and decentralized infrastructure to perform real-time, streaming data analysis.
3) The importance of open science and reproducibility given most researchers cannot replicate their own computational analyses.
4) The lack of data analysis training in biology and efforts at UC Davis to address this through workshops and community building.
This document summarizes the accomplishments of the National Resource for Network Biology over a reporting period. It lists numerous quantitative metrics of success, including over 100 publications citing their grants, thousands of daily downloads and uses of their software tools, and training over 100 users. It also provides details on improvements and developments made to several of their modeling frameworks, algorithms, and software applications. Finally, it outlines the formation of a new working group on single-cell RNA-seq analysis and visualization, and improvements made to their computing infrastructure.
FDA NGS and Big Data Conference September 2014Warren Kibbe
The document discusses the National Cancer Institute's efforts to address challenges in cancer data access and analysis through the development of the NCI Genomics Data Commons and NCI Cloud Pilots. The NCI Genomics Data Commons will provide integrated genomic and clinical cancer data from projects like TCGA to researchers. The NCI Cloud Pilots aim to explore cloud-based models for analyzing large cancer genomics datasets without having to download the full datasets locally, helping to enable more widespread data access and analysis. The goal is to build a national learning health system for cancer clinical genomics through open data sharing and cloud-based approaches.
The pulse of cloud computing with bioinformatics as an exampleEnis Afgan
The document discusses how cloud computing can enable large-scale genomic analysis by providing on-demand access to computational resources and petabytes of reference data. It describes how tools like Galaxy and CloudMan allow researchers to perform genomic analysis in the cloud through a web browser by automating the provisioning and configuration of cloud resources. This approach makes genomic research more accessible and enables the elastic scaling of analysis as needed.
This presentation was given at the GlobusWorld 2020 Virtual Conference, by Ian Foster, Rachana Ananthakrishnan, and Vas Vasiliadis from the University of Chicago.
Adelaide Rhodes has over 15 years of experience in bioinformatics and analyzing large datasets for environmental, metagenomic, and human health applications. She has strong skills in programming languages like R, Python, and UNIX/Linux and has experience building workflows for analyzing NGS data using tools like Nextflow, Snakemake, and Kubernetes on cloud platforms. She currently works as a Senior Bioinformatics Scientist at Tufts University, where she provides strategic consulting and trains researchers on analytical methods and cloud resources.
EBI Industry programme TCGA Warren KIbbe November 2013Warren Kibbe
This document discusses strategic objectives and activities of the National Cancer Institute's Center for Biomedical Informatics and Information Technology (NCI CBIIT). The key objectives are to reduce cancer risk, improve cancer outcomes, provide cancer information to the public, and enable precision oncology through data access and modeling. Specific activities mentioned include the Genomic Data Commons, cloud computing initiatives, clinical trials repositories, and the Tumor Genome Atlas project. TCGA has collected over 700 terabytes of genomic and clinical data on 20+ cancer types to date. The data provides a platform for understanding cancer drivers, molecular subtypes of cancers, and the implications of data sharing policies.
The dkNET annual meeting provided progress reports on recommendations from 2016. New resources were added to the registry including datasets, centers, and repositories. The dkNET site was improved with updated interfaces for browsing resources and new visualization tools. Efforts to support rigor and reproducibility included expanding RRID adoption and developing API services to integrate dkNET data. Plans are underway to provide more robust tracking of research resources.
GCAT Update June 2013 @ The Clinical Genome ConferenceDavid Mittelman
GCAT (Genome Comparison and Analytic Testing) is a free online platform for benchmarking NGS methods and developing standards and metrics. It allows users to process sequencing data through different analysis tools and compare results. Since launching in April 2013, GCAT has been viewed over 20,000 times and has processed large amounts of sequencing data. New features continue to be added, including comparing variant calls to validation datasets and support for additional sequencing applications like RNAseq and de novo assembly. The goal is to accelerate adoption of NGS technologies by providing a common system for experimentation and validation of analysis methods.
UC San Diego's BioBurst cluster provides additional resources for bioinformatics workloads through an I/O accelerator, FPGA-based computational accelerator, and 672 additional compute cores. The I/O accelerator uses 40TB of flash memory to alleviate small block/file I/O issues in bioinformatics applications. The FPGA accelerator can perform genome analysis tasks much faster than standard hardware. The resources are integrated with UC San Diego's existing high performance computing cluster to improve research productivity and address bottlenecks in genomics and other bioinformatics applications and pipelines.
This document summarizes a presentation about Globus Genomics, a service that provides genomic data analysis tools and workflows through a web interface. It allows users to securely transfer data, run standardized analysis pipelines, access computational resources on demand through Amazon Web Services, and collaborate on shared data and workflows. The service aims to make genomic analysis more accessible, reproducible, and sustainable through various pricing models and support for individual labs and bioinformatics cores.
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
Abstract
In this presentation, Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health, will share the NIH’s vision for a modernized, integrated FAIR biomedical data ecosystem and the strategic roadmap that NIH is following to achieve this vision. Dr. Gregurick will highlight projects being implemented by team members across the NIH’s 27 institutes and centers and will ways that industry, academia, and other communities can help NIH enable a FAIR data ecosystem. Finally, she will weave in how this strategy is being leveraged to address the COVID-19 pandemic.
Presenter: Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health
dkNET Webinar Information: https://dknet.org/about/webinar
What is Data Commons and How Can Your Organization Build One?Robert Grossman
1. Data commons co-locate large biomedical datasets with cloud computing infrastructure and analysis tools to create shared resources for the research community.
2. The NCI Genomic Data Commons is an example of a data commons that makes over 2.5 petabytes of cancer genomics data available through web portals, APIs, and harmonized analysis pipelines.
3. The Gen3 platform is an open source software stack for building data commons that can interoperate through common APIs and data models to support reproducible, collaborative research across projects.
This presentation focuses on the networking requirements using open source to treat diseases through cell-based analysis at the molecular level. Transporting this knowledge across devices and centers requires a whole new structure and networking. Terabits per second with high-availability and guaranteed delivery is required to meet the needs. Shared knowledge is the critical for real-time analysis. This will discuss data flows, open networking, and databases that are all open source and have been optimized for this problem.
The Cancer Genomics Cloud (CGC) Pilots NIH IC Show and TellSteve Tsang
The document discusses three cloud pilot projects funded by the NCI to address challenges with large cancer genomics datasets. It describes the FireCloud, Cancer Genomics Cloud, and Seven Bridges Genomics Cancer Genomics Cloud platforms, which provide co-located storage and computation of datasets like TCGA to enable broader access and analysis of the data. Key goals are making the data usable, supporting collaboration, ensuring reproducibility, and allowing integration of public and private data with analysis tools.
The document outlines plans to transition the cBioPortal cancer genomics platform to an open source model with coordinated development between Memorial Sloan Kettering Cancer Center, Dana-Farber Cancer Institute, and Princess Margaret Cancer Centre. It discusses expanding usage, new features, funding options, and establishing an advisory committee. The goal is to build a sustainable open source community through collaborative development, additional funding, and engagement with users and potential contributors.
NCI Cancer Research Data Commons - Overviewimgcommcall
The NCI Cancer Research Data Commons aims to enable sharing of diverse cancer research data across institutions by providing easy access to data stored in domain-specific repositories through a common authentication and authorization mechanism. It utilizes a framework of reusable components including data nodes, a cancer data aggregator, and cloud resources to integrate genomic, imaging, proteomic, and other data types while controlling access. The goals are to facilitate discovery and analysis tools as well as sustainably sharing data publicly to advance cancer research.
This webinar discusses opportunities for high performance computing (HPC) in pharmaceutical research and development. It features presentations from experts in the field including Peter Coveney from University College London, Matt Gianni from Cray Inc., and Darren Green from GlaxoSmithKline. The webinar covers topics such as using HPC for virtual drug screening and binding affinity calculations, automation and integration of HPC resources, and examples of HPC facilities being used for genomic sequencing and analyzing large biomedicine datasets.
The document discusses several US grid projects including campus and regional grids like Purdue and UCLA that provide tens of thousands of CPUs and petabytes of storage. It describes national grids like TeraGrid and Open Science Grid that provide over a petaflop of computing power through resource sharing agreements. It outlines specific communities and projects using these grids for sciences like high energy physics, astronomy, biosciences, and earthquake modeling through the Southern California Earthquake Center. Software providers and toolkits that enable these grids are also mentioned like Globus, Virtual Data Toolkit, and services like Introduce.
This document discusses the challenges and opportunities biology faces with increasing data generation. It outlines four key points:
1) Research approaches for analyzing infinite genomic data streams, such as digital normalization which compresses data while retaining information.
2) The need for usable software and decentralized infrastructure to perform real-time, streaming data analysis.
3) The importance of open science and reproducibility given most researchers cannot replicate their own computational analyses.
4) The lack of data analysis training in biology and efforts at UC Davis to address this through workshops and community building.
This document summarizes the accomplishments of the National Resource for Network Biology over a reporting period. It lists numerous quantitative metrics of success, including over 100 publications citing their grants, thousands of daily downloads and uses of their software tools, and training over 100 users. It also provides details on improvements and developments made to several of their modeling frameworks, algorithms, and software applications. Finally, it outlines the formation of a new working group on single-cell RNA-seq analysis and visualization, and improvements made to their computing infrastructure.
FDA NGS and Big Data Conference September 2014Warren Kibbe
The document discusses the National Cancer Institute's efforts to address challenges in cancer data access and analysis through the development of the NCI Genomics Data Commons and NCI Cloud Pilots. The NCI Genomics Data Commons will provide integrated genomic and clinical cancer data from projects like TCGA to researchers. The NCI Cloud Pilots aim to explore cloud-based models for analyzing large cancer genomics datasets without having to download the full datasets locally, helping to enable more widespread data access and analysis. The goal is to build a national learning health system for cancer clinical genomics through open data sharing and cloud-based approaches.
The pulse of cloud computing with bioinformatics as an exampleEnis Afgan
The document discusses how cloud computing can enable large-scale genomic analysis by providing on-demand access to computational resources and petabytes of reference data. It describes how tools like Galaxy and CloudMan allow researchers to perform genomic analysis in the cloud through a web browser by automating the provisioning and configuration of cloud resources. This approach makes genomic research more accessible and enables the elastic scaling of analysis as needed.
This presentation was given at the GlobusWorld 2020 Virtual Conference, by Ian Foster, Rachana Ananthakrishnan, and Vas Vasiliadis from the University of Chicago.
Adelaide Rhodes has over 15 years of experience in bioinformatics and analyzing large datasets for environmental, metagenomic, and human health applications. She has strong skills in programming languages like R, Python, and UNIX/Linux and has experience building workflows for analyzing NGS data using tools like Nextflow, Snakemake, and Kubernetes on cloud platforms. She currently works as a Senior Bioinformatics Scientist at Tufts University, where she provides strategic consulting and trains researchers on analytical methods and cloud resources.
EBI Industry programme TCGA Warren KIbbe November 2013Warren Kibbe
This document discusses strategic objectives and activities of the National Cancer Institute's Center for Biomedical Informatics and Information Technology (NCI CBIIT). The key objectives are to reduce cancer risk, improve cancer outcomes, provide cancer information to the public, and enable precision oncology through data access and modeling. Specific activities mentioned include the Genomic Data Commons, cloud computing initiatives, clinical trials repositories, and the Tumor Genome Atlas project. TCGA has collected over 700 terabytes of genomic and clinical data on 20+ cancer types to date. The data provides a platform for understanding cancer drivers, molecular subtypes of cancers, and the implications of data sharing policies.
The dkNET annual meeting provided progress reports on recommendations from 2016. New resources were added to the registry including datasets, centers, and repositories. The dkNET site was improved with updated interfaces for browsing resources and new visualization tools. Efforts to support rigor and reproducibility included expanding RRID adoption and developing API services to integrate dkNET data. Plans are underway to provide more robust tracking of research resources.
GCAT Update June 2013 @ The Clinical Genome ConferenceDavid Mittelman
GCAT (Genome Comparison and Analytic Testing) is a free online platform for benchmarking NGS methods and developing standards and metrics. It allows users to process sequencing data through different analysis tools and compare results. Since launching in April 2013, GCAT has been viewed over 20,000 times and has processed large amounts of sequencing data. New features continue to be added, including comparing variant calls to validation datasets and support for additional sequencing applications like RNAseq and de novo assembly. The goal is to accelerate adoption of NGS technologies by providing a common system for experimentation and validation of analysis methods.
UC San Diego's BioBurst cluster provides additional resources for bioinformatics workloads through an I/O accelerator, FPGA-based computational accelerator, and 672 additional compute cores. The I/O accelerator uses 40TB of flash memory to alleviate small block/file I/O issues in bioinformatics applications. The FPGA accelerator can perform genome analysis tasks much faster than standard hardware. The resources are integrated with UC San Diego's existing high performance computing cluster to improve research productivity and address bottlenecks in genomics and other bioinformatics applications and pipelines.
This document summarizes a presentation about Globus Genomics, a service that provides genomic data analysis tools and workflows through a web interface. It allows users to securely transfer data, run standardized analysis pipelines, access computational resources on demand through Amazon Web Services, and collaborate on shared data and workflows. The service aims to make genomic analysis more accessible, reproducible, and sustainable through various pricing models and support for individual labs and bioinformatics cores.
dkNET Webinar: Creating and Sustaining a FAIR Biomedical Data Ecosystem 10/09...dkNET
Abstract
In this presentation, Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health, will share the NIH’s vision for a modernized, integrated FAIR biomedical data ecosystem and the strategic roadmap that NIH is following to achieve this vision. Dr. Gregurick will highlight projects being implemented by team members across the NIH’s 27 institutes and centers and will ways that industry, academia, and other communities can help NIH enable a FAIR data ecosystem. Finally, she will weave in how this strategy is being leveraged to address the COVID-19 pandemic.
Presenter: Susan Gregurick, Ph.D., Associate Director of Data Science and Director, Office of Data Science Strategy at the National Institutes of Health
dkNET Webinar Information: https://dknet.org/about/webinar
What is Data Commons and How Can Your Organization Build One?Robert Grossman
1. Data commons co-locate large biomedical datasets with cloud computing infrastructure and analysis tools to create shared resources for the research community.
2. The NCI Genomic Data Commons is an example of a data commons that makes over 2.5 petabytes of cancer genomics data available through web portals, APIs, and harmonized analysis pipelines.
3. The Gen3 platform is an open source software stack for building data commons that can interoperate through common APIs and data models to support reproducible, collaborative research across projects.
This presentation focuses on the networking requirements using open source to treat diseases through cell-based analysis at the molecular level. Transporting this knowledge across devices and centers requires a whole new structure and networking. Terabits per second with high-availability and guaranteed delivery is required to meet the needs. Shared knowledge is the critical for real-time analysis. This will discuss data flows, open networking, and databases that are all open source and have been optimized for this problem.
Similar to The Cancer Genomics Cloud (CGC) pilots - an Introduction (20)
Mending Clothing to Support Sustainable Fashion_CIMaR 2024.pdfSelcen Ozturkcan
Ozturkcan, S., Berndt, A., & Angelakis, A. (2024). Mending clothing to support sustainable fashion. Presented at the 31st Annual Conference by the Consortium for International Marketing Research (CIMaR), 10-13 Jun 2024, University of Gävle, Sweden.
_Extraction of Ethylene oxide and 2-Chloroethanol from alternate matrices Li...LucyHearn1
How do you know your food is safe?
Last Friday was world World Food Safety Day, facilitated by the Food and Agriculture Organization of the United Nations (FAO) and the World Health Organization (WHO) in which the slogan rightly says, 'food safety is everyone's business'. Due to this, I thought it would be worth sharing some data that I have worked on in this field!
Working at Markes International has really opened my eyes (and unfortunately my friends and family 🤣) to food safety and quality, especially with my recent application work on ethylene oxide and 2-chloroethanol residues in foodstuffs, as of the biggest global food recalls in history was and is still being implemented by the Rapid alert system for food and feed (RASFF) in 2021, for high levels of these carcinogenic compounds.
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Describing and Interpreting an Immersive Learning Case with the Immersion Cub...Leonel Morgado
Current descriptions of immersive learning cases are often difficult or impossible to compare. This is due to a myriad of different options on what details to include, which aspects are relevant, and on the descriptive approaches employed. Also, these aspects often combine very specific details with more general guidelines or indicate intents and rationales without clarifying their implementation. In this paper we provide a method to describe immersive learning cases that is structured to enable comparisons, yet flexible enough to allow researchers and practitioners to decide which aspects to include. This method leverages a taxonomy that classifies educational aspects at three levels (uses, practices, and strategies) and then utilizes two frameworks, the Immersive Learning Brain and the Immersion Cube, to enable a structured description and interpretation of immersive learning cases. The method is then demonstrated on a published immersive learning case on training for wind turbine maintenance using virtual reality. Applying the method results in a structured artifact, the Immersive Learning Case Sheet, that tags the case with its proximal uses, practices, and strategies, and refines the free text case description to ensure that matching details are included. This contribution is thus a case description method in support of future comparative research of immersive learning cases. We then discuss how the resulting description and interpretation can be leveraged to change immersion learning cases, by enriching them (considering low-effort changes or additions) or innovating (exploring more challenging avenues of transformation). The method holds significant promise to support better-grounded research in immersive learning.
Travis Hills of MN is Making Clean Water Accessible to All Through High Flux ...Travis Hills MN
By harnessing the power of High Flux Vacuum Membrane Distillation, Travis Hills from MN envisions a future where clean and safe drinking water is accessible to all, regardless of geographical location or economic status.
Authoring a personal GPT for your research and practice: How we created the Q...Leonel Morgado
Thematic analysis in qualitative research is a time-consuming and systematic task, typically done using teams. Team members must ground their activities on common understandings of the major concepts underlying the thematic analysis, and define criteria for its development. However, conceptual misunderstandings, equivocations, and lack of adherence to criteria are challenges to the quality and speed of this process. Given the distributed and uncertain nature of this process, we wondered if the tasks in thematic analysis could be supported by readily available artificial intelligence chatbots. Our early efforts point to potential benefits: not just saving time in the coding process but better adherence to criteria and grounding, by increasing triangulation between humans and artificial intelligence. This tutorial will provide a description and demonstration of the process we followed, as two academic researchers, to develop a custom ChatGPT to assist with qualitative coding in the thematic data analysis process of immersive learning accounts in a survey of the academic literature: QUAL-E Immersive Learning Thematic Analysis Helper. In the hands-on time, participants will try out QUAL-E and develop their ideas for their own qualitative coding ChatGPT. Participants that have the paid ChatGPT Plus subscription can create a draft of their assistants. The organizers will provide course materials and slide deck that participants will be able to utilize to continue development of their custom GPT. The paid subscription to ChatGPT Plus is not required to participate in this workshop, just for trying out personal GPTs during it.
Sexuality - Issues, Attitude and Behaviour - Applied Social Psychology - Psyc...PsychoTech Services
A proprietary approach developed by bringing together the best of learning theories from Psychology, design principles from the world of visualization, and pedagogical methods from over a decade of training experience, that enables you to: Learn better, faster!
PPT on Alternate Wetting and Drying presented at the three-day 'Training and Validation Workshop on Modules of Climate Smart Agriculture (CSA) Technologies in South Asia' workshop on April 22, 2024.
SDSS1335+0728: The awakening of a ∼ 106M⊙ black hole⋆Sérgio Sacani
Context. The early-type galaxy SDSS J133519.91+072807.4 (hereafter SDSS1335+0728), which had exhibited no prior optical variations during the preceding two decades, began showing significant nuclear variability in the Zwicky Transient Facility (ZTF) alert stream from December 2019 (as ZTF19acnskyy). This variability behaviour, coupled with the host-galaxy properties, suggests that SDSS1335+0728 hosts a ∼ 106M⊙ black hole (BH) that is currently in the process of ‘turning on’. Aims. We present a multi-wavelength photometric analysis and spectroscopic follow-up performed with the aim of better understanding the origin of the nuclear variations detected in SDSS1335+0728. Methods. We used archival photometry (from WISE, 2MASS, SDSS, GALEX, eROSITA) and spectroscopic data (from SDSS and LAMOST) to study the state of SDSS1335+0728 prior to December 2019, and new observations from Swift, SOAR/Goodman, VLT/X-shooter, and Keck/LRIS taken after its turn-on to characterise its current state. We analysed the variability of SDSS1335+0728 in the X-ray/UV/optical/mid-infrared range, modelled its spectral energy distribution prior to and after December 2019, and studied the evolution of its UV/optical spectra. Results. From our multi-wavelength photometric analysis, we find that: (a) since 2021, the UV flux (from Swift/UVOT observations) is four times brighter than the flux reported by GALEX in 2004; (b) since June 2022, the mid-infrared flux has risen more than two times, and the W1−W2 WISE colour has become redder; and (c) since February 2024, the source has begun showing X-ray emission. From our spectroscopic follow-up, we see that (i) the narrow emission line ratios are now consistent with a more energetic ionising continuum; (ii) broad emission lines are not detected; and (iii) the [OIII] line increased its flux ∼ 3.6 years after the first ZTF alert, which implies a relatively compact narrow-line-emitting region. Conclusions. We conclude that the variations observed in SDSS1335+0728 could be either explained by a ∼ 106M⊙ AGN that is just turning on or by an exotic tidal disruption event (TDE). If the former is true, SDSS1335+0728 is one of the strongest cases of an AGNobserved in the process of activating. If the latter were found to be the case, it would correspond to the longest and faintest TDE ever observed (or another class of still unknown nuclear transient). Future observations of SDSS1335+0728 are crucial to further understand its behaviour. Key words. galaxies: active– accretion, accretion discs– galaxies: individual: SDSS J133519.91+072807.4
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...Scintica Instrumentation
Targeting Hsp90 and its pathogen Orthologs with Tethered Inhibitors as a Diagnostic and Therapeutic Strategy for cancer and infectious diseases with Dr. Timothy Haystead.
(June 12, 2024) Webinar: Development of PET theranostics targeting the molecu...
The Cancer Genomics Cloud (CGC) pilots - an Introduction
1. NCI Cancer Genomics Cloud (CGC) Pilots
Steve Tsang
Attain LLC
National Cancer Institute
2. Disclaimer
The opinions/comments/assessment expressed in this article are the author's own and do not necessarily
reflect the view of the National Cancer Institute or National Institutes of Health.
https://ethics.od.nih.gov/topics/Disclaimer.htm
3. Cancer Genomic Data Challenges
● > 2.5 PB of TCGA data (WXS, RNASeq, WGS)
● Fragmentary repositories of cancer genomic data
○ TCGA, TARGET and CGCI have their own data repositories (DCCs)
○ Sequencing data: BAM files at CGhub while VCF/MAF files at DCC
● Assuming the 2.5 PB TCGA data set
○ Storage and Data Protection cost approximately $2,000,000 per year
○ Downloading TCGA data at 10 Gb/sec = 23 days
○ Only large institutions have the ability to utilize this data
○ These data types will continue to grow
Slide Courtesy of Tanja Davidsen, NCI
6. http://firecloud.orgFireCloud Concepts
● Data Files reside in Google Cloud
Storage
● Workspaces
● Tasks and Workflows
● Method Repositories
● Provenance captured for every
analysis run (i.e. what version of
what methods was run on what data
at what time)
7. FireCloud Overview
● The Workspace is the organizing
principle for FireCloud
○ When a workspace is created,
a Google bucket is
automatically attached to that
workspace
● The Data Model is the backbone
within the workspace
○ Holds meta-data, and bucket
pointers to input and output
8. http://cgc.systemsbiology.net/
… is to make TCGA data, together with tools and
compute-power, available and accessible to a broad
range of users using multiple access modes:
❏ Interactive web application
❏ Scripting languages: R, Python, SQL
❏ Direct programmatic access
9. ❏ Build an open platform that can grow and evolve to satisfy a
broad range of users and use-cases
❏ Leverage the best existing tools and technologies, as they are
released
❏ Collaborate with the research community in areas of data
standards, containers, workflows, etc
❏ Provide a range of examples and tutorials to get newcomers
up and running quickly
10. http://www.cancergenomicscloud.org
/
❖The CGC aims to provide a collaborative environment where researchers can
take advantage of co-localized public data (like TCGA) and public tools; but
also recombine these with their private data and tools.
❖Guiding Principles
➢ Making data available isn’t enough to make it usable.
➢ The best science happens in teams.
➢ Reproducibility shouldn’t be hard.
➢ The impact of TCGA is extended by new data & tools
Seven Bridges Genomics CGC Objectives
11. ❖Explore processed TCGA data for
mutations, copy number variations
and expression levels
❖Analyze data from their private
cohorts alongside TCGA data.
❖Use standard bioinformatics pipelines
to perform analyses.
❖Bring their own analysis tools directly
to the TCGA dataset.
❖Collaborate with researchers around
the world.
❖Access storage and compute
resources on the cloud on demand.
❖Access the CGC using the API as
Seven Bridges Genomic
CGC Features
12. Acknowledgement
Team CGC - https://goo.gl/f21Lqq
National Cancer Institute CBIIT
CGC Fact sheet - https://cbiit.nci.nih.gov/sites/nci-cbiit/files/Cloud_Pilot_Handout.pdf
Access Cloud Pilots https://cbiit.nci.nih.gov/ncip/nci-cancer-genomics-cloud-pilots/access-the-cloud-pilot-
platforms
Broad Institute - FireCloud - http://firecloud.org
Institute of Systems Biology - Cancer Genomics Cloud - http://cgc.systemsbiology.net/
Seven Bridges Genomics - Cancer Genomics Cloud - http://www.cancergenomicscloud.org/
Attain, LLC - http://http://www.attain.com/
Editor's Notes
this is good but I would focus on how the native Google platform has been fully exploited - BigQuery and Google Genomics in addition to google storage
It would be nice to have a visual of the case explorer or something else.
Do you plan to explain why 3 pilots, what was uniquely evaluated in each of the three?
also do you plan a concluding slide:
- on next steps from the programs perspective and how these would become part of the Commons vision or something like that
- a call to action for those who want to use it to access cancer data, availability of free credits and or mimic it for their ICs using the open source code of the platforms available for others use.