Exposome Data Challenge results presented Juen 23rd 2022 by
@Lea_Maitre
as a showcase of "open science and causality", with Shani Evenstein from Wikimedia Foundation and Marc Chadeau-Hyam, ICL. in a session on "open science is an ethical duty"
#openscience #exposome #ISGlobalExposome #OneEU2022 #ATHLETE
The document outlines the Food Hygiene Regulations 2009 in Malaysia. It provides background on the regulations and their objectives to ensure food safety and hygiene. It details the application and registration requirements for food premises. Specific hygiene practices and standards are outlined for food handling, premises maintenance, food protection and carriage. Non-compliance can result in penalties. The regulations provide comprehensive standards to control food hygiene and safety.
This is a mini project based on the agricultural system which differs from traditional agricultural system as it is directed by the IOT devices. Some relevant information of conventional system were also discussed to differentiate between both the systems.
The document provides an overview of the history and development of the Internet and World Wide Web. It discusses how computer networks and protocols like TCP/IP were developed in the 1960s and 1970s, allowing computers to connect and share information. It then explains how the commercialization of the Internet in the 1990s led to rapid growth in users and traffic.
Exposome data challenge - ISGlobal hub prez July 2022.pptxLeaMaitre1
The document summarizes an exposome data challenge event organized by ISGlobal. The event aimed to promote open science and interdisciplinary collaboration around analyzing exposome data. Participants were given a simulated exposome dataset based on real data from the HELIX project and asked to apply their statistical methods to analyze the data. Twenty-five teams were selected to present their approaches at the event. The goal was to accelerate innovation in exposome research through this collaborative data analysis challenge.
The document discusses evidence farming as an approach to evaluating mobile health (mHealth) applications and interventions. Evidence farming involves extracting evidence from care process data and manipulating care processes with flexible protocols. This allows generating hypotheses at the population level while maintaining some internal validity. The document proposes an open architecture called OpenmHealth to support evidence farming and crowdsourcing evidence through modules for usage analytics, randomized trials, individualized studies, and sharing findings. The goal is a learning community that can rapidly disseminate and iterate on evaluation methods that matter to improving mHealth.
Bioinformatics Strategies for Exposome 100416Chirag Patel
This document discusses the challenges of using big data from the exposome for robust biomedical discovery. It notes that the exposome generates a huge number of potential exposure-phenotype hypotheses but observational data is susceptible to biases. It proposes bioinformatics-inspired guidelines to enhance discovery, including systematically testing hypotheses while addressing multiplicity, replicating findings, developing databases to disseminate results, and practicing reproducible research. Specific examples are given of searching large exposure-phenotype spaces and developing phenotype-exposure association maps to systematically explore connections in a big data context.
The document outlines the Food Hygiene Regulations 2009 in Malaysia. It provides background on the regulations and their objectives to ensure food safety and hygiene. It details the application and registration requirements for food premises. Specific hygiene practices and standards are outlined for food handling, premises maintenance, food protection and carriage. Non-compliance can result in penalties. The regulations provide comprehensive standards to control food hygiene and safety.
This is a mini project based on the agricultural system which differs from traditional agricultural system as it is directed by the IOT devices. Some relevant information of conventional system were also discussed to differentiate between both the systems.
The document provides an overview of the history and development of the Internet and World Wide Web. It discusses how computer networks and protocols like TCP/IP were developed in the 1960s and 1970s, allowing computers to connect and share information. It then explains how the commercialization of the Internet in the 1990s led to rapid growth in users and traffic.
Exposome data challenge - ISGlobal hub prez July 2022.pptxLeaMaitre1
The document summarizes an exposome data challenge event organized by ISGlobal. The event aimed to promote open science and interdisciplinary collaboration around analyzing exposome data. Participants were given a simulated exposome dataset based on real data from the HELIX project and asked to apply their statistical methods to analyze the data. Twenty-five teams were selected to present their approaches at the event. The goal was to accelerate innovation in exposome research through this collaborative data analysis challenge.
The document discusses evidence farming as an approach to evaluating mobile health (mHealth) applications and interventions. Evidence farming involves extracting evidence from care process data and manipulating care processes with flexible protocols. This allows generating hypotheses at the population level while maintaining some internal validity. The document proposes an open architecture called OpenmHealth to support evidence farming and crowdsourcing evidence through modules for usage analytics, randomized trials, individualized studies, and sharing findings. The goal is a learning community that can rapidly disseminate and iterate on evaluation methods that matter to improving mHealth.
Bioinformatics Strategies for Exposome 100416Chirag Patel
This document discusses the challenges of using big data from the exposome for robust biomedical discovery. It notes that the exposome generates a huge number of potential exposure-phenotype hypotheses but observational data is susceptible to biases. It proposes bioinformatics-inspired guidelines to enhance discovery, including systematically testing hypotheses while addressing multiplicity, replicating findings, developing databases to disseminate results, and practicing reproducible research. Specific examples are given of searching large exposure-phenotype spaces and developing phenotype-exposure association maps to systematically explore connections in a big data context.
Open Data and the Social Sciences - OpenCon Community WebcastRight to Research
The document discusses issues with transparency and reproducibility in social science research. It notes that research influences policy and decisions that affect millions of lives. However, weak academic norms like publication bias, p-hacking, non-disclosure, and failure to replicate can distort the body of evidence. The document proposes solutions like pre-registering studies and pre-specifying analyses to address these issues. It also discusses resources and efforts like the Berkeley Initiative for Transparency in the Social Sciences to raise awareness, foster adoption of transparent practices, and identify strategies to improve reproducibility.
Innovative research approaches to improve evidence in global healthEmilie Robert
Presentation given at the Canadian Conference on Global Health in 2015 in Montreal, with Federica Fregonese, Pierre Minn, Emilie Robert and Georges -Chalers Thiebaut
Sherri Rose wrote a fascinating article about statistician’s role in big data. One thing I really liked was this line: “This may require implementing commonly used methods, developing a new method, or integrating techniques from other fields to answer our problem.” I really like the idea that integrating and applying standard methods in new and creative ways can be viewed as a statistical contribution.
20160811 Big Data for Health and MedicineBrian Bot
This document summarizes Brian Bot's presentation on biomedical research in an increasingly digital world. It discusses how biomedical data is increasingly being produced, aggregated, and shared digitally. Only a small percentage of studies can be fully reproduced due to issues with data availability and documentation. However, organizations like Sage Bionetworks are working to promote open sharing of complex biological data through diverse collaborations and by empowering citizens to contribute their own health data to research. One example is the mPower study on Parkinson's disease which has involved thousands of participants sharing passive and active data through mobile apps.
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...Larry Smarr
10.10.06
Guest Lecture
UCSD Medical and Pharmaceutical Students Foundations of Human Biology--Lecture #41
Title: Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and Society
La Jolla, CA
This document discusses the need for open science due to a reproducibility crisis in many scientific disciplines. It notes that many published findings cannot be replicated and estimates that at least two-thirds of published results in psychology and biomedicine may be incorrect. This represents a credibility crisis that undermines public trust in science. The document argues that adopting practices of open science such as preregistration, open data, and detailed documentation can help address this crisis by reducing biases, enabling replication, and increasing transparency and reproducibility. Open science is presented as a means of improving research quality and accelerating discovery for the benefit of both science and society.
Beacon Network: A System for Global Genomic Data SharingMiro Cupak
The Beacon Network provides a system for global genomic data sharing by allowing users to query a network of genetic data sources to determine if a particular genetic variant or mutation exists in their databases. It began as a web service called Beacon that responds with "yes" or "no" to questions about genetic mutations. The Beacon Network expands this by distributing queries across multiple beacons and aggregating the results. It currently includes over 25 genomic organizations with access to over 2 million samples and 2 billion genetic variants, serving hundreds of thousands of queries from users around the world. The goal is to facilitate discovery of new links between genetic data and health conditions.
Our classification technique uses a deep CNN to classify skin lesions. An image is warped through the CNN architecture into a probability distribution over clinical skin disease classes. The CNN was pretrained on a large generic image dataset and fine-tuned on a dataset of over 129,000 skin lesions spanning 2,032 diseases. Data integration from multiple sources is key to future digital medicine, but challenges include data quality, availability, and privacy. Techniques like distributed learning models and homomorphic encryption can help address privacy concerns while enabling large-scale data sharing and analysis.
These slides are from a workshop "Putting bioscience into context: exercises to enhance engagement" run at the Society for Experimental Biology conference in Canterbury (April 2006).
Math, Stats and CS in Public Health and Medical ResearchJessica Minnier
Jessica Minnier gave a talk on her career path from studying mathematics at Lewis & Clark College to her current position as an Assistant Professor in biostatistics. She discussed how biostatistics, bioinformatics, and computational biology are applied in medical research, using examples like analyzing RNA sequencing data and building predictive models from electronic health records. Minnier also shared resources for learning more about careers in public health research and statistics.
The shared value of personal and population dataWessel Kraaij
Wessel Kraaij discusses the shared value of personal and population health data. Health is complex with many interrelated factors beyond just molecules. Quantified self data from sensors in smartphones and devices could provide insights if measured across lifetimes, but raises privacy issues. Different stakeholders have varying interests in healthcare data. Future scenarios may see companies owning patient data. Reference data is needed for clinical reasoning and self-management. A balanced health data infrastructure respecting privacy and shared benefits for all is needed.
Doing more with less resources used to be a situation common just for academic scientists. This is unfortunately still true for academics but we are seeing others facing many of the same challenges. With the squeeze on budgets and cost cutting resulting from recent worldwide economic challenges, the failure of many drugs to make it through the pipeline to the market, and the increasing costs associated with the drug development process, we are now seeing in the pharmaceutical industry a dramatic shift, perhaps belatedly, to have to accommodate similar challenges of doing more with less
XNN001 Introductory epidemiological concepts - Study designramseyr
This document provides an overview of key epidemiological concepts and study designs. It defines epidemiology and discusses why epidemiological data is collected through monitoring and surveillance and to identify relationships between exposures and disease. The main observational study designs covered are ecological, cross-sectional, case-control, cohort studies as well as randomized controlled trials. For each study design, the document outlines their structure, advantages and limitations.
The document summarizes the Institute for Systems Biology (ISB) and its work in systems biology. It discusses three pillars of ISB's work: systems biology, collaborations/partnerships/commercial spinouts, and stewardship/education. It provides details on ISB's research thrusts in medical research, sustainability, and healthcare using systems approaches. ISB seeks to solve complex biological problems through cross-disciplinary team science and driving innovation. Examples are provided of ISB's partnerships and commercial spinoffs applying systems biology. ISB also contributes to sustainability through its education programs.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
Repurposing large datasets for exposomic discovery in diseaseChirag Patel
This document discusses the need for large-scale studies of environmental exposures (the exposome) analogous to genome-wide association studies (GWAS) to better understand environmental contributions to health and disease. It notes that while genetics research has made great strides with GWAS, understanding of environmental influences lacks comparable methods and data. The author argues that characterizing the exposome through high-throughput methods could discover new environmental factors in phenotypes, as GWAS did for genetics. National Health and Nutrition Examination Survey is presented as a model for collecting comprehensive human exposure data on a large scale.
Talk by Christoph Steinbeck, European Bioinformatics Institute (EMBL-EBI) on challenges for data sharing of clinical data in metabolomics research. This workshop was co-organised with the European BBMRI Biobanking infrastructure as part of the BioMedBridge symposium at the Wellcome Trust Conference Centre in Hinxton, UK.
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
Open Data and the Social Sciences - OpenCon Community WebcastRight to Research
The document discusses issues with transparency and reproducibility in social science research. It notes that research influences policy and decisions that affect millions of lives. However, weak academic norms like publication bias, p-hacking, non-disclosure, and failure to replicate can distort the body of evidence. The document proposes solutions like pre-registering studies and pre-specifying analyses to address these issues. It also discusses resources and efforts like the Berkeley Initiative for Transparency in the Social Sciences to raise awareness, foster adoption of transparent practices, and identify strategies to improve reproducibility.
Innovative research approaches to improve evidence in global healthEmilie Robert
Presentation given at the Canadian Conference on Global Health in 2015 in Montreal, with Federica Fregonese, Pierre Minn, Emilie Robert and Georges -Chalers Thiebaut
Sherri Rose wrote a fascinating article about statistician’s role in big data. One thing I really liked was this line: “This may require implementing commonly used methods, developing a new method, or integrating techniques from other fields to answer our problem.” I really like the idea that integrating and applying standard methods in new and creative ways can be viewed as a statistical contribution.
20160811 Big Data for Health and MedicineBrian Bot
This document summarizes Brian Bot's presentation on biomedical research in an increasingly digital world. It discusses how biomedical data is increasingly being produced, aggregated, and shared digitally. Only a small percentage of studies can be fully reproduced due to issues with data availability and documentation. However, organizations like Sage Bionetworks are working to promote open sharing of complex biological data through diverse collaborations and by empowering citizens to contribute their own health data to research. One example is the mPower study on Parkinson's disease which has involved thousands of participants sharing passive and active data through mobile apps.
Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and So...Larry Smarr
10.10.06
Guest Lecture
UCSD Medical and Pharmaceutical Students Foundations of Human Biology--Lecture #41
Title: Genomics in Society: Genomics, Cellular Networks, Preventive Medicine, and Society
La Jolla, CA
This document discusses the need for open science due to a reproducibility crisis in many scientific disciplines. It notes that many published findings cannot be replicated and estimates that at least two-thirds of published results in psychology and biomedicine may be incorrect. This represents a credibility crisis that undermines public trust in science. The document argues that adopting practices of open science such as preregistration, open data, and detailed documentation can help address this crisis by reducing biases, enabling replication, and increasing transparency and reproducibility. Open science is presented as a means of improving research quality and accelerating discovery for the benefit of both science and society.
Beacon Network: A System for Global Genomic Data SharingMiro Cupak
The Beacon Network provides a system for global genomic data sharing by allowing users to query a network of genetic data sources to determine if a particular genetic variant or mutation exists in their databases. It began as a web service called Beacon that responds with "yes" or "no" to questions about genetic mutations. The Beacon Network expands this by distributing queries across multiple beacons and aggregating the results. It currently includes over 25 genomic organizations with access to over 2 million samples and 2 billion genetic variants, serving hundreds of thousands of queries from users around the world. The goal is to facilitate discovery of new links between genetic data and health conditions.
Our classification technique uses a deep CNN to classify skin lesions. An image is warped through the CNN architecture into a probability distribution over clinical skin disease classes. The CNN was pretrained on a large generic image dataset and fine-tuned on a dataset of over 129,000 skin lesions spanning 2,032 diseases. Data integration from multiple sources is key to future digital medicine, but challenges include data quality, availability, and privacy. Techniques like distributed learning models and homomorphic encryption can help address privacy concerns while enabling large-scale data sharing and analysis.
These slides are from a workshop "Putting bioscience into context: exercises to enhance engagement" run at the Society for Experimental Biology conference in Canterbury (April 2006).
Math, Stats and CS in Public Health and Medical ResearchJessica Minnier
Jessica Minnier gave a talk on her career path from studying mathematics at Lewis & Clark College to her current position as an Assistant Professor in biostatistics. She discussed how biostatistics, bioinformatics, and computational biology are applied in medical research, using examples like analyzing RNA sequencing data and building predictive models from electronic health records. Minnier also shared resources for learning more about careers in public health research and statistics.
The shared value of personal and population dataWessel Kraaij
Wessel Kraaij discusses the shared value of personal and population health data. Health is complex with many interrelated factors beyond just molecules. Quantified self data from sensors in smartphones and devices could provide insights if measured across lifetimes, but raises privacy issues. Different stakeholders have varying interests in healthcare data. Future scenarios may see companies owning patient data. Reference data is needed for clinical reasoning and self-management. A balanced health data infrastructure respecting privacy and shared benefits for all is needed.
Doing more with less resources used to be a situation common just for academic scientists. This is unfortunately still true for academics but we are seeing others facing many of the same challenges. With the squeeze on budgets and cost cutting resulting from recent worldwide economic challenges, the failure of many drugs to make it through the pipeline to the market, and the increasing costs associated with the drug development process, we are now seeing in the pharmaceutical industry a dramatic shift, perhaps belatedly, to have to accommodate similar challenges of doing more with less
XNN001 Introductory epidemiological concepts - Study designramseyr
This document provides an overview of key epidemiological concepts and study designs. It defines epidemiology and discusses why epidemiological data is collected through monitoring and surveillance and to identify relationships between exposures and disease. The main observational study designs covered are ecological, cross-sectional, case-control, cohort studies as well as randomized controlled trials. For each study design, the document outlines their structure, advantages and limitations.
The document summarizes the Institute for Systems Biology (ISB) and its work in systems biology. It discusses three pillars of ISB's work: systems biology, collaborations/partnerships/commercial spinouts, and stewardship/education. It provides details on ISB's research thrusts in medical research, sustainability, and healthcare using systems approaches. ISB seeks to solve complex biological problems through cross-disciplinary team science and driving innovation. Examples are provided of ISB's partnerships and commercial spinoffs applying systems biology. ISB also contributes to sustainability through its education programs.
The Learning Health System: Thinking and Acting Across ScalesPhilip Payne
A Learning Health System (LHS) can be defined as an environment in which knowledge generation processes are embedded into daily clinical practice in order to continually improve the quality, safety, and outcomes of healthcare delivery. While still largely an aspirational goal, the promise of the LHS is a future in which every patient encounter is an opportunity to learn and improve that patient’s care, as well as the care their family and broader community receives. The foundation for building such an LHS can and should be the Electronic Health Record (EHR), which provides the basis for the comprehensive instrumentation and measurement of clinical phenotypes, as well as a means of delivering new evidence at the patient- and population levels. In this presentation, we will explore the ways in which such EHR-derived phenotypes can be combined with complementary data across a spectrum from biomolecules to population level trends, to both generate insights and deliver such knowledge in the right time, place, and format, ultimately improving clinical outcomes and value.
Repurposing large datasets for exposomic discovery in diseaseChirag Patel
This document discusses the need for large-scale studies of environmental exposures (the exposome) analogous to genome-wide association studies (GWAS) to better understand environmental contributions to health and disease. It notes that while genetics research has made great strides with GWAS, understanding of environmental influences lacks comparable methods and data. The author argues that characterizing the exposome through high-throughput methods could discover new environmental factors in phenotypes, as GWAS did for genetics. National Health and Nutrition Examination Survey is presented as a model for collecting comprehensive human exposure data on a large scale.
Talk by Christoph Steinbeck, European Bioinformatics Institute (EMBL-EBI) on challenges for data sharing of clinical data in metabolomics research. This workshop was co-organised with the European BBMRI Biobanking infrastructure as part of the BioMedBridge symposium at the Wellcome Trust Conference Centre in Hinxton, UK.
Similar to Exposome data challenge - ONE2022.pptx (20)
Or: Beyond linear.
Abstract: Equivariant neural networks are neural networks that incorporate symmetries. The nonlinear activation functions in these networks result in interesting nonlinear equivariant maps between simple representations, and motivate the key player of this talk: piecewise linear representation theory.
Disclaimer: No one is perfect, so please mind that there might be mistakes and typos.
dtubbenhauer@gmail.com
Corrected slides: dtubbenhauer.com/talks.html
The ability to recreate computational results with minimal effort and actionable metrics provides a solid foundation for scientific research and software development. When people can replicate an analysis at the touch of a button using open-source software, open data, and methods to assess and compare proposals, it significantly eases verification of results, engagement with a diverse range of contributors, and progress. However, we have yet to fully achieve this; there are still many sociotechnical frictions.
Inspired by David Donoho's vision, this talk aims to revisit the three crucial pillars of frictionless reproducibility (data sharing, code sharing, and competitive challenges) with the perspective of deep software variability.
Our observation is that multiple layers — hardware, operating systems, third-party libraries, software versions, input data, compile-time options, and parameters — are subject to variability that exacerbates frictions but is also essential for achieving robust, generalizable results and fostering innovation. I will first review the literature, providing evidence of how the complex variability interactions across these layers affect qualitative and quantitative software properties, thereby complicating the reproduction and replication of scientific studies in various fields.
I will then present some software engineering and AI techniques that can support the strategic exploration of variability spaces. These include the use of abstractions and models (e.g., feature models), sampling strategies (e.g., uniform, random), cost-effective measurements (e.g., incremental build of software configurations), and dimensionality reduction methods (e.g., transfer learning, feature selection, software debloating).
I will finally argue that deep variability is both the problem and solution of frictionless reproducibility, calling the software science community to develop new methods and tools to manage variability and foster reproducibility in software systems.
Exposé invité Journées Nationales du GDR GPL 2024
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
The binding of cosmological structures by massless topological defectsSérgio Sacani
Assuming spherical symmetry and weak field, it is shown that if one solves the Poisson equation or the Einstein field
equations sourced by a topological defect, i.e. a singularity of a very specific form, the result is a localized gravitational
field capable of driving flat rotation (i.e. Keplerian circular orbits at a constant speed for all radii) of test masses on a thin
spherical shell without any underlying mass. Moreover, a large-scale structure which exploits this solution by assembling
concentrically a number of such topological defects can establish a flat stellar or galactic rotation curve, and can also deflect
light in the same manner as an equipotential (isothermal) sphere. Thus, the need for dark matter or modified gravity theory is
mitigated, at least in part.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
ESPP presentation to EU Waste Water Network, 4th June 2024 “EU policies driving nutrient removal and recycling
and the revised UWWTD (Urban Waste Water Treatment Directive)”
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
1. Open science and causality in the Exposome era
Lessons learned from the Exposome Data Challenge
Léa Maitre, PhD
Barcelona Institute for Global Health (ISGlobal)
June 23rd 2022 – One Health Conference
2. The Exposome concept
The totality of environmental exposures (meaning all non-genetic factors) that
a person experiences, from conception onwards.
Chris Wild, 2005.
Léa Maitre - ISGlobal - Exposome data challenge
3. Classical approach in environmental epidemiology:
single-exposure association
Selective reporting of associations
Publication bias
No correction for multiple testing
(separate papers for each exposure)
Cannot take into account
confounding by co-exposures
Lack of consideration of “mixture
effect”
Exposome approach
calls for a holistic view of the
effects of environmental
exposures on human health
Léa Maitre - ISGlobal - Exposome data challenge
5. The Exposome data challenge
Event created in the framework of the ISGlobal Exposome hub and H2020 ATHLETE
project
Scientific publication pre-print https://arxiv.org/abs/2202.01680 (under review)
Simulated data (based on the HELIX project) publicly available to challenge
researchers on statistical tools to study exposome-health associations
Léa Maitre - ISGlobal - Exposome data challenge
6. Léa Maitre - ISGlobal - Exposome data challenge
- 25 selected teams out of 39
abstracts sent
- 307 participants online:
• 101 North America
• 186 Europe
• 9 Asia
• 8 Latin America
• 2 Africa
• 1 Australia
- Awards attributed from the public
and from the comittee
7. The Human early-life exposome project (HELIX)
Study design
Six mother-child cohorts in Europe (n=1301)
Pregnant women enrolled at the beginning of their pregnancy
Follow-up of the children with a standardized clinical examination at 6-11 years old
Aim
To study the association between multiple exposures, molecular signatures, and
child health outcomes.
6-11 yo
Léa Maitre - ISGlobal - Exposome data challenge
8. Exposure assessment
>100 environmental factors
Assessed during pregnancy and childhood (at the time of the children follow-up, 6-11yo)
Outdoor exposures
(Geographic Information System)
Air pollution*
Noise†
Built environment†
Natural spaces†
Traffic
Meteorology*
Water DBP
Indoor air
Chemicals
(blood or urine biomarkers)
Organochlorines
PBDE
PFAS
Metals
Phthalates
Phenols
Organophosphate pesticides
Lifestyles
(questionnaires)
Smoking
Diet
Physical activity
Social and economic capital
Sleep
* Postnatal exposures available within different time window
† Postnatal exposures available at different location: home and school
Léa Maitre - ISGlobal - Exposome data challenge
9. Health outcomes
6 health outcomes
At birth or at the time of the children follow-up (6-11yo)
Continuous variables
Birth weight
Body mass index at 6-11yo
Categorical variables
Asthma at 6-11yo (binary)
Body mass index at 6-11yo
(4 categories)
Count variables
Intelligence quotient at 6-11yo
Total correct answers (RAVEN
test)
Neuro behavior at 6-11yo
Internalizing and externalizing
problems (CBCL scale)
Léa Maitre - ISGlobal - Exposome data challenge
10. Covariates, potential confounders
Maternal and child data
Maternal data
Cohort of inclusion
Age
Education
Pre-pregnancy body mass index
Weight gain during pregnancy
Parity
Child data at birth
Sex
Gestational age
Year of birth
Native origin
Child data at 6-11yo
Age
Weight
Height
Léa Maitre - ISGlobal - Exposome data challenge
11. Omics data
>400,000 features
Assessed during childhood only (at the time of the children follow-up, 6-11yo)
White blood cells
DNA methylation
(386,518 CpGs)
Transcription
Gene expression
(58,254 transcripts)
Plasma & serum
Proteins
(36 proteins)
Metabolites
(177 metabolites)
Urine
Metabolites
(44 metabolites)
Léa Maitre - ISGlobal - Exposome data challenge
12. Data summary
Naissance
lusion:1er trimestre
estionnaire
nes
Dossier maternité
+
Sang du cordon,
Fragments placenta,
Cheveux
Questionnaires
(mère, père,
enfant)
2 ans
Questionnaires
+ Suivi médical
(médecine
scolaire)
4 ans
Naissance
lusion:1er trimestre
estionnaire
nes
Dossier maternité
+
Sang du cordon,
Fragments placenta,
Cheveux
Questionnaires
(mère, père,
enfant)
2 ans
Questionnaires
+ Suivi médical
(médecine
scolaire)
4 ans
6-11 yo
Exposures ++ Exposures ++
Omics ++++
Health
Covariates
Health
Covariates
Léa Maitre - ISGlobal - Exposome data challenge
13. Particularity of the data
Correlation between
exposures
Causal relation
between
exposures/omic layers
High dimension
(exposure, omics)
Multi-center study (6
mother-child cohorts)
Repeated exposure
data (pregnancy and
childhood)
Small sample size
(n=1301 mother-child
pairs)
Missing data
Non-linear association
Continuous vs
categorical variables
Léa Maitre - ISGlobal - Exposome data challenge
14. Challenge examples
Challenge 1: Combined effects of exposures
Identify combination of exposures, high-order
interactions or exposure patterns that are particularly
harmful or beneficial for one or several health outcomes.
Challenge 2: Using omics data to improve inference
on the link between exposome and health.
Incorporate the different omics layers into the analysis
linking the exposome and one or more health outcomes.
Challenge 3: Multi-omics analysis
Incorporate different layers of omics data (including exposome as one of the layers) to find
patterns that can explain variations in one or more health outcomes.
Challenge 4: Causal structure in the exposome
Define hypothesized causal relationships between the
different exposures and one health outcome, and
incorporate this information into the analysis.
Challenge 5: Visualization techniques
Tools to visualize the complex relationships between the
different components of the analysis, with the main aim of
illustrating determinants of health effects.
/! Control for potential confounders and multicenter design.
Handle missing data.
Léa Maitre - ISGlobal - Exposome data challenge
15. Popular vote: winner
Using causal random forest to determine exposure environments with high
sexual dimorphisms
Alejandro Caceres, ISGlobal
16. Committee vote: winner 1
Quantifying Exposome-Health Associations with Bayesian Multiple Index
Models
Glen McGee, University of Waterloo
github.com/glenmcgee/bsmim2
glen.mcgee@uwaterloo.ca
Interactions between multipollutant index and individual covariates
17. Committee vote: winner 2
Decoding unknown links between the exposome and health outcomes by
multi-omics analysis
Xiaotao Shen, Stanford University
18. Ressources
Exposome data challenge
Dataset: https://github.com/isglobal-exposomeHub/ExposomeDataChallenge2021/blob/main/README.md
Data description: https://docs.google.com/document/d/1ul3v-sIniLuTjFB1F1CrFQIX8mrEXVnvSzOF7BCOnpQ/edit
Scientific publication https://arxiv.org/abs/2202.01680 (under review)
Slides and videos of the presentations https://www.isglobal.org/-/exposome-data-analysis-challenge
https://www.youtube.com/channel/UC0F3hR04UzUeKkcfAyikltA/featured
Code used shared on GitHub: https://github.com/isglobal-
exposomeHub/ExposomeDataChallenge2021/tree/main/R_Code_Presentations
HELIX project
Data inventory: https://www.projecthelix.eu/index.php/es/data-inventory
Tamayo-Uria I, et al. The early-life exposome: Description and patterns in six European countries. Environ Int. 2019.
Maitre L, et al. Human Early Life Exposome (HELIX) study: a European population-based exposome cohort. BMJ Open. 2018.
Vrijheid M, et al. The human early-life exposome (HELIX): project rationale and design. Environ Health Perspect. 2014
Léa Maitre - ISGlobal - Exposome data challenge
Publicize methods
for developers
Code source for
analysists
19. Challenge challenges
• Making real-case, sensitive personal data, publicly available
Partial imputation of the data
• Finding a balance between well-defined research questions and
generalizability of the results to a wide community
The first edition of the challenge was focused on general data analysis approach challenges,
rather than specific research questions, which means methods presented could not be
directly compared. The advantage was a broad catalogue of methods
• A short time frame, sometimes lack of adjustment for known biaises
• Appropriate recognition for the data-generating team
Léa Maitre - ISGlobal - Exposome data challenge
20. Challenge successes
• Knowledge transfer -> Training tools for students, already used by
universities such as French Centrale sup elec or PhD students
• New collaborations formed
• Accelerated the rate of scientific discovery, out-of-the-box ideas
created by multidisciplinary teams in a short time:
• 25 teams who worked over a month, will never equal the work even of a
consortium over 5 years (only 5 methods tested for the HELIX stat protocol,
run mainly by one experienced statistician)
Léa Maitre - ISGlobal - Exposome data challenge
21. Thank you
Léa Maitre - ISGlobal - Exposome data challenge
Contact me at
lea.maitre@isglobal.org
23. The early life period
Vulnerable period of rapid
organ development
Many chronic diseases have
part of their origin in early life
(Barker hypothesis)
Starting point for a life-course
exposome
Léa Maitre - ISGlobal - Exposome data challenge
24. The Exposome data challenge
Event created in the framework of the ISGlobal Exposome hub and H2020 ATHLETE
project
Organized by ISGlobal, Barcelona
Simulated data (based on the HELIX project) publicly available to challenge
researchers on statistical tools to study exposome-health associations
Léa Maitre - ISGlobal - Exposome data challenge
25. Organization of the datasets
Exposures
Omics
Health
Covariates
data.frame: exposome
data.frame: phenotype
data.frame: covariates
DNA methylation GenomicRatioSet: methyl
Gene expression ExpressionSet: genexpr
Metabolites (urine) ExpressionSet: metabol_urine
Metabolites (serum) ExpressionSet: metabol_serum
Proteines ExpressionSet: proteom
Data available here: https://github.com/isglobal-exposomeHub/ExposomeDataChallenge2021/blob/main/README.md
Léa Maitre - ISGlobal - Exposome data challenge