My talk at International Congress for Conservation Biology 2015, in Montpellier.
Data collected through citizen science programs allow addressing many important questions in conservation biology related, e.g., to the shift in species range, the ecology of infectious disease or the effects of habitat loss and fragmentation on biodiversity. However, citizen science data are subject to serious statistical challenges when it comes to their analysis and the reliable extraction of the information they contain, mainly due to sampling biases generated by variation in the observation process. Numerous methods have been proposed to address this issue that can be split into two main strategies: either a new approach is developed to deal with a specific problem or an existing approach is used pending some pre-treatment of the data or post-processing of the results. I review these various methods, trying to make the links between them and emphasizing their advantages and drawbacks with respect to the question. I illustrate my talk with case studies drawn for the research conducted in our group, mainly on large carnivores. Based on this review, I end up this contribution by recommendations on the use of existing methods and by suggesting perspectives on future developments.
Guided tutorial of the Neuroscience Information FrameworkMaryann Martone
A guided tutorial showing how to use the Neuroscience Information Framework to find data and tools related to the genetics of addiction. Presented at the Genetics of Addiction Workshop, Jackson Labs, Aug 28-Sept 1, 2014.
Guided tutorial of the Neuroscience Information FrameworkMaryann Martone
A guided tutorial showing how to use the Neuroscience Information Framework to find data and tools related to the genetics of addiction. Presented at the Genetics of Addiction Workshop, Jackson Labs, Aug 28-Sept 1, 2014.
A data-intensive assessment of the species abundance distributionElita Baldridge
Doctoral defense for Elita Baldridge from the Weecology lab at Utah State University. Slides for the talk (defense_pres.pdf) and a transcript are available on GitHub with the analysis code to fully reproduce the analyses presented. In addition, a fully closed captioned video of the talk is available on YouTube.
https://github.com/weecology/sad-comparison
https://www.youtube.com/watch?v=tkXUD0MSRCo#t=202
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Bertram Ludäscher
Keynote at CLIR Workshop (Webinar): Torward Open, Reproducible, and Reusable Research. February 10, 2021. https://reusableresearch.com/
ABSTRACT. The “reproducibility crisis” has resulted in much interest in methods and tools to improve computational reproducibility. FAIR data principles (data should be findable, accessible, interoperable, and reusable) are also being adapted and evolved to apply to other artifacts, notably computational analyses (scientific workflows, Jupyter notebooks, etc.). The current focus on computational reproducibility of scripts and other computational workflows sometimes overshadows a somewhat neglected and arguably more important issue: transparency of data analysis, including data wrangling and cleaning. In this talk I will ask the question: What information is gained by conducting a reproducibility experiment? This leads to a simple model (PRIMAD) that aims to answer this question by sorting out different scenarios. Finally, I will present some features of Whole-Tale, a computational platform for reproducible and transparent computational experiments.
Sampling design, sampling errors, sample size determinationVishnupriya T H
This presentation contains census and sample survey, implications of a sample design, steps in sample design, criteria of selecting a sampling procedure
Managing sensitive data at the Australian Data ArchiveARDC
Dr Steven McEachern, Director, Australian Data Archive, presenting at the Managing and publishing sensitive data in the Social Sciences webinar on 29/3/17
FULL webinar recording: https://youtu.be/7wxfeHNfKiQ
Webinar description:
1) Dr Steve McEachern (Director, Aust Data Archive) Stevediscussed how the Australian Data Archive manages and publishes sensitive social science data.
More about ADA: -- The Australian Data Archive (ADA) provides a national service for the collection and preservation of digital research data and to make these data available for secondary analysis by academic researchers and other users. -- The ADA is comprised of seven sub-archives - Social Science, HIstorical, Indigenous, Longitudinal, Qualitative, Crime & Justice and International. -- ADA data is free of charge to all users -- The archive is managed by the ADA central office based in the ANU Centre for Social Research and Methods at the Australian National University (ANU).https://www.ada.edu.au/
Analysing a Complex Agent-Based Model Using Data-Mining TechniquesBruce Edmonds
A talk given at "Social Simulation 2014" at Barcelona in September.
A complex “Data Integration Model” of voter behaviour is described. However it is very complex and hard to analyse. For such a model “thin” samples of the outcomes using classic parameter sweeps are inadequate. In order to get a more holistic picture of its behaviour data- mining techniques are applied to the data generated by many runs of the model, each with randomised parameter values.
Paper is at: http://cfpm.org/aacabm/analysing a complex model-v3.4.pdf
The process of obtaining information from a subset (sample) of
a larger group (population)
The results for the sample are then used to make estimates of
the larger group
Faster and cheaper than asking the entire population
A data-intensive assessment of the species abundance distributionElita Baldridge
Doctoral defense for Elita Baldridge from the Weecology lab at Utah State University. Slides for the talk (defense_pres.pdf) and a transcript are available on GitHub with the analysis code to fully reproduce the analyses presented. In addition, a fully closed captioned video of the talk is available on YouTube.
https://github.com/weecology/sad-comparison
https://www.youtube.com/watch?v=tkXUD0MSRCo#t=202
Computational Reproducibility vs. Transparency: Is It FAIR Enough?Bertram Ludäscher
Keynote at CLIR Workshop (Webinar): Torward Open, Reproducible, and Reusable Research. February 10, 2021. https://reusableresearch.com/
ABSTRACT. The “reproducibility crisis” has resulted in much interest in methods and tools to improve computational reproducibility. FAIR data principles (data should be findable, accessible, interoperable, and reusable) are also being adapted and evolved to apply to other artifacts, notably computational analyses (scientific workflows, Jupyter notebooks, etc.). The current focus on computational reproducibility of scripts and other computational workflows sometimes overshadows a somewhat neglected and arguably more important issue: transparency of data analysis, including data wrangling and cleaning. In this talk I will ask the question: What information is gained by conducting a reproducibility experiment? This leads to a simple model (PRIMAD) that aims to answer this question by sorting out different scenarios. Finally, I will present some features of Whole-Tale, a computational platform for reproducible and transparent computational experiments.
Sampling design, sampling errors, sample size determinationVishnupriya T H
This presentation contains census and sample survey, implications of a sample design, steps in sample design, criteria of selecting a sampling procedure
Managing sensitive data at the Australian Data ArchiveARDC
Dr Steven McEachern, Director, Australian Data Archive, presenting at the Managing and publishing sensitive data in the Social Sciences webinar on 29/3/17
FULL webinar recording: https://youtu.be/7wxfeHNfKiQ
Webinar description:
1) Dr Steve McEachern (Director, Aust Data Archive) Stevediscussed how the Australian Data Archive manages and publishes sensitive social science data.
More about ADA: -- The Australian Data Archive (ADA) provides a national service for the collection and preservation of digital research data and to make these data available for secondary analysis by academic researchers and other users. -- The ADA is comprised of seven sub-archives - Social Science, HIstorical, Indigenous, Longitudinal, Qualitative, Crime & Justice and International. -- ADA data is free of charge to all users -- The archive is managed by the ADA central office based in the ANU Centre for Social Research and Methods at the Australian National University (ANU).https://www.ada.edu.au/
Analysing a Complex Agent-Based Model Using Data-Mining TechniquesBruce Edmonds
A talk given at "Social Simulation 2014" at Barcelona in September.
A complex “Data Integration Model” of voter behaviour is described. However it is very complex and hard to analyse. For such a model “thin” samples of the outcomes using classic parameter sweeps are inadequate. In order to get a more holistic picture of its behaviour data- mining techniques are applied to the data generated by many runs of the model, each with randomised parameter values.
Paper is at: http://cfpm.org/aacabm/analysing a complex model-v3.4.pdf
The process of obtaining information from a subset (sample) of
a larger group (population)
The results for the sample are then used to make estimates of
the larger group
Faster and cheaper than asking the entire population
This pdf is about the Schizophrenia.
For more details visit on YouTube; @SELF-EXPLANATORY;
https://www.youtube.com/channel/UCAiarMZDNhe1A3Rnpr_WkzA/videos
Thanks...!
Introduction:
RNA interference (RNAi) or Post-Transcriptional Gene Silencing (PTGS) is an important biological process for modulating eukaryotic gene expression.
It is highly conserved process of posttranscriptional gene silencing by which double stranded RNA (dsRNA) causes sequence-specific degradation of mRNA sequences.
dsRNA-induced gene silencing (RNAi) is reported in a wide range of eukaryotes ranging from worms, insects, mammals and plants.
This process mediates resistance to both endogenous parasitic and exogenous pathogenic nucleic acids, and regulates the expression of protein-coding genes.
What are small ncRNAs?
micro RNA (miRNA)
short interfering RNA (siRNA)
Properties of small non-coding RNA:
Involved in silencing mRNA transcripts.
Called “small” because they are usually only about 21-24 nucleotides long.
Synthesized by first cutting up longer precursor sequences (like the 61nt one that Lee discovered).
Silence an mRNA by base pairing with some sequence on the mRNA.
Discovery of siRNA?
The first small RNA:
In 1993 Rosalind Lee (Victor Ambros lab) was studying a non- coding gene in C. elegans, lin-4, that was involved in silencing of another gene, lin-14, at the appropriate time in the
development of the worm C. elegans.
Two small transcripts of lin-4 (22nt and 61nt) were found to be complementary to a sequence in the 3' UTR of lin-14.
Because lin-4 encoded no protein, she deduced that it must be these transcripts that are causing the silencing by RNA-RNA interactions.
Types of RNAi ( non coding RNA)
MiRNA
Length (23-25 nt)
Trans acting
Binds with target MRNA in mismatch
Translation inhibition
Si RNA
Length 21 nt.
Cis acting
Bind with target Mrna in perfect complementary sequence
Piwi-RNA
Length ; 25 to 36 nt.
Expressed in Germ Cells
Regulates trnasposomes activity
MECHANISM OF RNAI:
First the double-stranded RNA teams up with a protein complex named Dicer, which cuts the long RNA into short pieces.
Then another protein complex called RISC (RNA-induced silencing complex) discards one of the two RNA strands.
The RISC-docked, single-stranded RNA then pairs with the homologous mRNA and destroys it.
THE RISC COMPLEX:
RISC is large(>500kD) RNA multi- protein Binding complex which triggers MRNA degradation in response to MRNA
Unwinding of double stranded Si RNA by ATP independent Helicase
Active component of RISC is Ago proteins( ENDONUCLEASE) which cleave target MRNA.
DICER: endonuclease (RNase Family III)
Argonaute: Central Component of the RNA-Induced Silencing Complex (RISC)
One strand of the dsRNA produced by Dicer is retained in the RISC complex in association with Argonaute
ARGONAUTE PROTEIN :
1.PAZ(PIWI/Argonaute/ Zwille)- Recognition of target MRNA
2.PIWI (p-element induced wimpy Testis)- breaks Phosphodiester bond of mRNA.)RNAse H activity.
MiRNA:
The Double-stranded RNAs are naturally produced in eukaryotic cells during development, and they have a key role in regulating gene expression .
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
Professional air quality monitoring systems provide immediate, on-site data for analysis, compliance, and decision-making.
Monitor common gases, weather parameters, particulates.
3. Mo3va3on
• Recent interest in large terrestrial and
marine mammals
• Hardly amenable to standard field protocols
• Growing curiosity in citizen science data
(CSD), but where to start?
4. What
are
the
biases
in
CSD?
• Observer bias
• Spatial bias
• Detection bias
You
see
me
You
don’t
see
me
5. Review
of
the
literature
• List all papers with ‘Citizen Science’ in them
• Scan and check those actually analysing CSD
• Add papers found randomly (ignoring
observer bias…)
• Can we build a taxonomy of methods?
• It’s going to be clumsy and
non-exhaustive
And
boring…
6. 1
-‐
the
‘compara3ve’
approach
• Comparison of results from (classic)
analyses of CSD vs. standardized protocols
- Deemed to be study/species specific
- Results are often convergent
• My review stops here then…
7. 2
-‐
‘filtering’
and
‘correc3on’
approaches
• Methods to filter, select data
• Correction methods: List Length Analysis,
Ball’s approach, Telfer’s approach,
Frescalo’s method, …
Sample
Completed
Least
Bi#ern
Survey
Data
Sheet
8. 2
-‐
‘filtering’
and
‘correc3on’
approaches
• These methods are not robust to bias in
CSD, except the Frescalo method
Check
out
our
paper,
it’s
awesome!
9. 3
-‐
the
‘simula3on’
approach
(Virtual
Ecologist)
• Simulate the bias, and check how your
favorite method behaves
• Case study with wolverine in Scandinavia
• Counts on den sites to infer abundance
• Accumulation of knowledge about the
sites falsely increases observed counts
V.
Gervasi
10. 3
-‐
the
‘simula3on’
approach
(Virtual
Ecologist)
Year
Log(N)
• Tool to design protocols adequately and
explore potential bias
• Convincing way to prove that raw indices
are biased
11. 4
-‐
the
‘regression’
approach
• Use relevant variables to account for biases
Ian
Renner
&
David
Warton
12. 4
-‐
the
‘regression’
approach
• Use relevant variables to account for biases
• Ecological variables
- Affect species’ presence
- Used for building models and predicting
• Observer bias variables
- Affect species detection
- Used only for building models
- Prediction with common level of bias
13. 4
-‐
the
‘regression’
approach
Maps of estimated intensity of Eucalyptus apiculata in Australia
(# detections / km2)
Ecological
variables
only
Ecological
+
observer
bias
variables,
condiFoning
on
a
common
level
of
bias
Sydney
Wollemi
Nat
Park
14. 5
-‐
the
‘combina3on’
approach
• Combine CSD with data collected via
standard protocols (detection/non-detection)
- DND data allow correcting for bias in
opportunistic data
- If no DND for one species, share information
with other species assuming similar bias
OpportunisFc
data
DetecFon/non-‐
detecFon
data
Actual
presence-‐
absence
of
the
species
Will
Fithian
15. 5
-‐
the
‘combina3on’
approach
• Combine CSD with data collected via
standard protocols (detection/non-detection)
- DND data allow correcting for bias in
opportunistic data
- If no DND for one species, share information
with other species assuming similar bias
• Several clever people are on it: Pagel,
Giraud, Dorazio, Fithian, O’Hara, …
16. 6
-‐
the
‘occupancy’
approach
• Correct for false-negatives, and
time/spatial variation in detection
- Account for false-positives
- Extension to multiple species
• How to get the non-detections?
- Relatively easy for checklist data
- But otherwise? You need to know something
about the observer effort…
17. • Typical example of human-wildlife conflict
• Network of observers all over the country
• Map its range, and assess its dynamics
Wolf
range
dynamics
in
France
21. Conclusions
• CSD are great!
• But, we need to deal with bias if we want
to extract meaningful ecological signal
22. Recommenda3ons
(at
your
own
risk)
• A myriad of approaches; no decision tree
• Use simulations to explore effect of bias
• If possible, incorporate detectability via
occupancy / capture-recapture models
• If not, the regression approach, with
covariates to correct for observer bias, is an
avenue to explore
23. Perspec3ves
• The combination approach holds great promise
• The (inhomogeneous) Poisson point process
modeling framework seems to be a unifying
framework
OpportunisFc
data
DetecFon/non-‐
detecFon
data
Actual
presence-‐
absence
of
the
species
24. Perspec3ves
• We should focus more on the citizens
- Fieldwork sheet for recording data on observers too?
- A protocol to collect/store data on both species and citizens
• Technology will help
• As well as social sciences
25. Thank
you!
… and Barney Stinson from How I met your mother, Tom from the Minions, a
random cute cat, Boromir from Lord of the Rings, James Montgomery Flagg
(Uncle Sam), Karine and Wesley, Anne-Sophie and Julie from our boulet
team, and the meme generators