神戸大学政治学研究会2015年12月例会報告資料
一部の表が見にくい場合は
http://www.JaySong.net/wp-content/uploads/2015/12/Kobepolisci_Conjoint.pdf
をご参照下さい。
( PDF version is also available in http://www.jaysong.net )
這是巨量資料探勘與統計應用課程的投影片「類別變項的相關檢定:卡方獨立性檢定」。本單元是屬於系列課程中的「資料檢定級」中的第三個單元,處理資料類型是「類別」類型的資料,可以檢測出兩兩類別資料之間的關係。本單元要講的分析技術是推論統計的卡方獨立性檢定(Chi-Square Test of Independence),相當適合質性研究所蒐集的類別資料或行為分析。本單元的分析工具是我額外開發的「卡方獨立性檢定計算器」,在投影片裡面還談到了隱含在卡方檢定之後的陷阱:辛普森詭論(Simpson's paradox)。這個單元包含了四個實作學習單,供同學邊看邊練習。
The document discusses Chi-square tests and their applications. Chi-square tests can be used for goodness of fit, independence, and homogeneity. They are non-parametric tests used to analyze categorical data. The three main types are: 1) goodness of fit tests determine if a sample fits a hypothesized distribution, 2) independence tests determine if two categorical variables are associated, and 3) homogeneity tests determine if a categorical variable is distributed identically across populations. Chi-square tests involve calculating expected frequencies, observed frequencies, and a test statistic to determine if the null hypothesis can be rejected.
This document summarizes a doctoral thesis presentation on statistical learning theory for parameter-restricted singular models. It discusses how singular models like hierarchical and latent variable models are important in statistical model design but traditional learning theory cannot analyze the generalization error of such singular models. The presentation analyzes the generalization error of non-negative matrix factorization (NMF) and latent Dirichlet allocation (LDA) as examples of parameter-restricted singular models. It derives an upper bound for the real log-threshold of NMF which determines the generalization error of singular models, and precisely analyzes the real log-threshold of LDA by relating it to a constrained matrix factorization.
The document discusses Demographic Health Surveys (DHS) conducted in India called the National Family Health Survey (NFHS). It notes that DHS are nationally representative household surveys that collect data on population, health, and nutrition trends in developing countries. In India, NFHS surveys collect data from hundreds of thousands of households, women, and men through standardized questionnaires on topics like marriage, fertility, family planning, and health. The data are used widely by policymakers and researchers to inform health policies and programs in India.
- Add Health is a longitudinal study that began in 1994 to examine the health and behaviors of adolescents in the United States from adolescence into adulthood. It utilizes a nationally representative sample and collects extensive data through in-home interviews and biological specimens.
- The study collects data on the social, family, school, and neighborhood environments of participants and how these impact health outcomes. It also examines genetic and biological factors. This allows for analysis of gene-environment interactions and pathways from social experiences to health.
- Add Health has completed 4 waves of data collection from adolescence through age 42. It includes measures of physical health, mental health, substance use, relationships, attitudes, and more. This provides a valuable resource for understanding
神戸大学政治学研究会2015年12月例会報告資料
一部の表が見にくい場合は
http://www.JaySong.net/wp-content/uploads/2015/12/Kobepolisci_Conjoint.pdf
をご参照下さい。
( PDF version is also available in http://www.jaysong.net )
這是巨量資料探勘與統計應用課程的投影片「類別變項的相關檢定:卡方獨立性檢定」。本單元是屬於系列課程中的「資料檢定級」中的第三個單元,處理資料類型是「類別」類型的資料,可以檢測出兩兩類別資料之間的關係。本單元要講的分析技術是推論統計的卡方獨立性檢定(Chi-Square Test of Independence),相當適合質性研究所蒐集的類別資料或行為分析。本單元的分析工具是我額外開發的「卡方獨立性檢定計算器」,在投影片裡面還談到了隱含在卡方檢定之後的陷阱:辛普森詭論(Simpson's paradox)。這個單元包含了四個實作學習單,供同學邊看邊練習。
The document discusses Chi-square tests and their applications. Chi-square tests can be used for goodness of fit, independence, and homogeneity. They are non-parametric tests used to analyze categorical data. The three main types are: 1) goodness of fit tests determine if a sample fits a hypothesized distribution, 2) independence tests determine if two categorical variables are associated, and 3) homogeneity tests determine if a categorical variable is distributed identically across populations. Chi-square tests involve calculating expected frequencies, observed frequencies, and a test statistic to determine if the null hypothesis can be rejected.
This document summarizes a doctoral thesis presentation on statistical learning theory for parameter-restricted singular models. It discusses how singular models like hierarchical and latent variable models are important in statistical model design but traditional learning theory cannot analyze the generalization error of such singular models. The presentation analyzes the generalization error of non-negative matrix factorization (NMF) and latent Dirichlet allocation (LDA) as examples of parameter-restricted singular models. It derives an upper bound for the real log-threshold of NMF which determines the generalization error of singular models, and precisely analyzes the real log-threshold of LDA by relating it to a constrained matrix factorization.
The document discusses Demographic Health Surveys (DHS) conducted in India called the National Family Health Survey (NFHS). It notes that DHS are nationally representative household surveys that collect data on population, health, and nutrition trends in developing countries. In India, NFHS surveys collect data from hundreds of thousands of households, women, and men through standardized questionnaires on topics like marriage, fertility, family planning, and health. The data are used widely by policymakers and researchers to inform health policies and programs in India.
- Add Health is a longitudinal study that began in 1994 to examine the health and behaviors of adolescents in the United States from adolescence into adulthood. It utilizes a nationally representative sample and collects extensive data through in-home interviews and biological specimens.
- The study collects data on the social, family, school, and neighborhood environments of participants and how these impact health outcomes. It also examines genetic and biological factors. This allows for analysis of gene-environment interactions and pathways from social experiences to health.
- Add Health has completed 4 waves of data collection from adolescence through age 42. It includes measures of physical health, mental health, substance use, relationships, attitudes, and more. This provides a valuable resource for understanding
Health Evidence hosted a 60 minute webinar examining the effectiveness of school-based interventions for preventing HIV, sexually transmitted infections and pregnancy in adolescents. Click here for access to the audio recording for this webinar: https://youtu.be/yCeIEQ4OTCc
Amanda Mason-Jones, Senior Lecturer in Global Public Health, Faculty of Science, University of York led the session and presented findings from her recent Cochrane review:
Mason-Jones A, Sinclair D, Mathews C, Kagee A, Hillman A, & Lombard C. (2016). School-based interventions for preventing HIV, sexually transmitted infections, and pregnancy in adolescents.Cochrane Database of Systematic Reviews, 2016(11), CD006417
http://healthevidence.org/view-article.aspx?a=school-based-interventions-preventing-hiv-sexually-transmitted-infections-29881
Sexually active adolescents are at risk of contracting HIV and STIs. Unintended pregnancy can have detrimental impact on young people’s lives. This review examines the impact of school sexual education programs on number of young people that contract STIs and number of adolescent pregnancies. Eight cluster randomized control trials, including 55,157 participants are included in this review. Findings suggest there is little evidence that school programs alone are effective in improving sexual and reproductive health outcomes for adolescents. This webinar examined the effectiveness and components of interventions that prevent HIV, STIs and adolescent pregnancy.
Evidence about Social Work Outcomes from Cohort and Panel StudiesBASPCAN
Jonathan Scourfield, Cardiff University
Morag Henderson, UCL Inst of Education
Sin Yi Cheung, Cardiff University
Elaine Sharland, University of Sussex
Luke Sloan, Cardiff University
Meng Le Zhang, Cardiff University
This dissertation examines student fears and perceptions of safety on secondary school campuses. The study surveyed students about their fears related to safety, how those fears impact their well-being, and which security measures increase their feelings of safety. It found that most students feel safe in at least one class and have an adult they trust. However, it also identified fears around drug use, bullying, prejudice, and property crimes. The study recommends improving relationships, publicizing policies, addressing drug use, reporting bullying, examining prejudice, and involving students in safety measures. It suggests future studies on academic performance, teacher perceptions, student participation, and bullying reporting.
This document summarizes an outcome-based program evaluation of Eastern Michigan University's LGBTQ Safe Campaign training program. It provides background on the goals of Safe programs and EMU's reputation as an LGBTQ-friendly campus. The evaluation examined whether a single Safe Campaign session increased knowledge about LGBTQ topics and if socio-demographic factors influenced knowledge levels. Results showed a significant increase in knowledge from pre- to post-test. Student assessments also showed the training was informative and increased awareness and support for LGBTQ individuals. Faculty interviews provided feedback on benefits and ways to improve and increase participation in the training.
Finding and Using Secondary Data and Resources for ResearchDr. Karen Whiteman
This document provides an overview of secondary data sources and how to find and use secondary data for research. It discusses what secondary data is, common myths about secondary data, pros and cons of using secondary data, questions to consider when choosing secondary data, examples of large data banks like ICPSR, and how to create a personalized dataset from secondary sources. The document aims to dispel myths and provide guidance on successfully utilizing large secondary datasets for research.
This ESS IA talks about To compare the family planning in a two areas. A rural village (A- Hilaswali, Mahendra Pur, Dehradun) and a metropolitan city (B- Gurgaon, Haryana) and evaluate whether education influences it.
The document summarizes an HIV & Global Health Rounds presentation on using mHealth to address the HIV continuum among sexual and gender minorities. It provides background on the presenter and collaborators. It then discusses how mHealth can enhance HIV research among sexual and gender minorities by providing tailored interventions, addressing co-occurring health issues, and reducing costs. Examples are given of mHealth interventions that have addressed these areas. The document advocates for matching intervention intensity to individual needs and integrating evidence-based interventions to impact multiple outcomes.
The document summarizes a study on the effects of cyberbullying on fourth year education students at Capiz State University Dayao Satellite College. It defines cyberbullying and discusses its psychological and social impacts, including depression, anxiety, low self-esteem, and suicidal thoughts. The study aims to determine the forms of cyberbullying encountered by students and the effects on their mental health and academic performance. It will utilize a survey questionnaire to gather data on students' experiences with cyberbullying and its impacts. The results could help school administrators address cyberbullying issues and support affected students.
This tutorial provides instructions for using the 2008 Ohio Family Health Survey (OFHS) Public Use File with Stata. It describes the survey design, questions, and constructed variables. Missing data was imputed for important questions. Each respondent is assigned a weight reflecting their probability of selection. The tutorial demonstrates how to set the survey design in Stata and run basic analyses using survey commands like svy: tabulate to produce weighted proportions and confidence intervals. An example shows insurance type proportions for children overall and by poverty level.
This tutorial provides instructions for using the 2008 Ohio Family Health Survey (OFHS) Public Use File with Stata. It describes the survey design, questions, and constructed variables. Missing data was imputed for important questions. Each respondent is assigned a weight reflecting their probability of selection. The tutorial demonstrates how to set the survey design in Stata and run basic analyses using survey commands like svy: tabulate to produce weighted proportions with confidence intervals of children's insurance types overall and by poverty level.
You Geaux Girl! Internet based Pregnancy Prevention in New OrleansYTH
Jakevia Green at Tulane University presented the results of a clinical trial of the BUtiful program - Be yoU! Talented Informed Fearless Uncompromised and Loved- a social media pregnancy prevention program for New Orleans women.
The document discusses survey research methods. It defines surveys as a method used to collect data from a sample of individuals through questionnaires or interviews to make inferences about a population. The key steps of survey design are outlined, including developing hypotheses, writing questions, sampling, data collection, analysis, and reporting findings. Advantages of surveys include wide scope and low cost, while limitations include superficiality and response biases. Surveys are useful for exploratory, descriptive, and explanatory research.
This document summarizes a study on consumer use of internet health information. The study found:
1) 72% of respondents were internet users, and 56% of internet users sought health information online.
2) Women were more likely than men to have negative assessments of their search experience. Older adults found searches more difficult than younger users.
3) Over half of those who searched online contacted a healthcare provider as a result, with no differences by age or sex.
The study suggests the internet influences health behaviors and could be used to deliver interventions to different groups.
The document discusses how partnerships between state agencies and higher education can use early childhood data for decision-making. It provides examples from Ohio of integrating data from the state to county and local levels, including the Ohio Education Research Center, the Cuyahoga County CHILD integrated data system, and projects analyzing third grade reading outcomes, homeless families, and child healthcare utilization. The examples show how integrated longitudinal data can inform policy and services to improve early childhood outcomes.
This document summarizes a workshop on preparing and curating research data from the Prevention and Early Intervention Initiative (PEII) in Ireland. It describes several data collections that were generated from evaluations of PEII programs, including the Preparing for Life (PFL) study, the Children's Profile at School Entry (CPSE) study, and others. The PFL study involved home visiting and supports for families from pregnancy to age 4, while the CPSE study collected data on school readiness for junior infants. Both studies used mixed methods and longitudinal designs. The document outlines the process of preparing, anonymizing, and curating these datasets so they can be safely and ethically archived and reused.
2022-06-07 Berman Lew Great Plains Conference FINAL.pptxLew Berman
The document discusses strategies for addressing gaps in electronic health record (EHR) data collected by the All of Us Research Program. It notes that EHR data is often fragmented across different providers due to participant mobility and care received from multiple organizations. The program has begun linking claims data and exploring participant-mediated linkages using FHIR and Apple Health. Additionally, acquiring data from national health networks/health information exchanges could help fill gaps by providing a broader view of participant health histories. Privacy-preserving record linkage techniques are also discussed as a way to match participant records across different data sources while protecting identities.
Stochastic actor-oriented models (SAOMs) allow for the modeling of interdependent network and behavioral change over time. SAOMs conceptualize change as occurring through a sequence of micro-steps, with actors making decisions to change their ties or behaviors based on maximizing an objective function. The objective function specifies how different network and behavioral effects influence these decisions. SAOMs can be used to estimate the effects of network structure on behaviors or behaviors on network structure while accounting for endogenous network and behavioral processes.
This document discusses different types of network experiments and interventions. It describes (1) using roommate assignments to make social connections exogenous, assessing peer effects on outcomes like GPA. It also discusses (2) natural experiments that manipulate exposure over existing networks, like popularity or voter turnout. Finally, it outlines (3) different types of network interventions, including targeting influential individuals, segmenting groups, inducing new connections, and altering network structure. The conclusion is that evidence from these experiments shows peer influence is real and we can now focus on how to leverage networks most effectively.
The document discusses network diffusion and peer influence. It begins by defining diffusion and compartment models used to model disease spread. It then discusses how network structure, including topology, timing of connections, and structural transmission, can impact diffusion. Simulation is proposed to test how network features like distance, clustering, redundancy, and high-degree nodes influence spread. The relationships between contact networks, exposure networks based on timing, and actual transmission networks are also introduced.
Statistical models for networks aim to compare observed networks to random graphs in order to assess statistical significance. Simple random graphs are commonly used as a baseline null model but are unrealistic. More developed null models condition on key network structures like degree distribution or mixing patterns to generate more reasonable random graphs for comparison. Network inference problems evaluate whether an observed network exhibits random or non-random properties relative to an appropriate null model.
This document provides a history of network visualization from its origins in the 18th century to modern applications. It discusses:
- Early work by Euler and kinship diagrams that relied on visual representations to study networks.
- J.L. Moreno's sociograms in the 1930s that sparked interest in social network analysis through visualizations.
- Developments through the 20th century as researchers sought to move from artistic to more scientific visualizations.
- Challenges of visualizing large, complex networks while representing multiple node and edge attributes simultaneously.
- Benefits of visualization in making invisible network structures visible and providing multi-dimensional insights beyond metrics alone.
The document discusses community detection in networks and multilayer networks. It begins with an introduction to community detection, how to calculate communities using various algorithms, and the importance of resolution parameters. It then provides a short introduction to multilayer networks and examples of community detection applied to real-world networks, including social networks, protein interaction networks, and legislative voting networks. The key points are that network representations provide a flexible framework for studying complex data, community detection is a powerful tool for exploring network structures, and network structures can identify essential features for modeling and understanding data applications.
This document provides an overview of Respondent-Driven Sampling (RDS), a method for sampling hidden populations using their social networks. RDS begins by selecting initial participants, called "seeds", who each recruit a small number of new participants into the study. Those new participants can then recruit others, creating a chain-referral sample. The document discusses how RDS works, its applications across many hidden populations, and some of its promises and pitfalls, including high sampling variance requiring large sample sizes compared to simple random sampling. It also reviews recent progress on estimating sampling variance from RDS studies.
This document discusses ego network analysis and its advantages over sociocentric network analysis. It begins with an overview of ego networks and sociocentric networks. Ego networks have several practical advantages, including flexibility in data collection, broader inference potential, and the ability to examine overlapping social circles. However, ego networks also have disadvantages like inability to measure reciprocated ties and map broader social structure. The document then reviews common measures used in ego network analysis, including measures of network size, tie strength, composition, and homophily. It provides examples of how to operationalize these concepts.
This document provides an overview of different measures for analyzing social networks, including centrality measures, connectivity and cohesion measures, and roles. It discusses centrality measures like degree, closeness, betweenness, and eigenvector centrality for individual nodes. It also covers whole network measures like degree distribution, density, and centralization. The document describes local connectivity and cohesion measures including reciprocity, triad census, transitivity, and clustering coefficients. It discusses how these measures can be applied and interpreted for one-mode projections of two-mode networks.
This document discusses network data collection. It begins by providing examples of how social structure matters and influences outcomes. It then discusses different ways to detect social structure through network data collection, including small group questionnaires, large surveys, observations, and digital data scraping. The document outlines key network questions that can shape data collection, such as how networks form and their consequences. It also discusses sampling and defining network boundaries. Overall, the document provides an overview of network data collection methods and considerations.
Networks & Health
This document provides an introduction and overview of social network analysis and its relevance to health research. It discusses key concepts such as what networks are, different types of network data including one-mode and two-mode data, and different levels of analysis including ego networks, partial networks, and complete networks. The document also discusses why networks matter for health through connectionist mechanisms like diffusion and positional mechanisms like social roles. Overall, the document serves as a high-level introduction to social network concepts and their application to health research.
This study investigated the reciprocal relationships between social network characteristics, social support, and mental health among older adults in the United States. The study found reciprocal associations between social support and depressive symptoms, as well as between social support and certain measures of social network structure. While social support was protective of depression, depression could undermine received support over time. The strongest link between social networks and depression was indirect, through levels of social support. Future research should focus on social support as an important pathway through which social networks impact mental health.
Graph and language embeddings were used to analyze user data from Reddit to predict whether authors would post in the SuicideWatch subreddit. Metapath2vec was used to generate graph embeddings from subreddit and author relationships. Doc2vec was used to generate document embeddings based on language similarity between submissions and subreddits. Combining the graph and document embeddings in a logistic regression achieved 90% accuracy in predicting SuicideWatch posters, reducing both false positives and false negatives compared to using the embeddings separately. Next steps proposed using the embeddings to better understand similarities between related subreddits and predict risk factors in posts.
This document discusses using social network analysis to design and implement behavior change interventions. It begins by outlining key network concepts like diffusion of innovations and mathematical models of diffusion. It then discusses how social networks influence behaviors through concepts like network exposure, tie strength, and thresholds. The document concludes by describing how to use social network analysis at different stages of intervention including needs assessment, program design, implementation, and monitoring through approaches like network ethnography, identifying opinion leaders, and using network diagnostics.
Stochastic actor-oriented models (SAOMs) summarize the key components and estimation process of these models. SAOMs model how networks and behaviors change over time as a result of endogenous network effects and influence between connected individuals. The models estimate parameters representing these effects to predict tie formation and changes in behaviors. SAOMs account for selection into the network based on attributes as well as social influence processes within the network. Estimation involves maximum likelihood to estimate parameters of network and behavior functions that represent how individuals make network and behavioral decisions.
Deep Behavioral Phenotyping in Systems Neuroscience for Functional Atlasing a...Ana Luísa Pinho
Functional Magnetic Resonance Imaging (fMRI) provides means to characterize brain activations in response to behavior. However, cognitive neuroscience has been limited to group-level effects referring to the performance of specific tasks. To obtain the functional profile of elementary cognitive mechanisms, the combination of brain responses to many tasks is required. Yet, to date, both structural atlases and parcellation-based activations do not fully account for cognitive function and still present several limitations. Further, they do not adapt overall to individual characteristics. In this talk, I will give an account of deep-behavioral phenotyping strategies, namely data-driven methods in large task-fMRI datasets, to optimize functional brain-data collection and improve inference of effects-of-interest related to mental processes. Key to this approach is the employment of fast multi-functional paradigms rich on features that can be well parametrized and, consequently, facilitate the creation of psycho-physiological constructs to be modelled with imaging data. Particular emphasis will be given to music stimuli when studying high-order cognitive mechanisms, due to their ecological nature and quality to enable complex behavior compounded by discrete entities. I will also discuss how deep-behavioral phenotyping and individualized models applied to neuroimaging data can better account for the subject-specific organization of domain-general cognitive systems in the human brain. Finally, the accumulation of functional brain signatures brings the possibility to clarify relationships among tasks and create a univocal link between brain systems and mental functions through: (1) the development of ontologies proposing an organization of cognitive processes; and (2) brain-network taxonomies describing functional specialization. To this end, tools to improve commensurability in cognitive science are necessary, such as public repositories, ontology-based platforms and automated meta-analysis tools. I will thus discuss some brain-atlasing resources currently under development, and their applicability in cognitive as well as clinical neuroscience.
What is greenhouse gasses and how many gasses are there to affect the Earth.moosaasad1975
What are greenhouse gasses how they affect the earth and its environment what is the future of the environment and earth how the weather and the climate effects.
Current Ms word generated power point presentation covers major details about the micronuclei test. It's significance and assays to conduct it. It is used to detect the micronuclei formation inside the cells of nearly every multicellular organism. It's formation takes place during chromosomal sepration at metaphase.
Remote Sensing and Computational, Evolutionary, Supercomputing, and Intellige...University of Maribor
Slides from talk:
Aleš Zamuda: Remote Sensing and Computational, Evolutionary, Supercomputing, and Intelligent Systems.
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Inter-Society Networking Panel GRSS/MTT-S/CIS Panel Session: Promoting Connection and Cooperation
https://www.etran.rs/2024/en/home-english/
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
EWOCS-I: The catalog of X-ray sources in Westerlund 1 from the Extended Weste...Sérgio Sacani
Context. With a mass exceeding several 104 M⊙ and a rich and dense population of massive stars, supermassive young star clusters
represent the most massive star-forming environment that is dominated by the feedback from massive stars and gravitational interactions
among stars.
Aims. In this paper we present the Extended Westerlund 1 and 2 Open Clusters Survey (EWOCS) project, which aims to investigate
the influence of the starburst environment on the formation of stars and planets, and on the evolution of both low and high mass stars.
The primary targets of this project are Westerlund 1 and 2, the closest supermassive star clusters to the Sun.
Methods. The project is based primarily on recent observations conducted with the Chandra and JWST observatories. Specifically,
the Chandra survey of Westerlund 1 consists of 36 new ACIS-I observations, nearly co-pointed, for a total exposure time of 1 Msec.
Additionally, we included 8 archival Chandra/ACIS-S observations. This paper presents the resulting catalog of X-ray sources within
and around Westerlund 1. Sources were detected by combining various existing methods, and photon extraction and source validation
were carried out using the ACIS-Extract software.
Results. The EWOCS X-ray catalog comprises 5963 validated sources out of the 9420 initially provided to ACIS-Extract, reaching a
photon flux threshold of approximately 2 × 10−8 photons cm−2
s
−1
. The X-ray sources exhibit a highly concentrated spatial distribution,
with 1075 sources located within the central 1 arcmin. We have successfully detected X-ray emissions from 126 out of the 166 known
massive stars of the cluster, and we have collected over 71 000 photons from the magnetar CXO J164710.20-455217.
Travis Hills' Endeavors in Minnesota: Fostering Environmental and Economic Pr...Travis Hills MN
Travis Hills of Minnesota developed a method to convert waste into high-value dry fertilizer, significantly enriching soil quality. By providing farmers with a valuable resource derived from waste, Travis Hills helps enhance farm profitability while promoting environmental stewardship. Travis Hills' sustainable practices lead to cost savings and increased revenue for farmers by improving resource efficiency and reducing waste.
Comparing Evolved Extractive Text Summary Scores of Bidirectional Encoder Rep...University of Maribor
Slides from:
11th International Conference on Electrical, Electronics and Computer Engineering (IcETRAN), Niš, 3-6 June 2024
Track: Artificial Intelligence
https://www.etran.rs/2024/en/home-english/
Nucleophilic Addition of carbonyl compounds.pptxSSR02
Nucleophilic addition is the most important reaction of carbonyls. Not just aldehydes and ketones, but also carboxylic acid derivatives in general.
Carbonyls undergo addition reactions with a large range of nucleophiles.
Comparing the relative basicity of the nucleophile and the product is extremely helpful in determining how reversible the addition reaction is. Reactions with Grignards and hydrides are irreversible. Reactions with weak bases like halides and carboxylates generally don’t happen.
Electronic effects (inductive effects, electron donation) have a large impact on reactivity.
Large groups adjacent to the carbonyl will slow the rate of reaction.
Neutral nucleophiles can also add to carbonyls, although their additions are generally slower and more reversible. Acid catalysis is sometimes employed to increase the rate of addition.
ANAMOLOUS SECONDARY GROWTH IN DICOT ROOTS.pptxRASHMI M G
Abnormal or anomalous secondary growth in plants. It defines secondary growth as an increase in plant girth due to vascular cambium or cork cambium. Anomalous secondary growth does not follow the normal pattern of a single vascular cambium producing xylem internally and phloem externally.
Phenomics assisted breeding in crop improvementIshaGoswami9
As the population is increasing and will reach about 9 billion upto 2050. Also due to climate change, it is difficult to meet the food requirement of such a large population. Facing the challenges presented by resource shortages, climate
change, and increasing global population, crop yield and quality need to be improved in a sustainable way over the coming decades. Genetic improvement by breeding is the best way to increase crop productivity. With the rapid progression of functional
genomics, an increasing number of crop genomes have been sequenced and dozens of genes influencing key agronomic traits have been identified. However, current genome sequence information has not been adequately exploited for understanding
the complex characteristics of multiple gene, owing to a lack of crop phenotypic data. Efficient, automatic, and accurate technologies and platforms that can capture phenotypic data that can
be linked to genomics information for crop improvement at all growth stages have become as important as genotyping. Thus,
high-throughput phenotyping has become the major bottleneck restricting crop breeding. Plant phenomics has been defined as the high-throughput, accurate acquisition and analysis of multi-dimensional phenotypes
during crop growing stages at the organism level, including the cell, tissue, organ, individual plant, plot, and field levels. With the rapid development of novel sensors, imaging technology,
and analysis methods, numerous infrastructure platforms have been developed for phenotyping.
BREEDING METHODS FOR DISEASE RESISTANCE.pptxRASHMI M G
Plant breeding for disease resistance is a strategy to reduce crop losses caused by disease. Plants have an innate immune system that allows them to recognize pathogens and provide resistance. However, breeding for long-lasting resistance often involves combining multiple resistance genes
When I was asked to give a companion lecture in support of ‘The Philosophy of Science’ (https://shorturl.at/4pUXz) I decided not to walk through the detail of the many methodologies in order of use. Instead, I chose to employ a long standing, and ongoing, scientific development as an exemplar. And so, I chose the ever evolving story of Thermodynamics as a scientific investigation at its best.
Conducted over a period of >200 years, Thermodynamics R&D, and application, benefitted from the highest levels of professionalism, collaboration, and technical thoroughness. New layers of application, methodology, and practice were made possible by the progressive advance of technology. In turn, this has seen measurement and modelling accuracy continually improved at a micro and macro level.
Perhaps most importantly, Thermodynamics rapidly became a primary tool in the advance of applied science/engineering/technology, spanning micro-tech, to aerospace and cosmology. I can think of no better a story to illustrate the breadth of scientific methodologies and applications at their best.
01 Add Health Network Data Challenges: IRB and Security Issues
1. Add Health Network Data Challenges:
IRB and Security Issues
Kathleen Mullan Harris
University of North Carolina at Chapel Hill
Social Networks and Health Workshop
May 13, 2019
2. Topics to cover
• Design of Add Health Study
• Unique network data
• Main IRB issues
• Security risks in data dissemination and user access
• Data security protocols
• Data sharing plan
• Successes and ongoing challenges
3. Add Health
• National Longitudinal Study of Adolescent to
Adult Health
• On-going program project that began in 1994
funded by NIH.
• Developed in response to a congressional
mandate to fund a study of adolescent health.
• Mandate to understand the role of social
environments in which adolescents live.
5. Key Features of Add Health Design
• Nationally representative study that explores the cause
of health and health-related behaviors of adolescents
and their outcomes in adulthood.
• Multi-survey, multi-wave inter-disciplinary design.
• Direct measurement of the social contexts of adolescent
life and their effects on health and health behavior.
• Unprecedented racial and ethnic diversity and
genetically informed sibling samples.
• Extensive biomarker collection across all waves.
6. Sampling Structure
Disabled SampleSaturation
Samples
from 16 Schools
Main Sample 200/Community
Ethnic Samples
Genetic
Samples
High Educ
Black
Puerto Rican
Chinese
Identical Twins Full SibsFraternal Twins
Unrelated Pairs
in Same HH
Half Sibs
Sampling Frame of Adolescents and Parents N = 100,000+ (100 to 4,000 per pair of schools)
School Sampling Frame = QED
H
Feeder
HS HS HS HS
Feeder Feeder Feeder Feeder
Cuban
HS
8. Unique Network Data
• Nominations of 5 best male and 5 best female friends
during in-school administration;
• Nomination of sexual and romantic partners in Wave I in-
home interview;
• Nomination of best friend in Wave I and Wave II in-home
interview;
• Nomination of 5 best male and female friends, sexual and
romantic partners in Wave I in-home interview in saturated
samples (N~3,000).
• 1,500 couple sample at Wave III.
• Contact with [potential] Wave I (1995) friends at Wave III
(2001-02) (asked of 7th and 8th graders from Wave I).
9. Wave I Friendship nominations
• Version A: [For R’s asked to nominate up to 5 male
and 5 female friends.]
• First, please tell me the name of your 5 best male
friends, starting with your best male friend. (If R is
female, add: If you have a boyfriend, list him first. If
not, begin with your best male friend.)
• Version B: [For R’s asked to nominate 1 male and 1
female friends.]
• The next questions are about your friends. First,
please think of your best male friend. What is his
name?
11. Race/Ethnicity N %
Mexico 1,767 8.5
Cuba 508 2.5
Central-South America 647 3.1
Puerto Rico 570 2.8
China 341 1.7
Philippines 643 3.1
Other Asia 601 2.9
Black (Africa/Afro-Caribbean) 4,601 22.2
Non-Hispanic White (Eur/Canada) 10,760 52.0
Native American (non-Hispanic) 248 1.2
Total N 20,686 100.0
Race and Ethnic Diversity in Add Health
Missing on race/ethnicity=59
12. Family Structure of Adolescents
Add Health 1995
N %
2 biological parents 10,339 53.3
2 adoptive parents 403 0.7
Bio Mom/ Step Dad 2,756 13.6
Bio Dad/ Step Mom 591 2.6
Single Mom 4,520 20.4
Single Dad 637 3.1
Surrogate parent(s) 1,499 6.3
Total 20,745 100.0
13. Domingue et al. 2018. The social genome of friends and schoolmates
in the National Longitudinal Study of Adolescent to Adult Health.
14. Main IRB Issues
• Protecting confidentiality of respondents’ identity;
• Protecting the confidentiality of friendship and
partner nominations;
• Minimizing deductive disclosure risks.
• All 3 of these issues combined with wide
dissemination of highly sensitive data.
15. Strong Motivation for Data Sharing
• Add Health very expensive;
• National representation;
• Design provided unprecedented and unique opportunities
for research that no other study allowed;
• Multi-disciplinary design;
• Omnibus study with comprehensive coverage of health
and health behavior in adolescence;
• Longitudinal study to help sort out cause and effect.
• Add Health policy: No proprietary period for investigators.
16. Special Data Security Concerns to Address
in Add Health Data Sharing Plan
• Contextual nature of design;
• Third-party nominations of friends;
• Third-party nominations of romantic and sex partners:
– No consent from nominated partners
• Geocodes of adolescent’s home address;
• Sensitive nature of data, especially during adolescence;
• With each additional wave of longitudinal data and new
data sources added, data security risks increase.
17. Pledge of Confidentiality to
Add Health Participants
Add Health Wave I Consent form for Adolescent,
1994-95.
• “I understand that my statements and answers will
be completely protected so that no one will be able
to connect my answers to me. My answers will not
be given to my parents/guardians or to any
unauthorized person by project staff.”
18. Two Major Security Risks to Privacy and
Confidentiality of Add Health Respondents
• Direct disclosure
– link between name and questionnaire information.
• Deductive disclosure
– Discerning a respondent’s identity and their responses
through use of known characteristics of that person;
– Add Health especially high risk of deductive disclosure
because of the clustered study design;
– Almost 400,000 people knew of the participation of at
least one, if not more, the adolescents in Add Health at
Wave I.
19. For Example
• In more than 90,000 cases in in-school
administration, takes only 5 variables to distinguish
as few as 7 respondents:
• Number of respondents = 90,118
– Gender = female (44,482)
– Age = 16 (8,250)
– Race = Asian (570)
– Does not live with father figure (104)
– Participates in chorus (7)
20. Data Security Procedures
• Add Health Data Management Security Plan:
• Protects against breach of confidentiality from both
direct and deductive disclosure
• Protects from subpoena (or hostile sharing)
• Also have a certificate of confidentiality
21. Protections from Direct Disclosure
• Based on the separation of identities and data.
• Names and addresses of respondents reside with a
security manager (i.e., honest broker) outside U.S.
• Identifying information only in U.S. during data
collection.
• Multiple identification numbers used for each wave
of data collection.
• Respondent names never connected to data.
22. Wave IV IDs
• F4ID – Wave IV Field ID
• B4ID – Wave IV Biospecimen ID
• C4ID – Wave IV Cortisol ID (pretest only)
• H4ID – Wave IV Household ID
• M4ID – Wave IV Military ID
• Q4ID – Wave IV Questionnaire ID
• S4ID –School ID used at Wave IV
• AID (altered ID)- analysis file identification number that
links data from all waves of data collection
23. Protections from Deductive Disclosure
• Tiered data dissemination plan:
– Differs by level of risk of deductive disclosure;
– Requirements for access and use of data differ by the
level of risk;
– Add Health users must meet these requirements and
assurances for safeguarding Add Health data.
• Users also required to abide by procedures for the
dissemination of research results.
• Access to all data via restricted data use contract.
24. Four tier data dissemination
according to disclosure risks
• Public-use data (no friendship or romantic pairs
linkages)
• Restricted-use data (no romantic pairs, HIV)
• High-security restricted data
• Secure data facility for analyzing high school
transcript data and for using geocodes to link
contextual data.
• [Secure server for access to genome-wide data.]
26. Data Users
• User requirements to protect from deductive disclosure:
– Pledge of confidentiality
– Monitoring of data use
– Store data securely
– Deletion of temporary data analysis files every six
months.
– Security of printed information
– Password protected screen saver, set to activate after
3 minutes of idle time.
– Access data only from approved locations.
27. Wave II
1996
(88.6%)
Wave III
2001-2002
(77.4%)
School
Admin
144
Adolescents
in grades 7-12
20,745
Adolescents
in grades 8-12
14,738
Adults
Aged 24-32
15,701
Wave IV
2008
(80.3%)
Wave I
1994-1995
(79%)
In-School
Administration
Survey
Administration
Students
90,118
Partners
1,507
Parent
17,670
Young Adults
Aged 18-26
15,197
School
Admin
128
IIV Study
~100
Wave V
2016-18
Adults
Aged 32-42
Target: 12,000
Social, Behavioral, and Biological Linkages Across the Life Course
Parent
3,001
IIV Study
~100
28. Adolescence
Transition to
Adulthood
Young Adulthood Adulthood
Wave I-II (Ages 12-20) Wave III (Ages 18-26) Wave IV (Ages 24-32) Wave V (Ages 32-42)
Embedded genetic sample of ~3,000 pairs
Physical development
Height, weight Height, weight Height, weight, waist Height, weight, waist
STI tests (urine) Metabolic Metabolic
HIV test (saliva) Immune function Immune function
Genetic
(buccal cell DNA)
Inflammation Inflammation
Cardiovascular Cardiovascular
Genetic
(buccal cell DNA)
Genetic
(whole blood)
Medications Medications
Renal
Biological Data Across Waves
29. Social and Biological Longitudinal Data in Add Health
Adolescence Adulthood
Wave I-II Wave III Wave IV Wave V
(12-20) (18-26) (24-32) (32-42)
Social environmental data:
school college college work
family family family family
romantic rel romantic rel romantic rel romantic rel
neighborhood neighborhood neighborhood neighborhood
community community community community
peer peer
Biological data:
Biological resemblance to siblings in household on 3,000 pairs
height height ht, wt, waist, BMI ht, wt, waist, BMI
weight weight, BMI BP, pulse BP, pulse
BMI STI test results immune immune
HIV test results inflammation inflammation
DNA diabetes diabetes
DNA kidney disease
GWAS mRNA, DNAm
30. Original Data Sharing Plan able to
accommodate collection of new sensitive data
• STD and HIV results (on both members of
couple sample)
• Links to military and other administrative data
• Biomarkers
• Genetic data
• Longitudinal parent data
31. Wave III Questions on Friends
This section is administered only to respondents who were
in grades 7 or 8 at Wave I.
Moody algorithm lists of 10 potential friends displayed
on laptop screen:
• Please indicate which of the following people who went
to <SCHOOL> are your friends now or were your friends
when you were in school with them.
– Current relationship with <FNAME>
– How often see <FNAME>
– How often you contact <FNAME>
– <FNAME> one of your closest friends
– Friendship beginning and end dates
32. Other Data Security
• Keep up with technological advances in data
access, hacking, and intruding
• SANS Institute
– Organization that provides computer security training and
certification
– Continuing training and updating security systems on 20
top threats to data security
– Add Health staff data security specialist
• Procedures for PI, study administrators, survey
field staff and interviewers, data manager, and
data users
33. Dissemination Successes
• 3,500 peer-reviewed articles in more than 800 different
disciplinary journals
– New article based on Add Health published every
day of the work week
• 200 books, book chapters, and reports; 4,000
conference presentations
• 900 master’s theses and doctoral dissertations
• Over 1,000 independent grants awarded using Add
Health data
• Over 200,000 citations to Add Health research
36. Ongoing Data Security Challenges
• Time-consuming
• Multiple staff demands and skills
• Resort to custom-approach with users to set up security
plans at IT-poor institutions
• Educate
– IRBs
– Users
– Investigators
– Never-ending with rapidly changing technology and
methods of data intrusion.
37. Ongoing Data Security Challenges
• Genetic Data
– Candidate and SNP data
– Genome-wide data
• Other genetic and biological data
– Microbiome
– Gene expression data
• Administrative records
– Social security
– Birth, death record
38. Ancillary Studies
• Researchers may propose ancillary studies to add
data to Add Health;
– No access to respondents;
– Geographic, political, environmental (via geocodes);
– Biological and genetic data (via biological specimen
archive);
– Review ancillary study proposals for scientific merit,
value and burden to Add Health, scarcity and amount
requested of specimen archive.
– Cover all costs, including Add Health costs to check
and merge data via Data Security Manager outside
the U.S. and disclosure analysis;
39. Ancillary Studies
• Add Health policies:
– Ancillary investigators must check and clean
data unlinked to Add Health longitudinal data;
– Contextual data has no proprietary period
• The day we provide the data to the Ancillary Study PI
is the day we distribute to community of Add Health
users.
– Biological and genetic data have a 1-year
proprietary period.
40. Add Health Info
• Website: www.cpc.unc.edu/addhealth
• Description of data access and dissemination:
http://www.cpc.unc.edu/projects/addhealth/data
• Restricted data use contract:
http://www.cpc.unc.edu/projects/addhealth/data/contract
41. Wave IV Consent Form (excerpt)
• Add Health has always attempted to protect privacy to the greatest extent
possible. Your answers, measurements, and test results will be held in strict
privacy by the Add Health and RTI project staff and not given to unauthorized
persons. Extensive security procedures are in place to make sure that
participants’ answers are not linked to their names. Your name will not be
reported with your responses or test results; these data are labeled only with an
ID number. Similarly, no names or other identifying information will be stored
with the DNA or blood drops. Only an ID number will identify the samples. This
ID number is different from the one associated with your interview responses.
Researchers and technicians at the laboratories that will conduct the tests do
not have access to your identifying information.
• After data collection and cleaning are completed, your identifying information will
be electronically transmitted to the Add Health Security Manager in Canada, and
your interview answers without identifying information will be sent to the Add
Health project staff at the University of North Carolina at Chapel Hill. Storing
different types of information in different places helps keep your identity
confidential.
42. Add Health Co-Funders
• National Institute of Child Health and Human Development
• National Cancer Institute*
• National Center for Health Statistics, Centers for Disease Control and Prevention, DHHS
• National Center for Injury Prevention and Control, Centers for Disease Control and Prevention, DHHS*
• National Center for Minority Health and Health Disparities*
• National Institute of Allergy and Infectious Diseases*
• National Institute of Deafness and Other Communication Disorders*
• National Institute of General Medical Sciences
• National Institute of Mental Health
• National Institute of Nursing Research*
• National Institute on Aging*
• National Institute on Alcohol Abuse and Alcoholism*
• National Institute on Drug Abuse*
• National Science Foundation*
• Office of AIDS Research, NIH*
• Office of the Assistant Secretary for Planning and Evaluation, DHHS*
• Office of Behavioral and Social Sciences Research, NIH*
• Office of the Director, NIH
• Office of Minority Health, Centers for Disease Control and Prevention, DHHS
• Office of Minority Health, Office of Public Health and Science, DHHS
• Office of Population Affairs, DHHS*
• Office of Research on Women's Health, NIH*
*Wave IV co-funders
Wave V co-funders