This document discusses interaction, which occurs when the effect of one variable depends on the level of another variable. It defines quantitative versus qualitative interaction and notes that qualitative interaction indicates interaction is present on both additive and multiplicative scales. An example of qualitative interaction between caffeine consumption and smoking on time to conception is provided. The document also discusses the reciprocity of interaction, how interaction can be distinguished from confounding, and challenges in interpreting observed heterogeneity, including the possibilities of random variability, confounding, bias, and differences in exposure levels across groups.
this ppt gives you adequate information about Karl Pearsonscoefficient correlation and its calculation. its the widely used to calculate a relationship between two variables. The correlation shows a specific value of the degree of a linear relationship between the X and Y variables. it is also called as The Karl Pearson‘s product-moment correlation coefficient. the value of r is alwys lies between -1 to +1. + 0.1 shows Lower degree of +ve correlation, +0.8 shows Higher degree of +ve correlation.-0.1 shows Lower degree of -ve correlation. -0.8 shows Higher degree of -ve correlation.
A Probability Distribution is a way to shape the sample data to make predictions and draw conclusions about an entire population. It refers to the frequency at which some events or experiments occur. It helps finding all the possible values a random variable can take between the minimum and maximum statistically possible values.
this ppt gives you adequate information about Karl Pearsonscoefficient correlation and its calculation. its the widely used to calculate a relationship between two variables. The correlation shows a specific value of the degree of a linear relationship between the X and Y variables. it is also called as The Karl Pearson‘s product-moment correlation coefficient. the value of r is alwys lies between -1 to +1. + 0.1 shows Lower degree of +ve correlation, +0.8 shows Higher degree of +ve correlation.-0.1 shows Lower degree of -ve correlation. -0.8 shows Higher degree of -ve correlation.
A Probability Distribution is a way to shape the sample data to make predictions and draw conclusions about an entire population. It refers to the frequency at which some events or experiments occur. It helps finding all the possible values a random variable can take between the minimum and maximum statistically possible values.
One of the three points that divide a data set into four equal parts. Or the values that divide data into quarters. Each group contains equal number of observations or data. Median acts as base for calculation of quartile.
Modern Identity: Heterogeneity and Distance (Cloud Identity Summit Keynote)Mark Diodati
Keynote from Cloud Identity Summit 2014 (July 21). Focusing on modern identity's two primary attributes: heterogeneity and distance. Discusses the requirement for adaptive and local biometric authentication in the modern identity era, with specifics on OAuth/OpenID Connect, federation, and WAM.
One of the three points that divide a data set into four equal parts. Or the values that divide data into quarters. Each group contains equal number of observations or data. Median acts as base for calculation of quartile.
Modern Identity: Heterogeneity and Distance (Cloud Identity Summit Keynote)Mark Diodati
Keynote from Cloud Identity Summit 2014 (July 21). Focusing on modern identity's two primary attributes: heterogeneity and distance. Discusses the requirement for adaptive and local biometric authentication in the modern identity era, with specifics on OAuth/OpenID Connect, federation, and WAM.
How to write a biomedical research paperAhmed Negida
This was the presentation of (How to write a biomedical research day workshop) given by Ahmed Negida as a part from MRGE continuous research activities in Egypt.
The course was joined by 45 medical students and seniors from different Egyptian Universities and it was more than 6 hours of exciting learning activities.
Major Learning Objectives were:
1- Structure of biomedical Research Paper
2- How to Write a conference Abstract
3- Scientific Writing Rules
4- Research Protocol
5- Referencing Using Mendeley software
6- Scientific Publication
EpidemiologyUnit 3Bias, Error, Confounding and Effect Modification4hrs
Radha Maharjan
MN(WHD)
Contents
3.1 Bias and Error in Epidemiology
3.1.1 Bias (Researcher and Respondent)
Recall Bias
Information Bias ( sponsor bias, social desirability bias, acquiescence Bias)
Selection Bias
Confirmation Bias
The halo effect.
Contents
3.1.2 Error
Systematic Error
Random Error
Confounding & Effect Modification
Definition of Error
A measure of the estimated difference between the observed or calculated value of a quantity and its true value.
Random error or Chance
It is the by-chance error
It makes observed value different from the true value
May occur through sampling variability or random fluctuation of the event of interest due to
biological variability, sampling error and measurement error (not due to machine)
lack of precision in the measurement of an association
Biological variability:
The natural variability in a lab parameter due to physiologic differences among subjects and within the same subject over time.
Differences between subjects due to differences in diet, genetics or immune status.
Sampling error:
Sampling error is a statistical error that occurs when an analyst does not select a sample that represents the entire population of data.
Measurement error:
Measurement Error (also called Observational Error) is the difference between a measured quantity and its true value.
Random error or Chance
Random error can never be completely eliminated since we can study only a sample of the population.
Random error can be reduced by
careful measurement of exposure and outcome
Proper selection of study
Taking larger sample- increase the size of the study.
Systematic error or Bias
Systematic error (or bias) occurs in epidemiology when results differ in a systematic manner from the true values.
Bias is any difference between the true value and observed value due to all causes other than random fluctuation and sampling variability.
This type of error is generally more insidious and hard to detect.
Systematic error or Bias
For example over-estimate of blood sugar of every subject by 0.05 mmol/l resulted from using inaccurate analyser.
The possible sources of systematic error are many and varied but the important biases are selection bias, measurement bias, confounding, information bias, recall (respondent) bias, etc..
Sources of error in epidemiological study
Common sources of error are
selection bias
absence or inadequacy of controls
unwarranted conclusions
improper interpretation of associations
mixing of non-comparable records
errors of measurement (intra-observer variation, inter-observer variation), etc.
The error can be minimised through
study design (by randomisation, restriction & matching) and
during analysis of the results (by stratification and statistical modelling) ..
Selection bias
A principal aim of epidemiology is to assess the cause of disease. However, since most epidemiological studies are by nature observational rather than experimental, a number of possible explanations for an observed association need to be considered before we can infer a cause-effect relationship exists.
The Building Blocks of QuestDB, a Time Series Databasejavier ramirez
Talk Delivered at Valencia Codes Meetup 2024-06.
Traditionally, databases have treated timestamps just as another data type. However, when performing real-time analytics, timestamps should be first class citizens and we need rich time semantics to get the most out of our data. We also need to deal with ever growing datasets while keeping performant, which is as fun as it sounds.
It is no wonder time-series databases are now more popular than ever before. Join me in this session to learn about the internal architecture and building blocks of QuestDB, an open source time-series database designed for speed. We will also review a history of some of the changes we have gone over the past two years to deal with late and unordered data, non-blocking writes, read-replicas, or faster batch ingestion.
Adjusting primitives for graph : SHORT REPORT / NOTESSubhajit Sahu
Graph algorithms, like PageRank Compressed Sparse Row (CSR) is an adjacency-list based graph representation that is
Multiply with different modes (map)
1. Performance of sequential execution based vs OpenMP based vector multiply.
2. Comparing various launch configs for CUDA based vector multiply.
Sum with different storage types (reduce)
1. Performance of vector element sum using float vs bfloat16 as the storage type.
Sum with different modes (reduce)
1. Performance of sequential execution based vs OpenMP based vector element sum.
2. Performance of memcpy vs in-place based CUDA based vector element sum.
3. Comparing various launch configs for CUDA based vector element sum (memcpy).
4. Comparing various launch configs for CUDA based vector element sum (in-place).
Sum with in-place strategies of CUDA mode (reduce)
1. Comparing various launch configs for CUDA based vector element sum (in-place).
Techniques to optimize the pagerank algorithm usually fall in two categories. One is to try reducing the work per iteration, and the other is to try reducing the number of iterations. These goals are often at odds with one another. Skipping computation on vertices which have already converged has the potential to save iteration time. Skipping in-identical vertices, with the same in-links, helps reduce duplicate computations and thus could help reduce iteration time. Road networks often have chains which can be short-circuited before pagerank computation to improve performance. Final ranks of chain nodes can be easily calculated. This could reduce both the iteration time, and the number of iterations. If a graph has no dangling nodes, pagerank of each strongly connected component can be computed in topological order. This could help reduce the iteration time, no. of iterations, and also enable multi-iteration concurrency in pagerank computation. The combination of all of the above methods is the STICD algorithm. [sticd] For dynamic graphs, unchanged components whose ranks are unaffected can be skipped altogether.
06-04-2024 - NYC Tech Week - Discussion on Vector Databases, Unstructured Data and AI
Discussion on Vector Databases, Unstructured Data and AI
https://www.meetup.com/unstructured-data-meetup-new-york/
This meetup is for people working in unstructured data. Speakers will come present about related topics such as vector databases, LLMs, and managing data at scale. The intended audience of this group includes roles like machine learning engineers, data scientists, data engineers, software engineers, and PMs.This meetup was formerly Milvus Meetup, and is sponsored by Zilliz maintainers of Milvus.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...Subhajit Sahu
Abstract — Levelwise PageRank is an alternative method of PageRank computation which decomposes the input graph into a directed acyclic block-graph of strongly connected components, and processes them in topological order, one level at a time. This enables calculation for ranks in a distributed fashion without per-iteration communication, unlike the standard method where all vertices are processed in each iteration. It however comes with a precondition of the absence of dead ends in the input graph. Here, the native non-distributed performance of Levelwise PageRank was compared against Monolithic PageRank on a CPU as well as a GPU. To ensure a fair comparison, Monolithic PageRank was also performed on a graph where vertices were split by components. Results indicate that Levelwise PageRank is about as fast as Monolithic PageRank on the CPU, but quite a bit slower on the GPU. Slowdown on the GPU is likely caused by a large submission of small workloads, and expected to be non-issue when the computation is performed on massive graphs.
Levelwise PageRank with Loop-Based Dead End Handling Strategy : SHORT REPORT ...
Defining and assessing heterogeneity of effects interaction
1. Defining and Assessing Heterogeneity of
Effects: Interaction
School of Pharmacy, SKKU
Sunhong Kwon
2. 6.7 The nature and Reciprocity of Interaction
6.7.1 Quantitative Versus Qualitative Interaction
Quantitative interaction Qualitative interaction
when the association between
factor A and outcome Y exists
and is of the Same direction in
each stratum formed by Z, but
the strength of the association
varies across strata
when the effects of A on the
outcome Y are in opposite
directions (crossover) according to
the presence of the third variable
Z
when there is an association in
one of the strata formed by Z but
not in the other
3. • Example of qualitative interaction
– To examine the effects of caffeine consumption on waiting time to
conception
Main exposure of
interest
- The point estimates of the effects of high caffeine consumption appear
to cross over as a function of smoking.
(i.e., there is a positive association of high caffeine intake with delayed
conception in nonsmokers and a negative association in smokers)
4. • When qualitative interaction is present,
– it is always present in both the additive and the multiplicative models
and is thus independent of the measurement scale.
– the scale does not need to be specified.
• The occurrence of qualitative interaction indicates that interaction is
present in both scales.
5. • Figure imply the presence of an additive interaction between
hypertension status and anger proneness in relation to risk of CHD
• a difference was shown in normotensives but not among hypertensive ->
qualitative interaction.
• it must also be present in a multiplicative scale.
the age-adjusted relative hazard comparing individuals with high and lower scores; 2.97 vs 1.05
6. 6.7.2 Reciprocity of Interaction
• If Z modifies the effect of A, then A modifies the effect of Z,
Interaction is completely reciprocal.
• the choice of A as the suspected risk factor of interest and Z as the
potential effect modifier is arbitrary and a function of the hypothesis
being evaluated.
• When deciding which variable should be treated as the effect modifier and
which as the factor of primary interest, there is no intrinsic hierarchical
value.
Risk factor of
interest
Effect modifier
Positively associated negatively associated
Table 6-22
7. 6.8 INTERACTION, CONFOUNDING EFFECT AND ADJUSTMENT
• Although on occasion the same variable may be both a confounder and an
effect modifier, confounding and interaction are generally distinct
phenomena.
• Confounding effects are undesirable, as they make it difficult to evaluate
whether a statistical association is also causal.
• Interaction is part of the web of causation and may have important
implications for prevention.
• When a variable is found to be both a confounding variable and an effect
modifier, adjustment for this variable is contraindicated.
– This is because when there is interaction the notion of an overaII
adjusted (weighted) mean value (main effect) makes little sense.
– example
• OR: 2.0 for men & 25.0 for women -> average is meaningless
• OR: 0.3 for men & 3.5 for women
-> "average, gender-adjusted” OR may denote no association
8. • Regardless of whether a "Z-adjusted" effect is reported, it is often informative
to report the stratum-specific values as well.
• One solution : to carry out statistical testing and not to adjust if the homogeneity
null hypothesis is rejected
9. 6.8.1 Joint Presence of Two Factors that Interact as a Confounding Variable.
• When there is interaction, the joint presence of variable that interact may
produce confounding effect, even if each individual variable is not
identified as a confounder.
• Because the prevalence of the joint presence of B and C is higher in those
exposed to A and because, in addition, there is strong interaction between
B and C, the crude incidence is greater in the individuals exposed to A
than in the unexposed.
10. 6.9 STATISTICAL MODELING AND STATISTICAL TESTS
FOR INTERACTION
• To examine interaction
– Use complex statistical approaches to evaluate interaction.
Ex) the regression equation including "interaction terms”
– Assess whether an observed heterogeneity is statistically significant.
• Statistical tests of homogeneity are not sufficient to evaluate interaction
fully.
11. 6.10 INTERPRETING INTERACTION
6.10.1 Heterogeneity Due to Random Variability
• random variability
– produced by the stratification by a suspected effect modifier.
– may occur in spite of an a priori specification of interaction in the
context of the hypothesis to be evaluated.
– A more common situation is when interaction is not specified a priori
but the investigator decides to carry out subgroup analysis.
• Sample size inevitably decreases as more strata are created in subgroup
analysis, making it likely that heterogeneity would occur by chance alone.
• The detection of heterogeneity should be assessed vis-a-vis its plausibility.
• After observed by means of subgroup analysis, interaction has to be
confirmed in a study especially designed to evaluate it.
12. 6.10.2 Heterogeneity Due to Confounding
• When associations between A and Y in strata formed by Z are being
explored, differential confounding effects across strata may be responsible
for the heterogeneity of effects.
13. • The possibility that interaction may be explained partially or entirely by a
confounding effect makes it essential to adjust for potential confounders
when assessing interaction.
• In most real-life instances, confounding may either exaggerate or decrease
heterogeneity.
14. 6.10.3 Heterogeneity Due to Bias
• The observed heterogeneity may also result from differential bias across
strata.
• EX) when stratification according to educational status was undertaken,
the apparent decreased risk of miscarriage in blacks was seen only in the
lower educational strata. This pattern of an apparent modification of the
race effect by educational level is probably due to the underascertainment
bias operating only in less educated blacks.
15. • Example of possible misclassification resulting in apparent interaction
: an earlier, aggressive treatment of preeclampsia in those with "high"
prenatal care may be the explanation for the lesser increase in gestational
diabetes-related odds of severe eclampsia; on the other hand, the authors
also suggested that, in those with a low level of care, preexisting diabetes
may have been misclassified as gestational, which may have artificially
increased the strength of the association in these individuals.
16. • Example of heterogeneity due to information bias
– validity levels differ between smokers and nonsmokers
17. 6.10.4 Heterogeneity Due to Differential Intensity of Exposure
• heterogeneity in the levels of exposure to the risk factor of interest
according to the alleged effect modifier
• Ex) the potential effect modification by gender of the relationship of
smoking to respiratory diseases -> may be created or exaggerated by the
fact that the level of exposure to smoking is higher in men than in women.
18. 6.10.5 Interaction and Host Factors
• Facilitation and level of exposure are also the result of anatomical or
pathophysiological characteristics of the host.
Ex) short nose dog; lung cancer vs. long nose dog; nasal cancer
– the importance of considering the intensity and/ or facilitation of
exposure when attempting to explain heterogeneity of effects.
– Effective exposure dose is obviously a function of the net result of the
amount of "exposure" in the individual's environment
• Effect modifiers can act on different portals of entry.
Ex) exposure to the same intensity of a skin pathogen(e.g., streptococcus)
skin rash vs. normal skin
• The biological mechanism of effect modification can also vary at the
metabolic or cellular level. (e.g., genetic disorders such as phenylketonuria)
19. 6.11 INTERACTION AND SEARCH FOR NEW RISK FACTORS
IN LOW-RISK GROUPS
• The strength of an association measured by a relative difference (e.g., a
relative risk) is a function of the relative prevalence of other risk factors.
• The idea of studying "emergent" risk factors in individuals with no known
risk factors is on occasion considered in the design of a study.
– It may limit the generalizability of the study findings to the general
population, which includes both low- and high-risk individuals.
– Associations that rely on synergism between risk factors may be
missed altogether.
• the "low-risk“ approach may underestimate the potential impact
20. 6.12 INTERACTION AND "REPRESENTATIVENESS"
OF ASSOCIATIONS
• An important assumption when generalizing results from a study is
that the study population should have an "average" susceptibility to the
exposure under study with regard to a given outcome.
• When susceptibility is unusual, results cannot be easily generalized.
Ex) Swiss children vs. African children
• Although it is difficult to establish to which extent the susceptibility of a
given study population differs from an "average" susceptibility, the
assessment of its epidemiological profile (based on well-known risk factors)
may indicate how "usual" or "unusual" that population is.
• This strategy is limited because level of susceptibility to a known risk
factor may not be representative of the level of susceptibility regarding
the exposure under study.