Causality Introspection Through Regression Variable Relationship Analysis
Causality Introspection 1Causality Introspection through Regression Variable Relationship Analysis by Edgardo Donovan RES 600 – Dr. Yufeng Tu Module 5 – Case Analysis Monday, December 15, 2008 RES600 - Introductory Data Analysis
Causality Introspection 2 Causality Introspection through Regression Variable Relationship Analysis After 4 decades of severe criticism, the ritual of null hypothesis significance testing—mechanical dichotomous decisions around a sacred .05 criterion—still persists.. JACOB COHEN American Psychologist , 1994 ormal hypotheses and significance testing of null-hypotheses, when applied within the boundaries of its limitations, is still the most effective and practical method to discern causality by using regression to model relationships among variables. The lattermethod in no way guarantees that a researcher will deliver valid and reliable conclusions.However, it does provide a rigid framework useful in dealing with quantitative researchdata which can be understood universally. Researchers rely on this method in the samemanner that physicists rely on mathematical equations while attempting to operationalizeand convey a sense of reason, order, and predictability to what may otherwise seem asabstract ideas, variables, or theories. Null-Hypothesis testing can help us find examples of variable correlation andcovariance which can be useful in understanding the relationship dynamics of thephenomena we are trying to study. Covariance, being the measure of how much twovariables change together, is more difficult to ascertain and may require both cross-functional and longitudinal analysis due to the fact that negative or positive correlationsmay appear in part due to a number of related qualifying variables. The interpretation ofthe magnitude of correlations and covariancess can be helpful in understanding the
Causality Introspection 3statistical dynamics of certain phenomena. A correlation of .8 compared to one of .2 is notnecessarily four times more important but shows a greater linear positively or negativelycorrelated relationship. Effect sizes, being the measure of the strength of the relationshipbetween two variables, are relationship strength indicators but do not necessarily definethe totality of the interaction between two variables. The best starting point for attempting to prove actual causation is by proving theexistence of a mechanism. Finding and examining statistical correlations and covariation ofvarying effects may be helpful in that task. The latter most of the time do not imply amechanism relative to the research at hand but constitute an opportunity to examine thedynamics among a set of related variables. Perhaps no example better illustrates this thansmoking and cancer/heart disease. Despite all of the statistical evidence, the causalrelationship between smoking and disease will not be nailed down by the numbers but bythe identification of the substance in tobacco that trigger the diseases (Dallal 3). The fundamental difference between cross-sectional and longitudinal studies isthat cross-sectional studies take place at a single point in time and that a longitudinal studyinvolves a series of measurements taken over a period of time. Finding the proverbial“needle in the haystack” require systematic cross-functional and longitudinal researchapproaches. Certain correlations between variables may appear as mechanisms howevermay not be reproducible at different times due to unknown effects of variables yet to bequantified that impact over time. Longitudinal research methodologies that sample overlong-term time periods are designed to sift through coincidental one-time events. Cross-functional research methodology is designed to prevent overly sampling a particular area
Causality Introspection 4so at avoid “cherry picking” or other forms of research myopia geared towards supportinga hypothesis rather than uncovering the truth about a target phenomena. M.D. Gall in his 2001 research titled “Figuring out the Importance of ResearchResults: Statistical Significance versus Practical Significance”, presented at the 2001 annualmeeting of the American Educational Research Association, harshly criticizes the null-hypotheses status-quo. He studies the impact of different approaches vis-à-vis educationaltesting. He affirms that that significance and effect sizes are not enough and that you needadditional reference points to help ascertain the value of the results. In the case ofeducation testing analysis, Gall proposes scoring-guide scale comparison of a sample to ameaningful reference group as a way of better contextualizing and operationalizing data.Even if one may disagree with the author’s overall premise, many people will agree thatdata and analytical hypotheses need to be sound from a practical standpoint if they hope tobe successful in conjunction with null-hypotheses utilization. A good example of how a traditional null-hypotheses study failed to achievesuccess due greatly to faulty practical design is a study entitled “Internet and Society: APreliminary Report” by Norman Nie and Lutz Erbing. This research fails in providingrepresentative conclusions not because of null-hypothesis testing but due to data quality.Rather than utilizing traditional means of conducting surveys they relied upon arevolutionary form application tool named “Intersurvey” (Nie page). Intersurvey was anapplication exclusively bundled with WebTV which attempted to market Internet set topboxes throughout the mid to late nineties. Set top boxes were priced between $300 and$500 relatively cheaper than typical personal computers at the time. The demographic
Causality Introspection 5targets for these products at the time were low income households and/or people whowanted to access the Internet without having to learn basic computer operation skills.Limiting oneself to a very distinct demographic when attempting to analyze a quasi-universal phenomenon cannot be deemed as an effective example of sampling and is proneto result in huge inconsistencies. It is very important that a researcher attempt to utilize statistical significance tobetter formulate an understanding of the dynamics of a particular phenomenon. Althoughit has great predictive value and useful in uncovering statistical mechanisms (Dallal 1), iftaken to careless extremes null-hypothesis testing may take the form of an impracticalPseudoscience (Johnson 1). Some researchers have argued that tests of statisticalsignificance are confusing and should be discouraged or perhaps even banned by journals(Armstrong 2). They contend that even when done properly, statistical significance testsare of no value and that tests of statistical significance are harmful to the development ofscientific knowledge because they distract researchers from the use of proper methods. The existence of alternative research strategies revolving around large effect sizesand chance outcomes coupled with the inability of null-hypothesis testing to guaranteesuccessful research does not justify entirely discrediting null-hypothesis testing therebythrowing out the baby with the bathwater (Levin 49). Large effect sizes treat the researchof localized and universal theories unevenly giving more importance to the latter. Chanceoutcomes are just that and are difficult to predict in terms of probability. Ideally, we wouldlike to see a situation where all studies were adequately designed, controlled, and with noissues concerning null-hypothesis testing. Until then, we will have to accept that null-
Causality Introspection 6hypothesis testing is not a guarantee for success but useful because it is impartial,measurable, and duplicable across a wide spectrum of research. One of the reasons for itspopularity and universal acceptance in society is that it appeals to that facet of the humanpsyche that longs to render the surrounding world less mysterious, more discernable, andless unpredictable so that it can be managed more effectively (Levin 46). Operationalizingconcepts into qualitative variables, extending that process into quantitative data-gathering,and conducting null-hypothesis analysis conveys a sense of order to what may otherwiseseem as abstract ideas or theories. Part of operationalizing abstract ideas involves bringing a sense of mathematicalproportion to quantitative data utilized in null-hypothesis analysis. One of these techniquesis a priori or post-hoc power analysis is conducted to determine appropriate sample size toachieve an acceptable power level. The power of a statistical test is the probability that thetest will reject a false null hypothesis. As power increases, the chances of error decreaseand vice versa (Myoung 4). Different factors affect power to include data sample size, datavariance, and data statistical significance. This technique may seem banal when attemptingto prove a hypothesis that involves a small group or niche of circumstances. An example ofthis could be a study involving the popularity or mindshare of a particular restaurant in asmall town. However, when sampling is employed in an attempt to prove universalconnotations, such as in the aforementioned study involving North American Internetusage, having the appropriate power is critical. Null-hypothesis testing is sometimes perceived by its critics as a method that favorsquantitative over qualitative factors. Although it is possible for it being misused towards
Causality Introspection 7that end, null-hypothesis testing can be effectively used to dig deeper into seeminglystraightforward theories and extrapolate a variety of qualitative questions thereby addingdepth and variety. In G. Blount-Nuss’ work titled "G. I. Average Joe: The Clothes Do NotNecessarily Make the Man", published in the Journal of Articles in Support of the NullHypothesis, the author attempts to prove that men in uniform are perceived to be moreattractive than if presented in civilian attire. Although the data in the study generallysupported the main hypothesis, the extreme variances in the limited data samples do notmake for adequate results that can be parlayed as an indicator of universal phenomena.What is particularly interesting is that the data variances brought on questions pertainingto the attraction effect of civilian attire taste, attractiveness of short military cut hair withcivilian attire, attractiveness of serious facial expressions with civilian attire, etc (Blount-Nuss 9). Formal hypotheses and significance testing of null-hypotheses, when applied withinthe boundaries of its limitations, is still the most effective and practical method to discerncausality by using regression to model relationships among variables. The latter method inno way guarantees that a researcher will deliver valid and reliable conclusions. However, itdoes provide a rigid framework useful in dealing with quantitative research data which canbe understood universally. Researchers rely on this method in the same manner thatphysicists rely on mathematical equations while attempting to operationalize and convey asense of reason, order, and predictability to what may otherwise seem as abstract ideas,variables, or theories.
Causality Introspection 8 BibliographyArmstrong, Scott. (2005). Significance tests harm progress in forecasting. TheWharton School University of Pennsylvania.Blount-Nuss, G., Cate, k. L., & H. Lattimer. (2006). G. I. Average Joe: The clothes donot necessarily make the man. Journal of Articles in Support of the Null Hypothesis.Vol. 4, No. 1.Cohen, Jacob. (1994). The earth is round. American Psychologist, 49(12), 997-1003.Dallal, Gerard. (2000). Cause & effect. TUI UniversityDallal, Gerard. (2006). The most important lesson youll ever learn about multiplelinear regression analysis. TUI UniversityJohnson, Douglas. (1998). Hypothesis testing: statistics as pseudoscience. Presentedat the Fifth Annual Conference of the Wildlife Society, Buffalo, New York, 26 September1998.Levin, Joel. (1998). What if there were no more bickering about statisticalsignificance tests? Research in the Schools. Vol. 5, No. 2, 43-53.M. D. Gall. (2001). Figuring out the importance of research results: statisticalsignificance versus practical significance. University of Oregon.
Causality Introspection 9Nie, Norman (2004). A preliminary report. Stanford Institute for the Quantitative Studyof Internet and Society.Park, Hun Myoung. (2005). Understanding the statistical power of a test. UITS Centerfor Statistical and Mathematical Computing.