This document outlines the final project assignment which involves revisiting and analyzing datasets from previous weekly assignments. It provides instructions to analyze mortality data by pooling all cancer causes together, analyze birth data by racial group and seasonality, evaluate a diagnostic test for liver cancer using biomarker data, test for period effects in a medical trial, and analyze differences in physical and cognitive variables by gender and birth order using youth data. It specifies the requirements for completing and formatting the final paper report.
This is the final projectIn this final assignment, we will revi.docx
1. This is the final project:
In this final assignment, we will revisit datasets that we have
utilized in previous assignments, but with new objectives.
In the Week One assignment, you looked at mortality in your
particular state, with two different metrics: the first was
numbers of deaths, and the second was years of life lost. For
this question, return to the original dataset, but this time first
pool all
cancer
causes of death together, so that cancer constitutes the only
category for cause of death. Then, repeat your analyses from
Week One. How do your conclusions change?
In the Week Two assignment, you looked at sex ratios for births
in your state.
Take the data you have assembled from the second part of your
Week Two assignment, namely, numbers of first-born boy and
girl births in your state between 2007 and 2012, separately by
racial group (i.e., American Indians, Asians, Blacks, and
Whites). Form a two-by-four contingency table from these data:
the two row categories are female (girl) and male (boy), and the
four column categories are the four racial groups. Calculate the
chi-square statistic from this contingency table, and interpret
the result.
Return to the
CDC Wonder website
, and obtain the numbers of births in your state between 2007
and 2012, by month. (Disregard gender, or race, or birth order—
you want all births). Calculate a chi-square statistic to assess
whether there is any seasonality to births. (Your null hypothesis
is that births should be equally likely to occur in any of the 12
months. We are ignoring the varying lengths of the months to
simplify calculations.) How would you interpret your findings?
Explain in 500 words in APA format supported by scholarly
sources.
2. BONUS:
Give a graphical representation of your findings for this
portion highlighting what you consider significant.
In the Week Three assignment, you were given levels of tumor-
associated antigens in a sample of 90 normal (non-cancer)
individuals, and 160 hepatocellular carcinoma (HCC) patients.
Here is a proposed diagnostic test for HCC:
For each individual, calculate a numerical score:
score = -3.95 + 10.7 * HCC1 - 4.14 * P16 + 13.95 * P53 + 28.92
* P90 + 6.48 * survivin
(This equation was derived from logistic regression.)
If this score is positive (i.e., > 0), diagnose this individual as an
HCC patient; if this score is negative (i.e., <0), diagnose this
individual as normal (i.e., non-cancer).
Apply this rule to the entire cohort of 250 individuals. Report
the sensitivity of this rule, the specificity, the false positive
rate, the false negative rate, and the overall accuracy. Do you
think the score function provides a good diagnostic test for
HCC? Explain.
In the Week Four assignment, we considered a simple two-by-
two crossover trial of a new experimental treatment for
interstitial cystitis. We calculated t tests for carryover and
treatment effects, but we have not yet considered period effects.
It is unlikely that there are any period effects in this trial, but
we may want to test this formally. If there were a period effect,
then patient responses under either treatment would likely be
systematically higher in one period than the other. (Here's an
analogy: Think of taking the same test twice. You would likely
perform better on the test the second time, since you have
learned from your experience of taking the first test.) Explain
how you would devise a t test for assessing a period effect in
this trial. (Hint: look at the explanation of the t test for
treatment effects given in the Week Four assignment. There, we
based the test on the random variable X - Y. Suppose we look
3. instead at X + Y?)
In the Week Five assignment, you investigated measures of
brain size and intelligence in a sample of 20 youths. A potential
shortcoming of your prior analyses is that you did not take into
account all available information in the dataset, in particular,
gender. Answer the following questions and explain your
answers:
Do any of the physiologic variables CCSA, HC, TOTSA,
TOTVOL, and WEIGHT differ significantly between males and
females?
Do IQs differ significantly by gender?
Undertake a paired analysis of IQs, in order to assess whether
firstborns have higher IQs than non-firstborns. In this regard,
there are 10 pairs of related youths, as denoted by the variable
PAIR.
Completing the Final Project
The Final Project:
Must include a title page with the following:
Title of paper
Student’s name
Course name and number
Instructor’s name
Date submitted
Must begin with an introductory paragraph that has a succinct
thesis statement.
Must address the topic of the paper with critical thought.
Must end with a conclusion that reaffirms your thesis.
Must use at least three scholarly, peer-reviewed sources
published within the last five years (not including the course
text) or those applicable to the data sets.
Must document all sources in APA style, as outlined in the
Ashford Writing Center.
Must include a separate reference page, formatted according to
APA style as outlined in the Ashford Writing Center. The
4. number of pages must be applicable to the specific data sets
outlined in the Final Project assignment.In this final
assignment, we will revisit datasets that we have utilized in
previous assignments, but with new objectives.
In the Week One assignment, you looked at mortality in your
particular state, with two different metrics: the first was
numbers of deaths, and the second was years of life lost. For
this question, return to the original dataset, but this time first
pool all
cancer
causes of death together, so that cancer constitutes the only
category for cause of death. Then, repeat your analyses from
Week One. How do your conclusions change?
In the Week Two assignment, you looked at sex ratios for births
in your state.
Take the data you have assembled from the second part of your
Week Two assignment, namely, numbers of first-born boy and
girl births in your state between 2007 and 2012, separately by
racial group (i.e., American Indians, Asians, Blacks, and
Whites). Form a two-by-four contingency table from these data:
the two row categories are female (girl) and male (boy), and the
four column categories are the four racial groups. Calculate the
chi-square statistic from this contingency table, and interpret
the result.
Return to the
CDC Wonder website
, and obtain the numbers of births in your state between 2007
and 2012, by month. (Disregard gender, or race, or birth order—
you want all births). Calculate a chi-square statistic to assess
whether there is any seasonality to births. (Your null hypothesis
is that births should be equally likely to occur in any of the 12
months. We are ignoring the varying lengths of the months to
simplify calculations.) How would you interpret your findings?
Explain in 500 words in APA format supported by scholarly
sources.
5. BONUS:
Give a graphical representation of your findings for this
portion highlighting what you consider significant.
In the Week Three assignment, you were given levels of tumor-
associated antigens in a sample of 90 normal (non-cancer)
individuals, and 160 hepatocellular carcinoma (HCC) patients.
Here is a proposed diagnostic test for HCC:
For each individual, calculate a numerical score:
score = -3.95 + 10.7 * HCC1 - 4.14 * P16 + 13.95 * P53 + 28.92
* P90 + 6.48 * survivin
(This equation was derived from logistic regression.)
If this score is positive (i.e., > 0), diagnose this individual as an
HCC patient; if this score is negative (i.e., <0), diagnose this
individual as normal (i.e., non-cancer).
Apply this rule to the entire cohort of 250 individuals. Report
the sensitivity of this rule, the specificity, the false positive
rate, the false negative rate, and the overall accuracy. Do you
think the score function provides a good diagnostic test for
HCC? Explain.
In the Week Four assignment, we considered a simple two-by-
two crossover trial of a new experimental treatment for
interstitial cystitis. We calculated t tests for carryover and
treatment effects, but we have not yet considered period effects.
It is unlikely that there are any period effects in this trial, but
we may want to test this formally. If there were a period effect,
then patient responses under either treatment would likely be
systematically higher in one period than the other. (Here's an
analogy: Think of taking the same test twice. You would likely
perform better on the test the second time, since you have
learned from your experience of taking the first test.) Explain
how you would devise a t test for assessing a period effect in
this trial. (Hint: look at the explanation of the t test for
treatment effects given in the Week Four assignment. There, we
based the test on the random variable X - Y. Suppose we look
instead at X + Y?)
6. In the Week Five assignment, you investigated measures of
brain size and intelligence in a sample of 20 youths. A potential
shortcoming of your prior analyses is that you did not take into
account all available information in the dataset, in particular,
gender. Answer the following questions and explain your
answers:
Do any of the physiologic variables CCSA, HC, TOTSA,
TOTVOL, and WEIGHT differ significantly between males and
females?
Do IQs differ significantly by gender?
Undertake a paired analysis of IQs, in order to assess whether
firstborns have higher IQs than non-firstborns. In this regard,
there are 10 pairs of related youths, as denoted by the variable
PAIR.
Completing the Final Project
The Final Project:
Must include a title page with the following:
Title of paper
Student’s name
Course name and number
Instructor’s name
Date submitted
Must begin with an introductory paragraph that has a succinct
thesis statement.
Must address the topic of the paper with critical thought.
Must end with a conclusion that reaffirms your thesis.
Must use at least three scholarly, peer-reviewed sources
published within the last five years (not including the course
text) or those applicable to the data sets.
Must document all sources in APA style.
Must include a separate reference page, formatted according to
APA style. The number of pages must be applicable to the
specific data sets outlined in the Final Project assignment.