SlideShare a Scribd company logo
1 of 6
Download to read offline
N University: Office of Strategic Research and Analytics
OBJECTIVE To understand how variables related to students accepted into N University’s programs
affect their likelihood of actually enrolling in these programs.
DATA
About: We have data from 1268 subjects. Each subject is a student who has been admitted to one of 4
program types at N University: Bachelors Degree, Fast-Track 18-Month BS Degree Completion Blended
Class, Fast-Track 18-Month BS Degree Completion Online Class, and Undergraduate Certificate. The
variables available to us range from demographic information, to their major, to time elapsed between
the application submitted date and the decision date.
Variables of Interest: In order to ensure that our analysis is applicable to future students, we replaced
factor variables that had a large amount of very specific levels, with factor variables that had a reduced
number of levels that were more general. For example, the variable Country presented an issue because
it did not encompass all the possible countries that a student could be from. However, by creating a new
variable that was coded “Domestic” for students from the United States and “International” otherwise,
not only did we retain the information from the Country variable, but we ensured that future students
would have a level they can be placed in for analytical purposes. Factor variables with levels that did not
have clear ways of being grouped together, such as High School, were omitted. Variables that were the
same for each subject, such as Level, where all the students were listed as undergraduate, were also
omitted.
Missing Values: While with continuous variables, missing data could be filled in using methods such as
using the average of the column, it is not as simple with categorical variables. Thus applicants that had
missing values for any of the categorical variables were not included in the analysis. After omitting these
subjects, 931 subjects remained.
MODELING PROCEDURE
The response variable was Enrolled which was coded 1 if the variable “Registered for Class” was a “Yes”
and 0 if it was a “No.” This type of variable is binomially distributed and thus we ran a generalized linear
binomial model.
We split the 931 subjects into a test set of 121 randomly selected subjects and the remaining 810
subjects became the training set on which we ran our analysis.
We started with the model that contained only an intercept and no predictive variables. From there we
implemented a forward stepwise model selection procedure, using AIC as the criteria for deciding which
variables to keep in the model.
The final model chosen by the procedure is presented in the following section.
THE MODEL
R- Output:
In terms of log(odds):
In terms of odds:
We will interpret each variable’s coefficient in order to explain the effect of that variable on the odds of
a student enrolling in their intended program, as that is easier for a general audience to understand.
However, we have included the model in terms of log (odds) above as well. The statistical significance of
each coefficient at specific significance levels is included so that the reader may designate their own
significance level and draw their own conclusions. The interpretation of each coefficient is holding all
other variables constant.
Log(odds of enrolling) = 1.301818
-1.733925*[Dom_InternInternational]
+1.478557*[Prog_typeFast-Track 18-Month BS Degree Completion Blended Classes]
+0.654839*[Prog_typeFast-Track 18-Month BS Degree Completion Online Classes]
-0.249308*[Prog_typeUndergraduate Certificate]
-0.671543*[Campus_PreferenceNo Preference]
-0.536188*[Campus_PreferenceOnline]
+0.878252*[Veteran_YNY]
-0.017988*[Age]
+0.425373*[Hispanic Latino_YNY]
-0.251832*[GenderMale]
odds of enrolling = 𝑒1.301818
*𝑒−1.733925∗Dom_Intern𝑰𝒏𝒕𝒆𝒓𝒏𝒂𝒕𝒊𝒐𝒏𝒂𝒍
*𝑒1.478557∗Prog_type𝑭𝒂𝒔𝒕−𝑻𝒓𝒂𝒄𝒌 𝟏𝟖−𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑩𝒍𝒆𝒏𝒅𝒆𝒅 𝑪𝒍𝒂𝒔𝒔𝒆𝒔
*𝑒0.654839∗Prog_type𝑭𝒂𝒔𝒕−𝑻𝒓𝒂𝒄𝒌 𝟏𝟖−𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑶𝒏𝒍𝒊𝒏𝒆 𝑪𝒍𝒂𝒔𝒔𝒆𝒔
*𝑒−0.249308∗Prog_type𝑼𝒏𝒅𝒆𝒓𝒈𝒓𝒂𝒅𝒖𝒂𝒕𝒆 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆
*𝑒−0.671543∗Campus_Preference𝑵𝒐 𝑷𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆
*𝑒−0.536188∗Campus_Preference𝑶𝒏−𝑳𝒊𝒏𝒆
*𝑒0.878252∗Veteran_YN𝒀
*𝑒−0.017988∗Age
*𝑒0.425373∗Hispanic_Latino_YN𝒀
*𝑒−0.251832∗Gender𝑴𝒂𝒍𝒆
The coefficient for the Intercept implies that the odds of a female student who is domestic, admitted to
the bachelors degree program, prefers to complete their program on the Boston Main Campus, is not a
veteran, is not Hispanic/Latino, and is of zero years of age is 𝑒1.301818
=3.675974. This coefficient has no
qualitative meaning on its own because it does not make sense for a person who is of zero years of age
to apply to the university.
Dom_Intern is a factor variable with two levels, indicating whether a student is domestic or
international.
 Dome_InternInternational= {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒊𝒏𝒕𝒆𝒓𝒏𝒂𝒕𝒊𝒐𝒏𝒂𝒍
𝟎, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒅𝒐𝒎𝒆𝒔𝒕𝒊𝒄
The coefficient for this variable implies that, holding all other variables constant, the odds of an
international student enrolling in their intended program decreases by a factor of 𝑒−1.733925
=0.1765899 compared to a domestic student. At any common significance level, there is a
statistically significant difference between the odds of an international and domestic student
enrolling in their intended program.
Prog_Type is a factor variable with four levels, indicating what type of program the student was
admitted into: Bachelors Degree, Fast-Track 18-Month BS Degree Completion Blended Classes, Fast-
Track 18-Month BS Degree Completion Online Classes, and Undergraduate Certificate.
 Prog_TypeFast-Track 18-Month BS Degree Completion Blended Classes
= {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒑𝒓𝒐𝒈𝒓𝒂𝒎 𝒕𝒚𝒑𝒆 𝒊𝒔 𝑭𝒂𝒔𝒕 𝑻𝒓𝒂𝒄𝒌 𝟏𝟖 𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑩𝒍𝒆𝒏𝒅𝒆𝒅
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
student in the Fast-Track 18-Month BS Degree Completion Blended program enrolling increases
by a factor of 𝑒1.478557
=4.386611 compared to a student in the Bachelors Degree Program. At a
significance level of .001, this difference in odds of enrolling between students in the Bachelors
Degree program and students in the Fast-Track 18-Month BS Degree Completion Blended
program is statistically significant.
 Prog_TypeFast-Track 18-Month BS Degree Completion Online Classes
= {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒑𝒓𝒐𝒈𝒓𝒂𝒎 𝒕𝒚𝒑𝒆 𝒊𝒔 𝑭𝒂𝒔𝒕 𝑻𝒓𝒂𝒄𝒌 𝟏𝟖 𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑶𝒏𝒍𝒊𝒏𝒆
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
student in the Fast-Track 18-Month BS Degree Completion Online program increases by a factor
of 𝑒0.654839
=1.924833 compared to students in the Bachelors Degree program. At a significance
level of .001, this difference in odds of enrolling between students in the Bachelors Degree
program and students in the Fast-Track 18-Month BS Degree Completion Online program is
statistically significant.
 Prog_TypeUndergraduate Certificate
= {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒑𝒓𝒐𝒈𝒓𝒂𝒎 𝒕𝒚𝒑𝒆 𝒊𝒔 𝑼𝒏𝒅𝒆𝒓𝒈𝒓𝒂𝒅𝒖𝒂𝒕𝒆 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
student admitted to the Undergraduate Certificate program enrolling decreases by a factor of
𝑒−0.249308
=0.7793399 compared to Bachelors Degree students. This difference in odds of
enrolling between students in the Undergraduate Certificate program and students in the
Bachelors Degree program is not statistically significant.
Campus_Preference is a factor variable with three levels, indicating whether a student prefers to
complete their program on the Boston Main Campus, online, or have no preference.
 Campus_PreferenceNoPreference = {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒉𝒂𝒔 𝒏𝒐 𝒑𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒊𝒏 𝒄𝒂𝒎𝒑𝒖𝒔
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
student that has no campus preference enrolling in their program decreases by a factor of
𝑒−0.671543
=0.5109196 compared to students whose campus preference is the Boston Main
Campus. The difference in odds of enrolling between students who prefer Boston Main Campus
and students who have no preference is not statistically significant.
 Campus_PreferenceOnLine = {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒄𝒂𝒎𝒑𝒖𝒔 𝒑𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒊𝒔 𝒐𝒏𝒍𝒊𝒏𝒆
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
student that prefers to complete their program online enrolling in classes decrease by a factor
of 𝑒−0.536188
=0.5849739 compared to students who’s campus preference is Boston Main
Campus. At a significance level of .001, this difference in odds of enrolling between students
who prefer to complete their program Online versus at the Boston Main Campus is statistically
significant.
Veteran_YN is a two level factor variable indicating the veteran status of a student.
 Veteran_YNY = {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒂 𝒗𝒆𝒕𝒆𝒓𝒂𝒏
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
student who is a veteran enrolling in their program increases by a factor of 𝑒0.878252
=2.406689
compared to a student who is not a veteran. At a significance level of .01, the difference in odds
of enrolling between students who are and are not veterans is statistically significant.
Age is a continuous variable indicating the age of the student in years. The range of the students in our
training data set was 17-64. The coefficient for this variable implies that, holding all other variables
constant, for every 1 year in the age of the student, the odds of that student enrolling in their intended
program decrease by a factor of 𝑒−0.017988
=0.9821728. At a significance level of .01, this difference in
odds of enrolling for each increase in years of age is statistically significant.
Hispanic_Latino_YN is a two level factor variable indicting whether or not the student is Hispanic/Latino.
 Hispanic_Latino_YNY = {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝑯𝒊𝒔𝒑𝒂𝒏𝒊𝒄/𝑳𝒂𝒕𝒊𝒐𝒏𝒐
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
Hispanic/Latino student enrolling in their intended program increases by a factor of
𝑒0.425373
=1.530161 compared to non-Hispanic/Latino students. However, this difference in odds
of enrolling between Hispanic/Latino and non-Hispanic/Latino students is only statistically
significant at the .05 level.
Gender is a two level factor variable indicating the gender of the student, either male or female.
 GenderMale = {
𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒂 𝒎𝒂𝒍𝒆
𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆
The coefficient for this variable implies that, holding all other variables constant, the odds of a
male student enrolling in his intended program decreases by a factor of 𝑒−0.251832
=0.7773753
compared to female students. However the difference in odds of enrolling between females and
males is not statistically significant.
EVALUATION OF THE MODEL
In order to evaluate the accuracy of the model, we conducted a binary classification on the test set of
the data. The decision rule (as a function of the predicted probability of enrolling based on our model)
used to classify the predicted enrollment status of each student was the following:
Predicted Enrollment Status = {
0 − 𝐷𝑖𝑑 𝑁𝑜𝑡 𝐸𝑛𝑟𝑜𝑙𝑙, 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(𝑒𝑛𝑟𝑜𝑙𝑙𝑖𝑛𝑔) < .5
1 − 𝐸𝑛𝑟𝑜𝑙𝑙𝑒𝑑, 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(𝑒𝑛𝑟𝑜𝑙𝑙𝑖𝑛𝑔) ≥ .5
Comparing the actual enrollment status of each student in the test set to the predicted enrollment
status based on our model, the accuracy rate was 67%.
DISCUSSION
In this section, we will qualitatively explain the results of our model and offer suggestions based on our
conclusions. Our chosen significance level is .001, and thus we only discuss variables that had a
significant effect on likelihood of enrollment at that level.
International students are less likely to enroll in their intended program than domestic students. This
could be due to a multitude of reasons. Two possible reasons are that 1) it is more costly for
international students to go to school in the United States, or 2) international students have a larger
scope of schools to choose from when including those in their own country. It would take a lot more
convincing to get an international student to make the effort to move to the United State and
participate in N University’s program than it would a student who just has to travel from another US
state. This difference between the likelihood of enrollment between international and domestic
students could be minimized if the University improves their international student services and/or
improves the marketing of the existing programs, making clear how attending the university would
contribute to the future success of international students.
Students admitted to the Fast-Track programs are much more likely to enroll than students admitted to
the Bachelors Degree program. This could be due to the fact that there are more existing traditional
Bachelors Degree programs for students to choose from, creating more competition for the university.
This also implies that the flexibility and efficiency that the university offers through the Fast-Track
programs is attractive to students and these programs tend to pull in a higher proportion of the
admitted applicants. Thus it would be advisable to invest/expand more into the marketing of these
programs.
The likelihood of a student who prefers to complete their program online enrolling in their intended
program is higher than the likelihood of a student who prefers the Boston Main Campus. Again, this may
be due to the fact that there are many schools that offer (or even require) on-campus learning.
Therefore students who do not have a campus preference, or prefer the Boston Main Campus have
more options to choose from and this poses more competition for the University. However, a higher
proportion of students who prefer to complete their courses online will actually enroll in their admitted
program because the University offers that as an option. Thus, investing more money into reeling in
students who prefer to attend online classes will likely offer a higher return.
At our chosen significance level, the difference in likelihood of enrolling is negligible for the other factors
in our model.

More Related Content

What's hot

ISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVE
ISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVEISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVE
ISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVESOURAV DAS
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceAlexander Decker
 
Analyzing Student Debt
Analyzing Student DebtAnalyzing Student Debt
Analyzing Student DebtNina Satasiya
 
Predicting Success : An Application of Data Mining Techniques to Student Outc...
Predicting Success : An Application of Data Mining Techniques to Student Outc...Predicting Success : An Application of Data Mining Techniques to Student Outc...
Predicting Success : An Application of Data Mining Techniques to Student Outc...IJDKP
 
Sri Lankan Journal of Educational Research
Sri Lankan Journal of Educational ResearchSri Lankan Journal of Educational Research
Sri Lankan Journal of Educational ResearchGodwin Kodituwakku
 
A Study on the Relationship between Affective Learning outcome and Achievemen...
A Study on the Relationship between Affective Learning outcome and Achievemen...A Study on the Relationship between Affective Learning outcome and Achievemen...
A Study on the Relationship between Affective Learning outcome and Achievemen...QUESTJOURNAL
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Alexander Decker
 
Pipelinebasicskillsmath2
Pipelinebasicskillsmath2Pipelinebasicskillsmath2
Pipelinebasicskillsmath2harrindl
 
Using Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a StudentUsing Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a Studentijtsrd
 

What's hot (12)

ISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVE
ISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVEISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVE
ISI B.STAT/M.STAT TUITION COACHING ONLINE AND LIVE
 
Rone
RoneRone
Rone
 
Application of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performanceApplication of ordinal logistic regression in the study of students’ performance
Application of ordinal logistic regression in the study of students’ performance
 
Analyzing Student Debt with R
Analyzing Student Debt with R Analyzing Student Debt with R
Analyzing Student Debt with R
 
Analyzing Student Debt
Analyzing Student DebtAnalyzing Student Debt
Analyzing Student Debt
 
Me101
Me101Me101
Me101
 
Predicting Success : An Application of Data Mining Techniques to Student Outc...
Predicting Success : An Application of Data Mining Techniques to Student Outc...Predicting Success : An Application of Data Mining Techniques to Student Outc...
Predicting Success : An Application of Data Mining Techniques to Student Outc...
 
Sri Lankan Journal of Educational Research
Sri Lankan Journal of Educational ResearchSri Lankan Journal of Educational Research
Sri Lankan Journal of Educational Research
 
A Study on the Relationship between Affective Learning outcome and Achievemen...
A Study on the Relationship between Affective Learning outcome and Achievemen...A Study on the Relationship between Affective Learning outcome and Achievemen...
A Study on the Relationship between Affective Learning outcome and Achievemen...
 
Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...Analyzing undergraduate students’ performance in various perspectives using d...
Analyzing undergraduate students’ performance in various perspectives using d...
 
Pipelinebasicskillsmath2
Pipelinebasicskillsmath2Pipelinebasicskillsmath2
Pipelinebasicskillsmath2
 
Using Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a StudentUsing Naive Bayesian Classifier for Predicting Performance of a Student
Using Naive Bayesian Classifier for Predicting Performance of a Student
 

Similar to Non-Traditional Student Enrollment Analytics

Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...
Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...
Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...ACBSP Global Accreditation
 
ElectronicTranscript
ElectronicTranscriptElectronicTranscript
ElectronicTranscriptYIRAN LIU
 
Tracking Alumni Outcomes with Alumni Tracker
Tracking Alumni Outcomes with Alumni TrackerTracking Alumni Outcomes with Alumni Tracker
Tracking Alumni Outcomes with Alumni TrackerNaviance
 
Predictors of University Admissions Yield
Predictors of University Admissions YieldPredictors of University Admissions Yield
Predictors of University Admissions Yieldrisraelkloss
 
Predictors of Admissions Yield
Predictors of Admissions YieldPredictors of Admissions Yield
Predictors of Admissions YieldIsrael Kloss, MBA
 
2017 third grade cog at parent presentation
2017 third grade cog at parent presentation2017 third grade cog at parent presentation
2017 third grade cog at parent presentationhcallahan
 
Undergraduate Transcripts of Deepak (Danny) Singh
Undergraduate Transcripts of Deepak (Danny) SinghUndergraduate Transcripts of Deepak (Danny) Singh
Undergraduate Transcripts of Deepak (Danny) SinghDanny Singh, M.B.A., MSEd
 
What is-teds-2013
What is-teds-2013What is-teds-2013
What is-teds-2013kyffa
 
evaluation_140701
evaluation_140701evaluation_140701
evaluation_140701Julicem
 
STC Metrics Collaboration
STC Metrics CollaborationSTC Metrics Collaboration
STC Metrics CollaborationCarlos Rivers
 
Condition National 2015
Condition National 2015Condition National 2015
Condition National 2015Will Valet
 
F41417 5edc20025b1b4d9f9c229e4ce4b12b8a
F41417 5edc20025b1b4d9f9c229e4ce4b12b8aF41417 5edc20025b1b4d9f9c229e4ce4b12b8a
F41417 5edc20025b1b4d9f9c229e4ce4b12b8aizzyka
 
Comprehensive Assessment of Student Retention in Online Learning
Comprehensive Assessment of Student Retention in Online LearningComprehensive Assessment of Student Retention in Online Learning
Comprehensive Assessment of Student Retention in Online LearningPhil Ice
 
USC transcript
USC transcriptUSC transcript
USC transcriptSikai Sun
 
Holistic Admissions
Holistic AdmissionsHolistic Admissions
Holistic AdmissionsACT, Inc.
 
OfficialTranscriptMSU
OfficialTranscriptMSUOfficialTranscriptMSU
OfficialTranscriptMSUJoshua Kahn
 
Academic_Transcript-2016-06
Academic_Transcript-2016-06Academic_Transcript-2016-06
Academic_Transcript-2016-06Jacob K. Yang
 

Similar to Non-Traditional Student Enrollment Analytics (20)

Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...
Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...
Brad Kleindl - Degree to Enrollment Ratios and Persistence Rates, Meeting Rep...
 
ElectronicTranscript
ElectronicTranscriptElectronicTranscript
ElectronicTranscript
 
Tracking Alumni Outcomes with Alumni Tracker
Tracking Alumni Outcomes with Alumni TrackerTracking Alumni Outcomes with Alumni Tracker
Tracking Alumni Outcomes with Alumni Tracker
 
Predictors of University Admissions Yield
Predictors of University Admissions YieldPredictors of University Admissions Yield
Predictors of University Admissions Yield
 
Predictors of Admissions Yield
Predictors of Admissions YieldPredictors of Admissions Yield
Predictors of Admissions Yield
 
2017 third grade cog at parent presentation
2017 third grade cog at parent presentation2017 third grade cog at parent presentation
2017 third grade cog at parent presentation
 
Undergraduate Transcripts of Deepak (Danny) Singh
Undergraduate Transcripts of Deepak (Danny) SinghUndergraduate Transcripts of Deepak (Danny) Singh
Undergraduate Transcripts of Deepak (Danny) Singh
 
What is-teds-2013
What is-teds-2013What is-teds-2013
What is-teds-2013
 
What is TEDS 2013
What is TEDS 2013What is TEDS 2013
What is TEDS 2013
 
evaluation_140701
evaluation_140701evaluation_140701
evaluation_140701
 
STC Metrics Collaboration
STC Metrics CollaborationSTC Metrics Collaboration
STC Metrics Collaboration
 
Condition National 2015
Condition National 2015Condition National 2015
Condition National 2015
 
AP_report
AP_reportAP_report
AP_report
 
F41417 5edc20025b1b4d9f9c229e4ce4b12b8a
F41417 5edc20025b1b4d9f9c229e4ce4b12b8aF41417 5edc20025b1b4d9f9c229e4ce4b12b8a
F41417 5edc20025b1b4d9f9c229e4ce4b12b8a
 
CogAt Introduction
CogAt IntroductionCogAt Introduction
CogAt Introduction
 
Comprehensive Assessment of Student Retention in Online Learning
Comprehensive Assessment of Student Retention in Online LearningComprehensive Assessment of Student Retention in Online Learning
Comprehensive Assessment of Student Retention in Online Learning
 
USC transcript
USC transcriptUSC transcript
USC transcript
 
Holistic Admissions
Holistic AdmissionsHolistic Admissions
Holistic Admissions
 
OfficialTranscriptMSU
OfficialTranscriptMSUOfficialTranscriptMSU
OfficialTranscriptMSU
 
Academic_Transcript-2016-06
Academic_Transcript-2016-06Academic_Transcript-2016-06
Academic_Transcript-2016-06
 

Non-Traditional Student Enrollment Analytics

  • 1. N University: Office of Strategic Research and Analytics OBJECTIVE To understand how variables related to students accepted into N University’s programs affect their likelihood of actually enrolling in these programs. DATA About: We have data from 1268 subjects. Each subject is a student who has been admitted to one of 4 program types at N University: Bachelors Degree, Fast-Track 18-Month BS Degree Completion Blended Class, Fast-Track 18-Month BS Degree Completion Online Class, and Undergraduate Certificate. The variables available to us range from demographic information, to their major, to time elapsed between the application submitted date and the decision date. Variables of Interest: In order to ensure that our analysis is applicable to future students, we replaced factor variables that had a large amount of very specific levels, with factor variables that had a reduced number of levels that were more general. For example, the variable Country presented an issue because it did not encompass all the possible countries that a student could be from. However, by creating a new variable that was coded “Domestic” for students from the United States and “International” otherwise, not only did we retain the information from the Country variable, but we ensured that future students would have a level they can be placed in for analytical purposes. Factor variables with levels that did not have clear ways of being grouped together, such as High School, were omitted. Variables that were the same for each subject, such as Level, where all the students were listed as undergraduate, were also omitted. Missing Values: While with continuous variables, missing data could be filled in using methods such as using the average of the column, it is not as simple with categorical variables. Thus applicants that had missing values for any of the categorical variables were not included in the analysis. After omitting these subjects, 931 subjects remained. MODELING PROCEDURE The response variable was Enrolled which was coded 1 if the variable “Registered for Class” was a “Yes” and 0 if it was a “No.” This type of variable is binomially distributed and thus we ran a generalized linear binomial model. We split the 931 subjects into a test set of 121 randomly selected subjects and the remaining 810 subjects became the training set on which we ran our analysis. We started with the model that contained only an intercept and no predictive variables. From there we implemented a forward stepwise model selection procedure, using AIC as the criteria for deciding which variables to keep in the model. The final model chosen by the procedure is presented in the following section.
  • 2. THE MODEL R- Output: In terms of log(odds): In terms of odds: We will interpret each variable’s coefficient in order to explain the effect of that variable on the odds of a student enrolling in their intended program, as that is easier for a general audience to understand. However, we have included the model in terms of log (odds) above as well. The statistical significance of each coefficient at specific significance levels is included so that the reader may designate their own significance level and draw their own conclusions. The interpretation of each coefficient is holding all other variables constant. Log(odds of enrolling) = 1.301818 -1.733925*[Dom_InternInternational] +1.478557*[Prog_typeFast-Track 18-Month BS Degree Completion Blended Classes] +0.654839*[Prog_typeFast-Track 18-Month BS Degree Completion Online Classes] -0.249308*[Prog_typeUndergraduate Certificate] -0.671543*[Campus_PreferenceNo Preference] -0.536188*[Campus_PreferenceOnline] +0.878252*[Veteran_YNY] -0.017988*[Age] +0.425373*[Hispanic Latino_YNY] -0.251832*[GenderMale] odds of enrolling = 𝑒1.301818 *𝑒−1.733925∗Dom_Intern𝑰𝒏𝒕𝒆𝒓𝒏𝒂𝒕𝒊𝒐𝒏𝒂𝒍 *𝑒1.478557∗Prog_type𝑭𝒂𝒔𝒕−𝑻𝒓𝒂𝒄𝒌 𝟏𝟖−𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑩𝒍𝒆𝒏𝒅𝒆𝒅 𝑪𝒍𝒂𝒔𝒔𝒆𝒔 *𝑒0.654839∗Prog_type𝑭𝒂𝒔𝒕−𝑻𝒓𝒂𝒄𝒌 𝟏𝟖−𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑶𝒏𝒍𝒊𝒏𝒆 𝑪𝒍𝒂𝒔𝒔𝒆𝒔 *𝑒−0.249308∗Prog_type𝑼𝒏𝒅𝒆𝒓𝒈𝒓𝒂𝒅𝒖𝒂𝒕𝒆 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆 *𝑒−0.671543∗Campus_Preference𝑵𝒐 𝑷𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆 *𝑒−0.536188∗Campus_Preference𝑶𝒏−𝑳𝒊𝒏𝒆 *𝑒0.878252∗Veteran_YN𝒀 *𝑒−0.017988∗Age *𝑒0.425373∗Hispanic_Latino_YN𝒀 *𝑒−0.251832∗Gender𝑴𝒂𝒍𝒆
  • 3. The coefficient for the Intercept implies that the odds of a female student who is domestic, admitted to the bachelors degree program, prefers to complete their program on the Boston Main Campus, is not a veteran, is not Hispanic/Latino, and is of zero years of age is 𝑒1.301818 =3.675974. This coefficient has no qualitative meaning on its own because it does not make sense for a person who is of zero years of age to apply to the university. Dom_Intern is a factor variable with two levels, indicating whether a student is domestic or international.  Dome_InternInternational= { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒊𝒏𝒕𝒆𝒓𝒏𝒂𝒕𝒊𝒐𝒏𝒂𝒍 𝟎, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒅𝒐𝒎𝒆𝒔𝒕𝒊𝒄 The coefficient for this variable implies that, holding all other variables constant, the odds of an international student enrolling in their intended program decreases by a factor of 𝑒−1.733925 =0.1765899 compared to a domestic student. At any common significance level, there is a statistically significant difference between the odds of an international and domestic student enrolling in their intended program. Prog_Type is a factor variable with four levels, indicating what type of program the student was admitted into: Bachelors Degree, Fast-Track 18-Month BS Degree Completion Blended Classes, Fast- Track 18-Month BS Degree Completion Online Classes, and Undergraduate Certificate.  Prog_TypeFast-Track 18-Month BS Degree Completion Blended Classes = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒑𝒓𝒐𝒈𝒓𝒂𝒎 𝒕𝒚𝒑𝒆 𝒊𝒔 𝑭𝒂𝒔𝒕 𝑻𝒓𝒂𝒄𝒌 𝟏𝟖 𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑩𝒍𝒆𝒏𝒅𝒆𝒅 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a student in the Fast-Track 18-Month BS Degree Completion Blended program enrolling increases by a factor of 𝑒1.478557 =4.386611 compared to a student in the Bachelors Degree Program. At a significance level of .001, this difference in odds of enrolling between students in the Bachelors Degree program and students in the Fast-Track 18-Month BS Degree Completion Blended program is statistically significant.  Prog_TypeFast-Track 18-Month BS Degree Completion Online Classes = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒑𝒓𝒐𝒈𝒓𝒂𝒎 𝒕𝒚𝒑𝒆 𝒊𝒔 𝑭𝒂𝒔𝒕 𝑻𝒓𝒂𝒄𝒌 𝟏𝟖 𝑴𝒐𝒏𝒕𝒉 𝑩𝑺 𝑫𝒆𝒈𝒓𝒆𝒆 𝑪𝒐𝒎𝒑𝒍𝒆𝒕𝒊𝒐𝒏 𝑶𝒏𝒍𝒊𝒏𝒆 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a student in the Fast-Track 18-Month BS Degree Completion Online program increases by a factor of 𝑒0.654839 =1.924833 compared to students in the Bachelors Degree program. At a significance level of .001, this difference in odds of enrolling between students in the Bachelors Degree program and students in the Fast-Track 18-Month BS Degree Completion Online program is statistically significant.  Prog_TypeUndergraduate Certificate = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒑𝒓𝒐𝒈𝒓𝒂𝒎 𝒕𝒚𝒑𝒆 𝒊𝒔 𝑼𝒏𝒅𝒆𝒓𝒈𝒓𝒂𝒅𝒖𝒂𝒕𝒆 𝑪𝒆𝒓𝒕𝒊𝒇𝒊𝒄𝒂𝒕𝒆 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a student admitted to the Undergraduate Certificate program enrolling decreases by a factor of 𝑒−0.249308 =0.7793399 compared to Bachelors Degree students. This difference in odds of
  • 4. enrolling between students in the Undergraduate Certificate program and students in the Bachelors Degree program is not statistically significant. Campus_Preference is a factor variable with three levels, indicating whether a student prefers to complete their program on the Boston Main Campus, online, or have no preference.  Campus_PreferenceNoPreference = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒉𝒂𝒔 𝒏𝒐 𝒑𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒊𝒏 𝒄𝒂𝒎𝒑𝒖𝒔 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a student that has no campus preference enrolling in their program decreases by a factor of 𝑒−0.671543 =0.5109196 compared to students whose campus preference is the Boston Main Campus. The difference in odds of enrolling between students who prefer Boston Main Campus and students who have no preference is not statistically significant.  Campus_PreferenceOnLine = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕′𝒔 𝒄𝒂𝒎𝒑𝒖𝒔 𝒑𝒓𝒆𝒇𝒆𝒓𝒆𝒏𝒄𝒆 𝒊𝒔 𝒐𝒏𝒍𝒊𝒏𝒆 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a student that prefers to complete their program online enrolling in classes decrease by a factor of 𝑒−0.536188 =0.5849739 compared to students who’s campus preference is Boston Main Campus. At a significance level of .001, this difference in odds of enrolling between students who prefer to complete their program Online versus at the Boston Main Campus is statistically significant. Veteran_YN is a two level factor variable indicating the veteran status of a student.  Veteran_YNY = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒂 𝒗𝒆𝒕𝒆𝒓𝒂𝒏 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a student who is a veteran enrolling in their program increases by a factor of 𝑒0.878252 =2.406689 compared to a student who is not a veteran. At a significance level of .01, the difference in odds of enrolling between students who are and are not veterans is statistically significant. Age is a continuous variable indicating the age of the student in years. The range of the students in our training data set was 17-64. The coefficient for this variable implies that, holding all other variables constant, for every 1 year in the age of the student, the odds of that student enrolling in their intended program decrease by a factor of 𝑒−0.017988 =0.9821728. At a significance level of .01, this difference in odds of enrolling for each increase in years of age is statistically significant. Hispanic_Latino_YN is a two level factor variable indicting whether or not the student is Hispanic/Latino.  Hispanic_Latino_YNY = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝑯𝒊𝒔𝒑𝒂𝒏𝒊𝒄/𝑳𝒂𝒕𝒊𝒐𝒏𝒐 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a Hispanic/Latino student enrolling in their intended program increases by a factor of 𝑒0.425373 =1.530161 compared to non-Hispanic/Latino students. However, this difference in odds of enrolling between Hispanic/Latino and non-Hispanic/Latino students is only statistically significant at the .05 level. Gender is a two level factor variable indicating the gender of the student, either male or female.
  • 5.  GenderMale = { 𝟏, 𝒔𝒕𝒖𝒅𝒆𝒏𝒕 𝒊𝒔 𝒂 𝒎𝒂𝒍𝒆 𝟎, 𝒐𝒕𝒉𝒆𝒓𝒘𝒊𝒔𝒆 The coefficient for this variable implies that, holding all other variables constant, the odds of a male student enrolling in his intended program decreases by a factor of 𝑒−0.251832 =0.7773753 compared to female students. However the difference in odds of enrolling between females and males is not statistically significant. EVALUATION OF THE MODEL In order to evaluate the accuracy of the model, we conducted a binary classification on the test set of the data. The decision rule (as a function of the predicted probability of enrolling based on our model) used to classify the predicted enrollment status of each student was the following: Predicted Enrollment Status = { 0 − 𝐷𝑖𝑑 𝑁𝑜𝑡 𝐸𝑛𝑟𝑜𝑙𝑙, 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(𝑒𝑛𝑟𝑜𝑙𝑙𝑖𝑛𝑔) < .5 1 − 𝐸𝑛𝑟𝑜𝑙𝑙𝑒𝑑, 𝑃𝑟𝑜𝑏𝑎𝑏𝑖𝑙𝑖𝑡𝑦(𝑒𝑛𝑟𝑜𝑙𝑙𝑖𝑛𝑔) ≥ .5 Comparing the actual enrollment status of each student in the test set to the predicted enrollment status based on our model, the accuracy rate was 67%. DISCUSSION In this section, we will qualitatively explain the results of our model and offer suggestions based on our conclusions. Our chosen significance level is .001, and thus we only discuss variables that had a significant effect on likelihood of enrollment at that level. International students are less likely to enroll in their intended program than domestic students. This could be due to a multitude of reasons. Two possible reasons are that 1) it is more costly for international students to go to school in the United States, or 2) international students have a larger scope of schools to choose from when including those in their own country. It would take a lot more convincing to get an international student to make the effort to move to the United State and participate in N University’s program than it would a student who just has to travel from another US state. This difference between the likelihood of enrollment between international and domestic students could be minimized if the University improves their international student services and/or improves the marketing of the existing programs, making clear how attending the university would contribute to the future success of international students. Students admitted to the Fast-Track programs are much more likely to enroll than students admitted to the Bachelors Degree program. This could be due to the fact that there are more existing traditional Bachelors Degree programs for students to choose from, creating more competition for the university. This also implies that the flexibility and efficiency that the university offers through the Fast-Track programs is attractive to students and these programs tend to pull in a higher proportion of the admitted applicants. Thus it would be advisable to invest/expand more into the marketing of these programs. The likelihood of a student who prefers to complete their program online enrolling in their intended program is higher than the likelihood of a student who prefers the Boston Main Campus. Again, this may be due to the fact that there are many schools that offer (or even require) on-campus learning. Therefore students who do not have a campus preference, or prefer the Boston Main Campus have more options to choose from and this poses more competition for the University. However, a higher
  • 6. proportion of students who prefer to complete their courses online will actually enroll in their admitted program because the University offers that as an option. Thus, investing more money into reeling in students who prefer to attend online classes will likely offer a higher return. At our chosen significance level, the difference in likelihood of enrolling is negligible for the other factors in our model.