Predict student behavior to increase retention
Upcoming SlideShare
Loading in...5
×
 

Like this? Share it with your network

Share

Predict student behavior to increase retention

on

  • 658 views

 

Statistics

Views

Total Views
658
Views on SlideShare
657
Embed Views
1

Actions

Likes
0
Downloads
8
Comments
0

1 Embed 1

http://www.slideshare.net 1

Accessibility

Categories

Upload Details

Uploaded via as Microsoft PowerPoint

Usage Rights

© All Rights Reserved

Report content

Flagged as inappropriate Flag as inappropriate
Flag as inappropriate

Select your reason for flagging this presentation as inappropriate.

Cancel
  • Full Name Full Name Comment goes here.
    Are you sure you want to
    Your message goes here
    Processing…
Post Comment
Edit your comment

Predict student behavior to increase retention Presentation Transcript

  • 1. Predict student behavior to increase retention Online seminar presented by: Jing Luan, Ph.D., Cabrillo College Bob Valencic, SPSS Inc. August 22, 2002
  • 2.
    • Business issues in higher education
    • How to predict student behavior and increase retention?
      • Data mining concepts
      • Data mining methods
    • Case studies
    • Getting started on data mining
    • Q&A
    Seminar agenda
  • 3. Higher education business issues
    • Institutional effectiveness
    • Student learning outcome assessment
    • Enrollment management
      • Achieving optimum attraction, retention and persistence goals
    • Marketing
      • Increasing competition for students
    • Alumni
    How can data mining help?
  • 4. Institutional effectiveness
    • Which students make greatest use of institutional services?
    • What courses provide high full-time equivalent students (FTES) and allow better use of space?
    • What are the patterns in course taking?
    • What courses tend to be taken as a group?
    Getting to know your students
  • 5. Enrollment management
    • Who are our best students?
    • Where do our students come from?
    • Who is most likely to return for another semester?
    • Who is most likely to fail or drop out?
    Helping your students succeed
  • 6. Marketing
    • Who is most likely to respond to our new campaign?
    • Which type of marketing/recruiting works best?
    • Where should we focus our advertising and recruiting?
    Making the best use of tight budgets
  • 7. Alumni
    • What are the different types/groups of alumni?
    • Who is likely to pledge, for how much, and when?
    • Where and on whom should we focus our fundraising drives?
    Continuing the relationship
  • 8. Our focus today: Predicting student behavior
    • Acquiring new students
    • Retaining students
    • Increasing persistence to and beyond graduation
  • 9. Data mining defined
    • “ The process of discovering meaningful new correlations, patterns, and trends by sifting through large amounts of data stored in repositories and by using pattern recognition technologies as well as statistical and mathematical techniques.”
    • The Gartner Group
  • 10. Another definition
    • “ Simply put, data mining is used to discover patterns and relationships in your data in order to help you make better business decisions.”
    • Robert Small, Two Crows
  • 11. CRISP-DM
    • Business Understanding
    • Data Understanding
    • Data Preparation
    • Modeling
    • Evaluation
    • Deployment
  • 12. Two types of data mining
    • Supervised
    • Purpose : For classification and estimation
    • Algorithms
      • C5.0
      • C&RT
      • Neural
      • Network, etc.
    • Unsupervised
    • Purpose : For clustering and association
    • Algorithms
      • Kohonen
      • Kmeans
      • TwoStep
      • GRI, etc.
  • 13. Algorithm vs. model
    • Algorithm
    • A technical term describing a specific mathematically driven data mining function
    • Model
    • A set of representative rules, behaviors or characteristics against which data are analyzed to find similarities
  • 14. Neural networks
    • Synonymous with Machine Learning
    • Identifies complex relations
    • Somewhat difficult to interpret
    • Long computation times
    Output Hidden layer Input layer
  • 15. Decision trees
    • Easy to interpret
    • - income < $40K
      • job > 5 yrs then yes
      • job < 5 yrs then no
    • - income > $40K
      • high debt then no
      • low debt then yes
  • 16. Apriori
    • Discovers events that occur together
    • Often called ‘market basket’ analysis
    • Example – What groups classes do certain students take in the same semester that may impact facilities and course scheduling?
  • 17. Kohonen network
    • Seeks to describe dataset in terms of natural clusters of cases
    • Example – identify similar groups of students
  • 18.
    • Predicting student persistence
    Case study using Clementine ®
  • 19. Examining data
  • 20. Clustering using TwoStep
  • 21. Building models for persistence in streams A node is being executed (notice the red arrows denoting the flow of data.
  • 22. Seeing the work of neural thinking Graphic display showing an ANN is learning the data.
  • 23. Results of neural node These are the outputs of the Neural Networks. Overall accuracy and significance of features (left). Predicted number of policies using fresh data vs. known data (above).
  • 24. Examining C5.0 The control panel of the C5.0 node, (Expert)
  • 25. Results of C5.0 node View the prediction by individual records (PNXT vs. $C-PNXT). View the overall prediction accuracy.
  • 26. Comparing C&RT and C5.0 Use the Analysis node to examine the difference in accuracy for C&RT and C5.0.
  • 27. Which one is better: C&RT & C5.0 C5.0 has an accuracy rate of 66.3% and C&RT 63.7%. They agree 72% of the time.
  • 28. Visualizing Results
  • 29. Visualizing Results
  • 30. Scoring new data Moment of truth. The most powerful feature of data mining is to use learned “rules” to predict (score) using fresh data for business purposes. Shown here is the change of dataset to a fresh data set unseen by Clementine before now.
  • 31. Using models to score new data Model Results Scored Results
  • 32. Additional case study
    • How best to identify future transfer students so college can groom them?
    • What can a community college do to increase transfer rates?
    • Using decision tree models, the top rule for successful transfers was: taking more than 12 units, taken less than 5 non-transfer courses, must have taken at least one math course.
    Predicting the behavior of transfer students
  • 33. Getting started
    • Company stability and customer feedback
    • User interface
    • Scalability
    • Server/Client
    • Modeling capacities
    • Learning curve
    • Join a listserv, such as CLUG
    • Cost
    Evaluate data mining software
  • 34. Getting started
    • Determine business needs
    • Determine technology infrastructure and management support
    • Identify mining area and business problems
    • Determine data source(s)
    • Invite an expert to jump start
    • Pilot test mining results
    • CRISP-DM and Real-time data mining, Knowledge Discover in Databases (KDD)
    Develop a data mining plan for your institution
  • 35. Want to Learn More ?
    • Full training course descriptions at:
    • www.spss.com/training
    • Contact us or one of our other data mining experts by calling 800-543-5815 .
    • Check out the Knowledge Management/Data Mining Discussion Group:
    • http://www.kdl1.com/kmdm
    • Obtain the book, “Knowledge Management – Building A Competitive Advantage in Higher Education,” published by Jossey-Bass:
    • http://josseybass.com/cda/product/0,,0787962910,00.html
    • Bob Valencic [email_address]
    • Jing Luan [email_address]
  • 36. Thank you!
    • Predict student behavior to increase retention
    • 2 nd Annual Public Sector Roadshow
    • October 15 in Washington, D.C.
    • www.spss.com/psroadshow