Predict student behavior to increase retention - PowerPoint PPT Presentation

1 / 36
About This Presentation
Title:

Predict student behavior to increase retention

Description:

Which students make greatest use of institutional services? ... Robert Small, Two Crows. CRISP-DM. Business Understanding. Data Understanding. Data Preparation ... – PowerPoint PPT presentation

Number of Views:150
Avg rating:3.0/5.0
Slides: 37
Provided by: Adm4
Category:

less

Transcript and Presenter's Notes

Title: Predict student behavior to increase retention


1
Predict student behavior to increase retention
  • Online seminar presented by
  • Jing Luan, Ph.D., Cabrillo College
  • Bob Valencic, SPSS Inc.
  • August 22, 2002

2
Seminar agenda
  • Business issues in higher education
  • How to predict student behavior and increase
    retention?
  • Data mining concepts
  • Data mining methods
  • Case studies
  • Getting started on data mining
  • QA

3
Higher education business issues
  • Institutional effectiveness
  • Student learning outcome assessment
  • Enrollment management
  • Achieving optimum attraction, retention and
    persistence goals
  • Marketing
  • Increasing competition for students
  • Alumni

How can data mining help?
4
Institutional effectiveness
Getting to know your students
  • Which students make greatest use of institutional
    services?
  • What courses provide high full-time equivalent
    students (FTES) and allow better use of space?
  • What are the patterns in course taking?
  • What courses tend to be taken as a group?

5
Enrollment management
Helping your students succeed
  • Who are our best students?
  • Where do our students come from?
  • Who is most likely to return for another
    semester?
  • Who is most likely to fail or drop out?

6
Marketing
Making the best use of tight budgets
  • Who is most likely to respond to our new
    campaign?
  • Which type of marketing/recruiting works best?
  • Where should we focus our advertising and
    recruiting?

7
Alumni
Continuing the relationship
  • What are the different types/groups of alumni?
  • Who is likely to pledge, for how much, and when?
  • Where and on whom should we focus our fundraising
    drives?

8
Our focus today Predicting student behavior
  • Acquiring new students
  • Retaining students
  • Increasing persistence to and beyond graduation

9
Data mining defined
  • The process of discovering meaningful new
    correlations, patterns, and trends by sifting
    through large amounts of data stored in
    repositories and by using pattern recognition
    technologies as well as statistical and
    mathematical techniques.
  • The Gartner Group

10
Another definition
  • Simply put, data mining is used to discover
    patterns and relationships in your data in order
    to help you make better business decisions.
  • Robert Small, Two Crows

11
CRISP-DM
  • Business Understanding
  • Data Understanding
  • Data Preparation
  • Modeling
  • Evaluation
  • Deployment

12
Two types of data mining
  • Supervised
  • Purpose For classification and estimation
  • Algorithms
  • C5.0
  • CRT
  • Neural
  • Network, etc.
  • Unsupervised
  • Purpose For clustering and association
  • Algorithms
  • Kohonen
  • Kmeans
  • TwoStep
  • GRI, etc.

13
Algorithm vs. model
  • Algorithm
  • A technical term describing a specific
    mathematically driven data mining function
  • Model
  • A set of representative rules, behaviors or
    characteristics against which data are analyzed
    to find similarities

14
Neural networks
  • Synonymous with Machine Learning
  • Identifies complex relations
  • Somewhat difficult to interpret
  • Long computation times

15
Decision trees
  • Easy to interpret
  • - income lt 40K
  • job gt 5 yrs then yes
  • job lt 5 yrs then no
  • - income gt 40K
  • high debt then no
  • low debt then yes

16
Apriori
  • Discovers events that occur together
  • Often called market basket analysis
  • Example What groups classes do certain students
    take in the same semester that may impact
    facilities and course scheduling?

17
Kohonen network
  • Seeks to describe dataset in terms of natural
    clusters of cases
  • Example identify similar groups of students

18
Case study using Clementine
  • Predicting student persistence

19
Examining data
20
Clustering using TwoStep
21
Building models for persistence in streams
A node is being executed (notice the red arrows
denoting the flow of data.
22
Seeing the work of neural thinking
Graphic display showing an ANN is learning the
data.
23
Results of neural node
These are the outputs of the Neural Networks.
Overall accuracy and significance of features
(left). Predicted number of policies using fresh
data vs. known data (above).
24
Examining C5.0
The control panel of the C5.0 node, (Expert)
25
Results of C5.0 node
View the prediction by individual records (PNXT
vs. C-PNXT).
View the overall prediction accuracy.
26
Comparing CRT and C5.0
Use the Analysis node to examine the difference
in accuracy for CRT and C5.0.
27
Which one is betterCRT C5.0
C5.0 has an accuracy rate of 66.3 and CRT
63.7. They agree 72 of the time.
28
Visualizing Results
29
Visualizing Results
30
Scoring new data
Moment of truth. The most powerful feature of
data mining is to use learned rules to predict
(score) using fresh data for business purposes.
Shown here is the change of dataset to a fresh
data set unseen by Clementine before now.
31
Using models to score new data
Model Results
Scored Results
32
Additional case study
Predicting the behavior of transfer students
  • How best to identify future transfer students so
    college can groom them?
  • What can a community college do to increase
    transfer rates?
  • Using decision tree models, the top rule for
    successful transfers was taking more than 12
    units, taken less than 5 non-transfer courses,
    must have taken at least one math course.

33
Getting started
Evaluate data mining software
  • Company stability and customer feedback
  • User interface
  • Scalability
  • Server/Client
  • Modeling capacities
  • Learning curve
  • Join a listserv, such as CLUG
  • Cost

34
Getting started
Develop a data mining plan for your institution
  • Determine business needs
  • Determine technology infrastructure and
    management support
  • Identify mining area and business problems
  • Determine data source(s)
  • Invite an expert to jump start
  • Pilot test mining results
  • CRISP-DM and Real-time data mining, Knowledge
    Discover in Databases (KDD)

35
Want to Learn More?
  • Full training course descriptions at
  • www.spss.com/training
  • Contact us or one of our other data mining
    experts by calling 800-543-5815.
  • Check out the Knowledge Management/Data Mining
    Discussion Group
  • http//www.kdl1.com/kmdm
  • Obtain the book, Knowledge Management Building
    A Competitive Advantage in Higher Education,
    published by Jossey-Bass
  • http//josseybass.com/cda/product/0,,0787962910,00
    .html
  • Bob Valencic rvalencic_at_spss.com
  • Jing Luan jing_at_cabrillo.edu

36
Thank you!
  • Predict student behavior to increase retention
  • 2nd Annual Public Sector Roadshow
  • October 15 in Washington, D.C.
  • www.spss.com/psroadshow
Write a Comment
User Comments (0)
About PowerShow.com