Title: Applying%20Bayesian%20Belief%20Networks%20to%20the%20Examination%20of%20Student%20Outcomes
1Applying Bayesian Belief Networks to the
Examination of Student Outcomes
- Xiaohong Li, Graduate Research Asst.
- Rita Caso, Director
Sam Houston State University Office of
Institutional Research Assessment
2Outline
- Purpose of the Study
- Why Study Freshman Outcomes?
- Why Bayesian Networks
- Method
- Example Inferences
- Conclusions
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
3Purpose
- Apply Bayesian Belief Network(BBN) techniques to
examine student outcomes for the purpose of
identifying families of factors associated with
students college success at Sam Houston State
University (SHSU) - Identify what factors impact retention and
graduation for First Time Freshmen (FTF) - Retention and Graduation rates key performance
indicators - Providing management information, analyzing and
interpreting these data for using in planning and
policy decisions
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
4Why Study Freshman Outcomes?
- To determine if we are providing the best
environment experiences to promote success for
our diverse freshman population - To make tailored improvements in the learning
environment and the learning experiences we
offer in order to maximize successful outcomes
for all students across preparation
backgrounds, needs, learning styles and
life-styles - To satisfy external accountability requirements
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
5Why Study Freshman Outcomes?
- University Stakeholders who need detailed
insights into the conditions and combinations of
factors that influence new student success - Enrollment Management
- Enrichment and Support Programs
- Student Services
- Academic Department
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
6Why Bayesian Network?
- Graphical Model with an Associated set of
Probability Tables - Learn causal relationships easily
- Better understand the problem domain and predict
the consequences - Flexible and robust recommendation strategies
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
7About Bayesian Networks
- Definitions of Basic Terms
- Independent
- Event A does not affect the probability of B
occurring P( A, B) P(A) P(B) - Conditional probability
- The probability of event C occurring, given that
event A has already occurred P(CA) - Conditional Independence
- E is independent of A and B given D
- E and F are conditionally independent of each
other, given D - Causal Theory
- A or B can cause D to occur
- Node variable
- Leaf Node no outcome depends on them (E, F)
- Root Node do not depend on any outcome (A,B)
A
B
D
E
F
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
8About Bayesian Networks
- A graphical model that encodes probabilistic
relationships among variables of interest - Named Bayes after Reverend Thomas Bayes, a
British theologian and mathematician who wrote
down a basic law of probability - Bayes Rule
-
Smoking
Cancer
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
9About Bayesian Networks
- Bayesian Networks Contain
- A Network Structure
- Directed, acyclic (non-circular) graph
- Encodes a set of conditional independence and
dependence information about variables - Probability
- Probability distributions associated with each
variable - Represented in the data and computed from the
data
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
10About Bayesian Networks
- Example of Bayesian Network
- Example Data below is Invented
FAID FAID
Yes No
0.2 0.8
Full/Part
FAID
Full/Part Full/Part
FAID Part Full
Yes 0.4 0.6
No 0.11 0.99
Retention
Retention Retention
Full/Part FAID Yes No
Full Yes 0.95 0.05
Full No 0.8 0.2
Part Yes 0.9 0.1
Part No 0.99 0.01
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
11Method
- Data Processing
- Data Source
- Institutional Research Assessment Office data
files from which Fall FTF cohorts for 2000
through 2006 were extracted - Working Data File
- Merge extracted FTF Cohort data into aggregated
data file - Records13542, variables 216
- Dependent variables - retention rate
graduation rate computed from enrollment and
graduation variables in working data file - Discretization transform continuous variables
into categorical variables
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
12Method
- Developing Bayesian Belief Network (BBN) Model by
using a computer application program called
NeticaTM3.25 - Selection of Variables
- Input variables selected from commonly used in
SHSU IRA Office studies of freshman outcomes - Variable selection reinforced by variables used
in Data Mining with Bayesian Belief networks to
Examine Retention and Graduation at a Public
University by P. Edamatsu, D. Jankovic and
Pokrajac, presented at AIR 2007 Forum
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
13Data Description
Name Label Type Value
1 Year Retention 1 Year Retention Discrete 2
Admitted_HscholGrad Year Admitted-Graduation Year Discretized 4
College College of Students Enrollment Discrete 7
Ethnicity Ethnicity Discrete 6
Gender Gender Discrete 2
I_O In-state(I)/Out of State (O) Discrete 2
F_T Full or Part Time Discrete 2
BKLC Bearkat Learning-Community Cohort Discrete 2
PBSP Probation or Suspension Discrete 6
ONOFF Whether or not student lives on campus Discrete 2
FAID Financial Aid Discrete 2
HSrank Rank in High School Discretized 5
SAT_Total SAT Total Score Discretized 6
GPA End of Semester GPA Discretized 8
Graduated_6yrs 6 year Graduation Discrete 2
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
14Method
- Assumptions in the model Structure
- Graduation and Retention (Dependent Variables)
are leaf nodes - Gender, Ethnicity, Full/Part, Probation
Suspension (PBSP) are root nodes and are
independent of each other.
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
15Method
- Building the Model Structure
- In order to specify the relationships between the
selected variables from PRIOR information, I took
inspiration from - Structure used by Edamatsu, D. Jankovic and
Pokrajac in their study - Knowledge about variables related to dependent
outcome variables from other SHSU IRA Office
studies - Knowledge about relationships between pairs of
variables from correlation matrices that
included all selected variables
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
16Structure Encoded with Data Probability
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
17Main Results
- Posteriori Analysis
- Students gender determines students college
choice and high school rank - Ethnicity influences students college choice.
- 1 year retention rate and 6 year graduation rate
directly depend on GPA and students probation or
suspension status - Students in-state or outof-state status and
ethnicity related to how many years after high
school graduation students applied to the
university - Students living on campus perform a little bit
better than those living off campus
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
18Results Pertaining to Gender
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
19Results Pertaining to Gender
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
20Results Pertaining to Gender
- There is no significant difference in graduation
rate and retention rate between males and
females. More females high school ranks are
above the 1st Q (from the top) than males - Females
- Tend to study majors in college of Art Sciences
and Humanities Social Sciences - Males
- Tend to study majors in college of Art Sciences
and Business Administration.
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
21Results Pertaining to Ethnicity
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
22Results Pertaining to Ethnicity
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
23Results Pertaining to Ethnicity
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
24Results Pertaining to Ethnicity
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
25Results Pertaining to Ethnicity
- No significant difference in graduation rate and
retention rate among ethnicities - Native Americans are less likely (86.7) to
attend university within 1 year after high
school compare to other ethnicities (around 95),
and 91 are in-state students, while 99 of other
ethnicities are in-state. - 46.6 of White Americans enrolled in college of
Arts and Sciences, compare to 39 of other
ethnicities. - 94 of African Americans live on campus, compare
to 75 - 86 of other ethnicities. -
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
26Results Pertaining to GPA
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
27Results Pertaining to GPA
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
28Results Pertaining to GPA
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
29Results Pertaining to GPA
- Bearkat Learning Community students have a higher
probability of having a higher GPA - Students with low GPA (below 2)
- Have only 27 graduation rate and 55 1 year
retention rate - Students with higher GPA (2 to 2.5)
- Have 43 graduation rate and 75 retention rate
- Students with highest GPA (above 3.75)
- Have 70 graduation rate and 85 retention rate
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
30Results Pertaining to Probation and Suspension
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
31Results Pertaining to Probation and Suspension
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
32Results Pertaining to Probation and Suspension
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
33In-State/Out-of-State Status
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
34In-State/Out-of-State Status
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
35On / Off Campus Living
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
36On /Off Campus Living
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
37Results Pertaining to Probation and Suspension,
In or out of State and Living on or off Campus
- Students on probation or suspended in the first
year - Have only 22 graduation rate and 45 retention
rate - Good standing students
- Have 53 graduation rate and 76 retention rate.
- Out-of-state students are less likely (87) to
attend university within 1 year after high
school, compared to in-state students (95). - There are no GPA distribution differences between
in-state students and out-of-state students - Students living on campus have a slightly higher
GPA, retention rate and graduation rate.
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
38Conclusion
- Bayesian Belief Networks are good tools for
analyzing institutional research data - BBN is a powerful methodology for graphically
demonstrating probability theory and can provide
good references for university administration - Users could have difficulty using BBN if they do
not have sufficient data or theory base to
provide prior probabilities. This is particularly
problematic when exploring a previously unknown
network - The validity and reliability of prior beliefs
used in Bayesian inference processing are
critical. If this prior knowledge is not
reliable, then the Bayesian network is not useful
TX Association of Institutional Research (TAIR)
2008 Conference, 2/5-7/08
39 Bibliography
- P. Edamatsu, D. Jankovic and Pokrajac, Data
Mining with Bayesian Belief networks to Examine
Retention and Graduation at a Public University,
presented at AIR 2007 Forum - David Heckerman, A Tutorial on Learning with
Bayesian Networks, 1997 - Bruce G. Marcot, What Are Bayesian Belief
Network Models?, 2005 - Castillo, E., J.M.Gutierrez and A.S.Hadi Expert
Systems and Probabilistic Network Models.
Springer Verlag, 1997 - Jie Cheng, Russell Greiner, Learning Bayesian
Belief Network Classifiers Algorithms and System
1995
40Questions?