Improving quality of graduate students by data mining - PowerPoint PPT Presentation

1 / 26
About This Presentation
Title:

Improving quality of graduate students by data mining

Description:

Improving quality of graduate students by data mining Asst. Prof. Kitsana Waiyamai, Ph.D. Dept. of Computer Engineering Faculty of Engineering, Kasetsart University – PowerPoint PPT presentation

Number of Views:152
Avg rating:3.0/5.0
Slides: 27
Provided by: nok85
Category:

less

Transcript and Presenter's Notes

Title: Improving quality of graduate students by data mining


1
Improving quality of graduate students by data
mining
  • Asst. Prof. Kitsana Waiyamai, Ph.D.
  • Dept. of Computer Engineering
  • Faculty of Engineering, Kasetsart University
  • Bangkok, Thailand

2
Content
  • PART I
  • Introduction to data mining
  • Data mining technique association rule discovery
  • Data mining technique data classification
  • PART II
  • Improving quality of graduate students by data
    mining
  • Conclusion

3
What Is Data Mining ?
  • Knowledge Discovery from Data KDD (Data Mining)
  • The process of nontrivial extraction of patterns
    from data. Patterns that are
  • implicit,
  • previously unknown, and
  • potentially useful
  • Patterns must be comprehensible for human users.

4
Knowledge Discovery Process Iterative
Interactive Process
Mining Objective
5
What kind of data can be mined?
  • Relational databases
  • Data warehouses
  • Transactional databases and Flat files
  • Advanced DB systems and information repositories
  • Object-oriented and object-relational databases
  • Spatial databases
  • Time-series data and temporal data
  • Text databases, multimedia databases
  • Heterogeneous and legacy databases
  • World Wide Web
  • Bioinformatic data

6
Two modes of data mining
  • Predictive data mining
  • Predict behavior based on historic data
  • Use data with known results to build a model that
    can be later used to explicitly predict values
    for different data
  • Methods classification, prediction, etc.
  • Descriptive data mining
  • Describe patterns in existing data that may be
    used to guide decisions
  • Methods Associations rule discovery, Sequence
    pattern discovery, Clustering, etc.

7
Data Mining Techniques
  • Data Clustering
  • Association rule discovery
  • Data Classification
  • Outlier detection
  • Data regression
  • Etc.

8
(No Transcript)
9
Data Classification
  • Classification is the process of assigning new
    objects to predefined categories or classes
  • Given a set of labeled records
  • Build a model
  • Predict labels for future unlabeled records
  • Example
  • Age, Educational background, Annual income,
    Current debts, Housing location gt Making
    Decision
  • DegreeMaster and Income7500 gt
    CreditExcellent

10
Three-Step Process of Classification
Training Data
Model construction
Testing Data
Classifier Model
Model Evaluation
Unseen Data
Classifier Model
Classification
11
Data Mining Tools
  • ANGOSS KnowledgeStudio
  • IBM Intelligent Miner
  • Metaputer PolyAnalyst
  • SAS Enterprise Miner
  • SGI Mineset
  • SPSS Clementine
  • Many others
  • More at http//www.kdnuggets.com/software

12
Data Mining Projects
  • Checklist
  • Start with well-defined questions
  • Define measures of success and failure
  • Main difficulty No automation
  • Understanding the problem
  • Data preparation
  • Selection of the right mining methods
  • Interpretation

13
Using Data Mining for Improving Qualityof
Engineering Graduates
  • Objective
  • Discover knowledge from large databases of
    engineering student records.
  • Discovered knowledge are useful in
  • - Assisting in development of new curricula,
  • - Improvement of existing curricula,
  • - Helping students to select the appropriate
    major

14
Using a data mining technique to help students in
selecting their majors
  • Motivation
  • - Student major selection is very important
    factor for his/her success.
  • - Lack of experience and information on each
    major.
  • Solution
  • - Find out the profiles of good students for
    each major using student profile database and
    course enrollment student databases (10 years)
  • - Determine the most appropriate major for each
    student

15
A Data Mining based Approach for Improving
Quality of Engineering Graduates
Data Mining Tool
User
student profile database
Java Servlet
course enrollment student databases
16
Data for Data Mining
Stu_code Sex Address Sch_GPA ..... GPA
37058063 male Bangkok 2.5 ..... 2.3
37058167 male Songkla 3.4 ..... 3.2
........... .... ....... ...... .... ....
Student profile database
Stu_code Sub_code Term Year Grade
37058063 204111 1 2537 C
37058063 403111 1 2537 D
37058063 208111 1 2537 B
course enrollment student databases
17
Data preparation a classification model
Stu_code Sex Address Sch_GPA ..... GPA
37058063 male Bangkok 2.5 ..... 2.3
37058167 male Songkla 3.4 ..... 3.2
........... .... ....... ...... .... ....
Stu_code Sub_code Term Year Grade
37058063 204111 1 2537 C
37058063 403111 1 2537 D
37058063 208111 1 2537 B

Stu_code Sex 204111 403111 GPA
37058063 male Medium Low .... 2.3
37058167 male High High ..... 3.2
....... ..... ...... ....... ..... ......
18
Global Classification Model
Global Decision Tree which determines which
majors should be appropriate to which
students. Each internal node represents a test
on students profile. Each leaf node represents
an appropriate major to be selected
19
Drawbacks of Global Classification Model
  • - Low Precision 50 due to the large number of
    majors
  • - Number of students is different in each
    department gt the model cannot predict correctly
    the best major to be selected.
  • - The model proposes a unique major to be
    selected, a set of possible majors ordered by
    appropriateness score would be preferred.

20
Classification Model for Each Major
  • - Decision tree predicts whether a student is
    likely to be a good student in a given major.
  • Good students are those that graduate within 4
    years and are at the first 40 ranking in a given
    major.
  • - Leaf nodes represent two class Good and
    Bad

21
Advantage of Majors Classification Model
  • Good precision 80
  • The model predicts the best major to be selected
    even if number of students in each major is
    different
  • Its proposes a set of possible majors to be
    selected ordered by appropriateness score.

Encountered problems
  • Database size
  • Other factors that could affect students
    decision
  • Teacher Preference, etc.

22
Presentation of Discovered Knowledge
23
Applying Association rule discovery for Grade
prediction
24
Grade Prediction for the Coming Term
25
Presentation of Discovered Knowledge
26
Conclusion Future works
  • Application of data mining in Education
  • Use data mining techniques for improving quality
    of engineering students
  • Apply data mining techniques to several other
    educational domains.
Write a Comment
User Comments (0)
About PowerShow.com