Title: A Classification Approach for Effective Noninvasive Diagnosis of Coronary Artery Disease
1A Classification Approach for Effective
Noninvasive Diagnosis of Coronary Artery Disease
- Advisor??? ??
- Student??? D95402001
- ???D95402005
- ???D95402007
2Outline
- Background
- Motivation
- The Data Mining Process
- Conclusion
- Limitation
3Background
- Heart disease is the leading cause of death in
Taiwan. - The third on the rank of the number of people
died - The number of people died is 12,970
- The death rate is 57.1 per one hundred thousand
people - Coronary artery disease (CAD) is the most common
type of heart disease.
4The illustration of CAD
Source http//images.medicinenet.com/images/illus
trations/heart_attack.jpg
5The Diagnosis of CAD
- Noninvasive approaches
- Laboratory tests
- Electrocardiogram (ECG)
- Ultrasound tests
- Invasive approach
- Coronary angiography
6The Important Risk Factors of CAD
- Smoking
- High blood pressure
- High blood cholesterol
- Diabetes
- Being overweight or obese
- Physical inactivity
7Motivation
- Invasive approach is higher risk and cost than
noninvasive approach - Is noninvasive approach is sufficient to diagnose
the possibility of occurring CAD? - Whats the performance of noninvasive approaches?
8Motivation
- Invasive approach is higher risk and cost than
noninvasive approach - Is noninvasive approach is sufficient to diagnose
the possibility of occurring CAD? - Whats the performance of noninvasive approaches?
9The Data Mining Procedure Step 1
- Use some medically examinations to predict
whether some people have heart disease. - A Classification problem
10The Data Mining Procedure Step 2
- UCI KDD archive web
- Those row data come from three hospitals in
United States
11The Data Mining Procedure Step 3
Attributes Range
age Min28 Max77 Average54
Sex Male1 Female0
Chest pain type 1Typical angina 2Atypical angina 3Non- angina pain 4Asymptomatic
resting blood pressure 0,120 0 121, 8) 0
cholestoral 0 gt200 1 200,240) 2 gt240
fasting blood sugar 0 lt120 1 gt120
electrocardiographic 0Normal 1Having ST-T wave abnormality 2Left ventricular hypertrophy
maximum heart rate 60,138
Attributes Range
exercise induced angina Yes 1 No 0
ST depression induced by exercise
The slope of peak exercise ST segment Upsloping 1 flat 2 Downsloping 3
resting blood pressure gt120 1 lt120 0
Number of major vessels 0,1,2,3
thal Normal 3 Fixed defect 6 Reversable defect 7
diagonsis lt 50 diameter narrowing 0 gt 50 diameter narrowing 1
12The Data Mining Procedure Step 4
- Source UCI KDD Archive
- Training set Cleveland Clinic Foundation, 303
records - Testing set Hungarian Institute of Cardiology,
294 records
13The Data Mining Procedure Step 5
- The Problem of data
- Missing Value
- Approach
- Discard the records containing missing values
14The Data Mining Procedure Step 6
- Skip this step due to the uselessness of
aggregating records or combining original
attributes
15The Data Mining Procedure Step 7
- In order to obtain rules from models to support
medical decision - Decision tree C4.5 and Bays Network are used as
our data mining approaches - WEKA is used as our data mining tool
16The Data Mining Procedure Step 7 (Cont.)
17The Data Mining Procedure Step 8
- In order to assess the models, we conduct two
phrases experiments with comprehensive measures
which include sensitivity, specificity and
accuracy.
18The Data Mining Procedure Step 8 (Cont.)
- First Phrase
- The diverse combinations of fields were used as
input variables. The fields are divided into four
groups
19The Data Mining Procedure Step 8Model
Assessment
- We use sensitivity, specificity and accuracy as
our model measures. - Sensitivity could represent the probability of
mistake in diagnosis - Specificity could represent the probability of
unnecessary medical resource wasting
20The Data Mining Procedure Step 8The Result of
First Phase
21The Data Mining Procedure Step 8 The Result of
First Phase (Cont.)
22The Data Mining Procedure Step 8 The Result of
Second Phrase
23The Data Mining Procedure Step 9
- In this step, due to we have not enough medical
resource to support our project, it is difficult
to deploy our models in practical. Although we
cannot fulfill our models in the real business
environment, we still obtain copious experience
and knowledge throughout data mining process.
24Conclusion
- Two classification approaches decision tree and
Bayesian network. - Using noninvasive and invasive approaches step
by step.
25Limitation
- Limited noninvasive approaches
- Apply other non-invasive examinations to increase
performance of data mining model, such as
ultrasound tests. - Lack of explanation of rules with domain
knowledge