Loading...

PPT – New Reduction Algorithm Based on Decision Power of Decision Table PowerPoint presentation | free to download - id: 25c96d-ZDc1Z

The Adobe Flash plugin is needed to view this content

New Reduction Algorithm Based on Decision Power

of Decision Table

- Jiucheng Xu, Lin Sun
- College of Computer Information Technology,

Henan Normal University, Xinxiang Henan, China

Introduction

- Rough set theory is a valid mathematical tool

that deals with imprecise, uncertain, vague or

incomplete knowledge of a decision system (see

1). Reduction of knowledge is always one of the

most important topics. Pawlak (see 1) first

proposed attribute reduction from the algebraic

point of view. Wang (see 2, 3) proposed some

reduction theories based on the information point

of view, and introduced two novel heuristic

algorithms of knowledge reduction with the time

complexity O(CU2) O(U3) and O(C2U)

O(CU3) respectively, where C denotes the

number of conditional attributes and U is the

number of objects in U, and the heuristic

algorithm based on the mutual information (see

4) with the time complexity O(CU2)

O(U3). These presented reduction algorithms

have still their own limitations, such as

sensitivity to noises, relatively high

complexities, nonequivalence in the

representation of knowledge reduction and some

drawbacks in dealing with inconsistent decision

tables.

- It is known that reliability and coverage of a

decision rule are all the most important

standards for estimating the decision quality

(see 5, 6), but these algorithms (see 1, 2,

3, 7, 8, 9) cant reflect the change of decision

quality objectively. To compensate for their

limitations, we construct a new method for

separating consistent objects from inconsistent

objects, and the corresponding judgment criterion

with an inequality used in searching for the

minimal or optimal reducts. Then we design a new

heuristic reduction algorithm with relatively

lower time complexity. For the large decision

tables, since usually U gtgt C, the reduction

algorithm is more efficient than the algorithms

discussed above. Finally, six data sets from UCI

repository are used to illustrate the performance

of the proposed algorithm and a comparison with

the existing methods is reported.

The Proposed Approach

- Limitations of Current Reduction Algorithms
- Hence, one can analyze algorithms based on

the positive region and the conditional entropy

deeply. Firstly, if for any P C, the P-quality

of approximation relative to D is equal to the

C-quality of approximation relative to D, i.e.,

?P(D) ?C(D), and there is no P P such that

?P(D) ?C(D), then P is called the reduct of C

relative to D (see 1, 7, 8, 9). In these

algorithms, whether or not any conditional

attributes is redundant depends on whether the

lower approximation corresponding to decision set

is changed or not after the attribute is deleted.

Accordingly if new inconsistent objects are added

to the decision table, it is not taken into

account whether the conditional probability

distributionof the primary inconsistent objects

are changed in every corresponding decision class

(see 10). Hence, if the generated deterministic

decision rules are the same, they will support

the same important standards for estimating

decision quality. Suppose the generated

deterministic decision rules are the same, that

is, the prediction of these rules is not

changing. Thus it is seen that these presented

algorithms only take into account whether or not

the prediction of deterministic decision rules is

changing after reduction.

- Secondly, if for any P C, H(DP) H(DC) and

P is independent relative to D, then P is called

the reduct of C relative to D (see 2, 3, 10,

11). Hence, whether any conditional attributes

is redundant or not depends on whether the

conditional entropy of decision table is changed

or not, after the attribute is deleted. It is

known that the conditional entropy generated by

POSC(D) is 0, thus U -POSC(D) can lead to a

change of conditional entropy. Due to the new

added and primary inconsistent objects in every

corresponding decision class, if their

conditional probability distribution changes, it

will cause the change of conditional entropy of

the whole decision table. Therefore, as it goes,

the main criterions of these algorithms for

estimating decision quality include two aspects,

the invariability of the deterministic decision

rules, the invariability of the reliability of

nondeterministic decision rules. - So, some researchers above only think about the

change of reliability for all decision rules

after reduction. However, in decision

application, besides the reliability of decision

rules, the object coverage of decision rules is

also one of the most important standards of

estimating decision quality. So these current

reduction algorithms above cant reflect the

change of decision quality objectively.

Meanwhile, the significance of attribute is

regarded as the quantitative computation of radix

for the positive region, which merely describes

the subsets of certain classes in U, while from

the information point of view, the significance

of attribute only indicates the detaching objects

of different decision classes in the equivalence

relation of conditional attribute subset.

However, for the inconsistent objects, these

current measures for attribute reduction lack of

dividing U into consistent object sets and

inconsistent object sets for the inconsistent

decision table. Therefore, these algorithms will

not be equivalent in the representation of

knowledge reduction for inconsistent decision

tables (see 12). It is necessary to seek for a

new kind of measure to search for the precise

reducts effectively.

- Representation of Decision Power on Decision

Table - Now, in a decision table S (U, C, D, V, f),

suppose D0 U POSC(D), from the definition of

positive region, we have CD0 D0. Suppose that

any set of AD0, AD1, AD2,, ADm isnt empty,

then the sets must be also a decision partition

of U, if there is an empty decision class ADi,

then the ADi is called a redundant set of the new

decision partition. After the redundant sets are

taken out, it makes no difference to the decision

partition. - Suppose that condition attributes subset A is a

reduction of C, thus the partition AD0, AD1,

AD2,, ADm is divided into consistent and

inconsistent objects set respective1y, and all

inconsistent objects detached form the unattached

set. On the basis of the idea mentioned above,

the new partition of condition attributes set C

is CD0, CD1, CD2,, CDm, then we have a new

equivalent relation generated by the new

partition, which is denoted by RD, U/RD CD0,

CD1, CD2,, CDm. Accordingly it shows that the

presented decision partition U/RD has not only

detached consistent objects from different

decision classes in U, but also separated

consistent objects from inconsistent objects,

while U/D is gained through detaching objects

from different decision classes corresponding to

equivalent classes.

- Definition 1. Given a decision table S (U, C,

D, V, f), let P C (U/P X1, X2,, Xt), D

d (U/D Y1, Y2,, Ym), and U/RD CY0, CY1,

CY2,, CYm, then the decision power of

equivalent relation RD with respect to P is

denoted by S(RD P), defined thus - .

- Theorem 1. Let r ? P C, then we have S(RD P)

S(RD P r). - Theorem 2. If S is a consistent one, then U/RD

U/D. Assume that -

,then S(RD P) S(RD P r)

H(DP) H(DP r) ?P(D) ?p- r(D). If

S is an inconsistent decision table, due to CY0

Y0 .Assume that

, then - S(RD P) S(RD P r) ?P(D) ?p-r

(D).

- Theorem 3. Let P be a subset of condition

attributes set C on U, and any r?P is said to be

dispensable in P with respect to D if and only if

S(RD P) S(RD P r). - Definition 2. If P C, then the significance of

any attribute r ?C P with respect to D is

defined in algebra view, denoted by - SGF(r, P, D) S(RD P? r) S(RD P).

(2) - Definition 3. Let P C be equivalent relations

on U, then P is an attribute reduction of C with

respect to D, which satisfies S(RD P) S(RD C)

and S(RD P) lt S(RD P), for any P P.

- Design of Reduction Algorithm Based on Decision

Power - Input Decision table S (U, C, D, V, f).
- Output A relative reduction P.
- (1) Calculating POSC(D) and U POSC(D) for the

new partition U/RD. - (2) Calculating S(RD C), CORED(C), and let P

CORED(C). - (3) If P Ø, then turn to (4), and if S(RD P)

S(RD C), then turn to (6). - (4) Calculating S(RD Pr), for any attribute

r?C P, select an attribute r with the maximum

of S(RD Pr), and if this r is not only, then

select that with the maximum of U/ (P? r). - (5) P P? r, and if S(RD P) ? S(RD C), then

turn to (4), else P P CORED(C)t P - for(i 1 i t i )
- ri?PP P ri
- if S(RD PCORED(C)) lt S(RD P) then P P?

ri - P P?CORED(C)
- (6) The output P is a minimum relative reduction.
- (7) End.

- Experimental Results
- Example 1. S (U, C, D, V, f) can be seen in

Table 1 below, where U x1, x2,, x10, C

a1, a2,, a5, and D d.

- In Table 2 below, there is the significance of

attribute relative to the core a2 and the

relative reducts, the Algorithm in 7,CEBARKCC

in 3, Algorithm 2 in 12, and the proposed

Algorithm are denoted by A1, A2, A3, and A4

respectively, and let m, n be the number of

attributes and universe respectively. - From Table 2, the significance of attribute in

3, 7 a4 is relatively minimum, and their

reducts are a1, a2, a3, a5, rather than the

minimum relative reduct a2, a4, a5. However,

the SGF(a4, a2,D) is relatively maximum. Thus

we get the minimum relative reduction a2, a4,

a5 generated by A3 and A4. Compared with A1 and

A2, the new proposed algorithm does not need much

mathematical computation, logarithm computation

in particular. Meanwhile, we know that the

general schema of adding attributes is typical

for old approaches to forward selection of

attributes although they are using different

evaluation measures, but it is clear that on the

basis of U/RD, the proposed decision power is

feasible to discuss the roughness of rough sets.

Hence, the new heuristic information will

compensate for the proposed limitations of those

current algorithms. Therefore, this algorithms

effects on reduction of knowledge are well

remarkable.

- Here we choose six discrete data sets from UCI

repository and five algorithms to do more

experiments on PC (P4 2.6G, 256M RAM, WINXP)

under DK1.4.2 in Table 3 below, where T or F

indicates that the data sets are consistent or

not, m, n are the number of primal attributes and

after reduction respectively, t is the time of

operation, and A5 denotes the algorithm in 6.

Conclusion

- In this paper, to reflect the change of decision

quality objectively, a measure for reduction of

knowledge and its judgment theorem with an

inequality are established by introducing the

decision power from the algebraic point of view.

To compensate for these current disadvantages of

classical algorithms, we design an efficient

complete algorithm for reduction of knowledge

with the time complexity reduced to O(C2U)

(In preprocessing, the complexity for computing

U/C based on radix sorting is cut down to

O(CU), and the complexity for measuring

attribute importance based on the positive region

is descended to O(C PU - U P) (see

9).), and the result of this method is

objective.

References

- 1. Pawlak, Z. Rough Sets and Intelligent Data

Analysis. International Journal of Information

Sciences. 147, 1-12 (2002) - 2. Wang, G.Y. Rough Reduction in Algebra View

and Information View. International Journal of

Intelligent System. 18, 679688 (2003) - 3. Wang, G.Y., Yu, H., Yang, D.C. Decision

Table Reduction Based on Conditional Information

Entropy. Journal of Computers. 25(7), 759-766

(2002) - 4. Miao, D.Q., Hu, G.R. A Heuristic Algorithm

for Reduction of Knowledge. Journal of Computer

Research and Development. 36(6), 681-684 (1999) - 5. Liang, J.Y., Shi, Z.Z., Li, D.Y.

Applications of Inclusion Degree in Rough Set

Theory. International Journal of Computationsl

Cognition. 1(2), 67-68 (2003) - 6. Jiang, S.Y., Lu, Y.S. Two New Reduction

Definitions of Decision Table. Mini-Micro

Systems. 27(3), 512-515 (2006) - 7. Guan, J.W., Bell, D.A. Rough Computational

Methods for Information Systems. International

Journal of Artificial Intelligences. 105, 77-103

(1998)

- 8. Liu, S.H., Sheng, Q.J., Wu, B., et al.

Research on Efficient Algorithms for Rough Set

Methods. Journal of Computers. 26(5), 524-529

(2003) - 9. Xu, Z.Y., Liu, Z. P., et al. A Quick

Attribute Reduction Algorithm with Complexity of

Max(O(CU),O(C2U/C)). Journal of

Computers. 29(3), 391-399 (2006) - 10. Slezak, D. Approximate Entropy Reducts.

Fundam. Inform. 53, 365-390 (2002) - 11. Slezak, D., Wroblewski, J. Order Based

Genetic Algorithms for the Search of Approximate

Entropy Reducts. InWang, G.Y., Liu, Q., Yao,

Y.Y., Skowron, A. (eds.) RSFDGrC 2003, LNCS, vol.

2639, pp. 570. Springer Berlin, Heidelberg (2003) - 12. Liu, Q.H., Li, F., et al. An Efficient

Knowledge Reduction Algorithm Based on New

Conditional Information Entropy. Control and

Decision. 20(8), 878-882 (2005) - 13. Jiang, S.Y. An Incremental Algorithm for

the New Reduction Model of Decision Table.

Computer Engineering and Applications. 28, 21-25

(2005) - 14. Slezak, D. Various Approaches to

Reasoning with Frequency-Based Decision Reducts

A Survey. In Polkowski, L., Lin, T.Y., Tsumoto,

S. (eds.) Rough Set Methods and Applications New

Developments in Knowledge Discovery in

Information Systems. vol. 56, pp. 235-285.

Springer, Heidelberg (2000). - 15. Han, J.C., Hu, X.H., Lin, T.Y. An Efficient

Algorithm for Computing Core Attributes in

Database Systems. LNCS, vol. 2871, pp. 663-667.

Springer (2003).

- THANK YOU VERY MUCH!