Classification Methods for Data Mining: Tasks, Issues - PowerPoint PPT Presentation

1 / 72
About This Presentation
Title:

Classification Methods for Data Mining: Tasks, Issues

Description:

SPOT Image of Calcutta in the Near Infra Red Band. Garden Reach Lake. Hooghly ... Classified SPOT image of Calcutta (zooming the race course R' only) using (a) ... – PowerPoint PPT presentation

Number of Views:1144
Avg rating:3.0/5.0
Slides: 73
Provided by: pc7562
Category:

less

Transcript and Presenter's Notes

Title: Classification Methods for Data Mining: Tasks, Issues


1
Classification Methods for Data Mining Tasks,
Issues Challenges
  • Sankar K. Pal
  • Indian Statistical Institute
  • Calcutta
  • http//www.isical.ac.in/sankar

2
  • Contents
  • Pattern Recognition and Machine Intelligence
  • Relevance of FL, ANN and GAs
  • Different Integrations of Soft Computing Tools
  • Emergence of Data Mining
  • Need
  • KDD Process
  • Relevance of Soft Computing Tools with Examples
  • Rough Sets and Granular Computing
  • Information granules
  • Rough set rules

3
  • Rough-Fuzzy Case Generation
  • What is Case Based Reasoning?
  • Case Generation with Granules
  • Fuzzy Granulation
  • Mapping Rough Set Dependency Rules to Cases
  • Case Retrieval
  • Experimental Results and Merits
  • Other Applications
  • Rough Self-Organizing Map
  • Rough Cases with EM and MST for Multi-Spectral
    Image Segmentation
  • Conclusions

4
Hybrid Systems
  • Neuro-fuzzy
  • Genetic neural
  • Rough fuzzy
  • Fuzzy neuro
  • genetic

Knowledge-based Systems
  • Probabilistic reasoning
  • Approximate reasoning
  • Case based reasoning

Data Driven Systems
Machine Intelligence
  • Neural network
  • system
  • Evolutionary
  • computing
  • Fuzzy logic
  • Rough sets

Non-linear Dynamics
  • Chaos theory
  • Rescaled range
  • analysis (wavelet)
  • Fractal analysis
  • Pattern recognition
  • and learning

Machine Intelligence A core concept for
grouping various advanced technologies with
Pattern Recognition and Learning
5
Pattern Recognition System (PRS)
  • Measurement ? Feature ? Decision
  • Space Space Space
  • Uncertainties arise from deficiencies of
    information available from a situation
  • Deficiencies may result from incomplete,
    imprecise, ill-defined, not fully reliable,
    vague, contradictory information in various
    stages of a PRS

6
M Height, Weight, Complexion, Diet.
D ? Classifier Design
7
(No Transcript)
8
Clustering
Mother
Father
Daughter
Son
9
Tasks Challenges
  • Classification Sampled data are given about the
    pattern space And the Challenge is to estimate
    the unknown regions of the pattern space based on
    the sampled data (incomplete information)
  • Clustering Entire data is given And the
    Challenge is to partition it into meaningful
    regions. The number of regions may be known or
    unknown

10
Clustering Some Points
  • More than 50 literature in PR research is
    related to Clustering
  • Still unsolved and provides open problems
  • (e.g., Cluster validity, Indexes)
  • Acts like a basic module in decision-making and
    machine learning problems particularly for mining
    large data sets in unsupervised mode
  • (e,g., prototype selection, feature
    selection, data condensation/compression)
  • Significance in Bioinformatics and Web data

11
Relevance of Fuzzy Sets in PR
  • Representing linguistically phrased input
    features for processing
  • Representing multi-class membership of ambiguous
    patterns
  • Generating rules inferences in
  • linguistic form
  • Extracting ill-defined image regions, primitives,
    properties and describing relations among them as
    fuzzy subsets

12
ANNs provide Natural Classifiers having
  • Resistance to Noise,
  • Tolerance to Distorted Patterns /Images (Ability
    to Generalize)
  • Superior Ability to Recognize Overlapping Pattern
    Classes or Classes with Highly Nonlinear
    Boundaries or Partially Occluded or Degraded
    Images
  • Potential for Parallel Processing
  • Non parametric

13
Why GAs in PR ?
  • Methods developed for Pattern Recognition and
    Image Processing are usually problem dependent.
  • Many tasks involved in analyzing/identifying a
    pattern need Appropriate Parameter Selection and
    Efficient Search in complex spaces to obtain
    Optimal Solutions
  • Makes the processes
  • - Computationally Intensive
  • - Possibility of Losing the Exact Solution

14
  • GAs Efficient, Adaptive and robust Search
    Processes, Producing near optimal solutions and
    have a large amount of Implicit Parallelism
  • GAs are Appropriate and Natural Choice for
    problems which need Optimizing Computation
    Requirements, and Robust, Fast and Close
    Approximate Solutions

15
Role of GAs
  • Robust, parallel, adaptive search methods
    suitable when the search space is large.
  • Used more in Prediction (P) than Description(D)
  • D Finding human interpretable patterns
    describing the data
  • P Using some variables or attributes in the
    database to predict unknown/ future values of
  • other variables of interest.

16
Example of GA based Classification
  • Automatic selection of no. of hyper planes for
    approximating class boundaries for minimum
    miss-classification (VGA classifier)
  • Chromosome (sexual) discrimination to reduce
    computation time (VGACD classifier)
  • Robust Searching Ability (suitable when the
    search space is large)

17
SPOT Image of Calcutta in the Near Infra Red Band
(spatial resolution 20m x 20m wavelength
0.79mm-0.89mm)
Garden Reach Lake
IEEE Trans. Geosci. Remote Sensing, 39(2),
303-308, 2001
Intl. J. Remote Sensing, 22(13), 2545-2569, 2001
18
Scatter plot of the training set of SPOT image of
Calcutta, containing seven classes.
19
(f)
(d)
(e)
Classified SPOT image of Calcutta (zooming the
race course R only) using (a)
VGACD-Classifier, Hmax15, final value of H13,
(b) VGA classifier, Hmax15, final value of H10,
(c) Bayes maximum likelihood Classifier, (d) k-NN
rule, k1, (e) k-NN rule, k3, (f) k-NN rule,
ksqrt(n).
IEEE Trans. Geosci. Remote Sensing 39(2),
303-308, 2001
20
IEEE Trans. Geosci. Remote Sensing 39(2),
303-308, 2001
Variation of the number of points misclassified
by the best Chromosome with generations for
VGACD classifier and VGA classifier
21
Fuzzy Logic, Neuro-computing and Genetic
Algorithms are the major components of what
is called Soft Computing where these tools
work synergistically
22
Role of Major Soft Computing Components
FL algorithms for dealing with imprecision and
uncertainty NC machinery for
learning and curve fitting GA algorithms for
search and optimization
handling uncertainty arising from the granularity
in the domain of discourse
23
  • Exploit the tolerance for
  • imprecision
  • uncertainty
  • approximate reasoning
  • partial truth
  • to achieve Tractability, Robustness, low cost
    solution and close resemblance with human like
    decision making
  • Provides Flexible Information Processing
    Capability for representation and evaluation of
    real life ambiguous/ uncertain situations.

24
  • It may be argued that it is soft computing
    rather than hard computing that should be viewed
    as the foundation for Artificial Intelligence.

25
  • Relevance of FL, ANN, GAs Individually
  • to PR Problems is Established

26
Integration of Soft Computing Tools
27
In late eighties scientists thought Why NOT
Integrations ?
Fuzzy Logic ANN ANN GA Fuzzy Logic ANN
GA Fuzzy Logic ANN GA Rough Set
Neuro-fuzzy hybridization is the most
visible integration realized so far.
28
Why Fusion
Fuzzy Set theoretic models try to mimic human
reasoning and the capability of handling
uncertainty (SW) Neural Network models attempt
to emulate architecture and information
representation scheme of human brain (HW)
NEURO-FUZZY Computing
(for More Intelligent System)
29
FUZZY SYSTEM
ANN used for learning and Adaptation
NFS
ANN
Fuzzy Sets used to Augment its Application
domain
FNN
30
Merits and Challenges
  • GENERIC
  • APPLICATION SPECIFIC

31
Rough-Fuzzy Hybridization
  • Fuzzy Set theory assigns to each object a degree
  • of belongingness (membership) to represent an
  • imprecise/vague concept.
  • The focus of rough set theory is on the
    ambiguity
  • caused by limited discernibility of objects
    (lower
  • and upper approximation of concept).

Rough sets and Fuzzy sets can be integrated to
develop a model of uncertainty stronger than
either.
32
Rough Fuzzy Hybridization A New Trend in
Decision Making, S. K. Pal and A. Skowron (eds),
Springer-Verlag, Singapore, 1999
33
Neuro-Rough Hybridization
  • Rough set models are used to generate network
  • parameters (weights).
  • Roughness is incorporated in inputs and output
    of
  • networks for uncertainty handling, performance
  • enhancement and extended domain of application.
  • Networks consisting of rough neurons are used.
  • Neurocomputing, Spl. Issue on Rough-Neuro
    Computing, S. K. Pal,
  • W. Pedrycz, A. Skowron and R. Swiniarsky (eds),
    vol. 36 (1-4), 2001.

34
Challenges (e.g., RN and RF)
  • Improve performance
  • Reduce network learning time
  • Reduce network size (Compact Network)
  • Preserving identity of clusters irrespective of
    their sizes
  • Stronger model of uncertainty handling
  • Reduce computation time

35
Example of Compact Network
Connectivity of the network obtained for
six-class vowel recognition using Modular Rough
Fuzzy MLP
36
  • Rough-Neural Computing Techniques for Computing
    with Words, S.K. Pal, L. Polkowski and A. Skowron
    (eds.), Springer, Heidelberg, 2003.

37
  • Neuro-Rough-Fuzzy-Genetic Hybridization
  • Rough sets are used to extract domain knowledge
    in the form of linguistic rules
    generates fuzzy Knowledge based networks
    evolved using Genetic algorithms.
  • Integration offers several advantages like fast
    training, compact network and performance
    enhancement.

38
IEEE TNN, 9, 1203-1216, 1998
Incorporate Domain Knowledge using Rough Sets
39
  • Data Mining
  • Today PR activity remains incomplete without the
    mention of its significance to DM
  • DM from Pattern Recognition Machine Learning
    Perspectives
  • (DBMS, Statistical)

40
One of the applications of Information Technology
that has drawn the attention of researchers is
DATA MINING where Pattern Recognition/Image
Processing/Machine Intelligence are directly
related.
41
Why Data Mining ?
IEEE Trans. Neural Networks, 13(1), 3-14, 2002
  • Digital revolution has made digitized information
    easy to capture and fairly inexpensive to store.
  • With the development of computer hardware and
    software and the rapid computerization of
    business, huge amount of data have been collected
    and stored in centralized or distributed
    databases.
  • Data is heterogeneous (mixture of text, symbolic,
    numeric, texture, image), huge (both in
    dimension and size) and scattered.
  • The rate at which such data is stored is growing
    at a phenomenal rate.

42
  • As a result, traditional ad hoc mixtures of
    statistical techniques and data management tools
    are no longer adequate for analyzing this vast
    collection of data.

43
  • Pattern Recognition and Machine Learning
  • principles applied to a very large (both in size
  • and dimension) heterogeneous database
  • ? Data Mining
  • Data Mining Knowledge Interpretation
  • ?
    Knowledge Discovery
  • Process of identifying valid, novel, potentially
  • useful, and ultimately understandable patterns
  • in data

44
Pattern Recognition, World Scientific, 2001
Data Mining (DM)
  • Data
  • Cleaning

Machine Learning
Knowledge Interpretation
  • Data
  • Condensation

Mathe- matical Model of
Preprocessed
Useful
Huge Raw Data
  • Knowledge
  • Extraction
  • Knowledge
  • Evaluation
  • Dimensionality
  • Reduction

Knowledge
  • Classification
  • Clustering
  • Rule
  • Generation

Data
Data (Patterns)
  • Data
  • Wrapping/
  • Description

Knowledge Discovery in Database (KDD)
45
Data Mining Algorithm Components
  • Model Function of the model (e.g.,
    classification, clustering, rule generation) and
    its representational form (e.g., linear
    discriminants, neural networks, fuzzy logic, GAs,
    rough sets).
  • Preference criterion Basis for preference of
    one model or set of parameters over another.
  • Search algorithm Specification of an algorithm
    for finding particular patterns of interest (or
    models and parameters), given the data, family of
    models, and preference criterion.

46
Why Growth of Interest ?
  • Falling cost of large storage devices and
    increasing ease of collecting data over networks.
  • Availability of Robust/Efficient machine learning
    algorithms to process data.
  • Falling cost of computational power ? enabling
    use of computationally intensive methods for data
    analysis.

47
Applications
  • Financial Investment Dynamic huge data of stock
    indices and prices, interest rates, credit card
    information, fraud detection
  • Health Care Diverse diagnostic information
    stored by hospital management systems
  • WWW Vast collection of uncontrolled, diverse
    dynamic documents
  • Bio-informatics Heterogeneous data base of gene
    sequence, protein structures, micro arrays, gene
    expressions with imprecise/partial information
  • Data is heterogeneous (mixture of text, symbolic,
    numeric, texture, image) and huge (both in
    dimension and size)

48
Example Medical Data
  • Numeric and textual information may be
    interspersed
  • Different symbols can be used with same meaning
  • Redundancy often exists
  • Erroneous/misspelled medical terms are common
  • Data is often sparsely distributed

49
Example Web Mining
Discovery/ analysis of useful information from WWW
Characteristics of web data
  • Unlabelled
  • Distributed
  • Heterogeneous (mixed media)
  • Semi-structured
  • Time varying
  • High dimensional
  • Web mining deals with large hyper-linked
    information having these characteristics with
    Interactive Medium (Human Interface)

50
Issues arising out of Human Interface
  • Need for handling context sensitive and imprecise
    queries
  • Need for summarization and deduction
  • Need for personalization and learning
  • Web mining, though considered an application of
    DM, warrants a separate field of research because
    of these characteristics and human related issues

51
Example Human Genome Data
  • Laboratory operations on DNA inherently involve
    errors
  • Heterogeneous data base of gene sequence, protein
    structures, micro arrays, gene expressions
  • Partial/incomplete information

52
  • Robust preprocessing system is required to
    extract any kind of knowledge
  • The data must not only be cleaned of errors and
    redundancy, but organized in a fashion that makes
    sense for the problem

53
  • So, We NEED
  • Efficient
  • Robust
  • Flexible
  • Machine Learning Algorithms
  • ?
  • NEED for Soft Computing Paradigm

54
Role of Fuzzy Sets
  • Modeling of imprecise/qualitative
    knowledge
  • Transmission and handling uncertainties at
    various stages
  • Supporting, to an extent, human type
  • reasoning in natural form

55
  • Classification/ Clustering
  • Discovering association rules (describing
    interesting association relationship among
    different attributes)
  • Inferencing
  • Data summarization/condensation (abstracting the
    essence from a large amount of information).

56
Role of ANN
  • Adaptivity, robustness, parallelism, optimality
  • Machinery for learning and curve fitting (Learns
    from examples)
  • Initially, thought to be unsuitable for black
    box nature no information available in symbolic
    form (suitable for human interpretation)
  • Recently, embedded knowledge is extracted in the
    form of symbolic rules making it
    suitable for Rule generation.

57
IEEE Trans. Knowledge Data Engg., 15(1), 14-25,
2003
Example Modular Rough-Fuzzy Evolutionary MLP
  • Enhances
  • Classification Performance
  • Training time
  • Network compactness
  • Generates Rules of
  • Higher accuracy
  • Smaller size
  • Less confusion

58
Knowledge Flow in Modular Rough Fuzzy MLP
IEEE Trans. Knowledge Data Engg., 15(1), 14-25,
2003
Feature Space
Rough Set Rules
C1
(R1)
Network Mapping
C1
F2
C2(R2)
C2(R3)
F1
R1 (Subnet 1)
R2 (Subnet 2)
R3 (Subnet 3)
Partial Training with Ordinary GA
Feature Space
SN1
(SN2)
(SN1)
(SN3)
F2
SN2
Partially Refined Subnetworks
SN3
F1
59
Concatenation of Subnetworks
high mutation prob.
low mutation prob.
Evolution of the Population of Concatenated
networks with GA having variable mutation operator
Feature Space
C1
F2
C2
Final Solution Network
F1
60
Vowel Data
61
Speech Data 3 Features, 6 Classes
Classification Accuracy
62
Training Time (hrs) DEC Alpha
Workstation _at_400MHz
63
Network Size (No. of Links)
64
1. MLP 4.
Rough Fuzzy MLP 2. Fuzzy MLP
5. Modular Rough Fuzzy MLP 3. Modular
Fuzzy MLP
Results for Speech data
65
Connectivity of the network obtained using
Modular Rough Fuzzy MLP
66
Without Soft Computing Machine Intelligence
and Data Mining Research Remains Incomplete.
67
Rough Sets and Granular Computing
68
  • Rough Sets
  • Offer math tools to discover hidden patterns in
    data
  • Offer learning systems to discover redundancies
    and dependencies between the given features of
    data
  • Approximate a given concept both from below and
    from above, using lower and upper approximations
  • Offer learning algorithms to obtain rules in
    IF-THEN form from a decision table w.r.t. objects
    and attributes
  • Extract Knowledge from data base (decision table
    ? remove undesirable attributes ? analyze data
    dependency ? minimum subset of attributes
    (reducts))

69
Z. Pawlak 1982, Int. J. Comp. Inf. Sci
Rough Sets
Upper Approximation BX
Set X
Lower Approximation BX
xB (Granules)
.
x
xB set of all points belonging to the same
granule as of the point x
in feature space WB.
xB is the set of all points which are
indiscernible with point x in terms of feature
subset B.
70
Approximations of the set
w.r.t feature subset B
B-lower BX
Granules definitely belonging to X
B-upper BX
Granules definitely and possibly belonging to X
If BX BX, X is B-exact or B-definable Otherwise
it is Roughly definable
Rough Sets are Crisp Sets, but with rough
description
71
Rough Sets
Uncertainty Handling
Granular Computing
(Using information granules)
(Using lower upper approximations)
72
Information Granules A group of similar objects
clubbed together by an indiscernibility
relation Granular Computing Computation is
performed using information granules and not the
data points (objects)
Information compression Computational gain
73
Information Granules and Rough Rules
F2
high
medium
low
low
medium
high
F1
Rule
  • Rule provides crude description of the class
    using
  • granule

74
  • Note
  • For non-convex clusters, there would be more than
    one granule or rough rule to represent it crudely
  • Unsupervised No. of granules is determined
    automatically
  • Granules/ rules may be viewed as Cases
  • All features may not occur in a rule
  • Cases may be represented by Different Reduced
    number of features.

75
  • Case Selection ? Cases belong to the set of
    examples encountered.
  • Case Generation ? Constructed Cases need not be
    any of the examples.

76
Granular Computing and Case Generation
  • Cases Informative patterns (prototypes)
    characterizing the problems.
  • In rough set theoretic framework
  • Cases ? Information Granules
  • In rough-fuzzy framework
  • Cases ? Fuzzy Information Granules

77
Case Generation Characteristics and Merits
  • Cases are cluster granules, not sample points
  • Involves only reduced number of relevant
  • features with variable size
  • Less storage requirements
  • Fast retrieval
  • Suitable for mining data with large dimension
    and size

78
Fuzzy (F)-Granulation
mlow
mmedium
mhigh
1
Membership value
0.5
cM
cH
cL
Feature j
lL
lM
p-function
79
Example IEEE Trans.
Knowledge Data Engg., 16(3), 292, 2004
F2
CASE 1
0.9
Note All features may not occur in a rule
0.4
X X X X X X X X X

CASE 2
0.2
0.7
0.1
0.5
F1
Parameters of fuzzy linguistic sets low, medium,
high
80
Case Retrieval
  • Similarity (sim(x,c)) between a pattern x and a
    case c is defined as
  • n number of features present in case c

81
Iris Flowers 4 features, 3 classes, 150 samples
Number of cases 3 (for all methods)
82
Forest Cover Types 10 features, 7 classes,
5,86,012 samples
Number of cases 545 (for all methods), GIS
(cartographic RS measurements)
83
Hand Written Numerals 649 features, 10 classes,
2000 samples
Number of cases 50 (for all methods),
Collection of Dutch Utility Map
84
Applications of Rough Granules
  • Case Based Reasoning (evident is sparse)
  • Prototype generation and class representation
  • Clustering Image segmentation (k selected
    autom)
  • Case representation and indexing
  • Knowledge encoding
  • Dimensionality reduction
  • Data compression and storing
  • Granular information retrieval

85
Certain Issues
  • Selection of granules and sizes
  • Fuzzy granules
  • Granular fuzzy computing
  • Fuzzy granular computing

86
Rough Set Knowledge Encoding, EM MST for
Multi-spectral Image Segmentation
87
EM Algorithm
  • Handles uncertainty out of overlapping classes
  • Number of clusters (k) needs to be known
  • Solution depends strongly on initial conditions
  • Models only convex clusters
  • Minimal Spanning Tree (MST) Clustering
  • Can model Non-convex clusters, but time consuming

Rough Set Theoretic Knowledge Encoding
  • Automatically determines the number of clusters
    k
  • Provides good initialization
  • (avoidance of local minima, fast convergence)
  • Granular computing

RS Knowledge Encoding EM MST
88
Band 1
Band 2
Intl. mixture model param.
Refined mixt. model param.
Final Clusters
Granulated n dimen. image space
Gray-level thresholding of individual bands
Segmented Multi-spectral Image
Band 3

Mapping Rules to Distribution Parameters
EM
MST
Band n
Rule Generation
Input Multi-spectral Image Bands
89
Multi-Spectral IRS Image of Calcutta
(Spatial resolution 36.25 m X 36.25 m,
wavelengths 0.77-0.86mm)
Band 2
Band 1
Band 3
Band 4
90
Quantitative Index b Measuring Segmentation
Quality (IRS-1A image of Calcutta, No. of
bands 4 )
Final no. of clusters (land cover type) 5
EM/KM Random initialization EM/K-means, REM/RKM
Rough set theoretic initialization
EM/K-means, KMEM K-means initialization EM,
EMMST Random init. EM MST FKM Fuzzy
K-means, REMMST Rough set init. EM MST
91
Computation Time (seconds)
92
Segmented image of Calcutta using EM algorithm
with random initialization (EM) b
5.91, No. of Clusters 5
93
Segmented image of Calcutta using EM algorithm
with Rough set theoretic initialization and MST
clustering (REMST) b 7.37, No. of Clusters
5
94
  • Related Subsequent Work
  • Unsupervised case generation Rough-SOM
  • (Applied Intelligence, 21(3), 289-299,
    2004)
  • Application to multi-spectral image segmentation
  • (IEEE Trans. Geoscience and Remote Sensing,
    40(11), 2495-2501, 2002)
  • Rough case-based reasoner for text categorization
  • (Int. J. Approx. Reasoning, 41, 229-255,
    2006)
  • Building CBR classifiers combining both feature
    reduction and case selection
  • (IEEE Trans. Knowledge and Data Engg.,
    18(3), 415-429, 2006)
  • Bioinformatics in Neurocomputing Framework
  • (IEE Proc. Circuits, Devices and Systems,
    152 (5), 556-564, 2005)
  • Evolutionary computation in Bioinformatics A
    Review
  • (IEEE Trans. Syst., Man and Cyberns. Part C,
    36(5), 601-615, 2006)
  • Rough-fuzzy c-medoids algorithms and selection of
    biobasis for amino acid sequence analysis
  • (IEEE Trans. Knowledge Data Engg., 19(6),
    859-872, 2007)

95
Some Challenges of Data Mining
  • Multimedia mining and retrieval, that involves
    simultaneous manipulation of heterogeneous data
    like text, image, audio, video, etc.
  • Data stream mining, for handling a sequence of
    digitally encoded coherent signals that is in
    transmission. This has implication to the
    Internet service providers.
  • Biological data mining, encompassing sequence,
    structure and high-dimensional data.
  • Scalability issues
  • Real time processing of time dependent data
    stream
  • Ensembling for distributed data mining (modular
    approach), including classification and
    clustering.
  • Quantitative Indices
  • CTP Computational Theory of Perception

96
S K Pal and S C K Shiu, Foundations of Soft
Case-Based Reasoning, Wiley, N.Y., 2004
97
SK Pal and P Mitra, Pattern Recognition
Algorithms for Data Mining, CRC/ Chapman Hall,
Florida, 2004
98
  • S Bandyopadhyay and S K Pal,
  • Classification and Learning Using
  • Genetic Algorithms Applications in
    Bioinformatics and Web Intelligence, Springer,
    Heidelberg, 2007

99
About the Soft Computing Center at ISI
http//www.isical.ac.in/scc
100
Objectives
  • The center will focus mainly on basic research
    and, to some extent, on manpower development
    keeping in mind that the research excellence is
    the main objective. The activities of the center
    will include
  • (a) conducting basic research in pattern
    recognition, image processing, computer vision,
    neural networks, genetic algorithms, wavelets,
    support vector machines, data mining, hybrid
    techniques, rough sets, video image processing,
    fractals etc.
  • (b) demonstrating applications to some focused
    areas like web mining (e.g., page ranking,
    personalization etc.), bioinformatics (e.g.,
    protein structure analysis), medical image (e.g.,
    ultrasonographic and MRI) analysis, and VLSI
    layout design, to be decided time to time,

101
  • (c) developing manpower (i) imparting  training
    to researchers/students from industry and
    academia including RD labs (ii) disseminating
    teaching and training material for distance
    education using multimedia and video facilities
    and (iii) offering regular short term advanced
    courses on upcoming  research areas,
  • (d) organizing seminars/workshops/schools by
    eminent faculty from abroad and India 
  • (e) providing a forum of exchanging ideas or
    establishing a linkage among scientists of
    leading institutions and industry working in
    similar areas by inviting interested faculty/
    research personnel,
  • (f) providing fellowships for helping faculty and
    scholars from less endowed institutions,
    especially from neighboring regions.

102
Collaboration with CIMPA, France
  • International Center for Pure and Applied
    Mathematics (CIMPA), France has recently
    partially supported an International Workshop on
    Soft Computing Approaches to Pattern Recognition
    and Image Processing organized by the Machine
    Intelligence Unit of ISI. They have promised to
    support similar endeavors of the center in future
    by providing travel support to foreign delegates.

103
Mechanism for collaborative projects
  • Since research excellence is the main object of
    the center, the collaborative projects would
    focus mainly on research.
  • At least one investigator of the center or a
    faculty of ISI deputed to the center would be
    involved in such a project.
  • Merits of the project proposals, routed through
    proper channel, will be evaluated.
  • Infrastructural facilities will be provided by
    the center.
  • Travel expense and local hospitality (in the form
    of fellowship) of the visitors will be borne by
    the center.
  • Less endowed Institutes will be given due
    preference.

104
Thank You!!
Write a Comment
User Comments (0)
About PowerShow.com