Ph'D' Research Proposal Leveraging Operational Data For Intelligent Decision Support in Construction - PowerPoint PPT Presentation

1 / 41
About This Presentation
Title:

Ph'D' Research Proposal Leveraging Operational Data For Intelligent Decision Support in Construction

Description:

... document using Support Vector Machine (SVM) [Caldas et ... ART prediction. ART forecasting. Flowchart for resolution-based outlier mining algorithm ... – PowerPoint PPT presentation

Number of Views:344
Avg rating:3.0/5.0
Slides: 42
Provided by: Fan131
Category:

less

Transcript and Presenter's Notes

Title: Ph'D' Research Proposal Leveraging Operational Data For Intelligent Decision Support in Construction


1
Ph.D. Research ProposalLeveraging Operational
Data For Intelligent Decision Support in
Construction Equipment Management
  • Hongqin Fan
  • Provisional Ph.D. Candidate in
  • Construction Engineering and Management
  • University of Alberta
  • April 24, 2006

2
Agenda
  • Introduction
  • Problem Statement and research motivation
  • Related research
  • Research methodology
  • Data warehousing for equipment management
  • Resolution-based outlier mining algorithm
  • AutoRegression Tree based Prediction
  • AutoRegression Tree based time series forecasting
  • Expected contributions
  • Summary and conclusions

3
Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Definition of Construction Equipment Management
  • Manage equipment resources to maximize return of
    capital investments and satisfy the needs of
    project management in a timely and cost-effective
    manner (Adapted according to Vorster and
    Livermore 1994).
  • Major construction equipment management tasks
  • Corporate level Equipment acquisition, finance,
    life cycle costing
  • Operational level logistics, maintenance and
    repair.

4
Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Recent trends in construction equipment
    management
  • Computerized construction equipment management
  • Increasing automation in data collection and
    control
  • Management of large fleet
  • Equipment acquisition and service market
    diversification.

5
Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Most data are collected and stored in electronic
    format.
  • Commonly cited problems with the current
    equipment data
  • Data problem noisy data, fragmented, difficult
    to retrieve
  • Underutilization of data assets, due to lack of
    advanced computer tools.
  • This research will improve the current situation
    through data warehousing and data mining.

6
Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
The equipment management team has more advanced
tools for decision support in addition to the
summary reports.
7
Introduction
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • The research will achieve the following
    objectives
  • Build a prototype construction equipment data
    warehouse as the enterprise data source for
    decision support. Explore the opportunities and
    challenges at different stages of data
    warehousing, including planning, design and
    implementation, for equipment management
  • Design and test of a novel nonparametric outlier
    mining algorithm for generic problem detection in
    construction equipment data, as well as other
    engineering data.
  • Testing, evaluation and modification of current
    data mining algorithms for decision support in
    construction equipment management
  • Design and implement the prototype intelligent
    equipment management system using integrated
    equipment data warehouse and embedded data mining
    models make recommendations on planning and
    design of an intelligent system.

8
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • An equipment maintenance management system,
    called MTrack, was developed by NSERC/Alberta
    Construction Research Chair and used by Standard
    General Inc. (SGI) since 1997, generating a large
    data collection on equipment management.
  • Based on the case of SGI and common to the
    industry, it is found that a number of problems
    undermine the usability of the collected data
  • Equipment data
  • Data quality issues
  • Scattered data sources
  • Most data are stored in relational database with
    high performance in data storage/updating but
    limited capability in data analysis
  • Equipment data utilization
  • Large amounts, but low rate of utilization for
    decision support in equipment management
  • Lack of advanced computer tools for decision
    support using these data. Data analysis is
    commonly conducted by exporting data to
    statistical tools or spreadsheets.

9
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Relational databases are back-ends of most
    equipment management systems
  • Equipment data stored in relational database are
    considered to be superior than these in
    applications, spreadsheets, or text files for
    analysis
  • Still, equipment data in a relational database
    suffer two drawbacks in terms of data analysis
  • Organized in relational data model, which is
    optimized for data adding/updating, but does not
    perform well in data retrieval
  • Extracting data from equipment database can only
    be performed by database specialists, as a
    results, the user has limited control over what
    data can be extracted

10
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • This research is motivated by the recent
    development in data warehousing and data mining
    techniques
  • Affordable as commercial products
  • Integration with other applications through
    standard communication protocols
  • And their capability of improving data quality,
    data structure and knowledge extraction in terms
    of data utilization.

11
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Data warehousing for Construction Equipment
    Management
  • A warehouse is a subject-oriented, integrated,
    time-variant and non-volatile collection of data
    in support of management's decision making
    process Inmon 1996.
  • Construction equipment data warehouse improves
    data quality by preprocessing, and integrate data
    by pulling needed data from various sources.
  • Construction equipment data warehouse uses
    multidimensional data model and is built around
    the subjects of equipment management, making it
    possible to perform interactive data analysis by
    end users.
  • Construction equipment data warehouse can serve
    as significantly improved data source for
    knowledge discovery and prediction.

12
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Data mining for identification of patterns,
    making prediction and forecasting
  • Data mining is the process of extracting non
    trivial, implicit, previously unknown and
    potentially useful information from large
    collections of data Frawley et al. 1992
  • Data mining can
  • Identify patterns (common patters or unusual
    patterns)
  • Create data mining models for exploration of
    hidden knowledge in data
  • Use model for prediction and forecasting.

These tasks are common problems in decision
making for equipment management.
13
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Data mining models are data-driven, which means
    the data mining algorithms derive models from
    historical data, and can update the model
    dynamically if there are any changes in data.
  • Comparison with Statistical, mathematical and
    simulation models

14
Problem Statement and Research Motivation
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Three research topics on data mining are selected
15
Related Research
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Researches on construction equipment management
    are primarily focused on
  • Automation and robotic technologies
  • Real-time data communications and information
    processing
  • Statistical and analysis modeling for decision
    support in equipment management.
  • Researches in intelligent equipment/assets
    management systems are conducted in the
    maintenance operations of the following
    facilities
  • Power plants,
  • Industrial plants,
  • Military facilities

16
Related Research
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Researches on applying data warehousing
    technology in following areas of construction
    industry
  • Inventory management of construction materials
    Chau et al. 2004
  • Document management for multi-party and
    multi-purposes Ma et al. 2005
  • Data mining techniques are applied in various
    researches and applications of construction
    industry
  • Estimation of construction productivity using
    artificial neural network Lu et al. 2002
  • Construction delay evaluation using C 4.5
    decision tree Soibelman and Kim 2002
  • Classify and quantify cumulative impact of change
    orders on productivity using decision tree Lee
    et al. 2004
  • Automatic classification of construction document
    using Support Vector Machine (SVM) Caldas et al.
    2004
  • Preliminary research is conducted on data
    preprocessing and knowledge discovery from
    construction data Soibelman and Kim 2002

17
Research Methodology
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • The research is based on the case of equipment
    management in Standard General Inc., Alberta,
    Canada.
  • An overview of research scope

18
(No Transcript)
19
Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Procedures for construction equipment data
    warehousing
  • Identify all the data sources containing related
    data for equipment management
  • Extract, transform and load data into the data
    warehouse repository
  • Expose data for interactive data analysis,
    knowledge discovery or reports.

20
Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Procedures for building equipment data warehouse
21
Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Data modeling and architectural design
  • Design of multidimensional data model for each
    subject, using star schema
  • Each data model contains a fact table surrounded
    by a number of dimension tables
  • Descriptive attributes are usually formed in
    hierarchy to allow for data analysis at different
    levels of details.
  • Many data cubes share dimensions in equipment
    data warehouse, use Bus Matrix model, proposed by
    Kimball and Ross2002, for architectural design

22
Research Methodology (Data warehousing for
equipment management)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Measurements of performance in equipment
management
Dimensions describing the measurements in fact
table
Multidimensional data model for subject Repair
Cost
23
Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Outlier mining identifies irregular patterns, or
    inconsistent records from a dataset.
  • Current statistical methods (e.g. multivariate
    outlier detection) and outlier mining algorithms
    do not perform well in engineering applications.
  • An non-parametric outlier mining algorithm based
    on resolution change is proposed in this
    research.

24
Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Why is it possible to identify outliers during
    resolution change?
  • Given a set of data objects, the underlying
    clusters and outliers change when increasing
    or decreasing the resolution of data objects.
  • This makes it possible to identify outliers by
    consecutively changing the resolution of a set of
    data objects and collect pre-defined statistical
    properties.
  • A resolution based outlier factor (ROF) is
    defined for measuring the degree of outlying of a
    data point.
  • Overall procedures for the algorithm

25
Flowchart for resolution-based outlier mining
algorithm
26
Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Preliminary experimental results on a synthetic
dataset
Top-10 outliers detected from a synthetic
200-tuple Dataset
27
Research Methodology(Resolution-based outlier
mining algorithm)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • The research will compare the proposed algorithm
    with the current distance-based outlier mining
    algorithm Knnor and Ng 1998 and Local Density
    Based outlier mining algorithm Breunig et al.
    2000 from the following perspectives
  • Results of experimental tests on synthetic
    datasets and real life equipment datasets
  • Explain the test results from the outlier
    definition and outlier mining algorithms
  • Compare the pros and cons of the three algorithms
    in engineering applications.

28
Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • For numerical target attribute, prediction
    problem in data mining is to estimate the target
    attribute based on a set of known attributes
    (categorical or numerical values).
  • AutoRegression Tree (ART) data mining algorithm
    Meek et al. 2002 is a hybrid algorithm of
    decision tree and multivariate linear regression,
    designed for prediction purpose.
  • Using the training dataset, ART algorithm grows a
    top-down decision tree with a linear regression
    model in each leaf node.

29
Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Information gain is used to select attributes and
    splits for C4.5 decision tree growing Kantardzic
    2003.

30
Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Least square method is used to build the
    multivariate linear regression model in each leaf
    node.
  • An example of ART estimation is to evaluate work
    orders estimated by the equipment superintendent
  • Given the factors of impact, such as equipment
    manufacturer, age, component, repair type,
    estimated hours etc, what is the likely error of
    estimates?

31
Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
Part of the Induced Work Order Evaluation ART
Model
32
Research Methodology(AutoRegression Tree based
prediction)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • The research will evaluate the ART algorithm in
  • Estimate accuracy
  • Interpretability of the derived model
  • Flexibility in solving different problems in
    equipment management.
  • And compare with Categorization and Regression
    Tree (CART) Breiman et al. 1984.

33
Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Time series data is a series of data collected
    over successive increments of time
  • Time series forecasting predicts future values of
    a time series, based on historical observations
    and assuming the current trend continues.

34
Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Traditional statistical methods decompose a time
    series into four basic movements trend, cyclic,
    seasonal, and irregular movements.
  • The basic assumption of time series forecasting
    model is AutoRegression, which assumes the
    current value of a time series depends on its
    previous n observed values.

Noise
  • In its most simple case, use linear regression
    and solve the problem using least square method.

35
Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Current approaches for solution
  • Statistical modeling and forecasting
  • ARMA (AutoRegression Moving Average) is a
    representative statistical approach for modeling
    and forecasting.
  • Neural Network forecasting
  • Use Neural Network (NN) to replace the regression
    model.
  • This research use ART model for forecasting
  • Use AutoRegression Tree (ART) data mining model
    to replace the regression model.

36
Preliminary test results
Forecasting results using ART model for
Predicting monthly equipment repair and
maintenance costs in Standard General Inc.
37
Research Methodology(AutoRegression Tree based
time series forecasting)
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
Equip. data warehousing
RB-outlier mining
ART prediction
ART forecasting
  • Time-series prediction based on ART data mining
    algorithm will be compared with ARMA Statistical
    method and neural network from the following
    perspectives
  • Accuracy of prediction
  • Extensibility and transparency of prediction
    model
  • Pros and cons in system integration for equipment
    management.

38
Expected Contributions
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • This research will provide guidelines for
    applying data warehousing technology to
    construction equipment management for improved
    decision support. These include the
    opportunities, challenges, and suggestions for
    planning and design of an equipment data
    warehouse
  • A novel non-parametric outlier mining algorithm
    is proposed for generic problem detection in both
    equipment management and other engineering
    applications. This will contribute to the body of
    knowledge in data mining community.
  • Current data mining algorithms, such as
    AutoRegression Tree, will be tested, evaluated
    and modified for intelligent decision support in
    construction equipment management. This research
    will report my findings and make recommendations
    on the general application of data mining
    technology in construction equipment management.
  • This research will summarize and make
    recommendations on the architectural design and
    implementation of an intelligent equipment
    management information system using combined data
    warehousing/data mining techniques, to meet
    industrial expectations.

39
Conclusions
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • This research will address data management issues
    and facilitate transformation of data into useful
    information and knowledge which the equipment
    management team can act upon.
  • High level design of intelligent systems for
    equipment management, and detailed study on a
    novel data mining algorithm, evaluation of
    current ART data mining algorithms for
    engineering applications are the focuses of this
    research
  • This research addresses both academic issues and
    real life application issues, therefore it will
    directly benefit the construction industry.

40
References
Introduction
Research Motivation
Related Research
Research Methodology
Expected Contributions
Conclusions
  • Breiman, L., Friedman, J., Olshen, R. and Stone,
    C. (1984). Classification and Regression Trees,
    Chapman Hall/CRC Press, Boca Raton, FL.
  • Breunig, M., Kriegel, H., Ng, R., and Sander, J.
    2000. LOF identifying density-based local
    outliers, Proceedings of ACM SIGMOD 2000
    International Conference on Management of Data,
    Dalles, TX, USA
  • Caldas, C. H., Soibelman, L. and Han J. (2002)
    Automated Classification of Construction Project
    Documents. ASCE Journal of Computing in Civil
    Engineering, 16(4), 234-243
  • Chau, K.W., Cao, Y., Anson, M., and Zhang J.
    2002. Application of Data Warehouse and Decision
    Support System in Construction Management.
    Automation in Construction, 12 213224.
  • Frawley, w., Piatetsky-Shapiro, G. and Matheus,
    C. (1992). Knowledge Discovery in Databases An
    Overview. AI Magazine, Fall 1992, pp. 213-228.
  • Inmon, W.H. (1996). Building the Data Warehouse.
    John Wiley Sons, New York.
  • Kantardzic, M. (2003). Data Mining Concepts,
    Models, Methods, and Algorithms. John Wiley
    Sons, Inc. NJ. USA.
  • Kimball, R. and Ross, M. 2002. The Data Warehouse
    Toolkit The Complete Guide to Dimensional
    Modeling, second edition, John Wiley Sons,
    Inc., New York, pp. 1388.
  • Knorr, E., and Ng, R. (1998) Algorithms for
    mining distance-based outliers in large
    datasets. Proceedings of Very Large Data Bases
    Conference, New York, USA
  • Lee, M., Hanna, A.S. and Loh, W.Y. (2004).
    Decision Tree Approach to Classify and Quantify
    cumulative Impact of Change Orders on
    Productivity. J. Comp. in Civ. Engrg., ASCE,
    18(2), 132
  • Lu, M., AbouRizk, S.M. and Hermann U.H. (2002).
    Estimating labor productivity using probability
    inference neural network J. Comp. in Civ.
    Engrg., ASCE, 14(4), 241-248
  • Ma, Z., Wond, K.D., Heng, L. and Jun Y. (2005)
    Utilizing exchanged documents in construction
    projects for decision support based on data
    warehousing technique. Automation in
    Construction, 14(3), 405-412
  • Meek, C., Chickering, D.M. and Heckerman, D.
    (2002). Autoregressive Tree Models for
    Time-Series Analysis. Proceedings of the 2nd
    SIAM International Conference on Data Mining,
    Arlington, VA, USA.
  • Soibelman, L and Kim, H. (2002). Data
    Preparation Process for Construction Knowledge
    Generation through Knowledge Discovery in
    Databases. ASCE Journal of Computing in Civil
    Engineering, 16(1), 39-48
  • Vorster M. C. and Livermore M. E. (1994).
    Executive development for equipment managers.
    Proceedings of conference Equipment Resource
    Management into the 21st Century. Nashville,
    Tennessee, pp 87 95of conference Equipment
    Resource Management into the 21st Century.
    Nashville, Tennessee, pp 87 95

41
H. Fan, Prov. Ph.D. Candidate
University of Alberta
Write a Comment
User Comments (0)
About PowerShow.com