Component Search and Retrieval - PowerPoint PPT Presentation

About This Presentation
Title:

Component Search and Retrieval

Description:

University of British Columbia. Software Practices Lab. The Problem: A ... IEEE Computer Society, Washington, DC, 99. [Ye, 2002] Y. Yunwen and G. Fischer. ... – PowerPoint PPT presentation

Number of Views:17
Avg rating:3.0/5.0
Slides: 61
Provided by: eduardocru
Category:

less

Transcript and Presenter's Notes

Title: Component Search and Retrieval


1
Component Search and Retrieval
Advanced Reuse Seminars Eduardo Cruz
2
Information Retrieval - 1948
  • Structured Documents
  • Unstructured Documents
  • No software documentation standard
  • Semi-Structured Documents

Calvin Northrup Mooers
3
Mooers' Law An information retrieval
system will tend not to be used whenever it is
more painful and troublesome for a customer to
have information than for him not to have it,
1959
Calvin Northrup Mooers
4
Mass Production Software components
Mcllroy, 1968
5
  • software industry is weakly founded,
  • and that one aspect of this weakness
  • is the absence of a
  • software components subindustry
  • McIlroy, 1968

6
  • The storage and retrieval of software assets
  • is nothing but a specialized form of
  • information storage and retrieval
  • Mili, 1998

7
Software Library
  • Browsing Inspecting without a predefined
    criterion
  • Retrieval Satisfy a predefined matching
    criterion

8
Classification Scheme
  • Facet-based
  • Better than hierarchical classification
  • Manual classification different facets
  • Automatic classification
  • Controlled Vocabulary
  • Semantic information
  • Uncontrolled Vocabulary
  • Big software libraries
  • Little or no descriptors

9
Recall and Precision
  • High Precision Most retrieved elements are
    relevant
  • High Recall Few elements left behind
  • Spreading Activation (Relaxed Search) Related
    matches are retrieved
  • Coverage The average number of assets that are
    visited over the total size of the library

10
Asset Representation
  • Library representation is made in full knowledge
    of the artifact. User representation is made in
    ignorance of the artifact
  • Asset representation is purposefully abstract to
    capture important features while overlooking
    miner or irrelevant details
  • Asset's surrogate is used in retrieval literature

11
Asset retrieval Goals
  • Exact retrieval Black box reuse
  • Approximate retrieval White box reuse
  • Generative modification Reusing the design
  • Compositional modification using building
    blocks of the retrieved asset

12
Usually non included information
  • Interface description
  • Non-functional requirements
  • Interoperability

13
Situational Model x System Model
Component retrieval model Lucrédio et. al, 2004
14
  • Repository representation is made
  • in full knowledge of the artifact at hand
  • User representation is made
  • in ignorance of the artifact
  • Mili, 1998

15
Scott Henninger
16
Tools
17
Component Search Tools
  • Web
  • Delphi Search Engine
  • Ispey
  • CSourceSearch.net (2004)
  • Gonzui
  • SourceBank
  • Koders (2004)
  • Codase (2005)
  • Aplications
  • Agora (1998)
  • Codebroker (2002)
  • Koders Enterprise (2004)
  • Maracatu (2005)

18
(No Transcript)
19
Delphi Search Engine
20
Ispey.com
21
SPARS-J (2003)
22
SourceBank
Filter
23
CSourceSearch.Net (2004)
24
Koders.com (2004)
25
CODASE Launched Sep 9, 2005
Multiple Search Options
Example Searches
Browsing
based on the number of people in your company,
starting from 5,000 USD
26
CODASE - Browsing
27
Other Tools
28
(No Transcript)
29
AGORA - Location and Indexing (1998)
INTERNET
INDEX
AltaVistaSearchIndex Server
Filter
AltaVista Query Server
Web Server
30
Component Rank (1998)
0.2
0.4
0.2
V1
V2
D12 0.5
D13 0.5
0.2
D23 1
0.2
0.4
V3
Nodes v Edges e Graph G Weight w Distribution
Ratio d
D31 1
0.4
31
  • Classes defining data structures and their
    containers are highly ranked

32
Clustered Component Graph
V7
V3
V26
V14
V5
V1 V4 , V2 V6
33
  • NO MORE
  • MULTIPLE
  • DISCONNECTED
  • COMPONENTS

34
Component Rank System Architecture
.java file component
INPUT
(1) Similarity Measurement
(3) Use Relation Extraction
(2) Clustering
(4) Component Graph Construction
(6) De-Clustering to Original Component Graph
(5) Component Rank Computation by Repetition
OUTPUT
Order of Weights Component Rank of .java files
35
Simple Copied Components
1/4
1/4
Copied Components
Other Components
1/4
1/4
Clustering Before Weight Computation
1/6
1/3
Non-clustered component Graph
1/6
1/3
Clustering After Weight Computation
36
  • DO NOT COUNT
  • SIMPLY DUPLICATED
  • COMPONENTS

37
Copied AND MODIFIED Components
1/5
2/5
A
Copied and Modified Components
Other Components
Original Components
B
C
1/5
1/5
1/5
Clustering Before Weight Computation
1/5
1/3
A
Non-clustered component Graph
B
C
1/6
1/6
1/6
Clustering Before Weight Computation
38
Beyond Searching and Browsing
  • Searching and browsing
  • Require users to initiate the information seeking
    process
  • Information access and Information Delivery

39
CodeBroker (2001)
  • Components repositories are often so large that
    software developers cannot learn about all of the
    components
  • Component repositories are not static
  • New components added
  • Old components updated
  • Context-Aware browsing

40
  • May not have suficient knowledge about the reuse
    repository
  • May perceive that reuse costs more than
    developing from scratch
  • May not be able to use the repository by
    formulating a proper query
  • May not be able to understand the found components

41
Information Islands
L4 Entire Information Space
Belief
Vaguely Known
Well Known
Unknown components
42
CodeBroker
L4 Entire Information Space
L3 Belief
L2 Vaguely Known
L1 Well Known
Information Use L1 Use by Memory L2 Use by
Recall L3 Use by Anticipation L4 Use by
Delivery
Already Known Components
Task Relevant Information
Irrelevant Components
43
Program Aspects
  • Concept
  • Formal
  • Informal
  • Indentation, comments, identifier names
    (semantic)
  • Executability
  • Code
  • Constraint environment
  • Signature

44
Information delivery
  • Feedback
  • After execution of the action
  • Feedforward
  • Affects the execution of the action

45
Information delivery
  • Interruptive
  • Noninterruptive

46
Latent Semantic Analysis (LSA)
  • Synonymy
  • Polysemy
  • Text documents and queries are represented as
    vectors in the semantic space, based on the words
    contained and the similarity between a query and
    a document is determined by the distance of their
    respective vectors

47
(No Transcript)
48
Comments
signature
Discourse model
User model
49
Koders Enterprise (2004)
50
M.A.R.A.C.A.T.U. Modern Architecture for
Retrieving All Components At The Universe (2005)
51
Using Structural Context to Recommend Source
Code Examples
  • Reid Holmes and Gail C. Murphy

University of British ColumbiaSoftware Practices
Lab
52
The Problem A Concrete Example
  • Frameworks can improve developer productivity.
    But developers can become stuck trying to use the
    APIs
  • Imagine trying to use the Eclipse APIs to place
    text in the status line of the Eclipse IDE
  • Eclipse has 38,000 public methods

53
Using Structural Context to Recommend Source
Code Examples - Reid Holmes and Gail C. Murphy
Project Repository
Development Environment
54
Strathcona Extract Structural Context
ViewPart
55
Strathcona Example Navigation
  • Visual representation
  • Highlights key relationships between example and
    query
  • Multiple examples can be quickly viewed

56
Strathcona Viewing Example Source
  • Code view
  • Example shows how to get a status line manager
  • Example is not a perfect match, but good enough
    to help

57
Conclusion
  • Information Delivery
  • Similarity Analyser
  • Ranking Metrics
  • Context
  • Automatic Facet Classification
  • Uncontrolled vocabulary additional terms

58
References
  • McIlroy, 1968 M. D. McIlroy, Mass Produced
    Software Components , NATO Software Engineering
    Conference Report, Garmisch, Germany, October,
    1968, pp. 79-85.
  • Mili, 1998 A. Mili, R. Mili, R. T. Mittermeir,
    A survey of software reuse libraries, Annals of
    Software Engineering, Vol. 5, 1998, pp. 349-414
  • Seacord, 1998 Robert C. Seacord, Scott A.
    Hissam, Kurt C. Wallnau. "Agora A Search Engine
    for Software Components," IEEE Internet
    Computing, vol. 02,  no. 6,  pp. 62-70, 
    November/December,  1998
  • Szyperski, 1999 Szyperski C., Component
    Software Beyond Object-Oriented Programming.
    Addison Wesley, 1999
  • Dey, 2001 Dey, A.. Understanding and Using
    Context. Personal Ubiquitous Comput. 5, 1 (Jan.
    2001)
  • Greengrass, 2001 Greengrass, Ed. Information
    retrieval A survey. DOD Technical Report
    TR-R52-008-001, 2001
  • Ye, 2001 Ye, Y. and Fischer, G. Context-Aware
    Browsing of Large Component Repositories. In
    Proceedings of the 16th IEEE international
    Conference on Automated Software Engineering
    (November 26 - 29, 2001). ASE. IEEE Computer
    Society, Washington, DC, 99.
  • Ye, 2002 Y. Yunwen and G. Fischer. Information
    delivery in support of learning reusable software
    components on demand. In Proceedings of the 7th
    international conference on Intelligent user
    interfaces, California, USA
  • Ye, 2002 Ye, Y. and Fischer, G. Supporting
    Reuse by Delivering Task Relevant and
    Personalized Information. In Proceedings of the
    24th International Conference on Software
    Engineering. p. 513-523, Orlando, Florida, May,
    2002

59
Bibliography
  • Inoue, 2003 K. Inoue et al. "Component Rank
    Relative Significance Rank for Software Component
    Search", Proceedings of ICSE 2003
  • Maxville, 2003 Valerie Maxville, Chiou Peng
    Lam, Jocelyn Armarego. "Selecting Components a
    Process for Context-Driven Evaluation," apsec, p.
    456,  10th Asia-Pacific Software Engineering
    Conference (APSEC'03),  2003
  • Maxville, 2004 Valerie Maxville, Jocelyn
    Armarego, Chiou Peng Lam. "Intelligent Component
    Selection," compsac, pp. 244-249,  28th Annual
    International Computer Software and Applications
    Conference (COMPSAC'04),  2004.
  • Prado, 2004 Lucrédio, D. Almeida, E, S.
    Prado, A, F. A Survey on Software Components
    Search and Retrieval, In the 30th IEEE EUROMICRO
    Conference, Component-Based Software Engineering
    Track, 2004, Rennes - France. IEEE Press,2004
  • Holmes, 2005 Holmes, R. and Murphy, G. C. 2005.
    Using structural context to recommend source code
    examples. In Proceedings of the 27th
    international Conference on Software Engineering
    (St. Louis, MO, USA, May 15 - 21, 2005). ICSE '05

60
Imperfect technology in a working market is
sustainable perfect technology without any
market will vanish Szyperski, 1999
Write a Comment
User Comments (0)
About PowerShow.com