Dias nummer 1 - PowerPoint PPT Presentation

1 / 60
About This Presentation
Title:

Dias nummer 1

Description:

Importance of bibliographic relationships. Searching ( navigating), identifying ... as these relate to legibility, clarity, understandability and navigability ... – PowerPoint PPT presentation

Number of Views:61
Avg rating:3.0/5.0
Slides: 61
Provided by: erikt2
Category:

less

Transcript and Presenter's Notes

Title: Dias nummer 1


1
Bibliographic relationships
Erik Thorlund Jepsen The Danish Library Agency
2
Outline
  • Bibliographic relations
  • Definitions and types
  • Importance of bibliographic relationships
  • Searching (navigating), identifying and
    selecting
  • Typologies
  • Utilizing bibliographic relationships in OPACs
    and other search tools (eg. Integrated search)
  • By cataloguing
  • By automatic means
  • Emploing statistical measures
  • Conclusion

3
Definition of Bibliographic relationships
  • A relationship between information entities
  • exist, when two entities are somehow
  • associated with each other.
  • (Velucci, 1997, p.105).
  • Though semantically precise, this definition does
    not provide much guidance for the identification
    of relationships, since associations rely on
    subjective judgements, assessing the relevance of
    connecting/relating two or more entities.

4
Bibliographic entities (FRBR)
5
Relations and associations
  • Who identifies a relation (associates)
  • Authors, publishers, indexers/cataloguers, users,
    system/rule based (incl. user tracks)
  • What are the associations based on?
  • Information in entities?
  • About another entity
  • About a relation between two other entities
  • Similarity
  • Use (and users)

6
Examples
  • Book is based on ..... (work, expression,
    manifestation)
  • Book is 3. Edition of ....
  • In his book, Peter Ingwersen mentions the
    findings of Pia Borlund in .... (work,
    expression, manifestation)
  • A novel is part of an anthology
  • Book X and Y are written by the same author
  • Book X and Y shares two descriptors
  • Person A lends book X and CD Y
  • Article X and article Y is cited in article
    Z
  • Book X and Y share a lot of words
  • Book X and Y share a lot of references

7
Importance of relations
  • A little bit about FRBR and relations
  • Examples
  • Identification and selection
  • Use of linkages
  • Faceted search facilities
  • Importance as stated by FRBR and GFOD

8
FRBR - tasks
  • to find entities that correspond to the user's
    stated search criteria (i.e., to locate either a
    single entity or a set of entities in a file or
    database as the result of a search using an
    attribute or relationship of the entity)
  • to identify an entity (i.e., to confirm that the
    entity described corresponds to the entity
    sought, or to distinguish between two or more
    entities with similar characteristics)
  • to select an entity that is appropriate to the
    user's needs (i.e., to choose an entity that
    meets the user's requirements with respect to
    content, physical format, etc., or to reject an
    entity as being inappropriate to the user's
    needs)
  • to acquire or obtain access to the entity
    described (i.e., to acquire an entity through
    purchase, loan, etc., or to access an entity
    electronically through an online connection to a
    remote computer).
  • (Functional Requirements for Bibliographic
    Records, 1998, p.82)

9
FRBR additional tasks?
  • to relate................. A fifth task?
  • Even more, FRBR reminds us of the importance of
    bibliographic relationships, and reminds us that
    we describe things in the bibliographic universe
    in order to meet specific user tasks find,
    identify, select, obtain, and i add
    relate (Tillett, 2005, p. 198).
  • Yet, information about relationships supports the
    three tasks to find, to identify and to select
    (e.g. supports collocation, which is seen as part
    of to find)
  • In other words, to relate is a sub task of to
    find, to identify and to select.
  • It could cause a breakdown of the model to
    incorporate to relate as a fifth task
  • To navigate.A fifth task?
  • Yes often among groups of entities

10
Importance of relations FRBR-terms
  • Information about bibliographic relations between
    two or more bibliographic entities can support
    the user tasks find, identify and select.
    Relations stated somehow in a bibliographic
    record can potentially
  • Improve users understanding of a given entity,
    which potentially strengthens the identification
    and selection/deselection of the entity.
  • Improve users options for finding relevant
    entities by leading the way from a known (found)
    entity too related entities which are more
    relevant in a given situation.

11
Importance of relations
  • Furthermore, information about bibliographic
    relationships can strengthen the users
    understanding of the system (database) at hand
    and the knowledge organization in the system, by
  • Creating groups of entities
  • and
  • Facilitating navigation in the bibliographic
    universe (the database/Catalogue)

12
Example identification and selection
  • Draznin, Sandra LeighBørnenes restaurant
  • Bogen er baseret på tv-serien Børnenes
    restaurant og indeholder enkle madopskrifter som
    8-12 årige selv kan lave. Det gælder både
    forretter, hovedretter, desserter og snacks.
  • BOG 1. udgave, 1. oplag. TV 2, 2007
  • Opskrifter af Sandra Leigh Draznin, fotos Jes
    Buusmann, opskrifter på side 20, 32, 42, 56 og
    74 Thomas Castberg Larsen, forord og tips ved
    Steffen Bjergved og Thomas Castberg Larsen,
    efterskrift af Carina Christensen1. udgave.
    2007. 92 sider, illustreret i farverForlag TV
    2Form kogebøger opskrifterOpstilling i
    folkebiblioteker 64.1Biblioteket anbefaler Fra
    10 årISBN-13 978-87-92121-13-4Pris ved
    udgivelsen kr. 199,00

13
Example linking
14
(No Transcript)
15
(No Transcript)
16
Example Faceted search
17
(No Transcript)
18
Importance of linking Danish example
  • Analysis of searches in bibliotek.dk
  • (20.506 searches - 20. December 2004)
  • Free-text 7
  • Author 34
  • Link author 5
  • Title 20
  • Descriptor/keyword 11
  • Link descriptor/keyword more like this
  • and Literature about..) 15
  • Other (each max 2) sum
    8
  • Source Kirsten Larsen, Deputy Head, The Danish
    Library Centre (DBC)

19
Importance GFOD
  • User Principle General guidelines for good
    practice in display design and criteria for
    effective screen displays as these relate to
    legibility, clarity, understandability and
    navigability
  • Content and Arrangement Principle 7. Support
    navigation from the displayed information to
    related information
  • (this principle is further divided into more
    specific, and i add ambitious, principles.)

20
Bibliographic relationships identification of,
and typologies
  • Associations/relations can be identified by
    analyzing
  • Sets of documents
  • Existing information systems
  • Standards, rule sets and registration formats
  • Empirical studies of users identification - and
    assessment of importance of associations among
    groups of entities

21
Typologies
  • Categories Holds between
  • Equivalence Relationships (copies, facsimiles,
    microforms and other similar reproductions)
  • Derivative Relationships (versions, editions,
    revisions, translations)
  • Descriptive Relationships (annotated editions,
    commentaries, reviews)
  • Whole-Part Relationships (selections from
    anthologies, collections, series, chapters vs.
    books)
  • Accompanying Relationships (supplements,
    concordances, indexes)
  • Sequential Relationships (sequels of a monograph,
    parts of a series)
  • Shared Characteristics Relationships (common
    author, publisher, title, subject)
  • Vellucci and Tillett Categories of Relations
    (Shortened description from
  • Vellucci, 1997)

22
Content relationships (equivalence, derivative
and descriptive) are sometimes hard to
distinguish in practice.
23


24
(No Transcript)
25
(No Transcript)
26
Typologi - FRBR
  • Relationships depicted in the high level diagrams
  • Other Relationships between Group 1 entities at
    these levels
  • Work-to-work
  • Expression-to-expression
  • Expression-to-work
  • Manifestation-to-manifestation
  • Manifestation-to-item
  • Item-to-item
  • Whole/Part at work, expression, manifestation and
    item Level
  • Not meant to be exhaustive!
  • Yet, relationships are mapped to user task
    (alongside attributes)

27
Utilization
  • Three purposes when cataloguing information about
    relations and setting up system rules
  • Identification and understanding of relation
  • Linking from found entity to related entities
  • Displaying meaningful/useful sets of records

28
Relations expressed as links
  • Relations are expressed as implicit or explicit
    links, where explicit links are divided into
    directional and mechanical links (hyperlinks)
  • (Velucci, 1997)
  • Hyperlinks are constructed by manual or
    computational means
  • Manual links are static and are commonly used to
    structure texts or to connect associatively
    related entities (by topic) (and to connect
    bibliographic families added by etj)
  • Computational links can be created at search time
    (dynamicality) and are primarily used to connect
    similar entities (e.g. based on shared
    characteristics added by etj)
  • (Agosti, 1997)

29
Traditional means vs. new ways
  • Traditionally relations are handled in very
    different ways, caused by
  • Different types of relationships are handled
    differently
  • Non-specific rules
  • Variances between library systems
  • Differences in cataloguing policies
  • .......

30
Traditional means vs. new ways (2)
  • Traditional methods include (very generalized)
  • Work, expression, manifestation and (sometimes)
    item information included in one record.
  • including notes on predecessors
  • often goes for part-whole relations
  • Hyperlink structures based on controlled
    information about author, subject and (sometimes
    title) (most commonly links from author(s),
    descriptors and classification numbers)

31
Traditional means vs. new ways (3)
  • Initiatives to strengthen the utilization of
    bibliographic relations could be divided into
  • Initiatives that try to put the display of
    relationships on the agenda (e.g. GFOD)
  • Initiatives that rely on manual work and further
    development of rule sets and registration formats
    (e.g. Reuse)
  • Initiatives that add links to the display of
    bibliographic records by controlling data and
    implementing local rules for displays. (ad hoc,
    local or system-specific initiatives)
  • Initiatives that try to collocate the different
    expressions and manifestations of the work (e.g.
    Bibliotek.dk)
  • Initiatives that try to structure the catalogue
    (existing data) according to FRBR (e.g. FRBR
    Display Tool)
  • Faceted search facilities
  • The list is not exhaustive!

automa-tic or semi-automa-tic
32
Utilization and links examples
  • One record structures (e.g. for accompanying
    relations)
  • Computational links for shared characteristics
    (The KB example renæssance)
  • Rules and codes (e.g. for derived relations)
  • Computational solutions for work display

33
One-record structures
34
Rules and codes example Reuse
  • Widened use of specific field in Marc-formats to
    handle relations in a uniform way.
  • 787 Non-specific relationship entry
    (Repeatable)....and two subfields
  • w Record control number (target to link current
    record to)
  • g Relationship information (textual optional)

35
Reuse (2)
  • To distinguish between the various relationships,
    and to make them specific, our simple model
    proposes the use of indicator 2 in 787, as yet
    undefined. This indicator might take on the
    following values (and here, a full-scale model
    would not have to differ) (in parentheses DC
    Simple terms for relations)
  • 0 Equivalence (facsimile or reproduction)
    (IsFormatOf) 1 Simultaneous edition
    (IsVersionOf) 2 Successive derivation, edition,
    version (IsVersionOf) 3 Amplification (incl.
    commentaries, illustrations, criticism etc.)
    (IsBasedOn) 4 Extraction (abridgements,
    condensations, excerpts) 5 Recordings of
    performances 6 Adaptation, modification (change
    of genre or medium, arrangement) (IsFormatOf) 9
    Translations (IsVersionOf)
  • a Accompanying relationship (supplements of any
    kind) (IsRequiredBy) p Part à whole relationship
    (IsPartOf) r Review or other descriptive
    relationship s Sequential relationship (like
    successive title of a serial) u Unspecific
    relationship, based on shared characteristics of
    other kinds
  • (Eversberg, 1998)

36
On collocating the work
  • Most users seek particular works, not particular
    editions. Yet works are published in the form of
    editions the fundamental duty of descriptive
    cataloguing is to organize the resulting chaotic
    bibliographic universe to facilitate user access
    to works, and to allow them easily to select the
    edition of the work sought that best meets their
    needs (Yee, 1997, p.64).

37
Computational solutions for work display
  • FRBR Display Tool
  • Library of Congress FRBR Display Tool was
    developed to transform bibliographic data found
    in MARC 21 record files into meaningful displays
    by grouping them into the work, expression and
    manifestation FRBR entities. Based on XML
    technologies, the tool may be altered to meet the
    needs of individual institutions. It also shows
    how the theoretical portion of the FRBR model can
    be used practically to allow librarians to
    evaluate the consistency of their local
    bibliographic data

38
Work-display Bibliotek.dk (The Danish Union
Catalogue)
  • An example of an almost totally automatic
    initiative is the display of editions of a work
    in the Danish Union Catalogue Bibliotek.dk
  • Attributes like author and title are used in a
    best match algorithm to identify different
    editions of the work.
  • Due to, a high level of authority control and the
    use of original titles, the different expressions
    of a work will normally be collocated in the
    search result.

39
bibliotek.dk - library.dk
  • End user version of the Danish Union Catalogue
  • Sponsored by The library Agency but maintenance
    and development by The Danish Library Centre
    (DBC)
  • Content
  • The Danish national bibliography
  • all titles in public libraries and research
    libraries in Denmark
  • Content is not 100 equivalent to the Union
    Catalogue (availability matters)
  • Works together with a national transportation
    system users can pick up books from every
    library at their own (chosen) library

40
Adaptation of FRBR in bibliotek.dk
  • The records in bibliotek.dk represents
    manifestations (AACR2/danMARC2).
  • The aim is to present these records grouped
    according to the work they embody
  • At one point our definition differs from FRBR
  • For practical reasons we consider expressions in
    different language to be different works.
  • You could also say that in this case we prefer
    grouping according to the expression of the work.
  • (Paul B. Jensen, Danish Library Center)

41
Implementing the work concept
  • The work level display is based on matching and
    collocating manifestation records on-the fly
  • This match is based on simple author and title
    data in normalized form
  • From the work level you can expand to the
    manifestations, select one (or more) and make a
    request

(Paul B. Jensen, Danish Library Center)
42
Accomplishment
  • A more user-friendly interface (as confirmed by a
    majority of test-users)
  • A reduction of unnecessary inter-library loans,
    because it is easier to locate an edition to your
    local library (or libraries)
  • (Paul B. Jensen, Danish Library Center)

43
Challenges(read problems)
  • In principle a traditional aacr2/marc-record does
    not specify which bibliographic information
    refers to work level and which to the
    expression/manifestation level
  • Many bibliographic items contains more than one
    work
  • Collected plays in one volume (e.g. Shakespeare)
  • 3 novels in one volume
  • 3 symphonies on one cd
  • Etc.
  • (Paul B. Jensen, Danish Library Center)

44
(No Transcript)
45
(No Transcript)
46
Neglect or choose edition
47
Show editions
Show full record
48
Other kinds of linkages
  • Author-pointers
  • citations, references and links
  • semantic equivalence (same as similarity below)
  • Use-determined
  • frequencies
  • Similarity-based
  • Co-occurrence of text-elements
  • e.g. words in text, citations (bibliographic
    coupling)
  • Third part pointers
  • co-citations
  • articles, books ..

49
Types Author pointers
Author
Entities citing, linking to or referring to other
entities
Entity
Entity
Entities cited by, referred by or linked to by
other entities
50
Types Use determined relations
Entity
User
Entities bought or lent by same user
Entity
51
Example use determined links
  • RomanSuzanne BrøggerLinda Evangelista Olsen /
    Suzanne Brøgger4. oplag. - Kbh. Gyldendal,
    2002. - 134 siderKatten Linda formodes at være
    en reinkarnation af forfatterens mor, der selv
    var en kat. Og det passer godt nok på den
    tilværelse mor og kat har, og deres måde at
    påvirke omgivelserne på ....Tidligere 1.
    udgave. 2001.Originaludgave 2001.ISBN
    87-00-48736-8 hf. kr. 175,00.
  •  Andre, der har lånt Suzanne Brøgger Linda
    Evangelista Olsen, har også låntSuzanne
    Brøgger JaSuzanne Brøgger En gris som har
    været oppe at slås kan man ikke stegeSuzanne
    Brøgger Creme fraicheSuzanne Brøgger
    JadekattenSuzanne Brøgger ToneJan Lyderik
    Tangs saga. Bind 1-2

52
Similar entities
  • Statistical based e.g. vector space model using
    tfidf weights

Entity 1
Entity 2
Shared elements
53
Statistisk baseret lighed mellem dokumenter
  • Similaritetsmål mellem dokumenter bruges til
    identifikation af relationer. Links etableres på
    basis af tærskelværdier.
  • Den mest udbredte teknik er anvendelsen af tf x
    idf vægtning i relation til vektorrumsmodellen.
  • Dette vil typisk indebære flg.
  • Betydende features/ord identificeres
    (stopordsliste)
  • Ordene vægtes baseret på tfxidf og evt.
    positionelle parametre
  • Similariteten mellem Dok.1 og alle andre
    dokumenter i basen beregnes udfra given algoritme
    (f.eks. som cosinus mellem to vektorer
    (dokumenter))
  • Similaritetsværdier overstigende given
    grænseværdi gt etablering af link

54
Eksempel
  • TermA, optræder i dok1 ift dok2
  • tfA 5, men termen optræder en gang som Metadata
    og en gang i overskrift gt tfA7 (3X1 2X2)
    (Croft anvender tf ift. hyppigste term i dok)
  • Idf 1/1000 (Croft anvender log. N/n)
  • Termvægt for termA mellem dok1 og dok2 7/1000
  • Samtlige termvægte indregnes i algoritme. F.eks.
    Udregning af cosinus.

55
(No Transcript)
56
Similarity example
Similar pages
57
(No Transcript)
58
Third part pointers
  • Co-citations
  • Similar to others who has lent this book, has
    lent these materials
  • But from an author/domain perspective
  • Example from Citeseer -gt

59
Citeseer example
  • Abstract Latent Semantic Indexing (LSI) is a
    technique for representing documents, queries,
    and terms as vectors in a multidimensional
    real-valued space. The representations are
    approximations to the original term space
    encoding, and are found using the matrix
    technique of Singular Value Decomposition. In
    comparison, Multidimensional Scaling (MDS) is a
    class of data analysis techniques for
    representing data points as points in a
    multidimensional real-valued space. The objects
    are represented so that... (Update)
  • Cited by More
  • Automated Modeling and Nonlinear Axis Scaling -
    Leejay Wu (2005) (Correct)
  • Similar documents (at the sentence level)
  • 8.5 Optimizing Ranking Functions A
    Connectionist Approach to.. - Bartell (1994)
    (Correct)
  • Active bibliography (related documents) More
    All
  • 0.2 A Survey of Information Retrieval and
    Filtering Methods - Faloutsos, Oard (1996)
    (Correct)
  • 0.2 Document Space Models Using Latent
    Semantic Analysis - Gotoh, Renals (1997)
    (Correct)
  • 0.2 Approximating Matrix Multiplication for
    Pattern Recognition Tasks - Cohen, Lewis (1997)
    (Correct)
  • Similar documents based on text More All
  • 0.5 Chapter 15 Getting Better Results With
    Latent Semantic Indexing - Nakov (2000)
    (Correct)
  • 0.4 Image Retrieval using Latent Semantic
    Indexing - Pecenovic (1997) (Correct)
  • 0.4 On the Use of Singular Value Decomposition
    for Text Retrieval - Husbands, Simon, Ding (2000)
    (Correct)
  • Related documents from co-citation More All

Co-citation threshold
60
Conclusion and perspectives Designing OPACs and
integrated search tools according to relations
  • A lot of possibilities lots of types of
    relationships to display and utilize in different
    ways
  • Bibliographic families Shared characteristics
    Whole-part and other bibliographic relations
  • Similarity (statistical) Co citations User
    defined (co use)
  • A.o.
  • Need for carefull design of system features/link
    structures and a lot of testing (not only
    emploing user satisfaction but essentially
    improved search results)
  • In other words Pick the functionalities that
    works for the user not the ones you like or are
    familiar with
  • In general, we lack large scale user
    investigations
Write a Comment
User Comments (0)
About PowerShow.com