Title: Mission Possible: The University of Hong Kong Libraries strategies in processing hundred thousands o
1Mission Possible The University of Hong Kong
Libraries strategies in processing hundred
thousands of e-Chinese materials
- Connie Lam
- University of Hong Kong Libraries
- OCLC CJK annual meeting April 4, 2008, Atlanta
2Outline
- The E-Chinese Collection in HKU
- Cataloguing approach
- Difficulties encountered
- Our solutions
- Lesson we learnt
- The Challenges
3The e-Chinese Collection
- Major acquisitions
- 1998 Sikuquanshu (????)
- 1999 China Journal Net (China Academic Journal)
- 2000 China Core Newspapers Database
- China Doctors Masters Dissertations
- 2003 Apabi e-book
- 2005 SuperStar Digital Library
4The e-Chinese Collection
- e-journals 11,967
- e-news 941
- e-books 400,000
- e-theses 624,300
5Cataloguing approach
- Catalog what?
- Purchased e-resources.
- Free e-resources selected by subject librarians
6Cataloguing approach
- Catalog at database level once access is set up
- Cataloging at individual level only if
- Title list is provided by vendor
- Individual title has full text access
- Separate record approach every format for a
title has its own record - Single record approach same title from different
vendors has only one record
7Cataloguing approach
- Why catalog?
- Provide one stop search for users
- Assist discovery and access to users
- Increases the visibility of the resources
- Promote access to the resources
8Problems encountered in cataloguing e-journals
- Insufficient information
- When we started our cataloguing project of CJN
(CAJ) titles in 2000, we only had the title list
in print format provided by vendor. - The title list has title and issuing body. No
ISSN, no information on title relationship, no
coverage details.
9Problems encountered in cataloguing e-journals
- We started the cataloguing work with titles we
have printed records. - We relied on references tools to obtain
information on other titles. - We checked full text images to verify changes in
titles. - There were only around 3,500 titles in the
package. - We spent around 2 years to have them finished.
10Problems encountered in cataloguing e-journals
- We requested the vendor to provide detail
information in excel file, including - Title
- Title in English, if any.
- ISSN
- Frequency
- Publisher
- We access the OPAC of National Library of China
and CALIS to obtain more information on the
titles.
11Problems encountered in cataloguing e-journals
- With more information in hand, we are able to
speed up our cataloguing work. - The latest excel file we received, has
information on coverage and former titles per our
request.
12Problems encountered in cataloguing e-journals
- But
- Information accuracy is still a problem.
- Difficult to trace title changes.
- Coverage information not clear.
- Verify the information by checking other catalogs
and full text images online is time consuming. - Difficult to keep the catalog up-to date as no
regular updating information provided.
13Problems encountered in cataloguing e-books
- Vendor provided Marc records
- Problem with the Chinese scripts.
- Not follow AACR2.
- Improper tag used, incorrect indicators used.
- Not follow LC Pinyin guidelines.
- Not follow LC name authorities.
- No subject headings.
- Multi-records for multi-volume set
- Multi-records for serials titles.
14Problems encountered in cataloguing e-books
- Our action
- Communicate with vendor for desired standard
- Provide advice to enhance the records
- Provide example records to vendor
- Why vendor unable to fulfill our requirements
- Records are created by the metadata in database
- Problems cannot be solved by using program
- Problems cannot be tackled in batch
15Problems encountered in cataloguing e-books
- We decide
- Request vendor to provide title list with the
following data in excel file - Author
- Title
- Publisher
- ISBN
- Direct url
- Control number
- Create records ourselves
16Our strategy
- Handle titles we have print records first.
- We estimated that 20-30 of the titles have print
records in HKUL catalog. - Based on the title list, we develop computer
programs - To search ISBN or title of print records in HKUL
catalog. - To generate records based on the print records.
- To add the urls and the control numbers to
records. - To Global update function of Innopac to add
constant data of the e-version to the records.
17Problems encountered in cataloguing e-book
- Search titles by program ? Fail
- Simply rely on title is not reliable.
- A lot of titles in the file have errors.
- Same title sometimes has duplicate entries due to
- Different punctuation used
- Simplified/Traditional characters
- Different space used
- We tried one batch and found that too much effort
required for the clean up.
18Problems encountered in cataloguing e-book
- Search ISBN by program ? Fail
- The matching rate is much lower than our
estimation - Some publishers use same ISBN
- For different titles.
- For same title, but different editions
- We used ISBN program to process 3 batches and
have to re-check every title in these batches to
ensure the records are correct.
19Problems encountered in cataloguing e-book
- We also find that the vendor provided title list
contains - Incorrect metadata
- ISBN
- Author
- Title
- Edition
- Publisher
- Series
- Duplicated entries
20Our solutions
- We realize that
- Manually search for matching records is the only
solution. - We can generate some revenue by uploading records
to OCLC which helps to cover some of the costs. - We expand the search to other libraries via Z39.5
- We set up a small project team for cataloguing of
Chinese e-books.
21Our solutions
- Division of labor
- We analyze the job and divide the task into
small pieces. - Members of the newly set up team are mainly
responsible for searching. No editing is
required. - Minimal training is required.
- Efficiency
- High productivity
22Our workflow
- Analyze excel file to locate serial titles,
collectanea and have them excluded from searching
list. - Serial titles cataloged by traditional
cataloging. - Based on the searching list to locate e-records
in HKUL catalog. - If yes, add the url to the existing e-record
23Our workflow
- Use Z39.5 to search selected libraries, including
HKUL. - Identify fully cataloged records ? Save the
record suppress the record ? Put the record no.
in the excel file. - Use program to add the urls and the control
numbers to the records. - Use Global update function of Innopac to add
constant data of the e-version to the records.
24A typical worksheet after searching
25Our workflow
- Create review file for the records
- Pass the file to experience catalogers for quick
review, mainly to identify out-dated cataloguing
practice, human errors unable to be tackled by
batch update, local notes in the note field. - Release the records and upload to OCLC
- Create simple records for titles without matching
records, i.e. descriptive cataloging authority
control, but no subject heading, no call number. - Simple records also reviewed by experienced
catalogers and upload to OCLC.
26E-Chinese resources we cataloged
- e-journals 11,967 titles no backlog
- e-news 941 titles no backlog
- e-books 108,974 titles backlog Apabi,
SuperStar - e-theses 1,025 titles not yet start the
cataloguing project for CDMD
27We contributed
28Goal 1 One stop search
29Goal 2 Assist resources discovery
- Titles in collectanea which are not searchable in
the database, can be located by searching HKUL
catalog. - More precise title information in HKUL catalog
than the database, it helps users to locate the
resources easily.
30(No Transcript)
31(No Transcript)
32(No Transcript)
33(No Transcript)
34Goal 3 Resources visibility
- We are told that our students search the web, but
not the catalog. By uploading the records to
OCLC and with Open WorldCat, our users can
identify the e-books by using Google, Yahoo - Our collection becomes more visible.
35(No Transcript)
36Lessons learnt
- Program can help to do some procedures but manual
checking is still very important. - Cataloging is a labor intensive job. There is no
magic. - Communication with vendor is the most important
task. Accurate metadata is extremely important
for fast processing.
37The challenges
- Quality control the speed of searching and
matching is faster than review. - The process of searching and matching is a
repetitive and boring task, how to help staff
maintain high accuracy and efficiency? - Training of inexperience staff
38What can be done?
- Urge vendor to provide unique control number for
individual title in the database, which can is
very useful in sharing cataloguing. - Work with vendor for detail and precise metadata.
39- Thank You!
- csllam_at_hkucc.hku.hk