New Applications of Tagging - PowerPoint PPT Presentation

1 / 30
About This Presentation
Title:

New Applications of Tagging

Description:

Annotation: 'Paris, European, historical building, beach, landscape, water' ... 100 web pages day (50KB each) 5 scanned pages a day (100KB each) ... – PowerPoint PPT presentation

Number of Views:44
Avg rating:3.0/5.0
Slides: 31
Provided by: jaesu
Category:

less

Transcript and Presenter's Notes

Title: New Applications of Tagging


1
New Applications of Tagging
  • November 17, 2006
  • Jaesun Han (jshan0000_at_gmail.com)
  • Research Fellow / Ph.D
  • ANLAB, Dept. of EECS, KAIST
  • Contact http//www.web2hub.com

2
Contents
  • GeoTagging
  • Auto-Tagging
  • Auto-Tagging for Text
  • Auto-Tagging for Image
  • Tagging the physical world

3
GeoTagging
4
GeoTagging
  • GeoTagging (also referred to GeoCoding)
  • Adding geographical identification metadata to
    various media such as websites, RSS feeds, or
    images
  • latitude and longitude coordinates ( altitude
    and place names)
  • non-coordinate based geographical identifiers (a
    postal address, etc)
  • Applications
  • GeoTagging-enabled image search engine
  • GeoTagging-enabled information services
  • Examples TripTracker, Zooomr, Mappr, Platial, etc

5
GeoTagging techniques
  • Manually inputting
  • Initial GeoTagging convention by GeoBloggers
  • geotagged
  • geolatlatitude e.g. geolat51.4989
  • geolonlongitude e.g. geolon-0.1786
  • Manually positioning into map
  • automatically augment geotags
  • Examples Flickr, Picasa Google Earth
  • Using a location-aware device such a GPS receiver
    link
  • JPEG and TIFF image file formats can store the
    geographical coordinates in the EXIF header
  • Examples Cameraphone with Zonetag Bluetooth
    GPS linkphotos
  • Digital Camera with Sony
    GPS-CS1 link
  • Using scene recognition program

6
Representation of GeoTag in HTML
GeoTags
GeoURL
GeoRSS
geo microformats
7
James Reserve Observing Systems at CENS
http//dms.jamesreserve.edu/
http//dms.jamesreserve.edu/jrcensweb/GE_GUI_v8.ph
p
8
SenseWeb at MSR
  • Goal Publishing and querying real-time data
    over such geo-centric web interfaces
  • Common platform and set of tools
  • For data owners to easily publish their data
  • For users to make useful queries over the live
    data sources
  • Example data
  • temperature, humidity
  • weather data
  • parking space
  • restaurant's wait time
  • traffic camera images
  • all types of image, audio
  • video

http//atom.research.microsoft.com/sensormap/ http
//research.microsoft.com/nec/senseweb/
9
Auto-Tagging
10
Auto-Tagging for Text
  • Paper Improved Annotation of the Blogopshere via
    Autotagging and Hierarchical Clustering (WWW
    2005)
  • Two Questions
  • Do tags provide users with the necessary
    descriptive power to successfully group articles
    into sets? (for discovering)
  • Can tags help with search task? (for searching)
  • Experiments
  • Retrieved the top 350 tags from Technorati, and
    then the 250 most recent articles for each tag
  • All articles that share a tag are assigned to a
    tag cluster
  • Articles are converted into weighted vectors,
    using TFIDF to assign weights to each word
  • Similarity is measured by the average pairwise
    cosine similarity of all articles in each cluster

11
Auto-Tagging for Text
cluster by random selection similarity 0.1 0.2
cluster by a user tag similarity 0.2 0.3
cluster by Google News similarity 0.4
cluster by auto-tagging similarity 0.5 0.7
12
Auto-Tagging for Text
  • Automated Tagging
  • Automatically assigning tags based on the content
    of an article
  • Auto-Tagging technique of this paper
  • Assign TFIDF scores to all words and extract the
    top three highest-scoring words as tags
  • Experimental results
  • Significantly better similarity scores than
    tagging does
  • Automated tagging produces more focused, topical
    clusters
  • Tags extracted from user text are more helpful in
    creating specific categories than user-selected
    tags are
  • Discussion
  • Can tags help with search task for text?
  • How different is auto-tagging from indexing for
    search?
  • Does auto-tagging cooperate with user-tagging?
  • Any other auto-tagging technique for text?

13
Auto-Tagging for Image
  • Existing image retrieval
  • Google Image Search
  • Google uses surrounding text and ignores contents
    of the image
  • Tag Search
  • Automatic image annotation
  • Automatically annotating by analyzing contents of
    images
  • Advantages
  • No manual annotation
  • Easy to handle textual queries
  • Does not ignore contents of the images

dahlia, golden, gate, park, flower,
and fog
cameraphone, animal, dog, and tyson
search by cameraphone
search by gate
14
Automatic image annotation
15
Case Study ALIPR(Automatic Linguistic Indexing
for Pictures - Real Time)
  • Goal Can a computer do this?

Building, sky, lake, landscape, Europe, tree
  • References
  • Automatic Linguistic Indexing of Pictures by a
    Statistical Modeling Approach (IEEE Transactions
    on Pattern Analysis and Machine Intelligence
    2003)
  • Real-time Computerized Annotation of Pictures
    (ACM Multimedia Conference 2006)
  • Project homepage http//wang.ist.psu.edu/IMAGE/
  • Online demo http//alipr.com/

16
ALIPR
  • Image DB Constructing Process
  • Corel image database 100 images X 600 CD-ROMs
    (by topics)
  • Each concept is manually annotated with a few
    keywords (total of 332 distinct words)
  • Training Process
  • Feature extraction color and texture
  • Region segmentation k-means clustering of
    feature vectors
  • Statistical modeling discrete distribution
    (D2-) clustering
  • Annotation Process
  • Image signature extraction
  • Computing concept likelihood score
  • Computing the probability for each word
  • Selecting top ranked words

17
ALIPR 600 Categories of Images
Image Database - Corel Stock Photo Dataset - 100
images X 600 CD-ROMs - categorized under same
topic - manually assign category descriptions
(total of 332 distinct words)
18
ALIPR A Category of Images
Concept(Paris/France) Annotation Paris,
European, historical building, beach, landscape,
water
19
ALIPR Training Process
20
ALIPR Automatic Annotation Process
21
Tagging the physical world
22
Olalog at Olaworks
  • Manage digital data from everyday life

Not a plain tag, we use a SPOT!
Space
Person
Object
Time
olalog
Where
Who
What
When
Everyday Life
olalog
olalog
Community
???
olalog
Web / P2P
?
? PC
olalog
??????
Daily Digital Data
SMS/MMS
????, ??, ??, MP3, E-mail, Web log,..
23
Olalog Auto-Tagging
  • S Cell or GPS-based LBS
  • Latitude, Longitude
  • P Face Recognition
  • ID or E-mail
  • O Barcode, Character, Trademark, ID3Tag, RFID
  • ISBN, UPC/EAN
  • T Time Stamp (ex. EXIF)
  • YYYYMMDDHHMMSS

24
MyLifeBits at MSR
A lifetime store of everything articles, books,
cards, CDs, letters, memos, papers, photos,
pictures, presentations, home movies, videotaped
lectures, and voice recordings beginning to
capture phone calls, IM transcripts, television,
and radio
25
The 1 TB Life
  • 1TB gives you 65 years of
  • 100 email messages a day (5KB each)
  • 100 web pages day (50KB each)
  • 5 scanned pages a day (100KB each)
  • 1 book every 10 days (1 MB each)
  • 10 photos per day (400 KB JPEG each)
  • 8 hours per day of sound - e.g. telephone,voice
    annotations, and meeting recordings (8 Kb/s)
  • 1 new music CD every 10 days (45 min each at
    128 Kb/s)
  • It will take you 5 years to fill up your 80 GB
    drive
  • Want video? Buy more cheap drives (1 TB/year lets
    you record 4 hours/day of 1.5 Mb/s video)

26
MyLifeBits System
Capture
Organization Annotation
Storage Retrieval
No organization Automatic annotation
SQL
27
MyLifeBits Software
MyLifeBits store
database
28
MyLifeBits Organization
powerful search
NO Organization
Personal
Professional
Archive
Archive
Current
Current
Classification Sharing
full-text search is not enough flat tagging does
not scale
Example document type - several hundred unique
entries article, bill, will, business card,
report card,
greeting card, and birth certificate -
different dimensions size, form, content,
supplier etc
29
MyLifeBits Annotation
  • Manual annotation by user
  • still recommended
  • Automatic annotation by speech analysis
  • increasing audio contents
  • voice annotation, notes, telephone calls,
    meetings, conversation etc
  • using speech-to-text for annotating other types
    of contents
  • example photos taken on greeting a new person
  • Photo annotation
  • who, what, when, where
  • who what are diffcult ? image analysis
    technology
  • Video annotation
  • same problems as audio and photo

30
Discussion
Write a Comment
User Comments (0)
About PowerShow.com