Title: The Future of Mobile
1The Future of Mobile Applications John Canny
UCB EECS Marc Davis UCB School of Information
Yahoo! Research Berkeley
2 The Business
- There are 6.5 billion people on earth - only
about 1.2 billion in developed countries - They will buy 800 million mobile phones this
year - one person in eight on the planet - Thats 4x PC or TV unit sales
- Fraction of smartphones should reach 40 by
2009 - most common computer
3 What kind of computer is it?
- This years Smartphone (free with service
contract) - 150-200 MHz ARM processor
- 32 MB ram
- 2 GB flash (not included)
- Windows-98 PC that boots quickly!
- Plus
- Camera
- AGPS (Qualcomm/Snaptrack)
- DSP cores, OpenGL GPU
- EV-DO (300 kb/s), Bluetooth
4 Whats Coming
- In the past, the platform was driven by
voicemessaging - Now the high end is driven by video, gaming,
location, - The result is diversification of the platform,
and sudden jumps in performance, e.g. Qualcomm
has 4 platforms - Value platform (voice only)
-
-
- 4. Convergence platform (MP3 player, gamer,
camera,) several timesthe performance of
todays high-end
PC
5 The Inevitable
- In response to MITs 100 laptop, Microsoft last
month proposed the cell phone computer for
developing countries
Bollywoodon demand click here
6 Back to the future, which is
- Using Context
- Location, time, BT neighborhood,
- Community
- User History
- Harnessing Content
- Text, Images, Video Metadata
- Speech Recognition
- Computer Vision
7 Whats wrong today
- Did you ever try to find a neighborhood
restaurant using a mobile browser - and find it while you were in the same
neighborhood? - In a car you might end up in the next county
- Luckily a house stopped thisdriver before they
got into serioustrouble.
8 Context-Awareness
- Context-awareness is the holy grail for next
generation mobile applications - Location (e.g., video store) heavily shapes
the users likely actions. - The system can present streamlined choices
here are your top-10 video suggestions with
clickable previews. - For users this is very convenient.
- For vendors,
9 Context-Awareness and Pro-Activity
- Knowledge of user background and context provide
great opportunities for pro-active services - Its 7pm and youre in San Francisco, would you
like me to find a nearby restaurant?
10 Context-Awareness and Pro-Activity
- Knowledge of user background and context provide
great opportunities for pro-active services - Its 7pm and youre in San Francisco, there is
a table available two blocks away at Aqua.
Would you like me to book it?
11 Context-Awareness and Pro-Activity
- Knowledge of user background and context provide
great opportunities for pro-active services - Its 7pm and youre in San Francisco, there is
a table available two blocks away at Aqua, and
they have a special on Salmon in parchment for
28. Would you like me to book a table, and
order the special?
12 Context-Awareness and Recognition
- Consider now a speech recognizing version of this
application - Its 7pm and youre in San Francisco, there is
a table available two blocks away at Aqua, and
they have a special on Salmon in parchment for
28. Would you like me to book a table, and
order the special? - User Yes or No
13 Context-Awareness and Activity
- Peoples actions are part of larger wholes called
activities. - When you plan an evening out it may include
- Going for coffee
- Seeing a movie
- Eating dinner
- - planned and coordinated by the phone
- Sharing photos on your cameraphone
- Scoring your date in real-time
- Its a social platform!
14 Activity-Based Design
- Activity-based design creates chains of services
(BigTribe) or menus of related actions. More
examples - Planning a trip hotel, car, events
- Going back to school housing, books etc.
- Shopping for holiday gifts
- Moving house
- Hobbies Needlepoint,Monster truck racing
- Some of these services exist, but activity
analysis supports automatic discovery and
customization of them.
15Sociotechnical Systems
- The problems of the internet are not purely
technological - Need to conduct sociotechnical analysis and
design of large scale internet systems and
applications at the intersection of media,
technology, and people - Leverage media metadata created by context-aware
devices, content analysis, and communities
16Signal-to-Symbol Problems
- Semantic Gap
- Gap between low-level signal analysis and
high-level semantic descriptions - Vertical off-white rectangular blob on blue
background does not equal Campanile at UC
Berkeley
17Signal-to-Symbol Problems
- Sensory Gap
- Gap between how an object appears and what it is
- Different images of same object can appear
dissimilar - Images of different objects can appear similar
18Computer Vision and Context
- You go out drinking with your friends
- You get drunk
- Really drunk
- You get hit over the head and pass out
- You are flown to a city in a country youve never
been to with a language you dont understand and
an alphabet you cant read - You wake up face down in a gutter with a terrible
hangover - You have no idea where you are or how you got
there - This is what its like to be most computer vision
systemsthey have no context - Context is what enables us to understand what we
see
19Campanile Inspiration
20MMM Mobile Media Metadata Idea
- Leverage the spatio-temporal context and social
community of media capture in mobile devices - Gather all automatically available information at
the point of capture (time of capture, spatial
location, collocated phone users, etc.) - Analyze contextual metadata and media to find
similar media that has been captured before - Use patterns in previously captured
media/metadata to infer the content, context, and
community of newly captured media - Interact with users to augment system-supplied
metadata for captured media
21MMM Mobile Media Metadata Projects
- Mobile Media Metadata
- Davis, Canny, et al.
- UC Berkeley
22Context-Aware Face Recognition
23Context-Aware Face Recognition
- Face recognition alone - 43
accurate(state of the art computer vision) - Context analysis alone - 50
accurate(Face prediction from contextual data
on the phone) - ContextContent analysis - 60 accurate
Figure 1. (Top) Subjects with frontal pose,
(Bottom) Same
24Context-Aware Place Recognition
- Image analysis alone - 30
accurate - Context analysis alone - 55
accurate - ContextContent analysis - 67 accurate
25MMM2 Context to Community
26Photo Share Guesser
27Photo Level of Interest (LOI) Browser
28PhotoCat Context-Aware Photo Browser
29 Technologies
- We have been developing core technologies for
context and content mining for the last 5 years - Accurate, scalable personalization (used in
MMM2) - Algorithms for integration of personal and
context information, and for activity
discovery - Methods to preserve privacy while mining user
location history and online behavior - Weve also worked with a company (BigTribe)
through 3 funded NSF SBIRs, to migrate these
ideas into products.
30 Harnessing Large, Mixed Content Sources
- Early access to an XML Content-Base (Mark Logic
CIS) - We built an efficient locationmetadata server
from diverse well- and poorly-structured data
sources. - Street data comes from XML Census data
(Tiger/GML). - Restaurant data is from the Open Directory,
partly structured. - Addresses converted to LAT/LONG by the database.
- The map you see is producedentirely using XQuery
(SVG).
31 Location Content-Base
- A native XML engine supports efficient tree
traversal. - The location C-B uses R-treeorganization as its
XML schema - The result is that our software spatial
database has the same efficiency as custom
spatial databases. - i.e. not only are data types extensible,but also
the types of query that areefficiently supported.
32 Context-Aware Design Glaze
- Designing new context-aware apps works best in
the wild. We are doing a participatory design
experiment with 20 AGPS phones this spring. - Users carry the phones with them everywhere, be
able to use some seed applications, and otherwise
create their own micro-apps through noun-verb
composition.
33 Perceptual Interfaces - Vision
- We needed continuous mouse input for map
browsing, so we developed TinyMotion, a software
mouse for cameraphones. - By moving the camera against any background,
real-time image motion estimation provides mouse
coordinates.Also great for games demo in BID
lab
34 Perceptual Interfaces - Vision
- Cameraphones are capable of much more. Right now,
the vision algorithms available include - Motion
- Barcodes
- OCR text (business cards etc.)
- Coming soon
- Face recognition
- Building or streetscape recognition
35 Perceptual Interfaces - Speech
- Speech recognition technology has improved
steadily in the last ten years, particularly in
noisy environments. - Speech was never a good match for office
environments. - But the mobile playing field is completely
different. - Mobile users often need their eyes and hands
free, and the phone will always have a voice
channel for telephony.
36 Speech on Mobile Phones
- Restricted speech recognition is available on
many phones. - Large-vocabulary recognition just appeared on
cell phones last year (Samsung P207). Its a huge
step. It enables the next generation of mobile
speech-based apps - Message dictation
- Web search
- Address/business lookup
- Natural command forms(no need to learn them)
- Most of this technology was developed in the US
by VoiceSignal Technologies.
37 Research in Mobile Speech
- We are developing a state-of-the-art (continuous
acoustic model) recognizer for SmartPhones. - The goals are
- To provide an open platform for next-generation,
speech-based interfaces on mobile devices. - To support integration of contextual knowledge in
the recognizer. - To allow efficient exploration of the higher
levels of dialog-based interfaces.
38 Speech for Developing Regions
- Speech is an even more important tool in
developing regions. - Literacy is low, and iconic (GUI) interfaces can
be hard to use. - Unfortunately, IT cannot help most of these
people because they lack even more basic skills
fluency in a widely-spoken language like English
or Mandarin. - This project focuses on teaching English in an
ecologically appropriate way. -
- Speech-based phones are ideal for this.
39 Speech for Developing Regions
- Speech (with headset) allows students to learn
while working. - It leaves their eyes and hands free, and engages
their minds during tedious, manual work. - Some game motifs
- Safari hear sound say the name in English
- Karoake in English
- Listen and summarize BBC, cricket etc.
- Treasure hunt leave LB clues in English
- Adventure games dialog-driven scenarios
40- In Summary,The Future of Mobile is
- Using Context
- Harnessing Content
- Context ? Proactivity
- C/P ? Content sharing
- Perceptual Speech/Vis
- 5 Billion new users
-
41 Upcoming
- Special issue of ACM Queue magazine on
context-aware and perceptual interfaces (summer
06?) JFC guest Ed.
42 Workshop on Mobile Applications
- Planning an event on campus later this semester.
- Send mail to jfc_at_cs.berkeley.edu if interested.
43 Class Presentations on Mobile Applications
- Our User Interface class is developing mobile
applications on Microsoft Smart Phones (thanks,
MS!), demos in May. - Projects include
- Craiglist/Friendster in your vicinity
- Video store wizard
- Bluetooth E-tickets for BART, parking,
- Diet/Nutrition assistant
- Send mail to jfc_at_cs.berkeley.edu if interested.
44 Demos and Posters Today
- You can see the projects discussed here in the
BID lab (Berkeley Institute of Design) open
house, 2-4pm - Tinymotion camera mouse
- Glaze location service design
- Speech recognition for cell phones
- English-Language learning with cell phones
- a dozen other projects
- The lab is in 354-360 Hearst Mining Bldg.
45 Acknowledgements
- Thanks to the sponsors of this work
- and..