Personalizing Java based Answers for Hundreds of Millions of Users - PowerPoint PPT Presentation

About This Presentation
Title:

Personalizing Java based Answers for Hundreds of Millions of Users

Description:

Prepared for Yahoo! by Polypod through CMG Advisors ... Anurag Gupta Senior Architect, Yahoo Answers & Groups anuragg_at_yahoo-inc.com – PowerPoint PPT presentation

Number of Views:94
Avg rating:3.0/5.0
Slides: 17
Provided by: Andrew1177
Category:

less

Transcript and Presenter's Notes

Title: Personalizing Java based Answers for Hundreds of Millions of Users


1
Personalizing Java based Answers for Hundreds of
Millions of Users

Anurag Gupta Senior Architect, Yahoo Answers
Groups anuragg_at_yahoo-inc.com
2
Agenda
  • Industry Gaps
  • Vision
  • Strategy
  • Use Cases
  • Architecture
  • Next Steps

3
2010 Resurgence of QA
2010 A year of highlights

Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
2011 The story continues Quora, Location-based
QA apps (Crowd Beacon, Hipster), Facebook
Questions and Mahalo pivoting, Answers.com
acquisition
. . . Yahoo! Answers is still 1 (twice size of
nearest competitor)
Launch Acquisition Investment
Mobile play
4
Why this activity?
Companies entering market to address deficiencies
of Social Media
  • Meeting unmet needs
  • Improving signal to noise ratio
  • Beyond realtime creating User Generated Content
    of lasting, evergreen value
  • Organising peoples knowledge and opinion for
    mass consumption
  • Allowing people to connect and share based on
    common interests, locations etc.
  • Providing platforms for people to become regarded
    as experts
  • Identifying untapped monetisation opportunities
  • Mining intent and interest and information from
    participating users

5
Industry Gaps
Personal Relevance User Reputation Content Quality
No understanding or filtering of content by interest Lack of understanding of quality contributors / content poor signals Spam management
No filtering of content by social circle or user reputation Persona vs. Real identity No distinction between knowledge vs. conversational QA
Almost no ability to post location-specific questions and filter content by location No topic specific reputation (PeopleRank) No memory hard to surface previous questions around topic
Limited action, reaction, interaction loops opportunity to improve engagement through notifications/follows No community tools for users to engage outside of QA
6
  • Yahoo Answers is the place to share opinions,
    experience knowledge around personal interests

7
Y! Answers Leading Site with over 2X next
competitor
Unique Users - Comscore Unique Users - Comscore Unique Users - Comscore Unique Users - Comscore Unique Users - Comscore Reach - Comscore Reach - Comscore Reach - Comscore Reach - Comscore Reach - Comscore












  Jun-11 M/M Y/Y     Jun-11 M/M Y/Y  
Reference 745 M -2 11            
Wikimedia Foundation Sites 399 M -3 5 Wikimedia Foundation Sites 54 -1 -5
Yahoo! Answers 245 M -2 17   Yahoo! Answers 33 0 5  
Baidu Answers 109 M 4 10 Baidu Answers 15 6 -1
eHow 82 M -8 13   eHow 11 -6 1  
Answers.com Sites 72 M -19 5 Answers.com Sites 10 -17 -6
8
Strengthen core and reach out
Monetization
Ecosystem
Distribution
Personalization, User Interest Graph User
Reputation
9
Personalization Relevance
Connected Devices
Ranked content, video, ads
User clicks
Insights
Users
APIs
Social graph
Ads
Content
Yahoo
User Generated Content, tagging
Partner Data
APIs
Publisher Partners
10
Personalization Relevance
Users
Finance
Ads
3rd party publisher
Sports
News
Search
Content Ad Server In-memory user-content-relevan
ce_score
Gaps drive acquisition of new relevant long-tail
content
User clicks Search terms
Ranked content ad
Collaborative Filtering, social, geo, time
Feeds
User Segments
Content- Tags
Tag
User Interest Graph
Ad Content
Advertisers
Social Graph like
Interactions UGC, tags, QA
Publishers
11
Yahoo Answers Personalization Use Cases
  • Learn about new users interests (cold-start)
  • Show relevant questions to user that comes via
    search engine
  • Show relevant questions to Answerer on Y!
    Answers or 3rd party site
  • Use knowledge of user interests to increase user
    engagement, page views, reach, monetization

12
Answers Relevance Content Quality
Increase signal to noise ratio Reward content
creators with relevant audience Help audience
discover relevant high quality content
High quality High relevance QA page
Green Y! wide Yellow Answers specific
Answerers with High PeopleRank
Viewers interest
Question Popularity
Quality of Answers
Answerability
User Interest Graph
PeopleRank of Viewer who voted useful
Best Answers Attributed To Answerer
Useful Vote
Like Vote
13
Architecture for Online Offline Computation
Tags
Relevance computation
Front-End
Fast path
Notification
userId, contentId, relevance_score
Middle-tier
search terms, UGC
User Profile Services
NoSQL Long Tail Cache
Oracle
User interest
Answerability
Collaborative Filtering
Feed Acquisition
PeopleRank
Quality of Answers
Thumbs-up
Content
Tags
Question Popularity
3rd party feeds
New Online serving
Answers serving
New Offline on Hadoop Grid
14
Offline Relevance Computation
7, userID-Q-relevance_score
3, viewer interests
1, userID
User Interest Graph
PeopleRank
Relevance Computation
4, top answerers
2, viewer interests
4b, popular Qs
5, top answerers
3b, viewer interests
6, Qs answered
Answers Data on Grid
15
Incremental Online Relevance Computation
1, click, search, UGC
10, relevant Qs
Front End
9, relevant Qs
2
5, viewer interests
Middle Tier
3, userID, tags
PeopleRank
UPS
6, top answerers
Relevance Computation
4, viewer interests
6b, popular Qs
7, top answerers
5b, viewer interests
8, Qs answered
Answers Oracle Database
16
Next Steps
  • Move Oracle batch processing to Hadoop grid
  • Get Answers data on Hadoop grid
  • Annotation of source property for user interest
  • Detect useful vs. interesting feedback
  • User Interest Graph
  • PeopleRank
  • Tag computation
  • Bucketing infrastructure
  • Notification services
Write a Comment
User Comments (0)
About PowerShow.com