Title: Reality Mining: Sensing Complex Social Systems Nathan Eagle and Alex (Sandy) Pentland MIT Media Laboratory
1Reality Mining Sensing Complex Social
SystemsNathan Eagle and Alex (Sandy)
PentlandMIT Media Laboratory
- Presented by Kyle Dorman
- May 12, 2008
2Motivation
- Current surveys are limited
- Even surveys that utilize location-aware devices
could not scale to large groups - Mobile phones can be used to study individuals
and organizations - Goal Create a predictive classifier that can
perceive aspects of a users life more accurately
than a human observer
3Bluetooth
- Bluetooth MAC address (BTID)
- Hex number unique to the particular device
- Device name
- Set by user
- Device type
- 3 integers that correspond to the device
discovered
4BlueAware
- Software application to log BTIDs
- Records and timestamps BTIDs encountered in a
proximity log - Runs in the background
Fig 1. BlueAware running in the foreground on a
Nokia Series 60 phone
5Dataset
- One Hundred Nokia 6600 smart phones
- 75 students or faculty in the MIT Media
Laboratory - 25 incoming students at MIT Sloan business school
adjacent to the lab - Nine month study (500,000 hours)
6User Modeling
- Identifiable routines
- Daily routines, weekly patterns, yearly patterns
- Model three states home, work and elsewhere
- Data obtained from Bluetooth, cell tower and
temporal information
7Location
- Detecting by cell towers is complicated
- Within range of over a dozen cell towers
- More than a couple miles away from a tower
- Detecting with GPS works well outside
- Inside issue is with line-of-sight
- Incorporated static Bluetooth device ID
- Such as a desktop computer
8Location Bluetooth
Fig 3. The number of Bluetooth encounters for
Subject 9 over the month of January
- 10 most frequently detected Bluetooth devices for
one subject - Provides insight into the times the user is in
his office - Also, insight into his relationship with other
subjects - Leaves his office at 1400 and becomes
consistently near to subject 4 - Strong cutoffs at 900 and 1700 show that
subject has low entropy
9Results Low Entropy Subject
Fig 4. Subject 9's 'low entropy' daily
distribution of home/work transitions and
Bluetooth devices. The 'hot spot' in mid-day is
when the subject is at the workplace.
10Results High Entropy Subject
Fig 5. Subject 4's 'high entropy' daily
distribution of home/work transitions
and Bluetooth devices.
11Cell Phone Applications
Fig 7. Average application usage of 100 subjects
with location
12Possible Errors
- Data Corruption
- Data was stored on flash memory card
- Because of finite number of read-write cycles, a
new card failed after a month - Changed to store incremental logs in RAM and
write complete logs to flash memory - Bluetooth
- Can penetrate some types of walls
13Human Induced Errors
- Phone turned off
- Phone separated from user
- Forgotten phone classifier
- Staying in same location for a period of time
- Remaining idle
- Classifier did identify days correctly but also
had false positives - Missing Data data corruption powered off
devices - Positive Users filled out surveys about
activities and people they interact with
throughout the day - Found correlation between survey and proximity
data
14Another application
- Logging and time-stamping users activity,
location and proximity to other users - People who users only see in a specific context
- By knowing activity, time of day and even
location, can calculate probability of user
seeing a specific individual using Bayes rule - Predictions can inform user of the most likely
time and place to find specific colleagues
15Human Landmarks
Fig 11. Plotted is proximity frequency data for a
friend and a workplace acquaintance of one
subject.
16Relationship Inference
Fig 10. Friendship (left) and daily proximity
(right) networks share similar structure. Circles
represent incoming Sloan business school
students. Triangles, diamonds and squares
represent senior students, incoming students, and
faculty/staff/freshman at the Media Lab.
17Organizational Rhythms
- Dynamics of behavior in organization in response
to external and internal stimuli - Example In October, 75 Media Lab subjects were
working towards annual visit of sponsors - Significant fraction spent much of the night in
the lab just before the event
18Conclusion
- Applications studied
- Individual behavior modeling
- Device usage
- Relationship inference
- Group behavior analysis
- Many more applications this kind of data can be
applied to
19Questions?