Title: I am not here to persuade you about the usefulness o
1Before We Start
- I am not here to persuade you about the
usefulness or limitations of Neogeography or User
Generated Content - I am here to share my views on issues relating to
the topic of spatial data quality and
neogeography - Disclaimer - In general, my observations derive
from my familiarity with mapping, navigation and
local search
2My Background
- PhD in Geography, specializing in Cartography
- Attended AutoCarto 1 in 1974 (and gave the
keynote in 2008) - Associate Professor of mapping and geography at
SUNY Albany (19721985) - Associate at Spad Systems
- Chief Cartographer, Chief Technologist and VP of
BizDev for Rand McNally (1986-1999) - CTO and EVP of Engineering for go2 Systems (YP
over cell phones) - Now run a consulting business focused on
geospatial, especially local search, mapping and
navigation applications
3Data Quality and Neogeography
- Dr. Mike Dobson
- President
- TeleMapics LLC
- mwd_at_telemapics.com
4Spatial Data Quality?
- Overall concern regarding the fitness of data
for a particular use - Accuracy of position
- resolution
- Accuracy of Attribution
- Logical Consistency
- Completeness
- Including spatial coverage
- Temporal relevance
- Metadata
5Spatial Datas Emerging Popularity
- World of spatial data is exploding
- Accessibility to spatial data increasing
- Availability of spatial data increasing
- Todays online environment provides
- Easy-to-use tools for collecting spatial data
- Easy-to-use tools for analyzing spatial data
- Easy-to-use tools for presenting spatial data
6Why Is This of Concern?
- The quality of spatial data mitigates the success
of communicating spatial concepts - Could this explosive growth have an influence on
the quality of spatial data?
7Why Data Quality Is Key
8No Integrity!
9Neogeography
- Neogeography
- new geography using non-traditional tools
- Neogeographers
- Want to communicate/share their interests in
geography and are willing to do something about it
10NeoGeos
- What Roles do Neogeographers play in the process
of communicating spatial data? - Data collectors database creators
- Data analyzers
- Data Presenters
- While all three roles impact or are influenced by
data quality, today I will focus on
neogeographers and data collection /database
creation
11Spatial Data Quality and Neogeography
- In order to help you understand my persuasion on
data quality and neogeography, I would like to
explore User Generated Content - UGC is one of the primary means that
neogeographers use to express their interest in
Geography - On this journey we will loop outside of geography
and then fall back in through mapping and other
uses of spatial data.
12User Generated Content?
- Content that is produced by users of web sites
and digital media - Contrasted with traditional media producers such
as broadcasters, production companies publishing
companies and map database companies
13So Whats Important About UGC?
- Equality of opportunity to publish
- Coupled with one of the most significant
demographic trends in the last century - Its about me (e.g. use of YouTube, MySpace,
Facebook) - Especially in respect to the streets, roads and
trails I travel, as well as the POIs I frequent
and the spatial topics of interest to me
14Social Networking
15How Did This Happen?
- Technology that allows you to be connected, as
well as to communicate and collaborate on your
own terms - Internet
- Cellular telephony
- Development of comprehensive spatial databases
- Pushing geospatial into the mainstream
-Neogeography
16How Did This Happen?
- Networks provide for
- Collective intelligence the hive mentality or
perhaps the Borg - Aggregated knowledge from decentralized sources
(Wikipedia Wikinomics) - Low cost collaboration
17UGC Potential Benefits
- Linuss law
- With enough eyes all bugs (spatial errors) become
trivial - Contributors exhibit
- Self selection
- Focus
- Self benefit
- Numerousness
- There should be more interested spatial data
contributors than professional map editors - Spatial distribution
- The distribution of UGCers is more ubiquitous
than that of professional map editors.
18Criticisms Of UGC
- Some error situations are too complex to be
understood real-time - Usability may be low
- May require extensive error checking
- User priorities may lead to unreliability
- Prejudice in responses
19Lake What Road?
20Not enough Contributors -Data Points?
21User Priorities - Oooops
22Prejudice in Response?
23Prejudice in Response
24UGC And Spatial Databases
25Spatial Database Creation
26Whats Being Optimized In The Previous Process?
- spatial data quality
- Accuracy of position
- resolution
- Accuracy of Attribution
- Logical Consistency
- Completeness
- Including spatial coverage
- Temporal relevance
- Metadata
27How Optimized?
- Data Quality is an integral part of the process
- Initially
- Data collected according to specifications
- Bad data re-collected or placed in the update
queue - Ongoing
- Every year significant spatial changes are
accommodated. - Areas of high change are identified and updated.
- Other changes are found by systematically working
research teams through the entire coverage over
time - The overall assignment is designed to maximize
the time value of money, while increasing the
integrity of the database.
28Harmonization
- It is this attempt to actively harmonize all data
that distinguishes database building efforts. - Important Issues
- Who directs crowdsourced data from an editorial
perspective? - Who sets standards for crowdsourced data?
- Who Quality Controls crowdsourced data?
- What external guidance exists in crowdsourced
systems ?
29Three Categories of Spatial Data
- Controlled data
- OS, Navteq, TeleAtlas, INFOusa
- Hybrid (a mix of controlled and uncontrolled
data) - Google, Yahoo, MSN, TomTom
- Crowdsourced (uncontrolled)
- OSM, Flickr, etc
30Issue
- It is possible to manage controlled data quality
to meet specific requirements - It is possible to manage hybrid data quality to
meet specific requirements - But can you manage crowdsourced data quality to
meet specific requirements on a reliable basis? - Lets look at database compilation for some
insights
31Compilation
- Commercial
- Training in compilation
- Specialization
- Staff size limited
- Research limited
- Sweat of the brow
- But salaried sweat of the brow
- Wiki
- Self Selection
- Local experience
- Staff size potentially unlimited
- Research hours potentially unlimited
- Avocation
32Compare and Contrast
- Commercial
- What are my coverage goals?
- What are my accuracy goals?
- How Much can I spend on updating?
- What size of capable staff can I afford?
- How well can I pay them?
- How can I otherwise incent them to create the
best database possible?
- WIKI
- How many people will contribute?
- How many are capable?
- Where are they located?
- Does this match areas of weak coverage?
- How long will it take to get good results over
large coverages? - How to motivate these collaborators over long
periods?
33What Are The Potential Weaknesses of WIKI?
- Common issues
- Not enough data gatherers to validate the data
- or a method to redeploy them
- Not enough coverage to meet the need (the
distribution of the UGCers) - Or a method to redeploy them
- Lack of Standards
- Lack of Quality Control
- But all of these limitation can be accommodated
34Getting Around Some UGC Issues
35Are Other Types of Spatial Databases Superior?
- Even with the benefits of Moolah () -Major
navigation databases are - Out of date
- Inaccurate
- Non-comprehensive
- Variable quality
- Too expensive to maintain
- Navteq database extension and update costs in
2007 were over 300,000,000
36www.refnum.com/osm/gmaps/Haywards Heath
37And Thats Why UGC and Neogeographers
- Will become an integral part of building spatial
databases - Hybrid data collection systems using UCG and
controlled data are where geospatial is going - Lets look
38Old Information Sharing
39New Information Sharing
40Whats The New Process
41Social Networking Tools Of Interest in Compilation
42Spatial Data Collection
- Some UGC will be active
- User connects to an app and enters relevant
spatial data for updating or extending a spatial
database - Some UGC will be passive
- Device tracks and reports (anonymously) user
paths, builds database by merging path
information over time - Passive is particularly useful in building
navigation databases
43Relative Cost
44Relative Accuracy
45Summing UP
- Data Collection Systems
- Closed commercial compilation efforts, no UGC
- Open WIKI approaches, no proprietary data
- Hybrid where geospatial is going
- Advantages spatial data accuracy by contributing
the best of both approaches.
46Raises These Questions
- Will the winners be
- Established commercial companies that capitalize
on UGC to augment their data? - New competitors that commercialize UGC and
augment these data to compete with established
commercial systems?
47PND Data Flow A Winner
48UGC Open Street Data Flow No Medal
49Commercializing UGC
50Relative Benefits Of Types Of UGC By Device
51Why We Need UGC and Neogeographers
52Thanks