Title: A Comparison of Geocoding Methodologies for Transportation Planning Applications
1A Comparison of Geocoding Methodologies for
Transportation Planning Applications
- Jennifer Indech Nelson
- Dr. Randall Guensler
- Dr. Hainan Li
- Georgia Institute of Technology
May 9th, 2007
2Agenda
- Purpose
- Background
- Process
- Acquisition of data
- QAQC
- Final data set
- Analysis
- Positional Accuracy
- Polygon Assignment
- Discussion
- Assess the accuracy of various geocoding methods
to provide insight on field data collection,
calibration of travel demand model inputs, and
automation of travel behavior analysis
3Geocoding and How It Is Used in Transportation
Planning
- Geocoding - Generation of coordinates within a
spatial geographic framework, where single points
serve as proxies for places - Used to
- Prepare TAZ data from travel diary studies for
Travel Demand Model development - Better represent spatial travel patterns
- Verify 4-step model components
- Provide primary input to next generation
behavior-based micro-simulation Travel Demand
Models
4Methods of Obtaining Geocoded Coordinate Data
- GPS field surveys (active)
- Aerial image processing
- Address matching
- Road network address interpolation
- GPS tracking (passive)
Increased automation
5Geocoding Address Matching Vs. Interpolation
Linear Address interpolation
11 - Check address existence / integrity from
list inc. other attributes
Estimate position from spatial reference (network
link)
Assign coordinates
Address Interpolation
Address Matching
6GPS and GIS Data Acquisition in Transportation
Commute Atlanta
- Commute Atlanta study
- GPS-instrumented vehicle tracking
- 3 years, second-by-second
- 487 vehicles, 268 households
- 1.8 million trips
7Data for Comparative Analysis
- Two days of parallel data in March 2004 from 137
HHs - Travel diary self-reported locations
- GPS recorded trip files
- Parcel-level geographic reference
- GIS shapefiles generated by MPO and individual
counties (Fulton and Gwinnett Counties)
8Example of GPS Trip Ends
All GPS Trip-Ends in 13-County Region during
travel diary survey period
9Final Data Format
- Each location record has three associated
coordinates - GPS trip-end point
- Parcel centroid
- Interpolated location (street network)
- Characteristics
- Unique ID
- Area
- Land use
- TAZ
Centroid
Geocode w/ offset
40
GPS
10Data Quality Issues GPS/Diaries
- Travel diaries versus GPS trip-ends
- Under-reporting of visited locations in travel
diaries - GPS wander
- Dependent on weather, satellite, and hardware
conditions - Primarily occurs at lt 5 mph
- Data point is last GPS coordinate at engine-off
11Data Quality Issues Reference
- GIS parcel boundaries and centroids
- Not all parcels have existent or correct address
data - Topology errors may lead to inaccurate centroid
calculation - Road network geocoding
- Uses national database generated by NavTeq and
TeleAtlas, may not have current/correct address
ranges
12The Incredible Shrinking Data Set
Metro Atlanta (13 counties )
Two-county subset
- Fulton 195 locations, 119 unique
- Gwinnett 129 locations, 75 unique
13Analysis Positional Accuracy
- Complete (3-source) data only 324 points (194
unique) - 195 Fulton, 129 Gwinnett
- Compare
- GPS trip-end data with parcel centroids
- Interpolated addresses with parcel centroids
- GPS trip-end data with interpolated addresses
- Further comparison according to
- Land use
- Parcel size (e.g. lt 5 acres, gt 5 acres)
14Positional Accuracy GPS vs Geocode
- GPS significantly more accurate than geocoding
- Combined 273 vs 402
- (Single-family) residential locations more
accurate than non-residential parcels - Smaller parcels more likely than larger parcels
to have better positional accuracy for all
methods
15Positional Accuracy Land Use / Size
- GPS to centroid accuracy has some correlation to
parcel size, but land use and typical parking
location are probably more important - Within particular land uses, inverse relationship
of accuracy to area
16Results Polygon AssignmentParcel and Blockgroup
- Match rates to potential TDM inputs
- Parcels, Census Blockgroup
17Results Polygon AssignmentLand Use and TAZ
- Match rates to potential TDM inputs
- Land Use, Traffic analysis zone (TAZ)
18Polygon Assigment Rate TAZ
- Non-residential locations especially prone to
mis-assignment
19Discussion
- Reference Data
- Must be accurate and standardized!
- Positional Accuracy
- Method of creating geocoded data depends on
degree of accuracy needed - Most to least accurate (lt10 ft to gt1000 away)
Address matching, GPS, interpolation - Off-site parking creates issues for passive
determination of trip purpose from GPS data
20Discussion
- Polygon Assignment
- TAZ hit rate lower than expected, particularly
for non-residential locations - Degree of zoning homogeneity and size of parcels
are directly proportional to chance of matching
correct land use for TDM verification
21Next Steps
- Assess method of GPS tracking and data gathering
- Quantify error associated trip-ends
- Determine how to evaluate large parcels /
campuses - Internal destinations, land uses
22Any Questions?
Please use the Microphone
22
23Appendix Sources and Additional Figures
- All figures created by Commute Atlanta
researchers, except spatial interpolation
picture (slide 5 from Three Standard Geocoding
Methods Dramowicz, 2004) and Google Earth
imagery (slides10 and 21)
- Right GPS position off due to urban canyon (tall
buildings in Midtown Atlanta)