Using Value-by-Alpha Maps to Visualize CTPP/ACS Bus Commute
1. Using Value-by-Alpha Maps to
Visualize CTPP/ACS Bus Commute
Estimate Reliability
Nichole Stewart
ESRI Mid-Atlantic User Conference 2014
Transportation Mapping & Analysis Session
2. Today’s Discussion
• CTPP and ACS Data Overview
• Estimate Reliability, Sampling Error and MOE,
and Coefficient of Variation
• CTPP Flow Data and Bus Commute by Census
Tract
• Creating Value-by-Alpha Maps Using ArcMap
3. CTPP Data
• Census Transportation Planning Products Data Tabulations
(CTPP)
• American Association of State Highway and Transportation
Officials (AASHTO)-sponsored and funded by state DOTs and
MPOs
• “Designed to help transportation analysts and planners
understand where people are commuting to and from, and
how they get there.”
4. CTPP Data
• Data Organization
– Part 1: Residence Tabulations
– Part 2: Workplace Tabulations
– Part 3: Commuting Flow to Work
• Commute data with more detailed crosstabs of critical
transportation data than standard ACS
– Means of transportation
– Number of occupants per vehicle
– Travel time to work and time leaving home
– Demographic information: income, age, presence of children, minority
status
Residence
Origin
Workplace
Destination
5. ACS Data and 5 Year Estimates
• American Community Survey
– Previous CTPP data tabulations based on decennial Census long form
but that survey replaced by American Community Survey (ACS)
– Sample of HH, 1 in 40 HH surveyed monthly over 12 months,
accumulated and pooled/reported for 1, 3, and 5 year periods (not
point in time)
• 5 year estimates
– Late 2013 first CTPP data based on 5-year ACS data
– Available from state to neighborhood (census tracts, block groups and
blocks, <20k population)
– 1 in 8 households surveyed over 5 year period (as compared to about
1 in 6 that received the census long form in the 2000 census).
– Trade-off: low reliability for small areas
Why does this matter? Data users want to know the statistical
reliability and usability of estimates
6. Estimate Reliability and Sampling Error
• Sampling error/standard error(SE):
The uncertainty or variability
associated with an estimate that is
based on data gathered from a
sample of the population rather than
the full population
• 90% confidence level for all
published ACS estimates
• Margin of Error (MOE): Measure of
the precision of an estimate (+/-) at
a given level of confidence
Read: A data user can be 90 percent
certain that the sample estimate and
the population value differ by no more
than the value of the MOE
MOE =SE*1.645
7. Coefficient of Variation (CV)
• The relative amount of sampling error associated with a
sample estimate
• Ratio of the standard error of the estimate to the
estimate: CV = SE / Estimate * 100%
MOE =SE*1.645
14. Extracting Data
Disclosure Review Board Threshold rule: No
data provided for O-D pair (origin-destination
worker flow tables) if fewer than three un-
weighted records
When estimates are zero, the Census Bureau
models the MOE calculation by comparing ACS
estimates to the most recent census counts
and deriving average weights for states and
the country. At the state, county, tract, or
block group level, state-specific MOEs for zero
estimates will be the same regardless of the
base of the table
15. Challenges with Data Extraction Tool
for this Analysis
• It can calculate percentages/rates…..
• BUT we have to aggregate the workplace census tracts
for all of Baltimore City first
• It can aggregate data using Totals…….
• BUT it aggregates MOE for Zero estimates and this
misinterprets the data
• It can also suppress empty or zero values……
• BUT not across more than one variable (total and
number using bus)
• SO you must download raw values and manipulate!
16.
17. Manipulating Data to Calculate CV
• Appendix 3
• Margin of Error for Aggregated Count
Data
• Margin of Error for a Proportion
• Coefficient of Variation
WORKPLACE SumOftotal SumOftotmoesq totsqrtmoe SumOfbus SumOfbusmoesq bussqrtmoe proportion propmoe propcoeffvar
Census Tract 1001,
Baltimore city, Maryland 239 18643 136.54 45 1686 41.06 0.19 0.13 0.43
18. Value-by-Alpha Maps
• Bivariate choropleth maps are used to
visualize two measures at the same time.
• Value-by-alpha maps are a type of bivariate
choropleth map
• The intensity of color will represent the
reported estimate of bus commuters and the
transparency of the layer will represent a
measure of the reliability of the estimate, the
CV.
19. VBA in ArcMap
• Map a layer of the estimate of interest (percent
Baltimore City commuters traveling by bus) and
adjust choropleth classifications and color
scheme.
• Make copies of the estimate layer based on the
number of reliability classifications in the
analysis.
• Use definition query to select tracts that fall
within the specified reliability range and adjust
the transparency of the reliability layers so that
more reliable estimates are more visible.
Adapted from: https://andrewpwheeler.wordpress.com/2012/08/24/making-value-by-alpha-
maps-with-arcmap/
20.
21. VBA in ArcMap
• Map a layer of the estimate of interest (percent
Baltimore City commuters traveling by bus) and
adjust choropleth classifications and color
scheme.
• Make copies of the estimate layer based on the
number of reliability classifications.
• Use definition query to select tracts that fall
within the specified reliability range and adjust
the transparency of the reliability layers so that
more reliable estimates are more visible.
Adapted from: https://andrewpwheeler.wordpress.com/2012/08/24/making-value-by-alpha-
maps-with-arcmap/
22. Mapping Coefficient of Variation
• A small CV (less than 15%) indicates that the sampling error
is small relative to the estimate and the estimate is more
reliable.
• A CV greater than 40% generally
signifies an unreliable estimate.
23. VBA in ArcMap
• Map a layer of the estimate of interest (percent
Baltimore City commuters traveling by bus) and
adjust choropleth classifications and color
scheme.
• Make copies of the estimate layer based on the
number of reliability classifications in the
analysis.
• Use definition query to select tracts that fall
within the specified reliability range and adjust
the transparency of the reliability layers so that
more reliable estimates are more visible.
Adapted from: https://andrewpwheeler.wordpress.com/2012/08/24/making-value-by-alpha-
maps-with-arcmap/
24.
25.
26. Making the Legend
• Convert legend to graphics and
then ungroup to arrange
symbology
• Final legend is a bivariate
scheme showing the estimate
of bus commuter percentage
going top to bottom, and the
CV transparency dimension
running from left to right
Adapted from: https://andrewpwheeler.wordpress.com/2012/08/24/making-value-by-alpha-
maps-with-arcmap/
31. Why Do We Care?
• The Maryland Transit Administration (MTA)
recently announced its Baltimore Bus Network
Improvement Project (BNIP). The initiative is
designed to improve Baltimore City’s local bus
system and ultimately connect residents to the
region and employment opportunities.
• Value-by-alpha maps can be used to determine
the usability of estimates of Baltimore City
residents who commute to work locally and
throughout the region using the bus as their
mode of transportation.
32. NICHOLE M. STEWART
/wherejobsare
Data Analyst and Researcher
Public Policy Doctoral Student (UMBC)
nichole@wherethejobsare.com
/wherejobsare
Contact
Editor's Notes
Census Transportation Planning Products
I’m not a transportation planner!
My primary research interests are in workforce development, economic development, and housing
But I’m fascinated by the newly available public data capturing job flows from home to work
Last year presented LEHD and the OTM application
I recently became aware of CTPP
Data sets different, wanted to explore other dimensions of job flow including mode of transportation
Previous CTPP
Will get into why that’s important
But ultimately the data set is …
How?
Who-information about different people
The reason so valuable….What
Data are available for tracts but not all data
Also, crosstabs won’t be available for the three parts
Website
So now we’ll go back to why it’s important that CTPP data is based on ACS and why we care about estimate reliability
ACS based on a sample of households
5 year estimates based on a larger sample that enables data to be reported at small geographies
What do we mean when we say reliability
We expect there to be variability in a sample estimate since we’re taking the sample from a group of observations that make up a population. The standard error is a statistic that quantifies that variability.
We also expect the mean value of the population will fall within a certain range of values, or distance, or a number of standard deviations from the sample estimate. How far, how many standard deviations? This depends on the precision we want or expect for an estimate, which we determine by a confidence interval. The 90% confidence level is used for all published ACS estimates.
The number of standard deviations associated with that confidence interval is 1.645. The MOE we get from ACS and CTPP comes from this equation.
What can we say about the estimate??
Remember, all other forms of error are called nonsampling error
Now we’ve downloaded our data and we have an estimate and a margin of error.
Most of us understand in theory what the MOE is, but probably less so understand what to do with it!
Another measure of sampling error, the CV
It’s a percentage!
But we need to do a few calculations…remember
This is the equation….
So far, our output will be tract to tract for residences and workplaces
As an example it will give you the number of residents in census tract 101 in Baltimore that works in and catches the bus to census tract 200 in Baltimore County
As you can imagine that’s a lot of information and that’s not really what we want
We want to know the total of Baltimore city residents who work and commute to that tract
This is a lot of data! Site has some great resources for you to figure out what’s available at various geographies and for different tabulations.
As an example there is information about residents in an area that you might not be able to get for the flow data
Missing
0s
Many rows, can suppress
Can download
This will be the cut off points of the CV that you want to show with different transparencies
The general idea is that
Census ACS Compass products suggest that users should be cautious about using an estimate if the CV is greater than 15%
Now we’ll make some changes to these layers using some ArcMap functions that we are familiar with
Ultimately is up to the user to access the quality of each ACS estimate and decide whether a particular estimate is suitable to his/her needs
.