This page reflects our submission for the 37 Billion Mile Challenge, awarded the prize for Best Analysis. However, we recommend you use the latest version of this page with corrections and improvements.

Driving and Land Use: An Explanatory Model

From the map below it’s clear that people drive less in urban areas. What explains this difference? Explore the map to test your assumptions, then continue below for an explanatory model.

Tooltip Title grid

Using the Map Interactive

We tested a variety of potential explanatory variables related to demographics, income, density, land use, street width, intersections, sidewalks, and transit availability. You can select miles of driving per day per person, CO2 emissions from cars per person, or any of several other potential explanatory variables. You can also select the scale, statewide or by region. Statewide data is shown at the zipcode level; regional data is shown in 15 acre grid cells. Some scales with more detail, such as MAPC, may take a few moments to load. You can also read a complete description of variables and sources.

A Model Incorporating Multiple Explanatory Factors

From the exploratory mapping it’s clear that urban areas—where there is less driving per person— have a much higher concentration of people, jobs, sidewalks, roads, intersections, renters, and public transit. Is each of these factors in isolation also associated with less driving? The plots below show each of them compared to car use per person, both observed averages (grey) and our model predictions, when all other factors were held constant (red).

In the plots in the first row, the predicted values are roughly similar to the actual ones. In other cases, the predicted line is almost horizontal (% Children, Floor Area Ratio), showing that what seemed to be a relationship with driving disappears after controlling for other factors. In the case of Building Footprint, Assessed Value, and Sidewalks, the slope of the line changes directions, meaning that these factors which seem related to lower driving actually predict higher driving!

Explanatory Factors

The plots show that comparing each factors one at a time to miles per day per person can be misleading, because many of them vary together. For example, frequent transit service is generally only available in higher density areas, but high density may also affect driving via congestion, parking access, and the viability of alternatives. The observed relationship between transit availability and driving may actually be the result of one of these other related variables. We have addressed this problem by estimating a multivariate model that uses all of these variables simultaneously. Multivariate regression models estimate the independent effect of each variable, controlling for all the others in the model.

The demographic factors included were the number of children, the number of seniors, the number of households (household size), and household income. In addition to area household income, we have two other variables that are related to income: percent of housing units that are owner-occupied and total assessed value of all property in the grid cell.

We tested explanatory variables related to different aspects of the built environment that affect driving:

Proximity. People living next to more potential opportunities (jobs, shopping, parks, etc) can reach these places without driving as far as others who are more isolate, or can reach them by other means (by walking, biking, or using public transit). However, just because people can go to a nearby job or grocery store, however, does not mean they will necessarily choose that option over a preferable one further away. Variables tested: measures of population, employment, building footprint, and building bulk as well as an index of the number of nearby schools. Since the grid cells are all the same size, these are measures of the density of population, employment, and buildings.

Cost of Driving. The major variable costs of driving that vary over space are tolls, parking fees, and driver time. The first two only affect a small portion of trips, since roads and parking are generally free. However, traffic congestion and parking scarcity increases the time cost of driving greatly, especially in the older portions of the state that were built with narrower roads and limited off-street parking. Congestion may also be related to the size of the urban area, since in larger areas the flows of traffic may be greater. Variables Tested: the extent of the road network, the amount of parking lots and driveways, and the on-road distance to the nearest highway interchange.

Transit Accessibility. Transit requires both a concentration of residences (for economical walk access) and a concentration of employment (or other attractions at the destination end of the trip, since driving egress is generally not feasible). Transit does best in locations that had already developed these patterns prior to the widespread suburbanization of people and jobs after the Second World War. Variable Tested: aggregate frequency of transit service within 0.25 miles of the Census block group (in which the grid cell is located) boundary per hour during evening peak period.

Walk and Bike Access. Walking and biking are favored not only by proximity, but also by grid networks that minimize travel distance and provide the possibility of walking or biking along low-volume, low-speed streets. A built-out street front also makes walking safer and more pleasant (compared to walking along buildings fronted by parking lots, for example.) The presence of sidewalks and bike paths may also increase walking and biking. Variables Tested: the number of intersections, which may differentiate older grid-based street networks from rural areas and suburban areas with cul-de-sac networks. We also have a measure of the amount of sidewalks within the grid cell.

Model Results

The dataset is large enough (almost 95,000 grid cells) that it is possible to tease out the independent effects of each variable. All were statistically significant except percent children and distance to highway exit. Among the demographic factors, higher income, homeownership, and households were associated with more driving, and a greater share of seniors was associated with less driving. Among the land use factors, more intersections, transit, schools, parking lots, and jobs were associated with less driving, but more roads, sidewalks, buildings (both footprint and FAR), and assessed value of parcels were linked to more driving. Complete model results are available.

One way of comparing the importance of the variables in the model is to look at elasticity, which is defined as the percent change in Miles per Day per Person that results from a 1% change in the independent variable. An elasticity that is larger than another one (in absolute value) means that the driving is more sensitive to changes in that variable. The elasticities for this model are shown below. Three of them stand out as significantly higher than the rest: owner-occupied housing share, population, and households. Increasing population density is strongly related to reducing driving. Increasing the share of homeowners would tend to increase driving: there is both an income effect (since homeowners tend to have higher than average income) and a spatial effect (since owned housing is most likely to be single-family housing). All of the remaining variables but two have a measurable, but small, impact on driving. The most notable one is intersection density. This association could be because areas with a grid of streets are more conducive to walking, biking, and transit, and it could also be because such areas have narrow streets and limited off-street parking, making driving slower and more expensive.

The overall results show that although many land use variables appear to be connected with the lower amount of driving we observe in urban areas, some are actually unrelated, whereas others have the opposite relationship—leading to more driving. Fewer roads, more intersections, and frequent transit do have an independent effect on reducing driving, but by far the strongest predictors of driving are population density and homeownership. This may be because population density and are the best proxies, dwarfing all others, to quantify the difference in the time cost of driving and the viability of walking, biking and public transit in urban compared to suburban areas.

Elasticities of Explanatory Variables with Respect to Miles per Day per Person

Name Description Elasticity*
ChildPct Percent of Population ages 5 to 17 0.002
SeniorPct Percent of Population age 65+ -0.09
OwnPct Owned housing units as a percent of all housing units 0.32
l_pop10 Total population, 2010 -0.76
l_hh10 Households, 2010 0.43
total_emp Employment -0.01
l_pbld_sqm Building Footprint 0.14
l_prow_sqm Right of Way Area 0.11
l_pttlasval Total Assessed Value of All Parcels 0.08
l_ppaved_sqm Parking Lots and Driveways -0.03
l_far_agg Floor Area Ratio 0.04
l_intsctnden Intersections -0.07
sidewlksqm Sidewalk Density 0.02
schwlkindx Schools Within a Mile -0.01
l_exit_dist Highway Exit Distance -0.001
l_HHIncBG Median household income of block group 0.03
SLD_D4c Transit Frequency within 1/4 Mile -0.01
*Calculated at the sample mean for independent variables not in log form.

We find that fewer roads, more intersections, and frequent transit do have an independent effect on reducing driving. But by far the strongest predictors of driving are population density and owner-occupied housing.

Credits

This visualization and analysis was created by Paul Schimek, Zia Sobhani, and Kim Ducharme.

Paul is a transportation consultant with TranSystems and was the first Bicycle Program Manager for Boston. He matched the MAPC data to Census and EPA datasets, created the regression model, and wrote the documentation.

Zia has worked in alternative transportation for years, building and racing solar cars and writing control software for hybrid vehicles. She now does freelance data visualization and blogs a bit about it. She developed the map interactive and the "small multiple" plots.

Kim previously managed the design group at WGBH Interactive and is currently Director of Design at CAST. She provided information design strategy, visual and editorial design consultation.

Data from the Vehicle Census of Massachusetts, Metropolitan Area Planning Council 2014, documented here: http://www.mapc.org/sites/default/files/VehicleCensusofMA_Documentation_v1.pdf.

The Massachusetts Vehicle Census dataset is licensed by MAPC under a Creative Commons Attribution 4.0 International Public License.

This product includes color specifications and designs developed by Cynthia Brewer (http://colorbrewer.org/).

Map tiles by Stamen Design, under CC BY 3.0. Data by OpenStreetMap, under CC BY SA.