Tuesday, March 19, 2013

Fruit and vegetable markets

College professors walk a fine line when they incorporate their research interests into their classroom teaching.  Focusing too much on our own interests risks ripping off the students as they miss out on other important areas.  However not incorporating our research at all makes the class feel far removed from the knowledge creation process that is a major function of universities.

My main research area is currently Urban Economics though I do not teach it.  This semester I had a clever idea for my Industrial Organization class--let's explore the interaction between industrial and spatial organization!

An excellent data set for exploring spatial and industrial organization questions is the Zip Code Business Patterns data series produced by the Census Bureau, which contains data on the number of establishments in each industry by zip code.  By focusing on industries, it covers the industrial organization aspect of the course.  And by geocoding the zip codes, we can explore some spatial questions that are in my area of research.

To illustrate the types of analysis I will have students in my class carry out, tomorrow I will show the results of my analysis of Fruit and vegetable markets in the Bay area.  (This industry is given NAICS code 445230.)  There a lot of different types of retail establishments that would be interesting to explore with this type of analysis, but given the recent interest in "food deserts" this seemed like a good one.

The figure above shows a scatterplot where the number of produce markets by zip code (divided by the land area of the zip code) is on the y-axis and the distance from the center of the zip code to the ferry building (my definition of downtown San Francisco) is on the x-axis.  I have fitted a line through these points and the fit is fairly decent; an R2 of 0.1389 means this linear relationship is statistically significant at the 10% level.

Turning to San Jose, I completed the same analysis as above, but looked at all zip codes in Santa Clara county.  The results are shown in the figure below.

To some extent, the results are similar in that produce shops are clustered in downtown zip codes.  Thus by this standard, downtown San Jose is not a "food desert."  However, while the coefficient on distance is negative, it is tiny, and the R2 of 0.0169 means the linear relationship is not statistically significant at any standard confidence level.  Note also that most zip codes have zero produce shops.

Moreover, note the absolute number of produce shops in all Santa Clara zip codes is much smaller than in San Francisco.  I know there are lots of reasons for this, including that Santa Clara county shoppers do most of their shopping in grocery stores, not small produce shops.  Things are obviously much different in dense San Francisco neighborhoods.  But the difference is nevertheless striking: no zip code in Santa Clara county has even one produce shop per square km; however zip code 94133 (North Beach/Chinatown) has ten produce shops (or 13.64 per sq km).  The second highest density of produce shops is also in Chinatown, followed by zip code 94111 (which includes the financial district and areas along the Embarcadero.)

Note to students who are completing similar analyses: you should remove any zip codes that have an area of zero because these are P.O. boxes!  Also, if you create scatter plots you can label your points (e.g. show the zip code represented by each point) but it is not easy to do in Excel and so I have not done it in the figures above.  But, if you want to use labels, it is easy to do in Open Office Calc or another program!  Finally, some helpful files for carrying out similar analyses that I've created can be found here.)