Thursday, September 19, 2013

Data sources for students

In this post I provide a list of web sites that distribute data that students can use in their research projects.  I plan to update this post regularly.  For now I will list a few sources with interesting data that are easy to access.  Links are given to all sources, and I also offer some brief descriptions. 

1.)  International Data
--------------------------------------
World Development Indicators
--------------------------------------
This contains data on education, environment, economic, infrastructure and topics as well as many others, for all countries in the world, for many years; an easy-to-use database is available here:

http://databank.worldbank.org/data/views/variableSelection/selectvariables.aspx?source=world-development-indicators

This data is distributed by the World Bank.  The World Bank also distribute a wealth of other data.  You can explore their other offerings here: http://data.worldbank.org/ 

2.)  U.S. Data
---------------------------------
American Fact Finder
---------------------------------
Using this system can be a bit tricky, but once you figure it out you will have access to a wide variety of data.

https://factfinder.census.gov


I suggest starting at the "Download Center".  I usually then select "I know the dataset or table(s) that I want to download."  In my research I have often used the ACS (American Community Survey, which provides data at various levels, including state, county, and city level, as well as the neighborhood level, i.e. tracts and block-groups, and even school and congressional district.)  I have also used the Business Patterns data which provides data on industry (based on the North American Industry Classification System) and the Economic Census. You can access all of these and more by selecting the one you want under "Program".  You then have to select geographies and finally variables.  Sometimes the Fact Finder does not work when you are trying to download large amounts of data.  It also does not contain much historical data.  However if you are able to create an account, you can access a lot of historical Census data through http://nhgis.org, and their download tool is very efficient when downloading large data files.

To use the Fact Finder, you select the program you want (say, 2008 ACS 1-year estimates), then the geography you want (say counties) and finally which tables (or variables.)  For tables I suggest DP02 "Selected social characteristics", DP03, "Selected economic characteristics" or another table that contains a large number of variables. You can also browse the complete set of tables or do a key word search.     

---------------------------------
County and City Databook
---------------------------------
Produced by the U.S. Census Bureau, this contains general demographic and socioeconomic variables for cities and counties, as well as some more interesting variables (including crime rates, manufacturing and other employment data.)  It is very easy to use as the data are distributed in XLS files.  The most current version can be found here:

http://www.census.gov/statab/www/ccdb.html

Click the links under "Selected features from the 14th edition of the County and City Data Book."  For earlier versions of the County and City Databook, see: http://www2.lib.virginia.edu/ccdb/

*** Update (3/3/2017):  Unfortunately these files are no longer hosted by the Census, but luckily you can still access them through the Internet Archive.  Click here to see archived versions of these files.  Also, see here for a blog post I wrote that uses these data.) 

-----------------------------------
State of the Cities Database
-----------------------------------
From the Department of Housing and Urban Development.

http://www.huduser.org/portal/datasets/socds.html

Contains data on Demographic and economic characteristics, employment rates, city extracts from County Business Patterns (establishments, average pay), crime data, building permit data, HMDA data, urban public finance data.


3.)  California Data
-----------------------------------
Rand California Statistics
-----------------------------------
This database contains a variety of economic and other data relating to California:

http://ca.rand.org/stats/

This database is available by subscription only, however San Jose State University has a subscription to it, and so SJSU students can access it for free through http://library.sjsu.edu/. Look for it under "Articles and Databases" (click here).

4.)  Miscilaneous
----------------------------------
ICPSR
----------------------------------
The Inter-university Consortium for Political and Social Research is a unique web site that allows social scientists from a variety of disciplines to deposit data for others to use.  If you browse their collection, you will find an amazing amount of data on topics you're interested in as well as topics you've never thought about before. 

www.icpsr.umich.edu

Definitely worth checking out!  Like the Rand California Statistics database, this is available through SJSU library.  Click here to find the link.

----------------------------
NBER
----------------------------
The National Bureau of Economic Research hosts the following data page:

http://www.nber.org/data/

I've never used it but perhaps you will find something interesting there.  (While I've never used any of these data, NBER's working papers series is a great place to go if you to see frontier examples of economic research.  Access through the SJSU library here.

----------------------------
The AER
----------------------------
The American Economic Review, arguably the top economics journal, has recently adopted a "Data Availability Policy" which states, for papers published in their journal, "the raw data will normally be archived on the AER Web site."  What this means for you is that you can access a wide variety of data sets that leading economists have assembled for their studies.  You can browse the table of contents of the journal issues here:

http://www.aeaweb.org/aer/issues.php

If you click an issue, then under the article titles you will see links to "Download Data Set".  The May issue always contains a large number of short articles and I recommend looking through these May issues first.


----------------------------
Public Policy Institute of California Surveys
----------------------------
Click the link below to visit their Data Depot:

http://www.ppic.org/main/datadepot.asp

Scroll down to view PPIC Statewide Survey Data.  You are able to access the raw data from these surveys (in SPSS format) and also the corresponding code books.)

---------------------------------
My Previous Blog Posts
---------------------------------
In addition to this post, I have pointed readers of my blog to interesting data sources in several other posts.  Here I will link to some of the best sources of data that I have linked to in previous posts.  You may have to look carefully, but there are links to good data sources in my previous posts that you can find here, here and here.