Wednesday, April 1, 2020

Using R to select your college major

Many students arrive at college undecided as to what they should study. To help students like these, as well as students who wish to learn the R programming language, I created an R program that runs entirely in the cloud. First, register for a free R Studio Cloud account here, and then access the program I wrote here.

I have created a short video on using the script which you can find here.



To give you an idea of what to expect, see the Table below. There we see that 28.6% of software developers were Computer Science majors. Only 1.2% of developers were Economics majors, but average earnings across these two majors are very similar, at $105,861 and 103,564, for Computer Science and Economics majors, respectively.
 
These are descriptive statistics, of course, and in Chapter 1 of my forthcoming book, I discuss in detail how selection bias means we should be careful in interpreting the difference in means as causal effects. Still, recent research has found the causal effect of the economics major on earnings may be be quite large.

The program you will find at the link above creates a table like the one shown below, for any occupation the user specifies. This analysis was inspired by and is a generalized version of John Winters' analysis of lawyers.

How to use this program.

  1. Find line 294
  2. Change 229 to a different occupation code (229 is software developers; see Section 4 for a list of all occupation codes)
  3. Find line 344
  4. Change "software_developers" to whichever occupation you chose for line 294
  5. Run line 294 and all lines in Section 3
  6. Select the csv file you created under Files in bottom right pane, More>export

If you do this, you will have a CSV file that can be opened with a spreadsheet, and shows most popular majors for the selection occupation, as well as mean and median earnings for workers in that occupation, by major.







Table: 20 most popular majors among software developers, and average earnings by major
major freq cumul_percent    mean    median
 Computer Science  8,600 28.6 105,861.0 97,097.0
 Electrical Engineering  3,064 10.2 109,927.1 100,009.9
 Computer Engineering  2,891 9.6 108,816.7 99,242.0
 Computer and Information Systems  1,503 5.0 84,546.2 81,378.4
 Mathematics  970 3.2 109,934.9 97,097.0
 Mechanical Engineering  747 2.5 106,118.1 97,097.0
 General Engineering  721 2.4 95,933.2 89,457.1
 Business Management and Administration  709 2.4 87,225.6 84,355.7
 General Business  559 1.9 92,180.4 84,355.7
 Management Information Systems and Statistics  544 1.8 92,735.3 88,364.5
 Physics  491 1.6 112,235.7 103,544.9
 Information Sciences  430 1.4 95,000.6 87,387.3
 Electrical Engineering Technology  424 1.4 87,871.5 82,520.0
 Economics  365 1.2 103,564.0 93,213.1
 Psychology  353 1.2 85,315.3 81,090.2
 Biology  334 1.1 82,896.4 79,791.2
 English Language and Literature  290 1.0 89,951.6 79,393.6
 Finance  287 1.0 98,111.1 88,302.8
 Accounting  264 0.9 91,517.0 85,447.3
 Political Science and Government  253 0.8 91,092.6 80,423.2

No comments:

Post a Comment