Complete all of the questions and knit your responses into an .html file. Send me the .html file by email (mgredman@uga.edu) by 5pm on Tuesday, May 30.

One reasonable, though not necessarily true extension of Lijphart’s hypothesis would be that different measures of a regime’s democratic tendencies should be correlated. Your task for this lab is to examine that conjecture using the polity data. Read in the data as before.

rm(list = ls())
dat <- read.csv('https://www.dropbox.com/s/eclwj2137icxpu7/lab2Data.csv?raw=1')

There are at least 9 variables you can use to assess the correlation between different measures of democracy (see Lab 1, Question 4.) In order to examine these variables, you will need to recode them so that missing values will be interpreted properly. I’ll walk you through this process using the xropen variable.

First, let’s examine the values that xropen takes.

sort(unique(dat$xropen))
## [1] -88 -77 -66   0   1   2   3   4
table(dat$xropen)
## 
##  -88  -77  -66    0    1    2    3    4 
##   62   33   31  484  137  151   32 3289

The large negative values are the problem. See the codebook to understand what they represent. The simplest way of handling values is to replace them with an indicator that this information is missing. R uses the symbol NA to indicate data is missing. Here’s what the recode process looks like for this variable:

## make a new varaible, with all its values set to missing. re is short for recode
dat$xropen.re <- NA
## tell R to update that variable to match the original, but leaving out the large negative values
dat$xropen.re[dat$xropen >= 0] <- dat$xropen[dat$xropen >= 0]
## visulaize the result 
## note the use of the table function to create a cross tabluation of two variables
table(dat$xropen, dat$xropen.re)
##      
##          0    1    2    3    4
##   -88    0    0    0    0    0
##   -77    0    0    0    0    0
##   -66    0    0    0    0    0
##   0    484    0    0    0    0
##   1      0  137    0    0    0
##   2      0    0  151    0    0
##   3      0    0    0   32    0
##   4      0    0    0    0 3289
## let's tell R to include missing values in the table
table(dat$xropen, dat$xropen.re, useNA = 'ifany')
##      
##          0    1    2    3    4 <NA>
##   -88    0    0    0    0    0   62
##   -77    0    0    0    0    0   33
##   -66    0    0    0    0    0   31
##   0    484    0    0    0    0    0
##   1      0  137    0    0    0    0
##   2      0    0  151    0    0    0
##   3      0    0    0   32    0    0
##   4      0    0    0    0 3289    0
  1. Explain this line of code: dat$xropen.re[dat$xropen >= 0] <- dat$xropen[dat$xropen >= 0]. What does each element represent? Why is there no comma inside the square brackets? Hint: If you don’t know the answer, ask me.

  2. Explain the results of the table command. What does each cell of the table represent?

Now let’s recode the executive constratint variable. Use the code above to generate xconst.re. and generate a table of the values. Your table should look like this:

##      
##          1    2    3    4    5    6    7 <NA>
##   -88    0    0    0    0    0    0    0   62
##   -77    0    0    0    0    0    0    0   33
##   -66    0    0    0    0    0    0    0   31
##   1    527    0    0    0    0    0    0    0
##   2      0  435    0    0    0    0    0    0
##   3      0    0  743    0    0    0    0    0
##   4      0    0    0  133    0    0    0    0
##   5      0    0    0    0  464    0    0    0
##   6      0    0    0    0    0  388    0    0
##   7      0    0    0    0    0    0 1403    0

Now we need to determine whether these two variables are moving together. Let’s begin by looking at data for all of the years together (you know how to restrict the data to a single year). First we can see can use the plot function, as before.

plot(dat$xconst.re, dat$xropen.re)

This plot is hard to interpret because so many of the points have the same value for both variables. Using the jitter function, we can put some space in between the points. Here’s what it looks like:

plot(jitter(dat$xconst.re, 2), jitter(dat$xropen.re, 2), pch = 20, col = gray(0.5, alpha = 0.3), xlab = 'Executive constraint', ylab = 'Opennesss')