A brief tutorial on the R programming language.

To make quick changes to one or two cells, or just see the data frame in a graphical editor, use the "edit" function. Below are the results of a call to "educ<-edit(educ)".

In this editor data can be viewed and changes made to values in rows and headings. If changes are made, the altered table will be the output of the "edit" function, so the results of the function need to be saved into a variable with the "<-" operator. This can be the same variable that currently holds the data frame, as above, or a different one.

The "hist" function allows a Histogram to be drawn. Below are the results of calling "hist(educ$bachelors_or_higher, 25):"

The Histogram appears in a new window and shows frequencies for percentages of people with bachelors degrees in 25 bins. The arguments in the call to "hist," above, explicitly specified 25 bins. If this argument is left out, R will use an algorithm to determine a number of bins based on the distribution of the sample.

R's built-in graphics functions fall into two categories: high-level and low-level. **High-level graphics functions,** like "hist," create a new graph in the graphics window. **Low-level graphics functions** operate on the graph currently in the graphics window. High-level graphics functions have a shared set of arguments they can take to draw a graph. For example, "educ$bachelors_or_higher" is not a particularly descriptive label for the chart or the x-axis, arguments, but the problem can be remedied by supplying new arguments. The title and axis label parameters of any high-level graphics function can be changed by adding the arguments "main" and "xlab" to the arguments for "hist." So calling "hist(educ$bachelors_or_higher, xlab="Percentage of Population with Bachelors Degree or Higher", main="Histogram of Percentage with Bachelors Degree or Higher for Illinois Counties", 25)" results in this graph:

Another high-level function that can be called is "boxplot." I'll call boxplot with a formula and some labels. The call "boxplot(educ$median_income ~ educ$greater_than_thirty_percent_have_bachelors, main="Boxplot of Median Income by Greater than Thirty Percent Bachelors Degrees", xlab="Median Household Income", ylab="Greater than Thirty Percent have Bachelors?")" results in:

For a more comprehensive list of R's high and low-level graphics functions see the R-tutorial, Chapter 12.

Scatterplots have a special place in statistics, and to obtain one in R we'll use the high-level "plot" function. The syntax here is "plot(educ$median_income ~ educ$bachelors_or_higher)". If desired, we can use "main," "xlab", and "ylab" to assign labels to our scatterplot.

Since this data appears somewhat linear, we can use the low-level function "abline" to plot a linear model on the graph. Here I've called "lm" right in the arguments to abline to get the model, but I could have saved the model to a variable and used that for the argument instead. (See the Get Statistics tab for a discussion on "lm").

Below are the results to the call "abline(lm(educ$median_income~educ$bachelors_or_higher))". Remember that, since "abline" is a low-level function, it acts on the graphic that's already being displayed.

Note on this code: To get the names for the x and y axes and the title of the plot the main, xlab, and ylab arguments were used, but the function call was simplified to emphasize the action of "plot."

Scholarly Commons