R is an open-source statistical analysis tool that is both a programming language and a command-based application. It is a free, but very powerful, tool for analyzing and visuallizing data.
This guide is a tutorial covering the use of basic statistical functions on data in R. It covers most of the functions that would be used in an introductory statistics course and enough of the underlying programming concepts to give the reader a solid foundation from which to continue exploring the software. The guide provides a straightforward and practical introduction to using functions, manipulating data, managing input and output, and visualizing data from the command line. For a comprehensive treatment of the basic R package, the R-project provides an intro to R tutorial, that describes all of R's built-in functions and data types in detail, and covers subjects of interest such as scripting your own functions, downloading packages to extend R, and the full-range of models, tests and visualizations not covered in this guide. The last section of this guide is a list of online and library references that can be used to expand on this introduction. Readers of this guide should have a basic knowledge of statistics and a desire to apply computing power to the process of data management and analysis.
If you'd like to follow along with the examples in this guide, download the csv file below. The data was collected from three publicly available reports provided by the U.S. Census Bureau:
U. S. Census Bureau. (2010). B01003 - Total Population. All Counties within Illinois. 2006-2010 American Community Survey 5-Year Estimates. Retrieved from http://factfinder2.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_10_5YR_B01003&prodType=table
U. S. Census Bureau. (2010). B23006 - Educational Attainment By Employment Status For The Population 25 To 64 Years Universe: Population 25 to 64 years . All Counties in Illinois. 2006-2010 American Community Survey 5-Year Estimates. Retrieved from http://factfinder2.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_10_5YR_B23006&prodType=table
U. S. Census Bureau. (2010). S1903 - Median Income in the Past 12 Months (in 2010 Inflation-Adjusted Dollars): All Counties within Illinois. 2006-2010 American Community Survey 5-Year Estimates. Retrieved from http://factfinder2.census.gov/faces/tableservices/jsf/pages/productview.xhtml?pid=ACS_10_5YR_S1903&prodType=table
In the R console, data can be saved in various forms as variables (a fuller discussion of variables can be found in the "Getting Data from a CSV File" tab). The set of variables that can be accessed at any point is called the "workspace." An R workspace can be saved and loaded from the file menu, and any time you close the console, R will ask you if you want to save the workspace.
The R Console is what you will see when you first open R on your machine. Most interactions with R are managed through the Console by entering commands and manipulating the results.