R is an open-source statistical analysis tool that is both a programming language and a command-based application. It is a free, but very powerful, tool for analyzing and visualizing data.
This guide is a tutorial covering the use of basic statistical functions on data in R. It covers most of the functions that would be used in an introductory statistics course and enough of the underlying programming concepts to give the reader a solid foundation from which to continue exploring the software. The guide provides a straightforward and practical introduction to using functions, manipulating data, managing input and output, and visualizing data from the command line. For a comprehensive treatment of the basic R package, the R-project provides an intro to R tutorial, that describes all of R's built-in functions and data types in detail, and covers subjects of interest such as scripting your own functions, downloading packages to extend R, and the full-range of models, tests and visualizations not covered in this guide. The last section of this guide is a list of online and library references that can be used to expand on this introduction. Readers of this guide should have a basic knowledge of statistics and a desire to apply computing power to the process of data management and analysis.
Throughout this guide, code will be completely enclosed in double quotation marks ("").
If you'd like to follow along with the examples in this guide, download the csv file above. The data was collected from three publicly available reports provided by the U.S. Census Bureau:
In the R console, data can be saved in various forms as variables (a fuller discussion of variables can be found in the Getting Data from a CSV File tab). The set of variables that can be accessed at any point is called the "workspace." An R workspace can be saved and loaded from the file menu, and any time you close the console, R will ask you if you want to save the workspace.
The R Console is what you will see when you first open R on your machine. Most interactions with R are managed through the Console by entering commands and manipulating the results.
Except where otherwise indicated, original content in this guide is licensed under a Creative Commons Attribution (CC BY) 4.0 license. You are free to share, adopt, or adapt the materials. We encourage broad adoption of these materials for teaching and other professional development purposes, and invite you to customize them for your own needs.