Skip to Main Content

University Library, University of Illinois at Urbana-Champaign

Basic Data Analysis in Python

This guide will go over how to utilize the Python programming language for basic data analysis.

Getting Started

There are many ways to download and begin using Python. To streamline this tutorial, we will be covering the Anaconda distribution and the PyCharm integrated development environment (IDE)

Anaconda

                                                                      Anaconda (Python distribution) - Wikipedia

Anaconda is a good starting point for starting Python because it comes pre-installed with many packages for use in data science. It gives users the option to select which packages to integrate into their version of Python. Other distributions, such as ActiveState, also allow for this, but Anaconda is more specifically focused on data science. Let's begin!

To start, download the Anaconda program from their website hereIt provides options for download on Windows, Mac, and Linux. This tutorial i is based upon Windows.

While not strictly necessary, it is recommended to run the installer as an administrator. This is achieved by right clicking on the download and clicking the "Run as Administrator." This helps mitigate any permissions issues that could arise.

Proceed through the installation process, clicking "yes" when asked if the program can make changes to your computer. Upon installation, open the "Anaconda Prompt" program. This is a quick, basic way to test code, but we will only be using it to make sure Python is fully updated. When it opens, you should be greeted with a black screen with text in the upper left, like follows:

In this prompt, we are just going to run a single command. To quickly update Python, you can type this command exactly as it appears: conda update --all. This command will automatically install the latest version of Python with plenty of packages! From here, we are set to move onto the next part: installing PyCharm. While there are plenty of things you can do to customize Python, such as creating your own virtual environment (essentially, a customized instance of Python with user-chosen packages, this is not necessary for this tutorial since everything is included in our base Python instance). However, it would be good to test everything before we move on. 

Open the Anaconda Navigator (it might take a minute and will open multiple blank black screens - don't worry!), and navigate to the "Environments" tab on the left side. Click on the "play" button symbol next to "base (root)," and click the second box "Open with Python." This will open a familiar looking black box. Once it is open, we can type in print("Hello"). If the next line prints back just a simple "Hello" (without quotes), then Python is operating as expected!