Intro to Data Analysis in Python

Tentative Schedule

Lesson 0: Intro to Jupyter

What is Jupyter?

engine

Jupyter Notebooks

Example Notebook

Classic Jupyter Notebook vs. JupyterLab

Getting Started

Let's open JupyterLab and create our first Jupyter notebook! Two options:

What if I don’t like where my current working directory is?

working_directory

Illustration by Allison Horst

Organizing Projects

It's good practice to keep all the files for a project in one folder, and use sub-folders to keep things organized.

Create a New Notebook

Working with Notebooks

A notebook consists of a series of "cells":

By default, a new cell is always a code cell.

Code Cells

To run a code cell, click in it and press Shift-Enter or press the Run button on the toolbar

Some handy features:

Markdown Cells

In Markdown cells, you can write plain text or add formatting and other elements with Markdown. These include headers, bold text, italic text, hyperlinks, equations $A=\pi r^2$, inline code print('Hello world!'), bulleted lists, and more.

Other Notebook Basics

Interactivity vs. Automation

For a great example of how an interactive workflow in Jupyter notebook can progress into automation with libraries/scripts, check out Jake VanderPlas' blog post Reproducible Data Analysis in Jupyter.

Python Data Science Ecosystem

The Python libraries for data science are developed and maintained by external "3rd party" development teams

Some of the libraries in the Python data science ecosystem:

ecosystem_big

From The Unexpected Effectiveness of Python in Science (Jake VanderPlas)

In this workshop, we'll be using pandas to work with tabular data and will give a brief introduction to data visualization with the seaborn and plotly libraries.


Go to: next lesson