Intro to Data Analysis in Python

PyLadies Vancouver Workshop

Skill Level: Beginner

A little bit of previous experience with Python or another coding language would be helpful, but not required!

Description

In this workshop, you will develop skills with powerful data analysis tools from Python’s rich ecosystem of libraries. If you’re wrestling with spreadsheets on a regular basis and want to find better ways to analyze and visualize your data, handle messy and missing data, and automate repetitive tasks, this workshop is for you. If you’re coming from another data analysis environment (R, Matlab, etc.) and/or you’re a Python enthusiast who is curious about Python’s data analysis capabilities, this workshop is also for you!

Working with real-world data and the Pandas library, you’ll learn how to load data from a comma-separated values (csv) file, quickly summarize it from many different angles, and visualize it in graphs—all with just a few lines of code. You’ll also learn how to dive into the data for a deeper analysis with techniques such as subsets, filters, text processing, and aggregation.

Setup

You’ll want to bring your laptop for lots of hands-on practice as we work through the lessons and exercises. We’ll be using Python 3.6, Jupyter Lab, and pandas. I highly recommend using Anaconda to set up your environment, especially if you’re new to Python and/or data analysis is your main reason for using Python.

Please make sure to download and install the required software on your laptop prior to the workshop (click here for setup instructions).

Agenda

Credits: Some portions of this workshop are adapted from or inspired by the fantastic instructional materials in Data Insights with Python for Beginners (Copyright © Ladies Learning Code) and Python for Ecologists (Copyright © Software Carpentry), both made available under the Creative Commons Attribution license.