Getting Started

Before starting this Cookbook, you need to set up your environment. You can choose either the SQL track or the Pandas track, or set up both.

Track Selection Guide

🗄️ When to Choose the SQL Track

If you’re already familiar with SQL
If you work in an environment using BigQuery or other cloud data warehouses
If you need to handle large-scale data (GB to TB scale)
If it’s difficult to install a Python environment locally

🐼 When to Choose the Pandas Track

If you’re familiar with Python or want to learn Python
If you want to connect data preprocessing to machine learning modeling
If you want to experiment freely in a local environment
If you want to create analysis reports with Jupyter Notebook

🔄 When Both Tracks Are Recommended

In practice, SQL and Pandas are often used together:

Data Extraction: Extract necessary data from BigQuery using SQL
Data Analysis: Perform detailed analysis and visualization with Pandas
Save Results: Save back to BigQuery or generate reports

Each recipe in this Cookbook provides both SQL and Pandas versions, so you can learn by comparing how to solve the same problem in two different ways.

Environment Setup

🗄️ BigQuery Setup

Create Google Cloud project, configure service account, connect to BigQuery

🐼 Pandas Setup

Install Python, set up virtual environment, download sample data

Understanding the Data

Once your environment is set up, visit the Data Structure Overview page to learn about the datasets used in this Cookbook.