Getting Started
Before starting this Cookbook, you need to set up your environment. You can choose either the SQL track or the Pandas track, or set up both.
Track Selection Guide
🗄️ When to Choose the SQL Track
- If you’re already familiar with SQL
- If you work in an environment using BigQuery or other cloud data warehouses
- If you need to handle large-scale data (GB to TB scale)
- If it’s difficult to install a Python environment locally
🐼 When to Choose the Pandas Track
- If you’re familiar with Python or want to learn Python
- If you want to connect data preprocessing to machine learning modeling
- If you want to experiment freely in a local environment
- If you want to create analysis reports with Jupyter Notebook
🔄 When Both Tracks Are Recommended
In practice, SQL and Pandas are often used together:
- Data Extraction: Extract necessary data from BigQuery using SQL
- Data Analysis: Perform detailed analysis and visualization with Pandas
- Save Results: Save back to BigQuery or generate reports
Each recipe in this Cookbook provides both SQL and Pandas versions, so you can learn by comparing how to solve the same problem in two different ways.
Environment Setup
🗄️ BigQuery Setup
Create Google Cloud project, configure service account, connect to BigQuery
🐼 Pandas Setup
Install Python, set up virtual environment, download sample data
Understanding the Data
Once your environment is set up, visit the Data Structure Overview page to learn about the datasets used in this Cookbook.
Last updated on