DataAdvanced

Data Engineering with Python + PySpark in the Cloud

Handling larger data volumes requires distributed processing and orchestration. In this module you will learn how to structure a modern lakehouse stack with Python, Spark, Airflow, dbt, and core AWS services.

18 lessonsCertificate includedUSD 10 (~ARS 10.000)

Buy access

Course syllabus

Modern data architectures

2 lessons

Data Lakehouse
The modern Airflow + dbt stack

PySpark from scratch

3 lessons

RDD vs DataFrame
Joins and aggregations
Parquet/Delta

Apache Airflow

3 lessons

DAGs and operators
TaskFlow API
Local orchestration

Transformations with dbt

3 lessons

dbt Core
Models
Tests and macros

Cloud data engineering with AWS

3 lessons

Glue
Redshift
S3 data lakes

Data quality and governance

3 lessons

Great Expectations
Data catalog
Lineage

Final project

1 lessons

Batch ingestion and transformation pipeline

What you will learn

PySparkApache AirflowdbtAWS S3 / GlueGreat Expectations

Certificate

Advanced Data Engineer Certificate - CumbreAcademy

Ready to start?

Investment: USD 10 (~ARS 10.000)

Buy access

Want access to every course?

Total Access gives you this course and all the others for $20/month.

This course: USD 10 (~ARS 10.000) - Total Access: $20 USD/month (all courses)

See Total Access

What can you do after this course?

These are the recommended next steps for your learning path.

Data

Introduction to Data Engineering

Intermediate · 8 weeks

USD 5 (~ARS 5.000)View course

Engineering

Modern DevOps: CI/CD + Docker + Kubernetes + IaC

Intermediate · 40 hours

USD 5 (~ARS 5.000)View course

Engineering

Cloud Foundations: AWS from Scratch

Beginner · 24 hours

FreeView course

Enroll

USD 10 (~ARS 10.000)

Buy access