Load data from the Million Song Dataset into a final dimensional model in RedShift utilizing Apache Airflow.
-
Updated
Jun 2, 2020 - Python
Load data from the Million Song Dataset into a final dimensional model in RedShift utilizing Apache Airflow.
Simplified blueprints for building data pipelines with SFTP.
Cassandra ETL Pipeline
This is a project based on the Data Engineering Coding Challenge of Verve Company.
Simplified blueprints for building data pipelines with PowerBI.
BigQuery data pipeline with dbt, Spark, Docker, Airflow, Terraform, GCP
Proof of concept to manage data warehouse data transformations
Repo for tracking content related to DBT cloud
Simplified blueprints for building data pipelines with dbt Cloud.
Simplified blueprints for building data pipelines with Domo.
Data Warehouse with ELT pipelines
The project focuses on creating an ELT pipeline to consolidate data from diverse resources into a single source of truth in BigQuery. The heart of this project is the innovative use of Apache Airflow to design a dynamic Directed Acyclic Graph (DAG) that automates task generation based on predefined file configurations.
Building a Modern Data Stack with Open Source tools
Add a description, image, and links to the elt topic page so that developers can more easily learn about it.
To associate your repository with the elt topic, visit your repo's landing page and select "manage topics."