Apache Airflow
For more information regarding AA please visit the AA section. As a reminder:
Apache Airflow is a platform to programmatically author, schedule and monitor workflows. It is a platform that lets you build and run workflows.
This section will cover ETL Pipeline projects setup in Apache Airflow
Bash
This section will make use of Bash to setup automated ETLs and Pipelines in Apache Airflow
Import TXT Server Data for ETL in AA
We’ll Extract server data from an online data source, Transform and Load the data to a local TXT file in a pipeline. Using BashOperator
Import Mulitiple Formats TGZ data ETL
This is a typical ETL Pipeline, where data is imported from different sources in different formats. Here we setup a pipeline to Extract, Transform and monitor the ETL Pipeline
Multiple Projects using BashOperator for ETL Pipelines
This is a collection of multiple short ETL Pipelines demonstrating the use of BashOperator with Apache Airflow
Python
This section will make use of Python to setup automated ETLs in Apache Airflow
Remake of Import TXT Server Data for ETL in Python
We’ll Extract server data from an online data source, Transform and Load the data to a local TXT file in a pipeline. Using PythonOperator
Remake of Import Mulitiple Formats TGZ data ETL
We’ll Extract server data from an online data source, Transform and Load the data to a local TXT file in a pipeline. Using PythonOperator
Import Customer Data Transform and Load to CSV file
This ETL Pipeline will import customer data, transform the information then Load the results into a CSV file that will be passed to the Data Analysts.