Overview

An Apache Airflow DAG that automates daily data ingestion from Amazon S3 into Snowflake, with real-time Slack notifications for pipeline monitoring. Built with production-grade patterns including externalized configuration, failure callbacks, and modern Airflow 2.x providers.

Architecture

The pipeline follows a three-stage pattern:

1. Sense — An S3KeySensor polls the target bucket every 30 minutes (up to 12 hours) waiting for the expected data file to land.

2. Load — Executes Snowflake’s COPY INTO command to bulk-load the CSV directly from S3 into the target table.

3. Notify — Posts a structured success message to Slack via webhook. Any task failure triggers an immediate alert with DAG and task context.

Tech Stack

View on GitHub