Overview

A production-grade serverless ML inference API that classifies text sentiment as POSITIVE or NEGATIVE using DistilBERT. Deployed on AWS Lambda via Docker containers with API Gateway, demonstrating the full path from model to production-ready cloud endpoint.

Architecture

API Gateway receives HTTP requests and routes them to AWS Lambda, which runs a Docker container housing the FastAPI application. Mangum bridges FastAPI’s ASGI interface to Lambda’s event-based invocation model. The DistilBERT model (fine-tuned on SST-2) is pre-downloaded at Docker build time to minimize cold start latency.

Key Design Decisions

Tech Stack

View on GitHub