GCSE Maths Exam Builder: Observability & Deployment Case Study

Building a functional backend is only half the battle. The other half is ensuring that the system is observable, automated, and secure. This project is a case study in managing a containerised stack on a DigitalOcean Droplet with a full CI/CD and monitoring suite.

The Architecture

I opted for a modular, containerised approach using Docker Compose. This ensures that the environment is reproducible and isolated.

Nginx Reverse Proxy: Handles SSL termination and routes traffic to the Node.js API or the Grafana dashboard (grafana.tacknowledge.co.uk).
The Backend: A Node.js service designed with a Prometheus-ready /metrics endpoint.
The Monitoring Tier: Uses Node Exporter for system metrics, Prometheus for scraping data every 15s, and Grafana for visualization.

Automated Deployment Flow

I follow a Build-Push-Pull pattern. To maintain maximum uptime and keep the Droplet's resources dedicated to the users, I offload the heavy lifting to GitHub Actions.

Remote Build: GitHub Actions builds the Docker image in the cloud. This prevents the Droplet from freezing due to CPU exhaustion during a build.
Docker Hub: The tagged image is pushed to a central registry.
The Trigger: GitHub Actions SSHs into the Droplet and runs deploy.sh.

The Logic: `deploy.sh`

This script ensures the deployment is idempotent and clean. Using set -euo pipefail ensures that if any command fails, the entire script stops, preventing a "partial" deployment.

shell

#!/usr/bin/env bash
set -euo pipefail

echo "[deploy] Using compose file docker-compose.yml"

echo "[deploy] Pulling latest latest images..."
docker-compose pull

echo "[deploy] Starting all services..."
docker-compose up -d 

echo "[deploy] Cleaning up old images..."
docker image prune -f

echo "[deploy] Done."

Observability Features

I don't believe in "silent failures." Visibility is the antidote to downtime.

System Health: I imported the Node Exporter Full dashboard into Grafana to track CPU load, RAM, and Disk I/O.
Custom Panels: I monitor the Node.js backend’s memory usage specifically during PDF generation—a resource-heavy task for the exam generator.
External Verification: UptimeRobot monitors the /api/health endpoint externally to verify that both Nginx and the container are responding.

Why this matters

My goal is to demonstrate that I don’t just "write code". I manage the entire lifecycle of an application. From the networking logic in the Nginx config to the automated recovery in Docker Compose, this project is a testament to my T-shaped engineering approach: broad system knowledge with a deep vertical in Cloud infrastructure.