Implementing Centralized Logging with Grafana Loki in ECS Fargate

When you run multiple microservices on ECS Fargate, log aggregation quickly becomes a core operational concern. In this guide, we walk through how we implemented centralized logging with Grafana Loki and Fluent Bit sidecars for a travel platform running four microservices.

The Problem

With several ECS services, logs naturally end up scattered across multiple CloudWatch log groups and streams. Developers were:

Jumping between log streams and services
Manually correlating timestamps and request flows
Lacking a single, searchable view of the system

We needed centralized logging with powerful query capabilities and low operational overhead.

Why We Chose Grafana Loki

We evaluated traditional ELK-style stacks and CloudWatch-only approaches, but settled on Grafana Loki because it offered:

Cost-effective storage: Loki indexes labels, not full log content, which is typically up to 10x cheaper than Elasticsearch for similar workloads.
Native Grafana integration: Metrics and logs in a single Grafana UI, with easy correlation between dashboards and log queries.
LogQL: A query language similar to PromQL, making it intuitive for teams already using Prometheus.
Horizontal scaling: Loki’s microservices architecture scales with your microservices footprint.

The result is a logging platform that’s both powerful and economical for ECS Fargate workloads.

High-Level Architecture

ECS Service → Fluent Bit Sidecar → Loki → Grafana
↓ ↓ ↓ ↓
App Logs Log Processing Storage Queries

Each ECS task runs a Fluent Bit sidecar (via FireLens) that ships logs to Loki. Grafana is used as the query and visualization layer.

Flow

Application containers write logs to stdout/stderr.
FireLens / Fluent Bit sidecar in the same task reads those logs.
Fluent Bit enriches logs with ECS metadata and forwards them to Loki.
Loki stores logs in object storage (S3) with indices based on labels.
Grafana queries Loki using LogQL for troubleshooting, dashboards, and alerting.

Fluent Bit Sidecar with FireLens

We use AWS FireLens (ECS’s native log router) to run Fluent Bit as a sidecar in each task definition.

Task Definition Snippet

ContainerDefinitions:
Name: log-router
Image: amazon/aws-for-fluent-bit:latest
Essential: true
FirelensConfiguration:
Type: fluentbit
Options:
enable-ecs-log-metadata: 'true'
LogConfiguration:
LogDriver: awslogs
Options:
awslogs-group: /ecs/firelens
awslogs-region: us-east-1
awslogs-stream-prefix: firelens

Key points:

enable-ecs-log-metadata: 'true' injects ECS metadata (task, cluster, container, etc.) into log records.
The log router itself logs to a dedicated CloudWatch group (/ecs/firelens) for troubleshooting the pipeline.

Your application containers then use the awsfirelens log driver so their logs are routed through this sidecar.

Fluent Bit Output Configuration for Loki

We configure Fluent Bit to send all matched logs to Loki:

Querying Logs with LogQL

LogQL is Loki's query language, similar to PromQL. Here are some practical queries for ECS workloads:

Filter by service: {service="api-gateway"} |= "error"
Rate of errors: rate({service=~".+"} |= "error" [5m])

JSON parsing: {service="order-service"}	json	level="error"
Latency tracking: {service="payment"}	json	duration_ms > 1000

Dashboard Setup

Create Grafana dashboards that combine metrics and logs:

Service health panel - show error rate with drill-down to logs
Request tracing - correlate request IDs across services
Log volume trends - monitor logging costs and anomalies

Alerting on Logs

Set up alerts in Grafana based on log patterns:

Error spike detection - alert when error rate exceeds threshold
Missing heartbeat - alert when a service stops logging
Specific error patterns - alert on "OutOfMemory" or "Connection refused"

Results

After implementing Grafana Loki for our four ECS microservices:

90% reduction in time to debug production issues
60% cost savings compared to CloudWatch Logs Insights
Single pane of glass for metrics and logs in Grafana
Proactive alerting catching issues before users report them

Key Takeaways

Grafana Loki with Fluent Bit sidecars is an excellent choice for ECS Fargate logging. The combination of low storage costs, powerful LogQL queries, and native Grafana integration makes it ideal for microservices architectures. Start with basic log aggregation and progressively add dashboards and alerts as your observability needs grow.

Implementing Centralized Logging with Grafana Loki in ECS Fargate

Table of Contents

Implementing Centralized Logging with Grafana Loki in ECS Fargate

The Problem

Why We Chose Grafana Loki

High-Level Architecture

Flow

Fluent Bit Sidecar with FireLens

Task Definition Snippet

Fluent Bit Output Configuration for Loki

Querying Logs with LogQL

Dashboard Setup

Alerting on Logs

Results

Key Takeaways

Related Articles

AWS WAF Best Practices for Fintech Applications

Microservices Service Discovery with AWS Cloud Map

Docker Compose for Multi-Service Blockchain Validator Stacks