ClimsTech

Multi-tenant SaaS · Logging transformation

Following one request across dozens of services

A central logging platform designed around correlation, structured fields, retention, access and actionable investigation.

ELKElasticsearchKibana

180 GB

of logs processed daily

45 min

investigation (from 3h)

36

services covered

84%

less time on individual servers

In brief

A multi-tenant SaaS platform generated logs across applications, servers and cloud services, and engineers connected to individual machines during incidents. ClimsTech centralised collection, enriched events with structured context and request correlation, created dashboards and alerts, and set retention and access policy — moving investigation from machine-by-machine searching to shared, correlated analysis.

Working constraints

  • Multiple applications and formats
  • High daily log volume
  • Sensitive tenant information
  • Different retention requirements
  • Existing server access practices
  • Need for correlation
  • Search performance and cost

The problem

What was actually going wrong

A multi-tenant SaaS platform generated logs across applications, servers, and cloud services. Engineers connected to individual machines during incidents and struggled to trace a request across service boundaries.

What discovery surfaced

  1. 1Log formats differed significantly.
  2. 2Timestamps and identifiers were inconsistent.
  3. 3Some logs contained unnecessary sensitive data.
  4. 4Important logs could be overwritten locally.
  5. 5Engineers relied on server access.
  6. 6Retention was not based on operational need.

The engineering

What we built and changed

1Source inventory

Applications, servers, databases, and cloud services were mapped.

2Collection and parsing

Log shippers and processing rules transformed data into consistent fields.

3Correlation

Request, tenant, environment, service, and severity identifiers were added where appropriate.

4Retention and access

Different categories received different retention and role-based access.

5Operational use

Dashboards, search patterns, and alerts supported common investigation scenarios.

Incident investigation moved from machine-by-machine searching to shared, structured, and correlated analysis.

The architecture

Before and after

Before
  • Applications and services — inconsistent log formats
  • Local server log storage — overwritable
  • No request correlation identifiers
  • Per-server access for incident investigation
  • No structured retention or access policy
After
  • Applications, servers, databases, cloud services
  • Log shippers
  • Processing and enrichment
  • Elasticsearch
  • Kibana dashboards
  • Alerts

Judgement calls

Decisions that shaped the outcome

Why not retain everything indefinitely?

Retention cost and search performance must align with business and operational need.

Why prioritise request identifiers?

Correlation enables engineers to follow one transaction through multiple services.

Why remove secrets and unnecessary personal data?

Logs are operational records, not unrestricted data stores.

Verified outcomes

What changed for the business

  • 180 GB processed daily
  • Investigation reduced from 3 hours to 45 minutes
  • Correlation improved by 73%
  • Server-access time reduced by 84%
  • Repeat errors identified 60% faster
  • 36 services covered
  • Critical alert response improved by 52%

What this engagement proves

Central logging becomes valuable when data is structured around investigation, not merely collected in one place.

Struggling to trace production incidents?

See more engagements

Discuss centralised logging