API: Erroneous Credit Handling

Incident Report for Trustfull

Postmortem

Summary

On September 18, 2025, starting at 09:00 UTC, our monitoring systems detected abnormal behavior in the credit management logic of the API. A subset of customer requests (~7%) incorrectly received an "insufficient credits" error message, even though their balances were valid.

The incident was mitigated by 09:30 UTC and fully resolved by 09:50 UTC.

Impact

  • Duration: 09:00 – 09:50 UTC
  • Affected users: ~7% of customers
  • Impact: Valid requests were rejected with "insufficient credits" errors, leading to degraded service availability.

Root Cause

The issue was caused by an optimization introduced in the credit validation logic. A conditional path introduced in the update misclassified certain valid balances as insufficient, erroneously denying requests.

As a result:

  • Customers with sufficient credits intermittently received errors.
  • Monitoring alerts correctly detected anomalies in request success rate.

Resolution

  • 09:15 UTC: On-call engineers were paged after monitoring flagged increased error rates.
  • 09:30 UTC: The team identified the issue as a recent change in the credit management logic. A rollback to the last stable version was initiated.
  • 09:50 UTC: All systems were stable, and error rates returned to normal levels.

Preventive Measures

To reduce the likelihood of similar incidents:

  • Stricter Code Reviews: Credit management logic will undergo enhanced peer review and validation before deployment.
  • Pre-Deployment Testing: Expanded test coverage for credit scenarios, including edge cases for balances near thresholds.
  • Canary Releases: Future changes to critical billing or credit systems will be deployed using gradual rollout strategies with automatic rollback triggers.
  • Improved Monitoring: Specific metrics and alarms for anomalous "insufficient credits" responses have been added.
Posted Sep 18, 2025 - 12:30 CEST

Resolved

Between 09:00 and 09:50 UTC on Sep 18, 2025, some valid requests were incorrectly rejected with "insufficient credits" errors due to a logic regression. The issue was identified and resolved via rollback.
Posted Sep 18, 2025 - 11:00 CEST