Elevated 5xx API error rates and increased latency for some SparkPost and SparkPost Enterprise customers (EU hosted only)
Incident Report for SparkPost


On May 7 2019, during routine operational maintenance on SparkPost EU, the central API service tier experienced an outage. This prevented many APIs from working for 15 minutes: Transmissions, Tracking Domains, Sending Domains, Templates, Inbound Domains and Relay Webhook APIs. Additionally, the engagement tracking service failed to record opens and clicks and click-through redirects didn’t work -- meaning that links did not work for recipients who clicked a link in an email during this incident. The Status Page incident did not explicitly state that engagement tracking was impacted.

Impact Period: Tuesday 7 May 2019 15:40 - 15:55 UTC

Root Cause:

A key software component that handles incoming HTTP traffic for APIs and engagement tracking failed to restart because of an invalid configuration. This caused the APIs to return 502 errors to any HTTP requests, including link redirects. Additionally, our alerting framework made it difficult to identify all the impacted components - especially engagement tracking.

Corrective Actions:

At SparkPost we take these kinds of incidents very seriously. We know that our email service is critical for our customers and we know that it’s imperative that we give full, accurate and timely information regarding operational incidents. To that end, we have identified a number of corrective actions to address the gaps that surfaced in this incident:

** Automate the relevant deployment process to ensure that a misconfigured node cannot be put into production

** Isolate engagement tracking inhouse alerting services from API alerting services

** Create a separate Status Page component for engagement tracking

Posted May 08, 2019 - 13:00 EDT

This incident has been resolved.
Posted May 07, 2019 - 12:05 EDT
A fix has been implemented and we are monitoring the results.
Posted May 07, 2019 - 11:56 EDT
We are continuing to investigate this issue.
Posted May 07, 2019 - 11:55 EDT
We are experiencing an elevated level of API and UI errors and latency for our APIs for some SparkPost and SparkPost Enterprise customers (including Transmissions API and SMTP injection API). Please retry any 5xx error.
Note: This issue does not impact our SparkPost Enterprise customers hosted in the US.
Posted May 07, 2019 - 11:47 EDT
This incident affected: Sending IPs API (Sending IPs API - EUROPE), Events API (Events API - EUROPE), Transmissions API (Transmissions API - EUROPE), SMTP API (SMTP API - EUROPE), Sending Domains API (Sending Domains API - EUROPE), Tracking Domains API (Tracking Domains API - EUROPE), Relay Webhooks Delivery Service (Relay Webhooks - EUROPE), SparkPost Application (WebUI) (SparkPost Application - EUROPE), IP Pools API (IP Pools API - EUROPE), Metrics API (Metrics API - EUROPE), and Templates API (Templates API - EUROPE).