Resolved -
After a period of extended monitoring, we have confirmed the remediation has completely resolved the issue. Requests and error rates have been normal since 06:27UTC
May 13, 07:29 EDT
Identified -
We have identified the cause as a set of stale IPs for an upstream service. We are now validating the approach for remediation.
May 13, 02:26 EDT
Update -
We are still investigating. During the period, the error rate has remained low as initially reported, approximately ~0.015%.
May 13, 01:52 EDT
Investigating -
We are investigating an unexpected but small increase in the percentage of API calls failing since 01:30UTC. Approximately 0.015% of API calls are failing with a HTTP 504 (Gateway timeout) error. We will update this incident as we have more information.
May 13, 01:22 EDT