Resolved -
This incident has been resolved.
Sep 12, 16:12 PDT
Monitoring -
We've identified and fixed a configuration issue with our backend services. Some services would shut down if they were unable to connect to a node on our log aggregation cluster. Last night, one of our log nodes died and as traffic (and log volume) peaked this morning we began to see erratic behavior across our infrastructure. Users experienced intermittent periods of timeouts and degraded performance.
We'll continue to monitor performance and work to fill gaps in our metrics and alerting to mitigate issues like these in the future. We'll follow up with a public postmortem later this week.
Sep 12, 13:09 PDT
Update -
We are continuing to investigate performance issues resulting in 5XX errors across UserVoice.
Sep 12, 09:41 PDT
Investigating -
Our engineering team is focused on investigating performance issues on across UserVoice. We're still seeing bursts of timeouts. Users may see Cloudflare-branded 502 or 503 errors.
Sep 12, 06:20 PDT