Loading is fully back to normal. We will do a full post-mortem internally but wanted to share initial information.
Timeline (in eastern time): 1) Site was extremely slow and whitescreening between ~1:35-1:50 pm eastern. 2) Site was noticeably slower from ~1:50-2:15 PM eastern 3) Load times fully return back to normal around 2:15 pm eastern.
Cause (may evolve was we continue to investigate) A commonly called cache-reliant GraphQL field saw a large increase in average DB execution time (likely due to a corresponding decrease in cache hit rate) which led to DB resources being saturated faster than our autoscaling. Once that was resolved, additional auto-scaling was required to handle the flood of requests coming in after the initial issue was resolved. This completed at 2:15 pm and brought things fully back to normal.
Please reach out to hello@gethealthie.com with any questions. We apologize for any trouble this caused.
Posted Jan 02, 2025 - 14:47 EST
Monitoring
We are seeing response times fully back to normal. Our team continues to monitor.
Posted Jan 02, 2025 - 14:19 EST
Update
Our team has identified the issue with a non-performant database query and is working to resolve the issue now.
Posted Jan 02, 2025 - 13:46 EST
Investigating
Our team is investigating slow loading on the platform and high queue times with our load balancer.