Fix cron bottleneck in Labdoo server



You are not a member of this team. If you want to be part of this team, click on 'Subscribe to this team'.
Website bug


Bild des Benutzers jordi
Gespeichert von jordi am So, 11/04/2018 - 09:35

This is now resolved, see commit:

The issue here was:

- As Labdoo grows, it takes longer time to recompute the global dashboards (we recompute/refresh dashboards once every hour).
- Prior to this fix, the computation of the dashboard were done as part of the main system cron housekeeping job.
- As computation of dashboards started taking progressively longer, this prevented some other critical system housekeeping tasks from properly completing.
- The solution was to separate the computation of the dashboards from the general housekeeping cron job.
- The general housekeeping cron job continues to run once every hour at the turn of the hour (0:00, 1:00, 2:00, etc.)
- The computation of the dashboards runs once every hour at 40 minutes past the hour (0:40, 1:40, 2:40, etc.)
- Attached is a screenshot of the performance of the Labdoo server as we monitored it for the past weeks. Notice that about once a week we get the system fully overloaded and not able to automatically recover (the system was still running ok, but very slow). With this fix, i think we should not longer see this issue happening. We will continue to monitor it during the next days/weeks.