Fix cron bottleneck in Labdoo server

×

Mensaje de estado

You are not a member of this team. If you want to be part of this team, click on 'Subscribe to this team'.
Tipo: 
Website bug
Estado: 
Resolved
Prioridad: 
High

Comentarios

Imagen de jordi
Enviado por jordi el Dom, 11/04/2018 - 09:35

This is now resolved, see commit: https://github.com/Labdoo/Labdoo/commit/7f4229cc523834e880abc7db71a65618...

The issue here was:

- As Labdoo grows, it takes longer time to recompute the global dashboards (we recompute/refresh dashboards once every hour).
- Prior to this fix, the computation of the dashboard were done as part of the main system cron housekeeping job.
- As computation of dashboards started taking progressively longer, this prevented some other critical system housekeeping tasks from properly completing.
- The solution was to separate the computation of the dashboards from the general housekeeping cron job.
- The general housekeeping cron job continues to run once every hour at the turn of the hour (0:00, 1:00, 2:00, etc.)
- The computation of the dashboards runs once every hour at 40 minutes past the hour (0:40, 1:40, 2:40, etc.)
- Attached is a screenshot of the performance of the Labdoo server as we monitored it for the past weeks. Notice that about once a week we get the system fully overloaded and not able to automatically recover (the system was still running ok, but very slow). With this fix, i think we should not longer see this issue happening. We will continue to monitor it during the next days/weeks.