#1388 epoll: massive performance hit on 0.11

Reporter ge0rg
Owner Zash
Created
Updated
Stars ★ (1)
Tags
  • Status-Fixed
  • Milestone-0.11
  • Performance
  • Priority-Medium
  • Type-Defect
  1. ge0rg on

    Use prosody 0.11 with the new epoll backend on Linux. Have thousands of connections open (c2s, s2s). The CPU load will rise significantly (6x), in comparison to lua-socket with libevent. There are also strong peaks in the daily cpu graphs, probably at time of high stanza throughput. Switching back to lua-socket immediately reverts the cpu load to small values.

  2. Zash on

    An inefficiency in timer handling was identified and fixed in trunk in https://hg.prosody.im/trunk/rev/6c2370f17027 Don't know if there's anything else but will backport that.

    Changes
    • tags Milestone-0.11 Status-Accepted
    • owner Zash
  3. Zash on

    Backported in https://hg.prosody.im/trunk/rev/c8c3f2eba898 Brief test showed significant improvement of CPU usage with lots of timers. Since each connection has at least one read timeout active at any time, lots of connections means lots of timers. When I found this, I believe it was because it spent a lot of time sorting the timers since they move around each time there is some incoming data that pushes their read timeout further into the future. The sorting compares fields in tables, which would explain the time spent in table indexing that you saw. On my not too bad laptop, with 20000 x 60s timers, it hovered around 50% CPU usage before this change, and under 1% after. For comparison, server_select seemed to hover around 7% and server_event seems to handle many times as many noop timers without touching the CPU.

    Changes
    • tags Status-Fixed
  4. Zash on

    Yes, testing would be appreciated.

    Changes
    • tags Performance

New comment

Not published. Used for spam prevention and optional update notifications.