I was recently working on a client site (a good-sized one) and was checking on the health of their application servers. I noticed that each of their app servers was running a few of the cores much harder than the other cores. This was in the evening and they get most of their traffic during the day; it runs Django under mod_wsgi in daemon mode with 8 processes and 25 threads per process. Further, the boxes were not VPSs/VMs, but were dedicated, multicore boxes. So they had multicore hardware and the web server was running in a multicore friendly way.
At the time, the load for each box was around 0.5. And various process IDs rotated as the top CPU users, so process IDs weren’t bound to a core. The ratio of traffic between the cores (ignoring actual core number and focusing on the utilization of each core, since each box was different) was something like:
Core # : Core Utilization
1 : 15%
2 : 2%
3 : 1%
* : 0%
So why would one processor bear most of the load? I googled and googled, and found little useful information. I banged on one server with Apache’s benching tool (“ab”) while watching core utilization and, sure enough, all cores shared the load equally. So what was going on?
I’m not sure if it’s the Linux kernel or a natural outcome of CPU caches, but the simplest explanation is that in low load situations processes are similar, due to cache coherence, will flock to the same core. Rather than spreading a set of processes across a set of cores that don’t necessarily share the same cache, processes naturally gravitate to the cores that experience the lowest cache misses.
Upshot: it’s rational for the system to schedule most operations of a process or group of similar processes on one core when a system is relatively lightly loaded. This is especially true if the cores are “Hyperthreading” and are sharing resources (read: their caches)!