As the name suggests, it it used by those data structures that need elements equal to the number of processor available.
This helps avoid contention and faster access due to cache coherency as the element of the per_cpu data structure accessed correponds "only" to the processor on which the kernel thread is running on.This obviously means preemption be disabled before accessing the per_cpu variables.
Now, in the realtime linux kernel (PREEMPT_RT patchset) the aim is to be as preemptible as possible so as to allow high priority tasks to preempt anyone and everyone. Hence the above assumption that preemption is disabled prior to accessing per_cpu variables breaks.
This happens because, usually spinlocks are used to disable preemption but in realtime linux, all these spinlocks are converted to rt-mutexes . rt-mutexes does not disable preemption and puts the process to sleep instead of spinning.
A task put to sleep, would not know on which processor it will wake up on. Hence, a task can be preempted while accessing a per_cpu var and scheduled on another processor. The value eventually read can be corrupted or illegal.
The solution is to declare variables as PER_CPU_LOCKED (DEFINE_PER_CPU_LOCKED, DECLARE_PER_CPU_LOCKED) instead of just PER_CPU DEFINE_PER_CPU, DECLARE_PER_CPU).
This new macro, associates a per-cpu sleeping lock (rt-mutex) with the per-cpu variable. So, even if a kernel thread accessing a per-cpu variable is scheduled on another cpu, this lock will ensure that the data read is correct.
The implication of this new macro is a performance hit, as the per-cpu variable being read on one processor could well be for some other processor.
This performance hit is alright as in "realtime" we care more about "latency" and "determinism" than "overall system performance".
The implication of this new macro is a performance hit, as the per-cpu variable being read on one processor could well be for some other processor.
This performance hit is alright as in "realtime" we care more about "latency" and "determinism" than "overall system performance".
No comments:
Post a Comment