Let’s talk locks!





Linux futex 使用方式大致如下:





虽然futex不会白白地耗费CPU,但是因为有了context switch, 所以多个线程冲突的时候延迟会非常高:12个冲突线程从130ns -> 900ns(0.9us)


go sync.Mutex在上面做了一些优化:



  1. 如何使用profiling工具定位竞争冲突 https://github.com/iovisor/bcc/issues/892 http://brendangregg.com/offcpuanalysis.html
  2. 避免/减少竞争冲突:lock-free, granular lock, 缩小关键区域 http://www.1024cores.net/ http://www.ece.ubc.ca/~sasha/papers/eurosys16-final29.pdf
  3. content-aware scheduler https://web.eecs.umich.edu/~mozafari/php/data/uploads/lock-schd-report.pdf



Spinning is two-level: (1) an idle M with an associated P spins looking for new G’s, (2) an M w/o an associated P spins waiting for available P’s. There are at most GOMAXPROCS spinning M’s (both (1) and (2)). Idle M’s of type (1) do not block while there are idle M’s of type (2). When a new G is spawned, or M enters syscall, or M transitions from idle to busy, it ensures that there is at least 1 spinning M (or all P’s are busy). This ensures that there are no runnable G’s that can be otherwise running; and avoids excessive M blocking/unblocking at the same time. Spinning is mostly passive (yield to OS, sched_yield()), but may include a little bit of active spinning (loop burnging CPU) (requires investigation and tuning).


  1. Try out LIFO scheduling, this will improve locality. However, it still must provide some degree of fairness and gracefully handle yielding goroutines.
  2. Better locality of G-to-P. Try to enqueue an unblocked G to a P on which it was last running.
  3. Throttling of M creation. The scheduler can be easily forced to create thousands of M's per second until OS refuses to create more threads. M’s must be created promptly up to k*GOMAXPROCS, after that new M’s may added by a timer.
  4. Do not allocate G and stack until the goroutine first runs. For a newly created goroutine we need just callerpc, fn, narg, nret and args, that is, about 6 words. This will allow to create a lot of running-to-completion goroutines with significantly lower memory overhead.
  5. Better locality of P-to-M. Try to execute P on the same M it was last running.