folly::EventBase: wrap libevent calls to prevent race-condition
Summary: Patch D1585087 exposes two flaws in EventBase(). It introduces IO worker threads to the ThriftServer which are constructed/destructed in parallel. Within the construction phase, a new EventBase() is instantiated for each thread and unwound in destruction. When using the BaseControllerTask (in Python), the following sequence is observed: a = event_init() [ThriftServer] b = event_init() [IO worker 1] c = event_init() [IO worker 2] ... event_base_free(c) event_base_free(b) event_base_free(a) -> segfault 1. event_init() should only ever be called once. It internally modifies a global variable in libevent, current_base to match the return value. event_base_free() will set current_base back to NULL if the passed in arg matches current_base. Therefore subsequent calls must use event_base_new(). 2. Since current_base is a global and EventBase() is called by multiple threads, it is important to guard with a mutex. The guard itself also exposed the bug because: a = event_init() [current_base = a] b = event_init() [current_base = b] ... event_base_free(b) [b == current_base -> current_base = NULL] So current_base ends up prematurely set to NULL. Test Plan: Run dba/core/daemons/dbstatus/dbstatus_tests.lpar, which no longer segfaults Reviewed By: jsedgwick@fb.com, davejwatson@fb.com Subscribers: dihde, evanelias, trunkagent, njormrod, ncoffield, lachlan, folly-diffs@ FB internal diff: D1663654 Tasks: 5545819 Signature: t1:1663654:1415732265:d51c4c4cae99c1ac371460bf18d26d4f917a3c52 Blame Revision: D1585087
Showing
Please register or sign in to comment