folly::EventBase: wrap libevent calls to prevent race-condition
Summary:
Patch
D1585087 exposes two flaws in EventBase(). It introduces IO
worker threads to the ThriftServer which are constructed/destructed in
parallel.
Within the construction phase, a new EventBase() is instantiated for
each thread and unwound in destruction.
When using the BaseControllerTask (in Python), the following sequence
is observed:
a = event_init() [ThriftServer]
b = event_init() [IO worker 1]
c = event_init() [IO worker 2]
...
event_base_free(c)
event_base_free(b)
event_base_free(a) -> segfault
1. event_init() should only ever be called once. It internally
modifies a global variable in libevent, current_base to match the
return value. event_base_free() will set current_base back to NULL if
the passed in arg matches current_base. Therefore subsequent calls
must use event_base_new().
2. Since current_base is a global and EventBase() is called by multiple
threads, it is important to guard with a mutex. The guard itself also
exposed the bug because:
a = event_init() [current_base = a]
b = event_init() [current_base = b]
...
event_base_free(b) [b == current_base -> current_base = NULL]
So current_base ends up prematurely set to NULL.
Test Plan:
Run dba/core/daemons/dbstatus/dbstatus_tests.lpar, which no longer
segfaults
Reviewed By: jsedgwick@fb.com, davejwatson@fb.com
Subscribers: dihde, evanelias, trunkagent, njormrod, ncoffield, lachlan, folly-diffs@
FB internal diff:
D1663654
Tasks:
5545819
Signature: t1:
1663654:
1415732265:
d51c4c4cae99c1ac371460bf18d26d4f917a3c52
Blame Revision:
D1585087