From 3c34f066eafa40417cfd756efd87116e4b65f294 Mon Sep 17 00:00:00 2001 From: Nick Burrett Date: Wed, 12 Nov 2014 05:30:43 -0800 Subject: [PATCH] folly::EventBase: wrap libevent calls to prevent race-condition Summary: Patch D1585087 exposes two flaws in EventBase(). It introduces IO worker threads to the ThriftServer which are constructed/destructed in parallel. Within the construction phase, a new EventBase() is instantiated for each thread and unwound in destruction. When using the BaseControllerTask (in Python), the following sequence is observed: a = event_init() [ThriftServer] b = event_init() [IO worker 1] c = event_init() [IO worker 2] ... event_base_free(c) event_base_free(b) event_base_free(a) -> segfault 1. event_init() should only ever be called once. It internally modifies a global variable in libevent, current_base to match the return value. event_base_free() will set current_base back to NULL if the passed in arg matches current_base. Therefore subsequent calls must use event_base_new(). 2. Since current_base is a global and EventBase() is called by multiple threads, it is important to guard with a mutex. The guard itself also exposed the bug because: a = event_init() [current_base = a] b = event_init() [current_base = b] ... event_base_free(b) [b == current_base -> current_base = NULL] So current_base ends up prematurely set to NULL. Test Plan: Run dba/core/daemons/dbstatus/dbstatus_tests.lpar, which no longer segfaults Reviewed By: jsedgwick@fb.com, davejwatson@fb.com Subscribers: dihde, evanelias, trunkagent, njormrod, ncoffield, lachlan, folly-diffs@ FB internal diff: D1663654 Tasks: 5545819 Signature: t1:1663654:1415732265:d51c4c4cae99c1ac371460bf18d26d4f917a3c52 Blame Revision: D1585087 --- folly/io/async/EventBase.cpp | 28 ++++++++++++++++++++++++++-- 1 file changed, 26 insertions(+), 2 deletions(-) diff --git a/folly/io/async/EventBase.cpp b/folly/io/async/EventBase.cpp index aa79aba3..0e34c5e2 100644 --- a/folly/io/async/EventBase.cpp +++ b/folly/io/async/EventBase.cpp @@ -125,6 +125,16 @@ void EventBase::CobTimeout::timeoutExpired() noexcept { delete this; } + +// The interface used to libevent is not thread-safe. Calls to +// event_init() and event_base_free() directly modify an internal +// global 'current_base', so a mutex is required to protect this. +// +// event_init() should only ever be called once. Subsequent calls +// should be made to event_base_new(). We can recognise that +// event_init() has already been called by simply inspecting current_base. +static std::mutex libevent_mutex_; + /* * EventBase methods */ @@ -133,7 +143,6 @@ EventBase::EventBase() : runOnceCallbacks_(nullptr) , stop_(false) , loopThread_(0) - , evb_(static_cast(event_init())) , queue_(nullptr) , fnRunner_(nullptr) , maxLatency_(0) @@ -144,6 +153,18 @@ EventBase::EventBase() , startWork_(0) , observer_(nullptr) , observerSampleCount_(0) { + { + std::lock_guard lock(libevent_mutex_); + + // The value 'current_base' (libevent 1) or + // 'event_global_current_base_' (libevent 2) is filled in by event_set(), + // allowing examination of its value without an explicit reference here. + // If ev.ev_base is NULL, then event_init() must be called, otherwise + // call event_base_new(). + struct event ev; + event_set(&ev, 0, 0, nullptr, nullptr); + evb_ = (ev.ev_base) ? event_base_new() : event_init(); + } if (UNLIKELY(evb_ == nullptr)) { LOG(ERROR) << "EventBase(): Failed to init event base."; folly::throwSystemError("error in EventBase::EventBase()"); @@ -202,7 +223,10 @@ EventBase::~EventBase() { // Stop consumer before deleting NotificationQueue fnRunner_->stopConsuming(); - event_base_free(evb_); + { + std::lock_guard lock(libevent_mutex_); + event_base_free(evb_); + } VLOG(5) << "EventBase(): Destroyed."; } -- 2.34.1