Commit 9232fb7a authored by Aaryaman Sagar's avatar Aaryaman Sagar Committed by Facebook Github Bot

folly::lock() - a deadlock safe way to lock folly::Synchronized

Summary:
`folly::lock()` is a deadlock safe way to acquire write
locks on many lockables or `folly::Synchronized` objects

```
lock(folly::wlock(one), folly::rlock(two), folly::wlock(three));
```

This executes the deadlock avoidance algorithm on a write lock for `one` and
`three` and a read lock for `two`.  ADL lookup is done for the `lock()`
function.

It can also work on arbitrary lockables and performs better than both
`std::lock()` and acquiring the mutexes in order

```
folly::lock(one, two, three);
```

There is a big performance improvement compared to simply acquiring locks
in the same order in the presence of contention.  The backoff algorithm tries
to adjust to contention and block on the mutex that it thinks is the best fit.

Benchmarks look promising

```
============================================================================
folly/test/SynchronizedBenchmark.cpp            relative  time/iter  iters/s
============================================================================
ThreeThreadsPathologicalFollyLock                            3.81us  262.24K
ThreeThreadsPathologicalStdLock                              5.34us  187.28K
ThreeThreadsPathologicalOrdered                              6.36us  157.28K
ThreeThreadsPathologicalCarefullyOrdered                     4.21us  237.29K
----------------------------------------------------------------------------
TwoThreadsTwoMutexesOrdered                                260.87ns    3.83M
TwoThreadsTwoMutexesSmart                                  161.28ns    6.20M
TwoThreadsTwoMutexesPersistent                             226.25ns    4.42M
----------------------------------------------------------------------------
TwoThreadsFourMutexesOrdered                               196.01ns    5.10M
TwoThreadsFourMutexesSmart                                 196.73ns    5.08M
TwoThreadsFourMutexesPersistent                            201.70ns    4.96M
----------------------------------------------------------------------------
TwoThreadsEightMutexesOrdered                              195.76ns    5.11M
TwoThreadsEightMutexesSmart                                187.90ns    5.32M
TwoThreadsEightMutexesPersistent                           199.21ns    5.02M
----------------------------------------------------------------------------
TwoThreadsSixteenMutexesOrdered                            203.91ns    4.90M
TwoThreadsSixteenMutexesSmart                              196.30ns    5.09M
TwoThreadsSixteenMutexesPersistent                         230.64ns    4.34M
----------------------------------------------------------------------------
FourThreadsTwoMutexesOrdered                               814.98ns    1.23M
FourThreadsTwoMutexesSmart                                 559.79ns    1.79M
FourThreadsTwoMutexesPersistent                            520.90ns    1.92M
----------------------------------------------------------------------------
FourThreadsFourMutexesOrdered                              456.04ns    2.19M
FourThreadsFourMutexesSmart                                391.69ns    2.55M
FourThreadsFourMutexesPersistent                           414.56ns    2.41M
----------------------------------------------------------------------------
FourThreadsEightMutexesOrdered                             392.20ns    2.55M
FourThreadsEightMutexesSmart                               277.89ns    3.60M
FourThreadsEightMutexesPersistent                          301.98ns    3.31M
----------------------------------------------------------------------------
FourThreadsSixteenMutexesOrdered                           356.36ns    2.81M
FourThreadsSixteenMutexesSmart                             291.40ns    3.43M
FourThreadsSixteenMutexesPersistent                        292.23ns    3.42M
----------------------------------------------------------------------------
EightThreadsTwoMutexesOrdered                                1.58us  634.85K
EightThreadsTwoMutexesSmart                                  1.58us  634.85K
EightThreadsTwoMutexesPersistent                             1.56us  639.93K
----------------------------------------------------------------------------
EightThreadsFourMutexesOrdered                               1.33us  753.45K
EightThreadsFourMutexesSmart                               794.36ns  936.34K
EightThreadsFourMutexesPersistent                          831.68ns    1.21M
----------------------------------------------------------------------------
EightThreadsEightMutexesOrdered                            791.52ns    1.26M
EightThreadsEightMutexesSmart                              548.05ns    1.51M
EightThreadsEightMutexesPersistent                         563.14ns    2.78M
----------------------------------------------------------------------------
EightThreadsSixteenMutexesOrdered                          785.40ns    2.11M
EightThreadsSixteenMutexesSmart                            541.27ns    1.60M
EightThreadsSixteenMutexesPersistent                       673.49ns    1.79M
----------------------------------------------------------------------------
SixteenThreadsTwoMutexesOrdered                              1.98us  505.83K
SixteenThreadsTwoMutexesSmart                                1.85us  541.06K
SixteenThreadsTwoMutexesPersistent                           3.13us  319.53K
----------------------------------------------------------------------------
SixteenThreadsFourMutexesOrdered                             2.46us  407.07K
SixteenThreadsFourMutexesSmart                               1.68us  594.47K
SixteenThreadsFourMutexesPersistent                          1.62us  617.22K
----------------------------------------------------------------------------
SixteenThreadsEightMutexesOrdered                            1.67us  597.45K
SixteenThreadsEightMutexesSmart                              1.62us  616.83K
SixteenThreadsEightMutexesPersistent                         1.57us  637.50K
----------------------------------------------------------------------------
SixteenThreadsSixteenMutexesOrdered                          1.20us  829.93K
SixteenThreadsSixteenMutexesSmart                            1.32us  757.03K
SixteenThreadsSixteenMutexesPersistent                       1.38us  726.75K
============================================================================
```

Reviewed By: djwatson

Differential Revision: D6673876

fbshipit-source-id: b57fdafb8fc2a42c74dc0279c051cc62976a4e07
parent 5604a147
......@@ -13,7 +13,6 @@
* See the License for the specific language governing permissions and
* limitations under the License.
*/
/**
* This module implements a Synchronized abstraction useful in
* mutex-based concurrency.
......@@ -25,14 +24,19 @@
#pragma once
#include <folly/Function.h>
#include <folly/Likely.h>
#include <folly/LockTraits.h>
#include <folly/Preprocessor.h>
#include <folly/SharedMutex.h>
#include <folly/Traits.h>
#include <folly/Utility.h>
#include <folly/container/Foreach.h>
#include <glog/logging.h>
#include <array>
#include <mutex>
#include <tuple>
#include <type_traits>
#include <utility>
......@@ -866,6 +870,156 @@ using LockedPtrType = typename std::conditional<
std::is_const<SynchronizedType>::value,
typename SynchronizedType::ConstLockedPtr,
typename SynchronizedType::LockedPtr>::type;
template <typename Synchronized>
class SynchronizedLockerBase {
public:
explicit SynchronizedLockerBase(Synchronized& sync) : synchronized{sync} {}
protected:
Synchronized& synchronized;
};
template <typename Synchronized>
class SynchronizedWLocker : public SynchronizedLockerBase<Synchronized> {
public:
using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
using LockedPtr = typename Synchronized::LockedPtr;
auto lock() {
return this->synchronized.wlock();
}
auto tryLock() {
return this->synchronized.tryWLock();
}
};
template <typename Synchronized>
class SynchronizedRLocker : public SynchronizedLockerBase<Synchronized> {
public:
using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
using LockedPtr = typename Synchronized::ConstLockedPtr;
auto lock() {
return this->synchronized.rlock();
}
auto tryLock() {
return this->synchronized.tryRLock();
}
};
template <typename Synchronized>
class SynchronizedULocker : public SynchronizedLockerBase<Synchronized> {
public:
using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
using LockedPtr = typename Synchronized::UpgradeLockedPtr;
auto lock() {
return this->synchronized.ulock();
}
auto tryLock() {
return this->synchronized.tryULock();
}
};
template <typename Synchronized>
class SynchronizedLocker : public SynchronizedLockerBase<Synchronized> {
public:
using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
using LockedPtr = typename Synchronized::LockedPtr;
auto lock() {
return this->synchronized.lock();
}
auto tryLock() {
return this->synchronized.tryLock();
}
};
template <typename Lockable>
class LockableLocker {
public:
explicit LockableLocker(Lockable& lockableIn) : lockable{lockableIn} {}
using LockedPtr = std::unique_lock<Lockable>;
auto lock() {
return std::unique_lock<Lockable>{lockable};
}
auto tryLock() {
auto lock = std::unique_lock<Lockable>{lockable, std::defer_lock};
lock.try_lock();
return lock;
}
private:
Lockable& lockable;
};
/**
* Acquire locks for multiple Synchronized<T> objects, in a deadlock-safe
* manner.
*
* The function uses the "smart and polite" algorithm from this link
* http://howardhinnant.github.io/dining_philosophers.html#Polite
*
* The gist of the algorithm is that it locks a mutex, then tries to lock the
* other mutexes in a non-blocking manner. If all the locks succeed, we are
* done, if not, we release the locks we have held, yield to allow other
* threads to continue and then block on the mutex that we failed to acquire.
*
* This allows dynamically yielding ownership of all the mutexes but one, so
* that other threads can continue doing work and locking the other mutexes.
* See the benchmarks in folly/test/SynchronizedBenchmark.cpp for more.
*/
template <typename... SynchronizedLocker>
auto /* std::tuple<LockedPtr...> */ lock(SynchronizedLocker... lockersIn) {
// capture the list of lockers as a tuple
auto lockers = std::forward_as_tuple(lockersIn...);
// make a list of null LockedPtr instances that we will return to the caller
auto lockedPtrs = std::tuple<typename SynchronizedLocker::LockedPtr...>{};
// start by locking the first thing in the list
std::get<0>(lockedPtrs) = std::get<0>(lockers).lock();
auto indexLocked = 0;
while (true) {
auto couldLockAll = true;
folly::for_each(lockers, [&](auto& locker, auto index) {
// if we should try_lock on the current locker then do so
if (index != indexLocked) {
auto lockedPtr = locker.tryLock();
// if we were unable to lock this mutex,
//
// 1. release all the locks,
// 2. yield control to another thread to be nice
// 3. block on the mutex we failed to lock, acquire the lock
// 4. break out and set the index of the current mutex to indicate
// which mutex we have locked
if (!lockedPtr) {
// writing lockedPtrs = decltype(lockedPtrs){} does not compile on
// gcc, I believe this is a bug D7676798
lockedPtrs = std::tuple<typename SynchronizedLocker::LockedPtr...>{};
std::this_thread::yield();
folly::fetch(lockedPtrs, index) = locker.lock();
indexLocked = index;
couldLockAll = false;
return folly::loop_break;
}
// else store the locked mutex in the list we return
folly::fetch(lockedPtrs, index) = std::move(lockedPtr);
}
return folly::loop_continue;
});
if (couldLockAll) {
return lockedPtrs;
}
}
}
} // namespace detail
/**
......@@ -1440,20 +1594,89 @@ class LockedGuardPtr {
SynchronizedType* const parent_{nullptr};
};
/**
* Acquire locks on many lockables or synchronized instances in such a way
* that the sequence of calls within the function does not cause deadlocks.
*
* This can often result in a performance boost as compared to simply
* acquiring your locks in an ordered manner. Even for very simple cases.
* The algorithm tried to adjust to contention by blocking on the mutex it
* thinks is the best fit, leaving all other mutexes open to be locked by
* other threads. See the benchmarks in folly/test/SynchronizedBenchmark.cpp
* for more
*
* This works differently as compared to the locking algorithm in libstdc++
* and is the recommended way to acquire mutexes in a generic order safe
* manner. Performance benchmarks show that this does better than the one in
* libstdc++ even for the simple cases
*
* Usage is the same as std::lock() for arbitrary lockables
*
* folly::lock(one, two, three);
*
* To make it work with folly::Synchronized you have to specify how you want
* the locks to be acquired, use the folly::wlock(), folly::rlock(),
* folly::ulock() and folly::lock() helpers defined below
*
* auto [one, two] = lock(folly::wlock(a), folly::rlock(b));
*
* Note that you can/must avoid the folly:: namespace prefix on the lock()
* function if you use the helpers, ADL lookup is done to find the lock function
*
* This will execute the deadlock avoidance algorithm and acquire a write lock
* for a and a read lock for b
*/
template <typename LockableOne, typename LockableTwo, typename... Lockables>
void lock(LockableOne& one, LockableTwo& two, Lockables&... lockables) {
auto locks = lock(
detail::LockableLocker<LockableOne>{one},
detail::LockableLocker<LockableTwo>{two},
detail::LockableLocker<Lockables>{lockables}...);
// release ownership of the locks from the RAII lock wrapper returned by the
// function above
folly::for_each(locks, [&](auto& lock) { lock.release(); });
}
/**
* Helper functions that should be passed to a lock() invocation, these return
* implementation defined structs that lock() will use to lock the
* synchronized instance appropriately.
*
* lock(folly::wlock(one), folly::rlock(two), folly::wlock(three));
*
* For example in the above rlock() produces an implementation defined read
* locking helper instance and wlock() a write locking helper
*/
template <typename Data, typename Mutex>
auto wlock(Synchronized<Data, Mutex>& synchronized) {
return detail::SynchronizedWLocker<Synchronized<Data, Mutex>>{synchronized};
}
template <typename Data, typename Mutex>
auto rlock(Synchronized<Data, Mutex>& synchronized) {
return detail::SynchronizedRLocker<Synchronized<Data, Mutex>>{synchronized};
}
template <typename Data, typename Mutex>
auto ulock(Synchronized<Data, Mutex>& synchronized) {
return detail::SynchronizedULocker<Synchronized<Data, Mutex>>{synchronized};
}
template <typename Data, typename Mutex>
auto lock(Synchronized<Data, Mutex>& synchronized) {
return detail::SynchronizedLocker<Synchronized<Data, Mutex>>{synchronized};
}
/**
* Acquire locks for multiple Synchronized<T> objects, in a deadlock-safe
* manner.
*
* The locks are acquired in order from lowest address to highest address.
* (Note that this is not necessarily the same algorithm used by std::lock().)
*
* For parameters that are const and support shared locks, a read lock is
* acquired. Otherwise an exclusive lock is acquired.
*
* TODO: Extend acquireLocked() with variadic template versions that
* allow for more than 2 Synchronized arguments. (I haven't given too much
* thought about how to implement this. It seems like it would be rather
* complicated, but I think it should be possible.)
* use lock() with folly::wlock(), folly::rlock() and folly::ulock() for
* arbitrary locking without causing a deadlock (as much as possible), with the
* same effects as std::lock()
*/
template <class Sync1, class Sync2>
std::tuple<detail::LockedPtrType<Sync1>, detail::LockedPtrType<Sync2>>
......
This diff is collapsed.
......@@ -21,6 +21,7 @@
#include <folly/Function.h>
#include <folly/LockTraitsBoost.h>
#include <folly/Portability.h>
#include <folly/ScopeGuard.h>
#include <folly/SharedMutex.h>
#include <folly/SpinLock.h>
#include <folly/portability/GTest.h>
......@@ -666,6 +667,27 @@ void testTryLock(Func func) {
EXPECT_EQ(unlocked, 0);
}
}
class MutexTrack {
public:
static int gId;
static int gOrder;
void lock_shared() {}
void unlock_shared() {}
void lock() {
order = MutexTrack::gOrder++;
}
void unlock() {
order = -1;
--gOrder;
}
int current{gId++};
int order{-1};
};
int MutexTrack::gId{0};
int MutexTrack::gOrder{0};
} // namespace
TEST_F(SynchronizedLockTest, TestTryLock) {
......@@ -797,4 +819,130 @@ TEST_F(SynchronizedLockTest, TestConvertTryLockToLock) {
EXPECT_EQ(value, 0);
}
TEST(FollyLockTest, TestVariadicLockWithSynchronized) {
{
auto syncs = std::array<folly::Synchronized<int>, 3>{};
auto& one = syncs[0];
auto& two = syncs[1];
auto& three = syncs[2];
auto locks =
lock(folly::wlock(one), folly::rlock(two), folly::wlock(three));
EXPECT_TRUE(std::get<0>(locks));
EXPECT_TRUE(std::get<1>(locks));
EXPECT_TRUE(std::get<2>(locks));
}
{
auto syncs = std::array<folly::Synchronized<int, std::mutex>, 2>{};
auto locks = lock(folly::lock(syncs[0]), folly::lock(syncs[1]));
EXPECT_TRUE(std::get<0>(locks));
EXPECT_TRUE(std::get<1>(locks));
}
}
TEST(FollyLockTest, TestVariadicLockWithArbitraryLockables) {
auto&& one = std::mutex{};
auto&& two = std::mutex{};
auto lckOne = std::unique_lock<std::mutex>{one, std::defer_lock};
auto lckTwo = std::unique_lock<std::mutex>{two, std::defer_lock};
folly::lock(lckOne, lckTwo);
EXPECT_TRUE(lckOne);
EXPECT_TRUE(lckTwo);
}
namespace {
struct TestLock {
public:
void lock() {
onLock();
++numTimesLocked;
}
bool try_lock() {
if (shouldTryLockSucceed) {
lock();
return true;
}
return false;
}
void unlock() {
onUnlock();
++numTimesUnlocked;
}
int numTimesLocked{0};
int numTimesUnlocked{0};
bool shouldTryLockSucceed{true};
std::function<void()> onLock{[] {}};
std::function<void()> onUnlock{[] {}};
};
} // namespace
TEST(FollyLockTest, TestVariadicLockSmartAndPoliteAlgorithm) {
auto one = TestLock{};
auto two = TestLock{};
auto three = TestLock{};
auto makeReset = [&] {
return folly::makeGuard([&] {
one = TestLock{};
two = TestLock{};
three = TestLock{};
});
};
{
auto reset = makeReset();
folly::lock(one, two, three);
EXPECT_EQ(one.numTimesLocked, 1);
EXPECT_EQ(one.numTimesUnlocked, 0);
EXPECT_EQ(two.numTimesLocked, 1);
EXPECT_EQ(two.numTimesUnlocked, 0);
EXPECT_EQ(three.numTimesLocked, 1);
EXPECT_EQ(three.numTimesUnlocked, 0);
}
{
auto reset = makeReset();
two.shouldTryLockSucceed = false;
folly::lock(one, two, three);
EXPECT_EQ(one.numTimesLocked, 2);
EXPECT_EQ(one.numTimesUnlocked, 1);
EXPECT_EQ(two.numTimesLocked, 1);
EXPECT_EQ(two.numTimesUnlocked, 0);
EXPECT_EQ(three.numTimesLocked, 1);
EXPECT_EQ(three.numTimesUnlocked, 0);
}
{
auto reset = makeReset();
three.shouldTryLockSucceed = false;
folly::lock(one, two, three);
EXPECT_EQ(one.numTimesLocked, 2);
EXPECT_EQ(one.numTimesUnlocked, 1);
EXPECT_EQ(two.numTimesLocked, 2);
EXPECT_EQ(two.numTimesUnlocked, 1);
EXPECT_EQ(three.numTimesLocked, 1);
EXPECT_EQ(three.numTimesUnlocked, 0);
}
{
auto reset = makeReset();
three.shouldTryLockSucceed = false;
three.onLock = [&] {
// when three gets locked make one fail
one.shouldTryLockSucceed = false;
// then when one gets locked make three succeed to finish the test
one.onLock = [&] { three.shouldTryLockSucceed = true; };
};
folly::lock(one, two, three);
EXPECT_EQ(one.numTimesLocked, 2);
EXPECT_EQ(one.numTimesUnlocked, 1);
EXPECT_EQ(two.numTimesLocked, 2);
EXPECT_EQ(two.numTimesUnlocked, 1);
EXPECT_EQ(three.numTimesLocked, 2);
EXPECT_EQ(three.numTimesUnlocked, 1);
}
}
} // namespace folly
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment