folly::lock() - a deadlock safe way to lock folly::Synchronized

Summary: `folly::lock()` is a deadlock safe way to acquire write locks on many lockables or `folly::Synchronized` objects ``` lock(folly::wlock(one), folly::rlock(two), folly::wlock(three)); ``` This executes the deadlock avoidance algorithm on a write lock for `one` and `three` and a read lock for `two`. ADL lookup is done for the `lock()` function. It can also work on arbitrary lockables and performs better than both `std::lock()` and acquiring the mutexes in order ``` folly::lock(one, two, three); ``` There is a big performance improvement compared to simply acquiring locks in the same order in the presence of contention. The backoff algorithm tries to adjust to contention and block on the mutex that it thinks is the best fit. Benchmarks look promising ``` ============================================================================ folly/test/SynchronizedBenchmark.cpp relative time/iter iters/s ============================================================================ ThreeThreadsPathologicalFollyLock 3.81us 262.24K ThreeThreadsPathologicalStdLock 5.34us 187.28K ThreeThreadsPathologicalOrdered 6.36us 157.28K ThreeThreadsPathologicalCarefullyOrdered 4.21us 237.29K ---------------------------------------------------------------------------- TwoThreadsTwoMutexesOrdered 260.87ns 3.83M TwoThreadsTwoMutexesSmart 161.28ns 6.20M TwoThreadsTwoMutexesPersistent 226.25ns 4.42M ---------------------------------------------------------------------------- TwoThreadsFourMutexesOrdered 196.01ns 5.10M TwoThreadsFourMutexesSmart 196.73ns 5.08M TwoThreadsFourMutexesPersistent 201.70ns 4.96M ---------------------------------------------------------------------------- TwoThreadsEightMutexesOrdered 195.76ns 5.11M TwoThreadsEightMutexesSmart 187.90ns 5.32M TwoThreadsEightMutexesPersistent 199.21ns 5.02M ---------------------------------------------------------------------------- TwoThreadsSixteenMutexesOrdered 203.91ns 4.90M TwoThreadsSixteenMutexesSmart 196.30ns 5.09M TwoThreadsSixteenMutexesPersistent 230.64ns 4.34M ---------------------------------------------------------------------------- FourThreadsTwoMutexesOrdered 814.98ns 1.23M FourThreadsTwoMutexesSmart 559.79ns 1.79M FourThreadsTwoMutexesPersistent 520.90ns 1.92M ---------------------------------------------------------------------------- FourThreadsFourMutexesOrdered 456.04ns 2.19M FourThreadsFourMutexesSmart 391.69ns 2.55M FourThreadsFourMutexesPersistent 414.56ns 2.41M ---------------------------------------------------------------------------- FourThreadsEightMutexesOrdered 392.20ns 2.55M FourThreadsEightMutexesSmart 277.89ns 3.60M FourThreadsEightMutexesPersistent 301.98ns 3.31M ---------------------------------------------------------------------------- FourThreadsSixteenMutexesOrdered 356.36ns 2.81M FourThreadsSixteenMutexesSmart 291.40ns 3.43M FourThreadsSixteenMutexesPersistent 292.23ns 3.42M ---------------------------------------------------------------------------- EightThreadsTwoMutexesOrdered 1.58us 634.85K EightThreadsTwoMutexesSmart 1.58us 634.85K EightThreadsTwoMutexesPersistent 1.56us 639.93K ---------------------------------------------------------------------------- EightThreadsFourMutexesOrdered 1.33us 753.45K EightThreadsFourMutexesSmart 794.36ns 936.34K EightThreadsFourMutexesPersistent 831.68ns 1.21M ---------------------------------------------------------------------------- EightThreadsEightMutexesOrdered 791.52ns 1.26M EightThreadsEightMutexesSmart 548.05ns 1.51M EightThreadsEightMutexesPersistent 563.14ns 2.78M ---------------------------------------------------------------------------- EightThreadsSixteenMutexesOrdered 785.40ns 2.11M EightThreadsSixteenMutexesSmart 541.27ns 1.60M EightThreadsSixteenMutexesPersistent 673.49ns 1.79M ---------------------------------------------------------------------------- SixteenThreadsTwoMutexesOrdered 1.98us 505.83K SixteenThreadsTwoMutexesSmart 1.85us 541.06K SixteenThreadsTwoMutexesPersistent 3.13us 319.53K ---------------------------------------------------------------------------- SixteenThreadsFourMutexesOrdered 2.46us 407.07K SixteenThreadsFourMutexesSmart 1.68us 594.47K SixteenThreadsFourMutexesPersistent 1.62us 617.22K ---------------------------------------------------------------------------- SixteenThreadsEightMutexesOrdered 1.67us 597.45K SixteenThreadsEightMutexesSmart 1.62us 616.83K SixteenThreadsEightMutexesPersistent 1.57us 637.50K ---------------------------------------------------------------------------- SixteenThreadsSixteenMutexesOrdered 1.20us 829.93K SixteenThreadsSixteenMutexesSmart 1.32us 757.03K SixteenThreadsSixteenMutexesPersistent 1.38us 726.75K ============================================================================ ``` Reviewed By: djwatson Differential Revision: D6673876 fbshipit-source-id: b57fdafb8fc2a42c74dc0279c051cc62976a4e07

folly::lock() - a deadlock safe way to lock folly::Synchronized
Summary: `folly::lock()` is a deadlock safe way to acquire write locks on many lockables or `folly::Synchronized` objects ``` lock(folly::wlock(one), folly::rlock(two), folly::wlock(three)); ``` This executes the deadlock avoidance algorithm on a write lock for `one` and `three` and a read lock for `two`. ADL lookup is done for the `lock()` function. It can also work on arbitrary lockables and performs better than both `std::lock()` and acquiring the mutexes in order ``` folly::lock(one, two, three); ``` There is a big performance improvement compared to simply acquiring locks in the same order in the presence of contention. The backoff algorithm tries to adjust to contention and block on the mutex that it thinks is the best fit. Benchmarks look promising ``` ============================================================================ folly/test/SynchronizedBenchmark.cpp relative time/iter iters/s ============================================================================ ThreeThreadsPathologicalFollyLock 3.81us 262.24K ThreeThreadsPathologicalStdLock 5.34us 187.28K ThreeThreadsPathologicalOrdered 6.36us 157.28K ThreeThreadsPathologicalCarefullyOrdered 4.21us 237.29K ---------------------------------------------------------------------------- TwoThreadsTwoMutexesOrdered 260.87ns 3.83M TwoThreadsTwoMutexesSmart 161.28ns 6.20M TwoThreadsTwoMutexesPersistent 226.25ns 4.42M ---------------------------------------------------------------------------- TwoThreadsFourMutexesOrdered 196.01ns 5.10M TwoThreadsFourMutexesSmart 196.73ns 5.08M TwoThreadsFourMutexesPersistent 201.70ns 4.96M ---------------------------------------------------------------------------- TwoThreadsEightMutexesOrdered 195.76ns 5.11M TwoThreadsEightMutexesSmart 187.90ns 5.32M TwoThreadsEightMutexesPersistent 199.21ns 5.02M ---------------------------------------------------------------------------- TwoThreadsSixteenMutexesOrdered 203.91ns 4.90M TwoThreadsSixteenMutexesSmart 196.30ns 5.09M TwoThreadsSixteenMutexesPersistent 230.64ns 4.34M ---------------------------------------------------------------------------- FourThreadsTwoMutexesOrdered 814.98ns 1.23M FourThreadsTwoMutexesSmart 559.79ns 1.79M FourThreadsTwoMutexesPersistent 520.90ns 1.92M ---------------------------------------------------------------------------- FourThreadsFourMutexesOrdered 456.04ns 2.19M FourThreadsFourMutexesSmart 391.69ns 2.55M FourThreadsFourMutexesPersistent 414.56ns 2.41M ---------------------------------------------------------------------------- FourThreadsEightMutexesOrdered 392.20ns 2.55M FourThreadsEightMutexesSmart 277.89ns 3.60M FourThreadsEightMutexesPersistent 301.98ns 3.31M ---------------------------------------------------------------------------- FourThreadsSixteenMutexesOrdered 356.36ns 2.81M FourThreadsSixteenMutexesSmart 291.40ns 3.43M FourThreadsSixteenMutexesPersistent 292.23ns 3.42M ---------------------------------------------------------------------------- EightThreadsTwoMutexesOrdered 1.58us 634.85K EightThreadsTwoMutexesSmart 1.58us 634.85K EightThreadsTwoMutexesPersistent 1.56us 639.93K ---------------------------------------------------------------------------- EightThreadsFourMutexesOrdered 1.33us 753.45K EightThreadsFourMutexesSmart 794.36ns 936.34K EightThreadsFourMutexesPersistent 831.68ns 1.21M ---------------------------------------------------------------------------- EightThreadsEightMutexesOrdered 791.52ns 1.26M EightThreadsEightMutexesSmart 548.05ns 1.51M EightThreadsEightMutexesPersistent 563.14ns 2.78M ---------------------------------------------------------------------------- EightThreadsSixteenMutexesOrdered 785.40ns 2.11M EightThreadsSixteenMutexesSmart 541.27ns 1.60M EightThreadsSixteenMutexesPersistent 673.49ns 1.79M ---------------------------------------------------------------------------- SixteenThreadsTwoMutexesOrdered 1.98us 505.83K SixteenThreadsTwoMutexesSmart 1.85us 541.06K SixteenThreadsTwoMutexesPersistent 3.13us 319.53K ---------------------------------------------------------------------------- SixteenThreadsFourMutexesOrdered 2.46us 407.07K SixteenThreadsFourMutexesSmart 1.68us 594.47K SixteenThreadsFourMutexesPersistent 1.62us 617.22K ---------------------------------------------------------------------------- SixteenThreadsEightMutexesOrdered 1.67us 597.45K SixteenThreadsEightMutexesSmart 1.62us 616.83K SixteenThreadsEightMutexesPersistent 1.57us 637.50K ---------------------------------------------------------------------------- SixteenThreadsSixteenMutexesOrdered 1.20us 829.93K SixteenThreadsSixteenMutexesSmart 1.32us 757.03K SixteenThreadsSixteenMutexesPersistent 1.38us 726.75K ============================================================================ ``` Reviewed By: djwatson Differential Revision: D6673876 fbshipit-source-id: b57fdafb8fc2a42c74dc0279c051cc62976a4e07
9232fb7a · Aaryaman Sagar · Facebook Github Bot · 5604a147 · 9232fb7a · 9232fb7a
Commit 9232fb7a authored Apr 22, 2018 by Aaryaman Sagar Committed by Facebook Github Bot Apr 22, 2018
Showing with 1156 additions and 6 deletions

folly/Synchronized.h folly/Synchronized.h +229 -6

folly/test/SynchronizedBenchmark.cpp folly/test/SynchronizedBenchmark.cpp +779 -0

folly/test/SynchronizedTest.cpp folly/test/SynchronizedTest.cpp +148 -0

No files found.
--- a/folly/Synchronized.h
+++ b/folly/Synchronized.h
@@ -13,7 +13,6 @@
 * See the License for the specific language governing permissions and
 * limitations under the License.
 */
-
 /**
 * This module implements a Synchronized abstraction useful in
 * mutex-based concurrency.
@@ -25,14 +24,19 @@

 #pragma once

+#include <folly/Function.h>
 #include <folly/Likely.h>
 #include <folly/LockTraits.h>
 #include <folly/Preprocessor.h>
 #include <folly/SharedMutex.h>
 #include <folly/Traits.h>
 #include <folly/Utility.h>
+#include <folly/container/Foreach.h>
 #include <glog/logging.h>
+
+#include <array>
 #include <mutex>
+#include <tuple>
 #include <type_traits>
 #include <utility>

@@ -866,6 +870,156 @@ using LockedPtrType = typename std::conditional<
    std::is_const<SynchronizedType>::value,
    typename SynchronizedType::ConstLockedPtr,
    typename SynchronizedType::LockedPtr>::type;
+
+template <typename Synchronized>
+class SynchronizedLockerBase {
+ public:
+  explicit SynchronizedLockerBase(Synchronized& sync) : synchronized{sync} {}
+
+ protected:
+  Synchronized& synchronized;
+};
+
+template <typename Synchronized>
+class SynchronizedWLocker : public SynchronizedLockerBase<Synchronized> {
+ public:
+  using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
+  using LockedPtr = typename Synchronized::LockedPtr;
+
+  auto lock() {
+    return this->synchronized.wlock();
+  }
+  auto tryLock() {
+    return this->synchronized.tryWLock();
+  }
+};
+template <typename Synchronized>
+class SynchronizedRLocker : public SynchronizedLockerBase<Synchronized> {
+ public:
+  using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
+  using LockedPtr = typename Synchronized::ConstLockedPtr;
+
+  auto lock() {
+    return this->synchronized.rlock();
+  }
+  auto tryLock() {
+    return this->synchronized.tryRLock();
+  }
+};
+template <typename Synchronized>
+class SynchronizedULocker : public SynchronizedLockerBase<Synchronized> {
+ public:
+  using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
+  using LockedPtr = typename Synchronized::UpgradeLockedPtr;
+
+  auto lock() {
+    return this->synchronized.ulock();
+  }
+  auto tryLock() {
+    return this->synchronized.tryULock();
+  }
+};
+template <typename Synchronized>
+class SynchronizedLocker : public SynchronizedLockerBase<Synchronized> {
+ public:
+  using SynchronizedLockerBase<Synchronized>::SynchronizedLockerBase;
+  using LockedPtr = typename Synchronized::LockedPtr;
+
+  auto lock() {
+    return this->synchronized.lock();
+  }
+  auto tryLock() {
+    return this->synchronized.tryLock();
+  }
+};
+template <typename Lockable>
+class LockableLocker {
+ public:
+  explicit LockableLocker(Lockable& lockableIn) : lockable{lockableIn} {}
+  using LockedPtr = std::unique_lock<Lockable>;
+
+  auto lock() {
+    return std::unique_lock<Lockable>{lockable};
+  }
+  auto tryLock() {
+    auto lock = std::unique_lock<Lockable>{lockable, std::defer_lock};
+    lock.try_lock();
+    return lock;
+  }
+
+ private:
+  Lockable& lockable;
+};
+
+/**
+ * Acquire locks for multiple Synchronized<T> objects, in a deadlock-safe
+ * manner.
+ *
+ * The function uses the "smart and polite" algorithm from this link
+ * http://howardhinnant.github.io/dining_philosophers.html#Polite
+ *
+ * The gist of the algorithm is that it locks a mutex, then tries to lock the
+ * other mutexes in a non-blocking manner.  If all the locks succeed, we are
+ * done, if not, we release the locks we have held, yield to allow other
+ * threads to continue and then block on the mutex that we failed to acquire.
+ *
+ * This allows dynamically yielding ownership of all the mutexes but one, so
+ * that other threads can continue doing work and locking the other mutexes.
+ * See the benchmarks in folly/test/SynchronizedBenchmark.cpp for more.
+ */
+template <typename... SynchronizedLocker>
+auto /* std::tuple<LockedPtr...> */ lock(SynchronizedLocker... lockersIn) {
+  // capture the list of lockers as a tuple
+  auto lockers = std::forward_as_tuple(lockersIn...);
+
+  // make a list of null LockedPtr instances that we will return to the caller
+  auto lockedPtrs = std::tuple<typename SynchronizedLocker::LockedPtr...>{};
+
+  // start by locking the first thing in the list
+  std::get<0>(lockedPtrs) = std::get<0>(lockers).lock();
+  auto indexLocked = 0;
+
+  while (true) {
+    auto couldLockAll = true;
+
+    folly::for_each(lockers, [&](auto& locker, auto index) {
+      // if we should try_lock on the current locker then do so
+      if (index != indexLocked) {
+        auto lockedPtr = locker.tryLock();
+
+        // if we were unable to lock this mutex,
+        //
+        // 1. release all the locks,
+        // 2. yield control to another thread to be nice
+        // 3. block on the mutex we failed to lock, acquire the lock
+        // 4. break out and set the index of the current mutex to indicate
+        //    which mutex we have locked
+        if (!lockedPtr) {
+          // writing lockedPtrs = decltype(lockedPtrs){} does not compile on
+          // gcc, I believe this is a bug D7676798
+          lockedPtrs = std::tuple<typename SynchronizedLocker::LockedPtr...>{};
+
+          std::this_thread::yield();
+          folly::fetch(lockedPtrs, index) = locker.lock();
+          indexLocked = index;
+          couldLockAll = false;
+
+          return folly::loop_break;
+        }
+
+        // else store the locked mutex in the list we return
+        folly::fetch(lockedPtrs, index) = std::move(lockedPtr);
+      }
+
+      return folly::loop_continue;
+    });
+
+    if (couldLockAll) {
+      return lockedPtrs;
+    }
+  }
+}
+
 } // namespace detail

 /**
@@ -1440,20 +1594,89 @@ class LockedGuardPtr {
  SynchronizedType* const parent_{nullptr};
 };

+/**
+ * Acquire locks on many lockables or synchronized instances in such a way
+ * that the sequence of calls within the function does not cause deadlocks.
+ *
+ * This can often result in a performance boost as compared to simply
+ * acquiring your locks in an ordered manner.  Even for very simple cases.
+ * The algorithm tried to adjust to contention by blocking on the mutex it
+ * thinks is the best fit, leaving all other mutexes open to be locked by
+ * other threads.  See the benchmarks in folly/test/SynchronizedBenchmark.cpp
+ * for more
+ *
+ * This works differently as compared to the locking algorithm in libstdc++
+ * and is the recommended way to acquire mutexes in a generic order safe
+ * manner.  Performance benchmarks show that this does better than the one in
+ * libstdc++ even for the simple cases
+ *
+ * Usage is the same as std::lock() for arbitrary lockables
+ *
+ *    folly::lock(one, two, three);
+ *
+ * To make it work with folly::Synchronized you have to specify how you want
+ * the locks to be acquired, use the folly::wlock(), folly::rlock(),
+ * folly::ulock() and folly::lock() helpers defined below
+ *
+ *    auto [one, two] = lock(folly::wlock(a), folly::rlock(b));
+ *
+ * Note that you can/must avoid the folly:: namespace prefix on the lock()
+ * function if you use the helpers, ADL lookup is done to find the lock function
+ *
+ * This will execute the deadlock avoidance algorithm and acquire a write lock
+ * for a and a read lock for b
+ */
+template <typename LockableOne, typename LockableTwo, typename... Lockables>
+void lock(LockableOne& one, LockableTwo& two, Lockables&... lockables) {
+  auto locks = lock(
+      detail::LockableLocker<LockableOne>{one},
+      detail::LockableLocker<LockableTwo>{two},
+      detail::LockableLocker<Lockables>{lockables}...);
+
+  // release ownership of the locks from the RAII lock wrapper returned by the
+  // function above
+  folly::for_each(locks, [&](auto& lock) { lock.release(); });
+}
+
+/**
+ * Helper functions that should be passed to a lock() invocation, these return
+ * implementation defined structs that lock() will use to lock the
+ * synchronized instance appropriately.
+ *
+ *    lock(folly::wlock(one), folly::rlock(two), folly::wlock(three));
+ *
+ * For example in the above rlock() produces an implementation defined read
+ * locking helper instance and wlock() a write locking helper
+ */
+template <typename Data, typename Mutex>
+auto wlock(Synchronized<Data, Mutex>& synchronized) {
+  return detail::SynchronizedWLocker<Synchronized<Data, Mutex>>{synchronized};
+}
+template <typename Data, typename Mutex>
+auto rlock(Synchronized<Data, Mutex>& synchronized) {
+  return detail::SynchronizedRLocker<Synchronized<Data, Mutex>>{synchronized};
+}
+template <typename Data, typename Mutex>
+auto ulock(Synchronized<Data, Mutex>& synchronized) {
+  return detail::SynchronizedULocker<Synchronized<Data, Mutex>>{synchronized};
+}
+template <typename Data, typename Mutex>
+auto lock(Synchronized<Data, Mutex>& synchronized) {
+  return detail::SynchronizedLocker<Synchronized<Data, Mutex>>{synchronized};
+}
+
 /**
 * Acquire locks for multiple Synchronized<T> objects, in a deadlock-safe
 * manner.
 *
 * The locks are acquired in order from lowest address to highest address.
 * (Note that this is not necessarily the same algorithm used by std::lock().)
- *
 * For parameters that are const and support shared locks, a read lock is
 * acquired.  Otherwise an exclusive lock is acquired.
 *
- * TODO: Extend acquireLocked() with variadic template versions that
- * allow for more than 2 Synchronized arguments.  (I haven't given too much
- * thought about how to implement this.  It seems like it would be rather
- * complicated, but I think it should be possible.)
+ * use lock() with folly::wlock(), folly::rlock() and folly::ulock() for
+ * arbitrary locking without causing a deadlock (as much as possible), with the
+ * same effects as std::lock()
 */
 template <class Sync1, class Sync2>
 std::tuple<detail::LockedPtrType<Sync1>, detail::LockedPtrType<Sync2>>

--- a/folly/test/SynchronizedBenchmark.cpp
+++ b/folly/test/SynchronizedBenchmark.cpp
--- a/folly/test/SynchronizedTest.cpp
+++ b/folly/test/SynchronizedTest.cpp
@@ -21,6 +21,7 @@
 #include <folly/Function.h>
 #include <folly/LockTraitsBoost.h>
 #include <folly/Portability.h>
+#include <folly/ScopeGuard.h>
 #include <folly/SharedMutex.h>
 #include <folly/SpinLock.h>
 #include <folly/portability/GTest.h>
@@ -666,6 +667,27 @@ void testTryLock(Func func) {
    EXPECT_EQ(unlocked, 0);
  }
 }
+
+class MutexTrack {
+ public:
+  static int gId;
+  static int gOrder;
+
+  void lock_shared() {}
+  void unlock_shared() {}
+  void lock() {
+    order = MutexTrack::gOrder++;
+  }
+  void unlock() {
+    order = -1;
+    --gOrder;
+  }
+
+  int current{gId++};
+  int order{-1};
+};
+int MutexTrack::gId{0};
+int MutexTrack::gOrder{0};
 } // namespace

 TEST_F(SynchronizedLockTest, TestTryLock) {
@@ -797,4 +819,130 @@ TEST_F(SynchronizedLockTest, TestConvertTryLockToLock) {
  EXPECT_EQ(value, 0);
 }

+TEST(FollyLockTest, TestVariadicLockWithSynchronized) {
+  {
+    auto syncs = std::array<folly::Synchronized<int>, 3>{};
+    auto& one = syncs[0];
+    auto& two = syncs[1];
+    auto& three = syncs[2];
+    auto locks =
+        lock(folly::wlock(one), folly::rlock(two), folly::wlock(three));
+    EXPECT_TRUE(std::get<0>(locks));
+    EXPECT_TRUE(std::get<1>(locks));
+    EXPECT_TRUE(std::get<2>(locks));
+  }
+  {
+    auto syncs = std::array<folly::Synchronized<int, std::mutex>, 2>{};
+    auto locks = lock(folly::lock(syncs[0]), folly::lock(syncs[1]));
+    EXPECT_TRUE(std::get<0>(locks));
+    EXPECT_TRUE(std::get<1>(locks));
+  }
+}
+
+TEST(FollyLockTest, TestVariadicLockWithArbitraryLockables) {
+  auto&& one = std::mutex{};
+  auto&& two = std::mutex{};
+
+  auto lckOne = std::unique_lock<std::mutex>{one, std::defer_lock};
+  auto lckTwo = std::unique_lock<std::mutex>{two, std::defer_lock};
+  folly::lock(lckOne, lckTwo);
+  EXPECT_TRUE(lckOne);
+  EXPECT_TRUE(lckTwo);
+}
+
+namespace {
+struct TestLock {
+ public:
+  void lock() {
+    onLock();
+    ++numTimesLocked;
+  }
+  bool try_lock() {
+    if (shouldTryLockSucceed) {
+      lock();
+      return true;
+    }
+    return false;
+  }
+  void unlock() {
+    onUnlock();
+    ++numTimesUnlocked;
+  }
+
+  int numTimesLocked{0};
+  int numTimesUnlocked{0};
+  bool shouldTryLockSucceed{true};
+  std::function<void()> onLock{[] {}};
+  std::function<void()> onUnlock{[] {}};
+};
+} // namespace
+
+TEST(FollyLockTest, TestVariadicLockSmartAndPoliteAlgorithm) {
+  auto one = TestLock{};
+  auto two = TestLock{};
+  auto three = TestLock{};
+  auto makeReset = [&] {
+    return folly::makeGuard([&] {
+      one = TestLock{};
+      two = TestLock{};
+      three = TestLock{};
+    });
+  };
+
+  {
+    auto reset = makeReset();
+    folly::lock(one, two, three);
+    EXPECT_EQ(one.numTimesLocked, 1);
+    EXPECT_EQ(one.numTimesUnlocked, 0);
+    EXPECT_EQ(two.numTimesLocked, 1);
+    EXPECT_EQ(two.numTimesUnlocked, 0);
+    EXPECT_EQ(three.numTimesLocked, 1);
+    EXPECT_EQ(three.numTimesUnlocked, 0);
+  }
+
+  {
+    auto reset = makeReset();
+    two.shouldTryLockSucceed = false;
+    folly::lock(one, two, three);
+    EXPECT_EQ(one.numTimesLocked, 2);
+    EXPECT_EQ(one.numTimesUnlocked, 1);
+    EXPECT_EQ(two.numTimesLocked, 1);
+    EXPECT_EQ(two.numTimesUnlocked, 0);
+    EXPECT_EQ(three.numTimesLocked, 1);
+    EXPECT_EQ(three.numTimesUnlocked, 0);
+  }
+
+  {
+    auto reset = makeReset();
+    three.shouldTryLockSucceed = false;
+    folly::lock(one, two, three);
+    EXPECT_EQ(one.numTimesLocked, 2);
+    EXPECT_EQ(one.numTimesUnlocked, 1);
+    EXPECT_EQ(two.numTimesLocked, 2);
+    EXPECT_EQ(two.numTimesUnlocked, 1);
+    EXPECT_EQ(three.numTimesLocked, 1);
+    EXPECT_EQ(three.numTimesUnlocked, 0);
+  }
+
+  {
+    auto reset = makeReset();
+    three.shouldTryLockSucceed = false;
+
+    three.onLock = [&] {
+      // when three gets locked make one fail
+      one.shouldTryLockSucceed = false;
+      // then when one gets locked make three succeed to finish the test
+      one.onLock = [&] { three.shouldTryLockSucceed = true; };
+    };
+
+    folly::lock(one, two, three);
+    EXPECT_EQ(one.numTimesLocked, 2);
+    EXPECT_EQ(one.numTimesUnlocked, 1);
+    EXPECT_EQ(two.numTimesLocked, 2);
+    EXPECT_EQ(two.numTimesUnlocked, 1);
+    EXPECT_EQ(three.numTimesLocked, 2);
+    EXPECT_EQ(three.numTimesUnlocked, 1);
+  }
+}
+
 } // namespace folly