- 12 Oct, 2018 3 commits
-
-
Nathan Bronson authored
Summary: Make StringPiece and MutableStringPiece implicitly convertible to std::string_view. This diff also adds FOLLY_HAS_STRING_VIEW to Portability.h, which will be defined to be 0 or 1. Reviewed By: vitaut Differential Revision: D10187660 fbshipit-source-id: 90e6f3ac82e47eae423dab8f1de1d39e5ac997c4
-
Yedidya Feldblum authored
Summary: [Folly] `exchange` in `DelayedDestruction` move-ctor. Reviewed By: knekritz Differential Revision: D10320776 fbshipit-source-id: b6b1f910b927098e61f6f6c8e414fdfcca843033
-
Dan Melnic authored
Summary: Clear the zerocopy maps in AsyncSocket::failAllWrites if the fd_ has been closed Reviewed By: djwatson Differential Revision: D10347608 fbshipit-source-id: a60d0450b13d4989d6849ca4cbe1057ecb6b92dd
-
- 11 Oct, 2018 5 commits
-
-
Lee Howes authored
Summary: Future::then to Future::thenValue to remove calls to deprecated functions. Reviewed By: Orvid Differential Revision: D10346695 fbshipit-source-id: 7f9641ef5b7f93ac4376a52724d42502eb1703b9
-
Orvid King authored
Summary: MSVC doesn't support inline assembly. Reviewed By: yfeldblum, aary Differential Revision: D10345334 fbshipit-source-id: 5b95d9f6bd8158fe426e01fa176ecfa732cf5bc3
-
Aaryaman Sagar authored
Summary: DistributedMutex is a small, exclusive-only mutex that distributes the bookkeeping required for mutual exclusion in the stacks of threads that are contending for it. It tries to come at a lower space cost than std::mutex while still trying to maintain the fairness benefits that come from using std::mutex. DistributedMutex provides the entire API included in std::mutex, and more, with slight modifications. DistributedMutex is the same width as a single pointer (8 bytes on most platforms), where on the other hand, std::mutex and pthread_mutex_t are both 40 bytes. It is larger than some of the other smaller locks, but the wide majority of cases using the small locks are wasting the difference in alignment padding anyway Benchmark results are good - at the time of writing in the common uncontended case, it is 30% faster than some of the other small mutexes in folly and as fast as std::mutex, which recently optimized its uncontended path. In the contended case, it is about 4-5x faster than some of the smaller locks in folly, ~2x faster than std::mutex in clang and ~1.8x faster in gcc. DistributedMutex is also resistent to tail latency pathalogies unlike many of the other small mutexes. Which sleep for large time quantums to reduce spin churn, this causes elevated latencies for threads that enter the sleep cycle. The tail latency of lock acquisition on average up to 10x better with DistributedMutex DistributedMutex reduces cache line contention by making each thread wait on a thread local spinlock and futex. This allows threads to keep working only on their own cache lines without requiring cache coherence operations when a mutex heavy contention. This strategy does not require sequential ordering on the centralized atomic storage for wakeup operations as each thread assigned its own wait state Non-timed mutex acquisitions are scheduled through intrusive LIFO contention chains. Each thread starts by spinning for a short quantum and falls back to two phased sleeping. Enqueue operations are lock free and are piggybacked off mutex acquisition attempts. The LIFO behavior of a contention chain is good in the case where the mutex is held for a short amount of time, as the head of the chain is likely to not have slept on futex() after exhausting its spin quantum. This allow us to avoid unnecessary traversal and syscalls in the fast path with a higher probability. Even though the contention chains are LIFO, the mutex itself does not adhere to that scheduling policy globally. During contention, threads that fail to lock the mutex form a LIFO chain on the central mutex state, this chain is broken when a wakeup is scheduled, and future enqueue operations form a new chain. This makes the chains themselves LIFO, but preserves global fairness through a constant factor which is limited to the number of concurrent failed mutex acquisition attempts. This binds the last in first out behavior to the number of contending threads and helps prevent starvation and latency outliers This strategy of waking up wakers one by one in a queue does not scale well when the number of threads goes past the number of cores. At which point preemption causes elevated lock acquisition latencies. DistributedMutex implements a hardware timestamp publishing heuristic to detect and adapt to preemption. DistributedMutex does not have the typical mutex API - it does not satisfy the Lockable concept. It requires the user to maintain ephemeral bookkeeping and pass that bookkeeping around to unlock() calls. The API overhead, however, comes for free when you wrap this mutex for usage with folly::Synchronized or std::unique_lock, which is the recommended usage (std::lock_guard, in optimized mode, has no performance benefit over std::unique_lock, so has been omitted). A benefit of this API is that it disallows incorrect usage where a thread unlocks a mutex that it does not own, thinking a mutex is functionally identical to a binary semaphore, which, unlike a mutex, is a suitable primitive for that usage Timed locking through DistributedMutex is implemented through a centralized algorithm - all waiters wait on the central mutex state, by setting and resetting bits within the pointer-length word. Since pointer length atomic integers are incompatible with futex(FUTEX_WAIT) on most systems, a non-standard implementation of futex() is used, where wait queues are managed in user-space. See p1135r0 and folly::ParkingLot Reviewed By: djwatson Differential Revision: D8949918 fbshipit-source-id: a772a70114e943ff68525c990da45e32ad1a5077
-
Yedidya Feldblum authored
Summary: [Folly] Reimplement `SCOPE_SUCCESS` and `SCOPE_FAIL` on top of `ScopeGuardImpl`. Allows the underlying type to be safely move-constructed, in case the compiler does not do RVO copy elision in pre-C++17 builds. Fixes #940. Reviewed By: ericniebler Differential Revision: D10120199 fbshipit-source-id: 889b867041338d6b76a0af5f04076b16e3120f63
-
Yedidya Feldblum authored
Summary: [Folly] Disable `-Wdeprecated-declarations` in cmake builds. To hide deprecation spam in unit-tests for uses of deprecated declarations which cannot yet be removed. Reviewed By: Orvid Differential Revision: D10287167 fbshipit-source-id: 2dddd9f69ca1dcced61e4a2d6cd16bad54b2a2ad
-
- 10 Oct, 2018 8 commits
-
-
Lee Howes authored
Summary: Remove the form of Future::then that could take a continuation with no parameters from elsewhere in Future.h as it is deprecated, and pending deletion. In particular this required splitting the multi forms of the functions which still allow flexible typing at this point in time. Reviewed By: yfeldblum Differential Revision: D10206118 fbshipit-source-id: bb2ae241d3bfd5b89cae077e667cb39c4cc62797
-
Dan Melnic authored
Summary: Do not check zeroCopyEnabled_ in AsyncSocket::isZeroCopyMsg to avoid an IOBuf leak when zeroCopyEnabled_ changes from true to false Reviewed By: yfeldblum Differential Revision: D10285664 fbshipit-source-id: 5f13426f6140668d029a89786eb9a92466d4b1f4
-
Michael Bolin authored
Summary: I was reading through about the various approaches to small-string optimization: https://shaharmike.com/cpp/std-string/. I noticed that Clang can store up to 22 bytes inline but FBString claims it can do 23, so I read through the code to figure out where the extra byte came from. Specifically, I wasn't sure how it could have space to store the size as well as ensure the buffer was null-terminated. After playing with the code for a bit (this was further complicated because running the code under ASAN changes the behavior, which I didn't realize before I started this exploration), I saw how we don't store the size, but `maxSmallSize - size`, so that null-termination works out even when `size==23`. This updates the docs to hopefully save someone else this same exploration. (Note: this ignores all push blocking failures!) Reviewed By: ot Differential Revision: D10258831 fbshipit-source-id: bfc0dd7ae55518af4173625bd719cfd4778180cc
-
Yedidya Feldblum authored
Summary: [Folly] Fix cmake tests build after {D10244149}, which added `instructions::dispatch`. Reviewed By: ot Differential Revision: D10287193 fbshipit-source-id: f555229c8fcebccf61cb5577f5bf3edee9c46163
-
Yedidya Feldblum authored
Summary: [Folly] DRY template args in `Future::thenImplementation`. Explicitly passing the template arguments which are also the types of non-template arguments is weird. Either limit the explicit template arguments to those which are not also the types of non-template arguments, or pass everything as non-template arguments. Reviewed By: andriigrynenko Differential Revision: D10247910 fbshipit-source-id: 64a56c5ea2072803781143b5247145ec2ee09ae0
-
Hunter Zhang authored
Summary: as described in title Reviewed By: Orvid Differential Revision: D10279251 fbshipit-source-id: 0ab66033193026ccb0b6a06e57557a822ccb8a70
-
Joe Loser authored
Summary: - A couple of tests pragma push/pop to avoid `-Wself-move` warning in code that does a move-assignment with itself. - Use `static_cast<T&&>` instead of `std::move` which will suppress the warning still without the need for pragma push/pop. Pull Request resolved: https://github.com/facebook/folly/pull/946 Reviewed By: Orvid Differential Revision: D10253159 Pulled By: yfeldblum fbshipit-source-id: 9d75db95018e6115a195f246dbd93b0a5935d80d
-
Yedidya Feldblum authored
Summary: [Folly] Enable `Fingerprint` test in CMake builds. Reviewed By: Orvid Differential Revision: D10081558 fbshipit-source-id: e4f753feed5b2698dd485cf32403c1c4d15d0c25
-
- 09 Oct, 2018 5 commits
-
-
Bill Earl authored
Summary: TSAN complains about a lock inversion between folly/io/async/Request.cpp lines 242 and 277. The code as 242 was changed to use folly::acquireLocked() in D9797715, but the code at 277 needs to be changed in the same way (except for requesting a write lock on the child) Reviewed By: kennyyu Differential Revision: D10251470 fbshipit-source-id: c476a3664bfe83edbddbd402cf568eac885b9123
-
Wez Furlong authored
Summary: These were failing in the watchman build ``` $ cd build && python fbcode_builder/shell_builder.py > ~/run.sh && bash ~/run.sh Traceback (most recent call last): File "fbcode_builder/shell_builder.py", line 110, in <module> steps = make_steps(builder) File "/home/travis/build/facebook/watchman/build/fbcode_builder/utils.py", line 94, in <lambda> steps_for_spec(builder, config['fbcode_builder_spec'](builder)) File "/home/travis/build/facebook/watchman/build/fbcode_builder/fbcode_builder.py", line 144, in build return [self.setup(), self.diagnostics()] + steps File "fbcode_builder/shell_builder.py", line 61, in setup ).format(ccache_dir=ccache_dir) File "/home/travis/build/facebook/watchman/build/fbcode_builder/shell_quoting.py", line 64, in format (k, shell_quote(v).do_not_use_raw_str) for k, v in kwargs.items() KeyError: u'CC' ``` Reviewed By: snarkmaster fbshipit-source-id: 614723e631a82f277739765a920731c872700c45
-
Giuseppe Ottaviano authored
Summary: Add a simple function to dispatch the `instructions` types based on current CPU (overridable in tests). Reviewed By: luciang Differential Revision: D10244149 fbshipit-source-id: ac0c1ff0d14402fa77c939361398db6122dea728
-
Greg McGary authored
Summary: For determining proper use of `strerror_r` based on available decl, move messy `#if` logic out of `String.cpp` and into buck config. Reviewed By: yfeldblum Differential Revision: D10243924 fbshipit-source-id: 6c10fc43562765c158d13ed74961277b633bb606
-
Joe Loser authored
Summary: - LLVM 7 emits a warning about self-assignment. - Silence the warning by using a static_cast. Pull Request resolved: https://github.com/facebook/folly/pull/945 Reviewed By: Orvid Differential Revision: D10236656 Pulled By: yfeldblum fbshipit-source-id: 43b46563357d89db9e0911b894f8fead0a14d660
-
- 08 Oct, 2018 6 commits
-
-
Greg McGary authored
Summary: Apply fixes that work with new unified headers and remain compatible with old per-API+arch headers. I do this to unclutter the diff that enables unified headers, so that it only contains things that are incompatible with old per-API+arch headers. Reviewed By: yfeldblum Differential Revision: D10225372 fbshipit-source-id: b92cb3ef45cf23f2233c39520236c4531dba715b
-
Dan Melnic authored
Summary: Use `FOLLY_ATTR_WEAK` for the annotate functions. (Note: this ignores all push blocking failures!) Differential Revision: D10234953 fbshipit-source-id: 534cb1d2d7d6053bc000e4c7ae3c6df16fc7a958
-
Wez Furlong authored
Summary: `shell_builder.py` allows running the fbcode_builder logic on the host rather than in a container. It emits a bash script with `set -exo pipefail` configured such that any failing step will cause the script to exit with failure. Refs: https://github.com/facebook/watchman/pull/639 Reviewed By: simpkins Differential Revision: D9552411 fbshipit-source-id: c7835deedf07ea342dcb3de61d576a4fb5439985
-
Petr Lapukhov authored
Summary: As titled, purely for uniformity. Reviewed By: yfeldblum Differential Revision: D10098066 fbshipit-source-id: 12d78fcc60805284d3d85f405475e63a85799041
-
Dmitry Koterov authored
Summary: This adds a missing feature. Nested exceptions are cool, see examples in https://en.cppreference.com/w/cpp/error/rethrow_if_nested (bottom of the page). In short, nested exceptions allow people to add "contextual" information to exceptions. Previously we had to write the `try { ew.throw_exception(); } catch (...) { std::throw_with_nested(xxx); }` boilerplate if we used an exception_wrapper (e.g. in futures-oriented code); with this diff, we're able to just run `ew.throw_with_nested(xxx)`. Reviewed By: yfeldblum Differential Revision: D10219994 fbshipit-source-id: cdf0fbe8e4608f6c87d326631e8ffb4919c20711
-
Joe Loser authored
Summary: Pull Request resolved: https://github.com/facebook/folly/pull/944 Reviewed By: Orvid Differential Revision: D10231693 Pulled By: yfeldblum fbshipit-source-id: 9c0c5d7c4c1e4d9d9860c71e608e47d661a41fdd
-
- 07 Oct, 2018 2 commits
-
-
Mark Williams authored
Summary: icc accepts the syntax, but doesn't emit the definition into the object file. Reviewed By: ricklavoie, yfeldblum Differential Revision: D10228495 fbshipit-source-id: 73c1676deb4963a9c30886414cc1722d44a79214
-
Lee Howes authored
Summary: Remove falls to Future::then that take a nullary lambda. This is deprecated and will be deleted. This change replaces them with thenValue([](auto&&){...}) or similar. This does not replace all calls to then with thenValue - other forms will be replaced when those forms of then are due for deletion. Reviewed By: yfeldblum Differential Revision: D10219070 fbshipit-source-id: 13b6f64ee2d7c8741537fe1131bdf3b251361413
-
- 06 Oct, 2018 1 commit
-
-
Philip Pronin authored
Summary: Similar to `--dynamic_cputhreadpoolexecutor`, this is a temporary kill switch until all outstanding issues are addressed that would allow us to use dynamic thread pools unconditionally. Reviewed By: djwatson Differential Revision: D10231230 fbshipit-source-id: b0abd77ff2a6a7ce31727e14eca799a6f3b5535a
-
- 05 Oct, 2018 1 commit
-
-
Aaryaman Sagar authored
Summary: There were some read-modify-write operations that were behaving as read and then modify. This is incorrect when you use DeterministicAtomic without a backing schedule. This is okay if the test is operating in the presence of a DeterministicSchedule schedule, but is probably worth fixing anyway. If using DeterministicAtomic without a backing schedule is considered a bug, we can add an assertion in beforeSharedAccess() and afterSharedAccess() so users know not to try that Reviewed By: djwatson, nbronson Differential Revision: D10190682 fbshipit-source-id: 6a3173d904a40bbc5c96b934a76aff90a48c8608
-
- 04 Oct, 2018 9 commits
-
-
Mark Williams authored
Summary: icc gets confused on the parameter unpacking otherwise. https://godbolt.org/z/rSRZYn Reviewed By: yfeldblum Differential Revision: D10147211 fbshipit-source-id: 85de581ffa4c487523a5e9f173636f74863a82ac
-
Mark Williams authored
Summary: When they come before the explicit instantiations of Rob, icc fails to compile certain combintions of calls. Reviewed By: alexeyt Differential Revision: D10144943 fbshipit-source-id: ce63a4d289f64d1d644b116ab99b45b9da013162
-
Kenny Yu authored
Summary: `folly::SharedMutex` allows `unlock_shared()` to be called in a different thread from the one that locked it There are some places in folly that use this property. Example: In `ObserverManager`, the calling thread 1. acquires the read lock 2. then it schedules some async work and gives ownership of the lock guard to the lambda 3. the thread that performs the async work will unlock the mutex However, this causes false positives for TSAN. It looks like this breaks TSAN's assumptions that a read-write mutex is always read locked and unlocked in the same thread. In the example above, TSAN thinks the calling thread in step (1) still owns the mutex, even though semantically the mutex is now owned by the thread in step (3). This results in false positives for running unit tests with an annotated version of `folly::SharedMutex`. TSAN reports a lock inversion between the `folly::Singleton vaule_.state_` mutex, and the read lock referenced in `ObserverManager`. To suppress this error, this change annotates ObserverManager to let TSAN know that the the thread in step 1 has dropped the lock, and the thread in step 3 now owns the lock. Reviewed By: andriigrynenko Differential Revision: D9797937 fbshipit-source-id: c0e5277bb9cb2f404df8abd3c3d6ea4a16460c78
-
Kenny Yu authored
Summary: TSAN flagged the following lock inversion below. The reason for this lock inversion is because of the following example usage: 1. set request context B, save old one A (lock B, lock A) 2. later on, reset request context back to A, get the old one back B (lock A, lock B) In the two calls, we always locked the the new context first, resulting in a lock inversion. The fix is to always acquire them in a consistent order using `folly::acquireLocked` TSAN report: ``` WARNING: ThreadSanitizer: lock-order-inversion (potential deadlock) (pid=1574307) Cycle in lock order graph: M159 (0x7b1400000138) => M157 (0x7b14000000e8) => M159 Mutex M157 acquired here while holding mutex M159 in main thread: #0 0x48f18c in AnnotateRWLockAcquired ??:? #1 0x315495 in folly::SharedMutexImpl<false, void, std::atomic, false>::lock_shared() ./folly/SharedMutex.h:385 #2 0x1b0eb8 in folly::detail::LockTraitsImpl<folly::SharedMutexImpl<false, void, std::atomic, false>, (folly::detail::MutexLevel)1, false>::lock_shared(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:157 #3 0x1b0e68 in std::integral_constant<bool, true> folly::LockPolicyShared::lock<folly::SharedMutexImpl<false, void, std::atomic, false> >(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:493 #4 0x1d63a9 in LockedPtrBase ./folly/Synchronized.h:1026 #5 0x1d6258 in LockedPtr ./folly/Synchronized.h:1326 #6 0x1c9a17 in folly::SynchronizedBase<folly::Synchronized<folly::RequestContext::State, folly::SharedMutexImpl<false, void, std::atomic, false> >, (folly::detail::MutexLevel)1>::rlock() const ./folly/Synchronized.h:123 #7 0x1c7530 in folly::RequestContext::setContext(std::shared_ptr<folly::RequestContext>) ./folly/io/async/Request.cpp:243 #8 0x36a7b in folly::RequestContext::create() ./folly/io/async/Request.h:119 #9 0x34916 in RequestContextScopeGuard ./folly/io/async/Request.h:269 Mutex M159 previously acquired by the same thread here: #0 0x48f18c in AnnotateRWLockAcquired ??:? #1 0x315495 in folly::SharedMutexImpl<false, void, std::atomic, false>::lock_shared() ./folly/SharedMutex.h:385 #2 0x1b0eb8 in folly::detail::LockTraitsImpl<folly::SharedMutexImpl<false, void, std::atomic, false>, (folly::detail::MutexLevel)1, false>::lock_shared(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:157 #3 0x1b0e68 in std::integral_constant<bool, true> folly::LockPolicyShared::lock<folly::SharedMutexImpl<false, void, std::atomic, false> >(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:493 #4 0x1d63a9 in LockedPtrBase ./folly/Synchronized.h:1026 #5 0x1d6258 in LockedPtr ./folly/Synchronized.h:1326 #6 0x1c9a17 in folly::SynchronizedBase<folly::Synchronized<folly::RequestContext::State, folly::SharedMutexImpl<false, void, std::atomic, false> >, (folly::detail::MutexLevel)1>::rlock() const ./folly/Synchronized.h:123 #7 0x1c7516 in folly::RequestContext::setContext(std::shared_ptr<folly::RequestContext>) ./folly/io/async/Request.cpp:242 #8 0x36a7b in folly::RequestContext::create() ./folly/io/async/Request.h:119 #9 0x34916 in RequestContextScopeGuard ./folly/io/async/Request.h:269 Mutex M159 acquired here while holding mutex M157 in main thread: #0 0x48f18c in AnnotateRWLockAcquired ??:? #1 0x315495 in folly::SharedMutexImpl<false, void, std::atomic, false>::lock_shared() ./folly/SharedMutex.h:385 #2 0x1b0eb8 in folly::detail::LockTraitsImpl<folly::SharedMutexImpl<false, void, std::atomic, false>, (folly::detail::MutexLevel)1, false>::lock_shared(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:157 #3 0x1b0e68 in std::integral_constant<bool, true> folly::LockPolicyShared::lock<folly::SharedMutexImpl<false, void, std::atomic, false> >(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:493 #4 0x1d63a9 in LockedPtrBase ./folly/Synchronized.h:1026 #5 0x1d6258 in LockedPtr ./folly/Synchronized.h:1326 #6 0x1c9a17 in folly::SynchronizedBase<folly::Synchronized<folly::RequestContext::State, folly::SharedMutexImpl<false, void, std::atomic, false> >, (folly::detail::MutexLevel)1>::rlock() const ./folly/Synchronized.h:123 #7 0x1c7530 in folly::RequestContext::setContext(std::shared_ptr<folly::RequestContext>) ./folly/io/async/Request.cpp:243 #8 0x34b99 in ~RequestContextScopeGuard ./folly/io/async/Request.h:278 Mutex M157 previously acquired by the same thread here: #0 0x48f18c in AnnotateRWLockAcquired ??:? #1 0x315495 in folly::SharedMutexImpl<false, void, std::atomic, false>::lock_shared() ./folly/SharedMutex.h:385 #2 0x1b0eb8 in folly::detail::LockTraitsImpl<folly::SharedMutexImpl<false, void, std::atomic, false>, (folly::detail::MutexLevel)1, false>::lock_shared(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:157 #3 0x1b0e68 in std::integral_constant<bool, true> folly::LockPolicyShared::lock<folly::SharedMutexImpl<false, void, std::atomic, false> >(folly::SharedMutexImpl<false, void, std::atomic, false>&) ./folly/LockTraits.h:493 #4 0x1d63a9 in LockedPtrBase ./folly/Synchronized.h:1026 #5 0x1d6258 in LockedPtr ./folly/Synchronized.h:1326 #6 0x1c9a17 in folly::SynchronizedBase<folly::Synchronized<folly::RequestContext::State, folly::SharedMutexImpl<false, void, std::atomic, false> >, (folly::detail::MutexLevel)1>::rlock() const ./folly/Synchronized.h:123 #7 0x1c7516 in folly::RequestContext::setContext(std::shared_ptr<folly::RequestContext>) ./folly/io/async/Request.cpp:242 #8 0x34b99 in ~RequestContextScopeGuard ./folly/io/async/Request.h:278 ``` Reviewed By: igorsugak Differential Revision: D9797715 fbshipit-source-id: a0f991ae0ec8f3e9d33af6e05db49eceaf3a5174
-
Kenny Yu authored
Summary: There is a lock inversion between VirtualEventBase and the fibers library: 1. During program shutdown, `GlobalCache::getImpl` acquires its `mutex_`, then it calls `VirtualEventBase::runOnDestruction` which acquires the SharedMutex on `onDestructionCallbacks_` 2. During normal program execution, `VirtualEventBase::destroyImpl()` acquires the `onDestructionCallbacks_` lock and then invokes the user-supplied callbacks, which will invoke `GlobalCache::eraseImpl` and acquire `mutex_` The reason for this lock inversion is because in step 2, we are holding a mutex while invoking user-supplied callbacks. The fix is to NOT hold the lock while we invoke these callbacks. Reviewed By: andriigrynenko Differential Revision: D9797816 fbshipit-source-id: 861a915f22138f0c8d3f8bca4d7bf4b9a0aa3d26
-
Kenny Yu authored
Summary: When running tests with a TSAN-annotated version of folly::SharedMutex, I discovered at least 3 types of lock inversions in folly::Singleton code. These inversions are probably benign and deadlock is not possible, but we need to suppress these inversions to prevent too much noise from TSAN. Lock inversions: 1. The users can supply arbitrary create() functions for Singletons. If the create() function happens to acquire a mutex, then this introduces a lock ordering from SingletonVault -> mutex. Later on, when the user code runs in normal operations, we might hold the mutex and try to create a singleton, resulting in the mutex -> SingletonVault ordering 2. SingletonVault::destroyInstances() holds a mutex, and then invokes the user-supplied destroy() functions. In the user supplied destroy functions, we might attempt to create a singleton and acquire the lock in SingletonHolder, resulting in SingletonVault -> SingletonHolder lock ordering. In normal operations, whenever a singleton is created, we have the SingletonHolder -> SingletonVault lock ordering in createInstance() 3. doEagerInit() holds the eagerInitSingleton mutex and then acquires the SingletonVault lock during createInstance(). addEagerInitSingleton() holds the SingletonVault lock and then acquires the eagerInitSingleton lock. The source of these errors is that we are invoking user-supplied callbacks while holding mutexes. However, these lock inversions cannot actually deadlock based on higher-level knowledge of folly::Singleton state transitions. To suppress these errors, use the new `folly::SharedMutexSuppressTSAN` mutex instead, which will not be annotated by TSAN. If we are not building with TSAN, then this mutex is equivalent to a normal `folly::SharedMutex`. Reviewed By: andriigrynenko Differential Revision: D9797988 fbshipit-source-id: 82b5850fe189f7d1dcaca0e3562fabcfafd0cef5
-
Kenny Yu authored
Summary: This diff adds TSAN annotations to `folly::SharedMutex` to allow TSAN to find lock inversions with this type of mutex. Note that only the WritePriority version is annotated, and the ReadPriority version is not. See the comments in the source code for an explanation of this. Some notes about how the annotation was done: - We always call the RELEASED annotations at the beginning of unlock(), and the ACQUIRED annotations at the end of lock(). This prevents a double locking interleaving where thread 1 has unlocked it and before RELEASED is called, another thread has locked it and calls ACQUIRED first. - These annotations treat upgrade locks as equivalent to a shared lock. This prevents any false positives, and keeps the annotations simple. - We need to keep the constructor for SharedMutex as `constexpr` to avoid static initialization bugs. As a result, we need to lazily annotate creation of the mutex. To do this, this adds an extra bit to the `SharedMutex` state to keep track if initialization has been called. In TSAN builds, we have an array of mutexes to enforce that initialization is called at most once. - This diff introduces a new template param AnnotateForThreadSanitizer (default is False). This allows users to use a new folly::SharedMutexSuppressTSAN version to avoid lock inversion detection if there are noisy lock inversions. However, this should be the exception and not the norm. In normal build modes, this is equivalent to a normal SharedMutex. Reviewed By: nbronson Differential Revision: D9677058 fbshipit-source-id: b0f5719a75024937fb81672435fb1c9802f255d7
-
Kenny Yu authored
Summary: This change is part of a bigger effort to make `folly::SharedMutex` compatible with TSAN so that TSAN can find lock inversions with these types of mutexes. In order to do this, we need to annotate the mutex function calls accordingly. The annotation functions are provided by the TSAN runtime library. Upcoming changes to `folly::SharedMutex`: 1. provide annotation helper macros (this change) 2. fix or suppress known issues in existing libraries (these macros will allow us to suppress issues) 3. once all the issues are fixed, annotate `SharedMutex` accordingly Reviewed By: nbronson Differential Revision: D9797582 fbshipit-source-id: 12306db98505fe31a8d8590113d10585840fbe6d
-
Nick Terrell authored
Summary: `AccessSpreader::cachedCurrent()` caches the result of `AccessSpreader::getcpuFunc()` for 32 calls in a thread-local. The cached function takes 2 ns, where the current call takes 12 ns. This comes at the cost of being imprecise when threads migrate to a new cpu. I chose 32 as the number of calls because it only has a 10% overhead over never refreshing (2.05 ns vs 1.83 ns), where 16 has a 30% overhead, and it performs just as well as 64. Reviewed By: ot Differential Revision: D10151009 fbshipit-source-id: 07ed292dfdcdcedcb74c24279f7773a80ad09348
-