Commit b693ea20 authored by Maged Michael's avatar Maged Michael Committed by Facebook Github Bot

hazptr: Avoid deadlock on tagged list

Summary:
Prevent deadlock on tagged retired lists within calls to `do_reclamation()` that call `cleanup_batch_tag()`.

Changes:
- Make locking the tagged list reentrant.
- Eliminate sharding of tagged lists to prevent deadlock between concurrent calls to `do_reclamation()` on different shards that call `cleanup_batch_tag` on the other shard.
- Expose the list of unprotected objects being reclaimed in `do_reclamation()` of the tagged list so that calls to `cleanup_batch_tag()` don't miss objects with a matching tag.
- Refactor of commonalities between calls to `retire()` in `hazptr_obj_base` and `hazptr_obj_base_linked` into `hazptr_obj::push_obj()`.
- Fixed release of tagged list lock to use CAS in a loop instead of store, since concurrent lock-free pushes are possible.

Reviewed By: davidtgoldblatt

Differential Revision: D13439983

fbshipit-source-id: 5cea585c577a64ea8b43a1827522335a18b9a933
parent d7d5fe11
...@@ -25,7 +25,6 @@ ...@@ -25,7 +25,6 @@
#include <folly/synchronization/AsymmetricMemoryBarrier.h> #include <folly/synchronization/AsymmetricMemoryBarrier.h>
#include <atomic> #include <atomic>
#include <functional>
#include <unordered_set> // for hash set in bulk_reclaim #include <unordered_set> // for hash set in bulk_reclaim
/// ///
...@@ -48,6 +47,59 @@ constexpr int hazptr_domain_rcount_threshold() { ...@@ -48,6 +47,59 @@ constexpr int hazptr_domain_rcount_threshold() {
* A domain manages a set of hazard pointers and a set of retired objects. * A domain manages a set of hazard pointers and a set of retired objects.
* *
* Most user code need not specify any domains. * Most user code need not specify any domains.
*
* Notes on destruction order, tagged objects, locking and deadlock
* avoidance:
* - Tagged objects support reclamation order guarantees. A call to
* cleanup_batch_tag(tag) guarantees that all objects with the
* specified tag are reclaimed before the function returns.
* - Due to the strict order, access to the set of tagged objects
* needs synchronization and care must be taken to avoid deadlock.
* - There are two types of reclamation operations to consider:
* - Type A: A Type A reclamation operation is triggered by meeting
* some threshold. Reclaimed objects may have different
* tags. Hazard pointers are checked and only unprotected objects
* are reclaimed. This type is expected to be expensive but
* infrequent and the cost is amortized over a large number of
* reclaimed objects. This type is needed to guarantee an upper
* bound on unreclaimed reclaimable objects.
* - Type B: A Type B reclamation operation is triggered by a call
* to the function cleanup_batch_tag for a specific tag. All
* objects with the specified tag must be reclaimed
* unconditionally before returning from such a function
* call. Hazard pointers are not checked. This type of reclamation
* operation is expected to be inexpensive and may be invoked more
* frequently than Type A.
* - Tagged retired objects are kept in a single list in the domain
* structure, named tagged_.
* - Both Type A and Type B of reclamation pop all the objects in
* tagged_ and sort them into two sets of reclaimable and
* unreclaimable objects. The objects in the reclaimable set are
* reclaimed and the objects in the unreclaimable set are pushed
* back in tagged_.
* - The tagged_ list is locked between popping all objects and
* pushing back unreclaimable objects, in order to guarantee that
* Type B operations do not miss any objects that match the
* specified tag.
* - A Type A operation cannot release the lock on the tagged_ list
* before reclaiming reclaimable objects, to prevent concurrent
* Type B operations from returning before the reclamation of
* objects with matching tags.
* - A Type B operation can release the lock on tagged_ before
* reclaiming objects because the set of reclaimable objects by
* Type B operations are disjoint.
* - The lock on the tagged_ list is re-entrant, to prevent deadlock
* when reclamation in a Type A operation requires a Type B
* reclamation operation to complete.
* - The implementation allows only one pattern of re-entrance: An
* inner Type B inside an outer Type A.
* - An inner Type B operation must have access and ability to modify
* the outer Type A operation's set of reclaimable objects and
* their children objects in order not to miss objects that match
* the specified tag. Hence, Type A operations use data members,
* unprotected_ and children_, to keep track of these objects
* between reclamation steps and to provide inner Type B operations
* access to these objects.
*/ */
template <template <typename> class Atom> template <template <typename> class Atom>
class hazptr_domain { class hazptr_domain {
...@@ -59,13 +111,8 @@ class hazptr_domain { ...@@ -59,13 +111,8 @@ class hazptr_domain {
static constexpr int kThreshold = detail::hazptr_domain_rcount_threshold(); static constexpr int kThreshold = detail::hazptr_domain_rcount_threshold();
static constexpr int kMultiplier = 2; static constexpr int kMultiplier = 2;
static constexpr uint64_t kSyncTimePeriod{2000000000}; // nanoseconds static constexpr uint64_t kSyncTimePeriod{2000000000}; // nanoseconds
static constexpr uint8_t kLogNumTaggedLists = 6;
static constexpr uint16_t kNumTaggedLists = 1 << kLogNumTaggedLists;
static constexpr uint16_t kTaggedListIDMask = kNumTaggedLists - 1;
static constexpr uintptr_t kTagBit = hazptr_obj<Atom>::kTagBit; static constexpr uintptr_t kTagBit = hazptr_obj<Atom>::kTagBit;
static_assert(kNumTaggedLists <= 1024, "Too many tagged lists.");
Atom<hazptr_rec<Atom>*> hazptrs_{nullptr}; Atom<hazptr_rec<Atom>*> hazptrs_{nullptr};
Atom<hazptr_obj<Atom>*> retired_{nullptr}; Atom<hazptr_obj<Atom>*> retired_{nullptr};
Atom<uint64_t> sync_time_{0}; Atom<uint64_t> sync_time_{0};
...@@ -78,7 +125,9 @@ class hazptr_domain { ...@@ -78,7 +125,9 @@ class hazptr_domain {
bool shutdown_{false}; bool shutdown_{false};
RetiredList untagged_; RetiredList untagged_;
RetiredList tagged_[kNumTaggedLists]; RetiredList tagged_;
Obj* unprotected_; // List of unprotected objects being reclaimed
ObjList children_; // Children of unprotected objects being reclaimed
public: public:
/** Constructor */ /** Constructor */
...@@ -89,9 +138,7 @@ class hazptr_domain { ...@@ -89,9 +138,7 @@ class hazptr_domain {
shutdown_ = true; shutdown_ = true;
reclaim_all_objects(); reclaim_all_objects();
free_hazptr_recs(); free_hazptr_recs();
for (uint16_t i = 0; i < kNumTaggedLists; ++i) { DCHECK(tagged_.empty());
DCHECK(tagged_[i].empty());
}
} }
hazptr_domain(const hazptr_domain&) = delete; hazptr_domain(const hazptr_domain&) = delete;
...@@ -125,14 +172,35 @@ class hazptr_domain { ...@@ -125,14 +172,35 @@ class hazptr_domain {
/** cleanup_batch_tag */ /** cleanup_batch_tag */
void cleanup_batch_tag(const hazptr_obj_batch<Atom>* batch) noexcept { void cleanup_batch_tag(const hazptr_obj_batch<Atom>* batch) noexcept {
auto tag = reinterpret_cast<uintptr_t>(batch) + kTagBit; auto tag = reinterpret_cast<uintptr_t>(batch) + kTagBit;
RetiredList& rlist = tagged_[hash_tag(tag)]; auto obj = tagged_.pop_all(RetiredList::kAlsoLock);
ObjList match, nomatch; ObjList match, nomatch;
auto obj = rlist.pop_all(RetiredList::kAlsoLock); list_match_tag(tag, obj, match, nomatch);
list_match_condition( if (unprotected_) { // There must be ongoing do_reclamation
obj, match, nomatch, [tag](Obj* o) { return o->batch_tag() == tag; }); ObjList match2, nomatch2;
rlist.push_unlock(nomatch); list_match_tag(tag, unprotected_, match2, nomatch2);
match.splice(match2);
unprotected_ = nomatch2.head();
}
if (children_.head()) {
ObjList match2, nomatch2;
list_match_tag(tag, children_.head(), match2, nomatch2);
match.splice(match2);
children_ = std::move(nomatch2);
}
auto count = nomatch.count();
nomatch.set_count(0);
tagged_.push_unlock(nomatch);
obj = match.head(); obj = match.head();
reclaim_list_transitive(obj); reclaim_list_transitive(obj);
if (count >= threshold()) {
check_threshold_and_reclaim(tagged_, RetiredList::kAlsoLock);
}
}
void
list_match_tag(uintptr_t tag, Obj* obj, ObjList& match, ObjList& nomatch) {
list_match_condition(
obj, match, nomatch, [tag](Obj* o) { return o->batch_tag() == tag; });
} }
private: private:
...@@ -188,7 +256,7 @@ class hazptr_domain { ...@@ -188,7 +256,7 @@ class hazptr_domain {
} }
uintptr_t btag = l.head()->batch_tag(); uintptr_t btag = l.head()->batch_tag();
bool tagged = ((btag & kTagBit) == kTagBit); bool tagged = ((btag & kTagBit) == kTagBit);
RetiredList& rlist = tagged ? tagged_[hash_tag(btag)] : untagged_; RetiredList& rlist = tagged ? tagged_ : untagged_;
/*** Full fence ***/ asymmetricLightBarrier(); /*** Full fence ***/ asymmetricLightBarrier();
/* Only tagged lists need to be locked because tagging is used to /* Only tagged lists need to be locked because tagging is used to
* guarantee the identification of all objects with a specific * guarantee the identification of all objects with a specific
...@@ -200,11 +268,6 @@ class hazptr_domain { ...@@ -200,11 +268,6 @@ class hazptr_domain {
check_threshold_and_reclaim(rlist, lock); check_threshold_and_reclaim(rlist, lock);
} }
uint16_t hash_tag(uintptr_t tag) {
size_t h = std::hash<uintptr_t>{}(tag);
return h & kTaggedListIDMask;
}
/** threshold */ /** threshold */
int threshold() { int threshold() {
auto thresh = kThreshold; auto thresh = kThreshold;
...@@ -234,14 +297,18 @@ class hazptr_domain { ...@@ -234,14 +297,18 @@ class hazptr_domain {
list_match_condition(obj, match, nomatch, [&](Obj* o) { list_match_condition(obj, match, nomatch, [&](Obj* o) {
return hs.count(o->raw_ptr()) > 0; return hs.count(o->raw_ptr()) > 0;
}); });
/* Reclaim unmatched objects */ /* Reclaim unprotected objects and push back protected objects and
hazptr_obj_list<Atom> children; children of reclaimed objects */
reclaim_list(nomatch.head(), children);
match.splice(children);
/* Push back matched and children of unmatched objects */
if (lock) { if (lock) {
unprotected_ = nomatch.head();
DCHECK(children_.empty());
reclaim_unprotected_safe();
match.splice(children_);
rlist.push_unlock(match); rlist.push_unlock(match);
} else { } else {
ObjList children;
reclaim_unprotected_unsafe(nomatch.head(), children);
match.splice(children);
rlist.push(match, false); rlist.push(match, false);
} }
} }
...@@ -279,8 +346,26 @@ class hazptr_domain { ...@@ -279,8 +346,26 @@ class hazptr_domain {
} }
} }
/** reclaim_list */ /** reclaim_unprotected_safe */
void reclaim_list(Obj* head, ObjList& children) { void reclaim_unprotected_safe() {
while (unprotected_) {
auto obj = unprotected_;
unprotected_ = obj->next();
(*(obj->reclaim()))(obj, children_);
}
}
/** reclaim_unprotected_unsafe */
void reclaim_unprotected_unsafe(Obj* obj, ObjList& children) {
while (obj) {
auto next = obj->next();
(*(obj->reclaim()))(obj, children);
obj = next;
}
}
/** reclaim_unconditional */
void reclaim_unconditional(Obj* head, ObjList& children) {
while (head) { while (head) {
auto next = head->next(); auto next = head->next();
(*(head->reclaim()))(head, children); (*(head->reclaim()))(head, children);
...@@ -318,7 +403,7 @@ class hazptr_domain { ...@@ -318,7 +403,7 @@ class hazptr_domain {
void reclaim_list_transitive(Obj* head) { void reclaim_list_transitive(Obj* head) {
while (head) { while (head) {
ObjList children; ObjList children;
reclaim_list(head, children); reclaim_unconditional(head, children);
head = children.head(); head = children.head();
} }
} }
......
...@@ -181,6 +181,15 @@ class hazptr_obj { ...@@ -181,6 +181,15 @@ class hazptr_obj {
} }
} }
void push_obj(hazptr_domain<Atom>& domain) {
auto b = batch();
if (b) {
b->push_obj(this, domain);
} else {
push_to_retired(domain);
}
}
void push_to_retired(hazptr_domain<Atom>& domain) { void push_to_retired(hazptr_domain<Atom>& domain) {
#if FOLLY_HAZPTR_THR_LOCAL #if FOLLY_HAZPTR_THR_LOCAL
if (&domain == &default_hazptr_domain<Atom>() && !domain.shutdown_) { if (&domain == &default_hazptr_domain<Atom>() && !domain.shutdown_) {
...@@ -232,6 +241,10 @@ class hazptr_obj_list { ...@@ -232,6 +241,10 @@ class hazptr_obj_list {
return count_; return count_;
} }
void set_count(int val) {
count_ = val;
}
bool empty() const noexcept { bool empty() const noexcept {
return head() == nullptr; return head() == nullptr;
} }
...@@ -307,10 +320,7 @@ class hazptr_obj_batch { ...@@ -307,10 +320,7 @@ class hazptr_obj_batch {
} }
private: private:
template <typename, template <typename> class, typename> friend class hazptr_obj<Atom>;
friend class hazptr_obj_base;
template <typename, template <typename> class, typename>
friend class hazptr_obj_base_linked;
int count() const noexcept { int count() const noexcept {
return count_.load(std::memory_order_acquire); return count_.load(std::memory_order_acquire);
...@@ -411,7 +421,10 @@ class hazptr_obj_retired_list { ...@@ -411,7 +421,10 @@ class hazptr_obj_retired_list {
void push_unlock(hazptr_obj_list<Atom>& l) noexcept { void push_unlock(hazptr_obj_list<Atom>& l) noexcept {
List ll(l.head(), l.tail()); List ll(l.head(), l.tail());
retired_.push_unlock(ll); retired_.push_unlock(ll);
add_count(l.count()); auto count = l.count();
if (count) {
add_count(count);
}
} }
int count() const noexcept { int count() const noexcept {
...@@ -495,12 +508,7 @@ class hazptr_obj_base : public hazptr_obj<Atom>, public hazptr_deleter<T, D> { ...@@ -495,12 +508,7 @@ class hazptr_obj_base : public hazptr_obj<Atom>, public hazptr_deleter<T, D> {
hazptr_domain<Atom>& domain = default_hazptr_domain<Atom>()) { hazptr_domain<Atom>& domain = default_hazptr_domain<Atom>()) {
pre_retire(std::move(deleter)); pre_retire(std::move(deleter));
set_reclaim(); set_reclaim();
auto batch = this->batch(); this->push_obj(domain); // defined in hazptr_obj
if (batch) {
batch->push_obj(this, domain);
} else {
this->push_to_retired(domain); // defined in hazptr_obj
}
} }
void retire(hazptr_domain<Atom>& domain) { void retire(hazptr_domain<Atom>& domain) {
......
...@@ -240,14 +240,7 @@ class hazptr_obj_base_linked : public hazptr_obj_linked<Atom>, ...@@ -240,14 +240,7 @@ class hazptr_obj_base_linked : public hazptr_obj_linked<Atom>,
this->pre_retire_check(); // defined in hazptr_obj this->pre_retire_check(); // defined in hazptr_obj
set_reclaim(); set_reclaim();
auto& domain = default_hazptr_domain<Atom>(); auto& domain = default_hazptr_domain<Atom>();
auto btag = this->batch_tag(); this->push_obj(domain); // defined in hazptr_obj
if (btag == 0u) {
this->push_to_retired(domain); // defined in hazptr_obj
} else {
btag -= btag & 1u;
auto batch = reinterpret_cast<hazptr_obj_batch<Atom>*>(btag);
batch->push_obj(this, domain);
}
} }
/* unlink: Retire object if last link is released. */ /* unlink: Retire object if last link is released. */
......
...@@ -21,6 +21,7 @@ ...@@ -21,6 +21,7 @@
#include <glog/logging.h> #include <glog/logging.h>
#include <atomic> #include <atomic>
#include <thread>
/// Linked list class templates used in the hazard pointer library: /// Linked list class templates used in the hazard pointer library:
/// - linked_list: Sequential linked list that uses a pre-existing /// - linked_list: Sequential linked list that uses a pre-existing
...@@ -209,10 +210,14 @@ class shared_head_tail_list { ...@@ -209,10 +210,14 @@ class shared_head_tail_list {
* following are valid combinations: * following are valid combinations:
* - push(kMayBeLocked), pop_all(kAlsoLock), push_unlock * - push(kMayBeLocked), pop_all(kAlsoLock), push_unlock
* - push(kMayNotBeLocked), pop_all(kDontLock) * - push(kMayNotBeLocked), pop_all(kDontLock)
*
* Locking is reentrant to prevent self deadlock.
*/ */
template <typename Node, template <typename> class Atom = std::atomic> template <typename Node, template <typename> class Atom = std::atomic>
class shared_head_only_list { class shared_head_only_list {
Atom<uintptr_t> head_{0}; // lowest bit is a lock for pop all Atom<uintptr_t> head_{0}; // lowest bit is a lock for pop all
Atom<std::thread::id> owner_{std::thread::id()};
int reentrance_{0};
static constexpr uintptr_t kLockBit = 1u; static constexpr uintptr_t kLockBit = 1u;
static constexpr uintptr_t kUnlocked = 0u; static constexpr uintptr_t kUnlocked = 0u;
...@@ -252,6 +257,18 @@ class shared_head_only_list { ...@@ -252,6 +257,18 @@ class shared_head_only_list {
} }
void push_unlock(linked_list<Node>& l) noexcept { void push_unlock(linked_list<Node>& l) noexcept {
DCHECK_EQ(owner(), std::this_thread::get_id());
uintptr_t lockbit;
if (reentrance_ > 0) {
DCHECK_EQ(reentrance_, 1);
--reentrance_;
lockbit = kLockBit;
} else {
clear_owner();
lockbit = kUnlocked;
}
DCHECK_EQ(reentrance_, 0);
while (true) {
auto oldval = head(); auto oldval = head();
DCHECK_EQ(oldval & kLockBit, kLockBit); // Should be already locked DCHECK_EQ(oldval & kLockBit, kLockBit); // Should be already locked
auto ptrval = oldval - kLockBit; auto ptrval = oldval - kLockBit;
...@@ -262,7 +279,11 @@ class shared_head_only_list { ...@@ -262,7 +279,11 @@ class shared_head_only_list {
} }
auto newval = auto newval =
(t == nullptr) ? ptrval : reinterpret_cast<uintptr_t>(l.head()); (t == nullptr) ? ptrval : reinterpret_cast<uintptr_t>(l.head());
set_head(newval); newval += lockbit;
if (cas_head(oldval, newval)) {
break;
}
}
} }
bool check_lock() const noexcept { bool check_lock() const noexcept {
...@@ -278,10 +299,6 @@ class shared_head_only_list { ...@@ -278,10 +299,6 @@ class shared_head_only_list {
return head_.load(std::memory_order_acquire); return head_.load(std::memory_order_acquire);
} }
void set_head(uintptr_t val) noexcept {
head_.store(val, std::memory_order_release);
}
uintptr_t exchange_head() noexcept { uintptr_t exchange_head() noexcept {
auto newval = reinterpret_cast<uintptr_t>(nullptr); auto newval = reinterpret_cast<uintptr_t>(nullptr);
auto oldval = head_.exchange(newval, std::memory_order_acq_rel); auto oldval = head_.exchange(newval, std::memory_order_acq_rel);
...@@ -293,6 +310,19 @@ class shared_head_only_list { ...@@ -293,6 +310,19 @@ class shared_head_only_list {
oldval, newval, std::memory_order_acq_rel, std::memory_order_acquire); oldval, newval, std::memory_order_acq_rel, std::memory_order_acquire);
} }
std::thread::id owner() {
return owner_.load(std::memory_order_relaxed);
}
void set_owner() {
DCHECK(owner() == std::thread::id());
owner_.store(std::this_thread::get_id(), std::memory_order_relaxed);
}
void clear_owner() {
owner_.store(std::thread::id(), std::memory_order_relaxed);
}
Node* pop_all_no_lock() noexcept { Node* pop_all_no_lock() noexcept {
auto oldval = exchange_head(); auto oldval = exchange_head();
DCHECK_EQ(oldval & kLockBit, kUnlocked); DCHECK_EQ(oldval & kLockBit, kUnlocked);
...@@ -304,10 +334,18 @@ class shared_head_only_list { ...@@ -304,10 +334,18 @@ class shared_head_only_list {
while (true) { while (true) {
auto oldval = head(); auto oldval = head();
auto lockbit = oldval & kLockBit; auto lockbit = oldval & kLockBit;
if (lockbit == kUnlocked) { std::thread::id tid = std::this_thread::get_id();
if (lockbit == kUnlocked || owner() == tid) {
auto newval = reinterpret_cast<uintptr_t>(nullptr) + kLockBit; auto newval = reinterpret_cast<uintptr_t>(nullptr) + kLockBit;
if (cas_head(oldval, newval)) { if (cas_head(oldval, newval)) {
return reinterpret_cast<Node*>(oldval); DCHECK_EQ(reentrance_, 0);
if (lockbit == kUnlocked) {
set_owner();
} else {
++reentrance_;
}
auto ptrval = oldval - lockbit;
return reinterpret_cast<Node*>(ptrval);
} }
} }
s.sleep(); s.sleep();
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment