Commit e8751dea authored by Stepan Palamarchuk's avatar Stepan Palamarchuk Committed by Facebook Github Bot

Avoid atomic operation on the last refcount

Summary:
There's no need in the atomic refcount decrement from 1->0, because we're the last live reference and using it to bump 1->2 is already undefined behavior.

Added benchmark:

Before:
```
stepan@devvm329:~/fbsource/fbcode$ buck-out/opt/gen/folly/io/test/iobuf_benchmark --bm_min_usec=1000000
============================================================================
folly/io/test/IOBufBenchmark.cpp                relative  time/iter  iters/s
============================================================================
createAndDestroy                                            36.35ns   27.51M
cloneOneBenchmark                                           40.40ns   24.75M
cloneOneIntoBenchmark                                       27.03ns   37.00M
cloneBenchmark                                              42.23ns   23.68M
cloneIntoBenchmark                                          28.35ns   35.27M
moveBenchmark                                               15.20ns   65.79M
copyBenchmark                                               35.00ns   28.57M
cloneCoalescedBaseline                                     334.59ns    2.99M
cloneCoalescedBenchmark                          660.01%    50.69ns   19.73M
takeOwnershipBenchmark                                      50.57ns   19.78M
============================================================================
```

After:
```
stepan@devvm329:~/fbsource/fbcode$ buck-out/opt/gen/folly/io/test/iobuf_benchmark --bm_min_usec=1000000
============================================================================
folly/io/test/IOBufBenchmark.cpp                relative  time/iter  iters/s
============================================================================
createAndDestroy                                            30.04ns   33.29M
cloneOneBenchmark                                           41.27ns   24.23M
cloneOneIntoBenchmark                                       26.37ns   37.92M
cloneBenchmark                                              43.91ns   22.77M
cloneIntoBenchmark                                          28.49ns   35.10M
moveBenchmark                                               15.50ns   64.52M
copyBenchmark                                               35.85ns   27.89M
cloneCoalescedBaseline                                     318.49ns    3.14M
cloneCoalescedBenchmark                          643.69%    49.48ns   20.21M
takeOwnershipBenchmark                                      45.15ns   22.15M
============================================================================
```

Reviewed By: yfeldblum, davidtgoldblatt

Differential Revision: D14715579

fbshipit-source-id: 3c0373b8423dda680920860979cfa240bf3d8d7a
parent 5c133249
...@@ -799,6 +799,10 @@ void IOBuf::decrementRefcount() { ...@@ -799,6 +799,10 @@ void IOBuf::decrementRefcount() {
return; return;
} }
// Avoid doing atomic decrement if the refcount is 1.
// This is safe, because it means that we're the last reference and destroying
// the object. Anything trying to copy it is already undefined behavior.
if (info->refcount.load(std::memory_order_acquire) > 1) {
// Decrement the refcount // Decrement the refcount
uint32_t newcnt = info->refcount.fetch_sub(1, std::memory_order_acq_rel); uint32_t newcnt = info->refcount.fetch_sub(1, std::memory_order_acq_rel);
// Note that fetch_sub() returns the value before we decremented. // Note that fetch_sub() returns the value before we decremented.
...@@ -807,6 +811,7 @@ void IOBuf::decrementRefcount() { ...@@ -807,6 +811,7 @@ void IOBuf::decrementRefcount() {
if (newcnt > 1) { if (newcnt > 1) {
return; return;
} }
}
// save the useHeapFullStorage flag here since // save the useHeapFullStorage flag here since
// freeExtBuffer can delete the sharedInfo() // freeExtBuffer can delete the sharedInfo()
......
...@@ -19,6 +19,13 @@ ...@@ -19,6 +19,13 @@
using folly::IOBuf; using folly::IOBuf;
BENCHMARK(createAndDestroy, iters) {
while (iters--) {
IOBuf buf(IOBuf::CREATE, 10);
folly::doNotOptimizeAway(buf.capacity());
}
}
BENCHMARK(cloneOneBenchmark, iters) { BENCHMARK(cloneOneBenchmark, iters) {
IOBuf buf(IOBuf::CREATE, 10); IOBuf buf(IOBuf::CREATE, 10);
while (iters--) { while (iters--) {
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment