1. 05 Aug, 2021 1 commit
    • Nikita Lutsenko's avatar
      folly | Fix hard-coded assumption in folly/Portability that MSVC builds always have SSE4.2. · f55bad22
      Nikita Lutsenko authored
      Summary:
      Windows is always supporting SSE4_2, right? What could go wrong?
      Well, we want to support UWP targeting ARM, meaning that well, we don't have SSE4.2, which causes all different sort of fun things.
      One of them - boost screaming in agony about both NEON SIMD and SSE instructions available.
      Fix it, by ensuring that we only ever declare that we support SSE, if we are not arm.
      
      Reviewed By: rudybear
      
      Differential Revision: D30095193
      
      fbshipit-source-id: e303f90348116cd3b5ea618dc737a647d7aa2cd6
      f55bad22
  2. 04 Aug, 2021 2 commits
    • Lucian Grijincu's avatar
      folly/Benchmark.cpp: right align user metrics · 37e6bbcc
      Lucian Grijincu authored
      Summary:
      Right align so number units are easier to compare.
      
      ```name=before
      ======================================================================================================
      some bench                             relative  time/iter  iters/s  cpu-cycles  instructions
      ======================================================================================================
      A                                                  33.19us   30.13K  100.61K     217.49K
      B                                        93.71%    35.41us   28.24K  106.19K     224.50K
      C                                                  24.37us   41.03K  71.86K      92.74K
      D                                       163.37%    14.92us   67.02K  45.04K      77.66K
      ======================================================================================================
      ```
      
      ```name=after
      ======================================================================================================
      some bench                             relative  time/iter  iters/s  cpu-cycles  instructions
      ======================================================================================================
      A                                                  33.92us   29.48K     102.56K       217.49K
      B                                        94.27%    35.99us   27.79K     113.75K       224.50K
      C                                                  23.53us   42.51K      72.80K        92.73K
      D                                       154.72%    15.20us   65.77K      47.21K        77.66K
      ======================================================================================================
      ````
      
      Reviewed By: yfeldblum, ot, philippv
      
      Differential Revision: D30081837
      
      fbshipit-source-id: cbbfb7e381910c102eefca13475840c58a8b3361
      37e6bbcc
    • Yiding Jia's avatar
      Back out "Fix concurrency issues in ConcurrentSkipList." · d9abea1b
      Yiding Jia authored
      Summary: This is causing a memory leak
      
      Reviewed By: yfeldblum
      
      Differential Revision: D30082875
      
      fbshipit-source-id: 6b6ba9963cc45d71cb98b88b1db9a304cb7b7bda
      d9abea1b
  3. 03 Aug, 2021 2 commits
    • Peyman Gardideh's avatar
      switch to tp2 CLI11 · c9875b79
      Peyman Gardideh authored
      Summary: migrating to tp2 version of CLI11. Will delete our local copy in next diff to keep this smaller
      
      Reviewed By: shri-khare
      
      Differential Revision: D29808579
      
      fbshipit-source-id: c7b4cf40a64c9e8804f0eb6c749f36fabdbc79e1
      c9875b79
    • Peyman Gardideh's avatar
      Add CLI11 manifest · 951a343b
      Peyman Gardideh authored
      Summary: Adding a oss manifest for CLI11 to fetch from github
      
      Reviewed By: shri-khare
      
      Differential Revision: D29833128
      
      fbshipit-source-id: 39cae08f9a15b87da0fa6e26c7b9e0387a7cec50
      951a343b
  4. 02 Aug, 2021 3 commits
    • Yedidya Feldblum's avatar
      tweaks to futures interrupts · 451819f6
      Yedidya Feldblum authored
      Summary:
      * Always copy or move the interrupt handler precisely once.
      * Mark interrupt-handler heap classes as final to help the compiler devirtualize, even though it should already be able to.
      
      Reviewed By: davidtgoldblatt
      
      Differential Revision: D29974764
      
      fbshipit-source-id: d39008fa4d1a37fb448d42063da907edc08aa328
      451819f6
    • Jan Mazur's avatar
      adding copyright header · 3e84bbd6
      Jan Mazur authored
      Summary: as in the title
      
      Reviewed By: Croohand
      
      Differential Revision: D30041822
      
      fbshipit-source-id: 923158fcba241f5cd2ace8f87fa12083fd22356c
      3e84bbd6
    • Yedidya Feldblum's avatar
      fix race handling bug in futures interrupt-handler · f4c8e833
      Yedidya Feldblum authored
      Summary:
      Setting the interrupt-handler races with setting the interrupt-exception. This is expected to occur on occasion and should be handled in both `setInterruptHandler` and `raise`. The latter is correct but the former has a bug when, if the race happens, a moved-from interrupt-handler is invoked. The fix is to be sure to invoke a non-moved-from handler.
      
      For consistency, always performs precisely one copy or move of the handler and always invokes the handler as an lvalue-ref-to-non-const.
      
      Reviewed By: iahs
      
      Differential Revision: D29974165
      
      fbshipit-source-id: faf083e97c042fda0622801fc94bbc6fbca910fd
      f4c8e833
  5. 31 Jul, 2021 1 commit
    • Yedidya Feldblum's avatar
      fix observed double-deletes of futures interrupt handlers · d0edf4c6
      Yedidya Feldblum authored
      Summary:
      The state machine for the interrupt has a has-handler state which holds the handler pointer and a terminal state which holds nothing. When a handler has been stored and an exception is raised, the state machine exchanges the handler and transitions the state from has-handler to terminal, and then invokes the handler and decrements its refcount, possibly deleting it. A concurrent continuation of a future can load the interrupt and, if it is a handler, increment its refcount.
      
      There is a possible race: the handler is loaded in both threads, its refcount decremented in the first thread and incremeneted in the second thread, and then the first thread observes an historical refcount of zero and deletes the handler. Fix this race by holding onto the handler until the core is destroyed.
      
      Reviewed By: davidtgoldblatt
      
      Differential Revision: D30017199
      
      fbshipit-source-id: b6c474c630719ef8fd1cb88fa982045a523590dd
      d0edf4c6
  6. 30 Jul, 2021 5 commits
    • Zeyi (Rice) Fan's avatar
      avoid generating internal dependencies for public CI · 9781d415
      Zeyi (Rice) Fan authored
      Reviewed By: wez
      
      Differential Revision: D30017863
      
      fbshipit-source-id: fb94a7c36e05d874fc3a6ce568a7b757c1863ffa
      9781d415
    • Jun Wu's avatar
      include rust-shed in edenscm builds · 0d35ac1d
      Jun Wu authored
      Summary: They will be used by upcoming changes.
      
      Reviewed By: DurhamG
      
      Differential Revision: D30005548
      
      fbshipit-source-id: 6154069f7dfd9b3c9b1b1a7ea3552081c7d1641e
      0d35ac1d
    • Nathan Lanza's avatar
      Cast an unused variable to void during NDEBUG builds · 14933d5f
      Nathan Lanza authored
      Summary:
      the only usage of `remaining` here is in an assert and thus clang13
      errors out on NDEBUG builds. Give it a trivial "usage" here.
      
      Reviewed By: smeenai
      
      Differential Revision: D29719883
      
      fbshipit-source-id: 8141c3ee8ed1b852ea04ec03954c8fe1ac83c12b
      14933d5f
    • Fred Qiu's avatar
      Rename alpn option in folly/openssl · a87dba71
      Fred Qiu authored
      Summary:
      Renamed alpn option from requireAlpnIfClientSupports to
      alpnAllowMismatch.
      
      Reviewed By: knekritz
      
      Differential Revision: D29968118
      
      fbshipit-source-id: bb515efec22520e52444530a2de2b3835691c26c
      a87dba71
    • Dan Melnic's avatar
      fbstring: switch FOLLY_NOINLINE inline to FOLLY_NOINLINE · 24821079
      Dan Melnic authored
      Summary: fbstring: switch FOLLY_NOINLINE inline to FOLLY_NOINLINE
      
      Reviewed By: yfeldblum, ot
      
      Differential Revision: D29978049
      
      fbshipit-source-id: ecacb51d58c0c8f10ce6f67ca1d51e9de60bff55
      24821079
  7. 29 Jul, 2021 3 commits
    • Igor Sugak's avatar
      workaround LLVM-12 coro bug · 57f9d2cd
      Igor Sugak authored
      Reviewed By: yfeldblum
      
      Differential Revision: D29979792
      
      fbshipit-source-id: c18155884f1ac737b7b222b9804398b6bb31049e
      57f9d2cd
    • Fred Qiu's avatar
      Mock Cpp2ConnContext · 8adc1e1f
      Fred Qiu authored
      Summary:
      Mock Cpp2ConnContext so the supplied cert to the constructor is available to
      Cpp2::ConnContext->getTransport()->getPeerCertificate() call. This change is needed by other changes.
      
      Reviewed By: edenzik
      
      Differential Revision: D29791757
      
      fbshipit-source-id: 42b8bb6024c880b038925a3f95b855de0488033b
      8adc1e1f
    • Igor Sugak's avatar
      fix nullptr-with-nonzero-offset UB in CacheLocality.h · 3e9865fc
      Igor Sugak authored
      Reviewed By: yfeldblum, andriigrynenko
      
      Differential Revision: D29914176
      
      fbshipit-source-id: 981b75acec24391822ace833d4265fed0c5d0cb3
      3e9865fc
  8. 28 Jul, 2021 9 commits
  9. 27 Jul, 2021 3 commits
    • Yedidya Feldblum's avatar
      spell small-vector uses of the trait as is_tivially_copyable_v · c3c6b788
      Yedidya Feldblum authored
      Summary: The `folly::` prefix is not necessary from within folly's namespace and the `_v` is available on the folly trait even in C++14 builds when `std::is_trivially_copyable_v` from C++17 is unavailable.
      
      Reviewed By: ot
      
      Differential Revision: D29925029
      
      fbshipit-source-id: 570d17c57ca68bea1c7c8b80ce59d8560b1aba2b
      c3c6b788
    • Shai Szulanski's avatar
      Allow using CancellableAsyncScope with external cancellation token · 2f5a71dd
      Shai Szulanski authored
      Summary: Right now the internal cancellation signal will be silently ignored, which hurts usability. Add mechanism for merging with external source and comment describing the pitfall.
      
      Reviewed By: capickett
      
      Differential Revision: D29935016
      
      fbshipit-source-id: 9311930ff9fbfc0470fdcfb5d425f36e7f0aff06
      2f5a71dd
    • Maged Michael's avatar
      SharedMutex: Change SharedMutexPolicyDefault and change default spin and yield counts · c17ed205
      Maged Michael authored
      Summary:
      Change SharedMutexPolicyDefault to include max_spin_count and max_soft_yield_count instead of bool block_immediately.
      
      Change the default spin and yield counts from 1000 and 1000 to 2 and 1, respectively.
      
      Reviewed By: ot
      
      Differential Revision: D29559594
      
      fbshipit-source-id: 93a3bdf43c20f456031265daf7b76ab40e3dcbdf
      c17ed205
  10. 26 Jul, 2021 3 commits
  11. 23 Jul, 2021 6 commits
    • Giuseppe Ottaviano's avatar
      Do not leak GFlags.h in widely included headers · ff841baa
      Giuseppe Ottaviano authored
      Summary: It defines several generically named macros, better avoid including it everywhere.
      
      Reviewed By: aary
      
      Differential Revision: D29870524
      
      fbshipit-source-id: b8703a737a6dc53e00c13daebc445855bfbadd1f
      ff841baa
    • Alan Frindell's avatar
      Fix SSL exception slicing · 5e2ab64f
      Alan Frindell authored
      Summary:
      SSLException derives from AsyncSocketException, so need to construct the exception_wrapper differently to prevent slicing it.
      
      I wish there were a more future-proof way to do this
      
      Reviewed By: yangchi
      
      Differential Revision: D29836520
      
      fbshipit-source-id: df4222d94952c66b4c86f12861b3792babdce3c6
      5e2ab64f
    • Shai Szulanski's avatar
      Make co_awaitTry(AsyncGenerator) return Try<NextResult<T>> · 832f135a
      Shai Szulanski authored
      Summary:
      There are two problems with the current approach of returning Try<T>:
      - It is impossible to write generic algorithms like coro::timeout that convert any awaitable into a Task of its await result without throwing exceptions because there's no way to reconstruct the expected return type. More generally, we want the property that the await_try_result_t::element_type matches the await_result_t so we can make drop-in replacements by wrapping in functions like timeout.
      - There's no way to both avoid moving yielded values and avoid throwing exceptions because Try doesn't support references (and an earlier diff adding this support was rejected), which means the two performance optimizations avaioable to users of AsyncGenerator are mutually exclusive
      
      We fix this to restore the aforementioned invariant by wrapping the existing result type. This is a marginal inefficiency, so if we notice regressions as a result we can specialize these Try instantiations to consolidate the storage. For now we do not except this to matter.
      
      Reviewed By: andriigrynenko
      
      Differential Revision: D29680441
      
      fbshipit-source-id: 4ef74f4645d990b623bb95a297718fb576a9b977
      832f135a
    • Pranjal Raihan's avatar
      RequestContext::StaticContextAccessor · 7fc541e8
      Pranjal Raihan authored
      Summary: `RequestContext::StaticContextAccessor` acts as a guard, preventing all threads with a `StaticContext` from being destroyed (or created).
      
      Reviewed By: dtolnay
      
      Differential Revision: D29684337
      
      fbshipit-source-id: 2b785b9293dd0b9c190512363afddaff50ec1f01
      7fc541e8
    • Pranjal Raihan's avatar
      Don't use typeid without RTTI in UniqueInstance · 44683993
      Pranjal Raihan authored
      Summary:
      The class depends on RTTI. It's a sanity check that crashes if two instances of a singleton are created. So doing nothing in `-fno-rtti` code is fine.
      
      Redo of D29630207 (https://github.com/facebook/folly/commit/160eb4d284eb67cc2641b6718c964dab8fc6486b)
      
      Reviewed By: dtolnay
      
      Differential Revision: D29684338
      
      fbshipit-source-id: 38355df5297681329f227fd10570a816f4672b9b
      44683993
    • Shai Szulanski's avatar
      makeUnorderedAsyncGeneratorFromAwaitableRange -> makeUnorderedAsyncGenerator · 3275d892
      Shai Szulanski authored
      Summary: We use the FromBla to distinguish collect-range (type fixed, count varies) algorithms from collect-tuple (types vary, count fixed) algorithms. But in this case there is no sensible translation from collect-tuple to an async-generator so the from-range bit is not necessary.
      
      Reviewed By: vitaut
      
      Differential Revision: D29877021
      
      fbshipit-source-id: 69dfa764fca880bd3770a4d57ff0d60fe500a206
      3275d892
  12. 22 Jul, 2021 2 commits
    • Giuseppe Ottaviano's avatar
      Use CoreCachedSharedPtr in Singleton · 937fc980
      Giuseppe Ottaviano authored
      Summary: `CoreCachedSharedPtr` is almost as fast as `ReadMostlySharedPtr`, so we can use it to have a better default that does not have pathological behavior under heavy contention. `try_get_fast()` is still useful if we need to squeeze out the last cycle.
      
      Reviewed By: philippv, luciang
      
      Differential Revision: D29812053
      
      fbshipit-source-id: 49e9e53444f8099dbfe13e36c3c07c1b57bb89fb
      937fc980
    • Lucian Grijincu's avatar
      thrift: varint: BMI2 (pdep) based varint encoding: branchless 2-5x faster than loop unrolled · 4baba282
      Lucian Grijincu authored
      Summary:
      BMI2 (`pdep`) varint encoding that's mostly branchless. It's 2-5x faster than the current loop-unrolled version.
      
      Being mostly branchless there's less variability in micro-benchmark runtime compared to the loop-unrolled version:
      - the loop-unrolled versions are slowest when encoding random numbers across the entire 64-bit range (some likely large) and branch prediction has most failures.
      
      Kept the fast-pass for values <127 (encoded in 1 byte) which are likely to be frequent. I couldn't find a fully branchless version that performed better anyway.
      
      TLDR:
      - `u8`: unroll the two possible values (1B and 2B encoding). Faster in micro-benchmarks than branchless versions I tried (needed more instructions to produce the same value without branches).
      - `u16` & `u32`:
      -- u16 encodes in up to 3B, u32 in up to 5B.
      -- Use `pdep` to encode into a u64 (8 bytes). Write 8 bytes to `QueueAppender`, but keep track of only the bytes that had to be written. This is faster than appending a buffer of bytes using &u64 and size.
      -- u16 could be written by encoding using `_pdep_u32` (3 bytes max fit in u32) and using smaller 16B lookup tables. In micro-benchmark that's not faster than using the same code as the one to encode u32 using `_pdep_u64`. In prod will perform better due to sharing the same lookup tables with u32 and u64 versions (less d-cache pressure).
      - `u64`: needs up to 10B. `pdep` to encode first 8B and unconditionally write last 2B too (but keep track of `QueueAppender` size properly).
      
      Reviewed By: vitaut
      
      Differential Revision: D29250074
      
      fbshipit-source-id: 1f6a266f45248fcbea30a62ed347564589cb3348
      4baba282