-
Philip Pronin authored
Summary: I just found that gcc (4.8.2) failed to unroll the loop in `pullAtMost()`, so it didn't replace `memcpy` with a simple load for small `len`. Test Plan: fbconfig -r folly/io/test thrift/lib/cpp2/test && fbmake runtests_opt -j32 Ran unicorn-specific thrift deserialization benchmark from D1724070, verified 50% improvement in `SearchRequest` deserialization performance. `thrift/lib/cpp2/test/ProtocolBench` results: ``` |---- before -----| |---- after -----| ================================================================================================ thrift/lib/cpp2/test/ProtocolBench.cpp relative time/iter iters/s time/iter iters/s ================================================================================================ BinaryProtocol_read_Empty 21.72ns 46.04M 17.58ns 56.89M BinaryProtocol_read_SmallInt 43.03ns 23.24M 23.64ns 42.30M BinaryProtocol_read_BigInt 43.72ns 22.87M 22.03ns 45.38M BinaryProtocol_read_SmallString 88.57ns 11.29M 47.01ns 21.27M BinaryProtocol_read_BigString 365.76ns 2.73M 323.58ns 3.09M BinaryProtocol_read_BigBinary 207.78ns 4.81M 169.09ns 5.91M BinaryProtocol_read_LargeBinary 187.81ns 5.32M 172.09ns 5.81M BinaryProtocol_read_Mixed 161.18ns 6.20M 68.41ns 14.62M BinaryProtocol_read_SmallListInt 177.32ns 5.64M 96.91ns 10.32M BinaryProtocol_read_BigListInt 77.03us 12.98K 15.88us 62.97K BinaryProtocol_read_BigListMixed 1.79ms 557.79 923.99us 1.08K BinaryProtocol_read_LargeListMixed 195.01ms 5.13 103.78ms 9.64 ================================================================================================ ``` Reviewed By: soren@fb.com Subscribers: alandau, bmatheny, mshneer, trunkagent, njormrod, folly-diffs@ FB internal diff: D1724111 Tasks: 5770136 Signature: t1:1724111:1417977810:b7d643d0c819a0bbac77fa0048206153929e50a8
173356a3