Optimize Function::operator() codegen
Summary: [Folly] Optimize `Function::operator()` codegen for size and speed. * Avoid translating between values and references for trivially-copyable values. * Avoid shifting all arguments to make room for the function object address. In the optimal case, the codegen for calling a `Function` with many arguments translates into just a `jmp`. See: * https://github.com/thecppzoo/zoo/commits/master/inc/zoo/AnyCallable.h * https://github.com/bloomberg/bde/blob/3.38.0.1/groups/bsl/bslstl/bslstl_function.h * https://github.com/bloomberg/bde/blob/3.38.0.1/groups/bsl/bslmf/bslmf_forwardingtype.h Given this example code: ```lang=c++,name=check.cpp extern "C" void check_0(folly::Function<void()>& f) { f(); } extern "C" void check_1(int i, folly::Function<void(int)>& f) { f(i); } extern "C" void check_2(int i, int j, folly::Function<void(int, int)>& f) { f(i, j); } extern "C" void check_3(int i, int j, int k, folly::Function<void(int, int, int)>& f) { f(i, j, k); } extern "C" void check_4(int i, int j, int k, int l, folly::Function<void(int, int, int, int)>& f) { f(i, j, k, l); } ``` Before: ```name=check.o 0000000000000000 <check_0>: 0: ff 67 30 jmp QWORD PTR [rdi+0x30] 0000000000000000 <check_1>: 0: 55 push rbp 1: 48 89 f0 mov rax,rsi 4: 48 89 e5 mov rbp,rsp 7: 48 83 ec 10 sub rsp,0x10 b: 89 7d fc mov DWORD PTR [rbp-0x4],edi e: 48 8d 75 fc lea rsi,[rbp-0x4] 12: 48 89 c7 mov rdi,rax 15: ff 50 30 call QWORD PTR [rax+0x30] 18: c9 leave 19: c3 ret 0000000000000000 <check_2>: 0: 55 push rbp 1: 48 89 d0 mov rax,rdx 4: 48 89 e5 mov rbp,rsp 7: 48 83 ec 10 sub rsp,0x10 b: 89 7d f8 mov DWORD PTR [rbp-0x8],edi e: 89 75 fc mov DWORD PTR [rbp-0x4],esi 11: 48 8d 55 fc lea rdx,[rbp-0x4] 15: 48 8d 75 f8 lea rsi,[rbp-0x8] 19: 48 89 c7 mov rdi,rax 1c: ff 50 30 call QWORD PTR [rax+0x30] 1f: c9 leave 20: c3 ret 0000000000000000 <check_3>: 0: 55 push rbp 1: 48 89 c8 mov rax,rcx 4: 48 89 e5 mov rbp,rsp 7: 48 83 ec 10 sub rsp,0x10 b: 89 7d f4 mov DWORD PTR [rbp-0xc],edi e: 89 75 f8 mov DWORD PTR [rbp-0x8],esi 11: 89 55 fc mov DWORD PTR [rbp-0x4],edx 14: 48 8d 4d fc lea rcx,[rbp-0x4] 18: 48 8d 55 f8 lea rdx,[rbp-0x8] 1c: 48 8d 75 f4 lea rsi,[rbp-0xc] 20: 48 89 c7 mov rdi,rax 23: ff 50 30 call QWORD PTR [rax+0x30] 26: c9 leave 27: c3 ret 0000000000000000 <check_4>: 0: 55 push rbp 1: 4c 89 c0 mov rax,r8 4: 48 89 e5 mov rbp,rsp 7: 48 83 ec 10 sub rsp,0x10 b: 89 7d f0 mov DWORD PTR [rbp-0x10],edi e: 89 75 f4 mov DWORD PTR [rbp-0xc],esi 11: 89 55 f8 mov DWORD PTR [rbp-0x8],edx 14: 89 4d fc mov DWORD PTR [rbp-0x4],ecx 17: 4c 8d 45 fc lea r8,[rbp-0x4] 1b: 48 8d 4d f8 lea rcx,[rbp-0x8] 1f: 48 8d 55 f4 lea rdx,[rbp-0xc] 23: 48 8d 75 f0 lea rsi,[rbp-0x10] 27: 48 89 c7 mov rdi,rax 2a: ff 50 30 call QWORD PTR [rax+0x30] 2d: c9 leave 2e: c3 ret ``` After: ```name=check.o 0000000000000000 <check_0>: 0: ff 67 30 jmp QWORD PTR [rdi+0x30] 0000000000000000 <check_1>: 0: ff 66 30 jmp QWORD PTR [rsi+0x30] 0000000000000000 <check_2>: 0: ff 62 30 jmp QWORD PTR [rdx+0x30] 0000000000000000 <check_3>: 0: ff 61 30 jmp QWORD PTR [rcx+0x30] 0000000000000000 <check_4>: 0: 41 ff 60 30 jmp QWORD PTR [r8+0x30] ``` Reviewed By: luciang Differential Revision: D17523239 fbshipit-source-id: beed0bae827aad8290e807374e8596f71f98ce99
Showing
Please register or sign in to comment