better code for aarch64 F14 maps and sets
Summary: vshrn_n_u16 can be used to efficiently get a bit of information from every byte in a 16-byte vector into an 8-byte vector, which is better than the previous NEON sequence used during tag matching. The resulting code is faster and smaller on aarch64. x86_64 code is refactored but should compile to the same assembly. Reviewed By: shixiao Differential Revision: D8420917 fbshipit-source-id: 21a9f920f55ffc479b20fee6882a5987b626c89a
Showing
Please register or sign in to comment