Detect popcnt instruction at runtime, use it if available.
Summary:
If compiled for a popcnt-supporting target (-march=corei7, for example),
use __builtin_popcount, as it's presumably inlined. Otherwise, detect
on startup (in the same way as glibc dispatches to one of the many
flavors of memcpy): GCC allows us to add a resolver function which the
dynamic loader will call on startup to resolve a function to one of
various alternatives; we check (using the cpuid instruction) whether
popcnt is supported, and use it if available.
Test Plan: tests added
Reviewed By: soren@fb.com
FB internal diff:
D542977