Make FBVector faster
Summary:Per https://github.com/facebook/folly/issues/379, github user pmalek determined that fbvector's insert function was a lot slower than std::vector's. While insert is often imagined to be a second-class vector function (if you're inserting in the middle, don't use a vector!), it is the correct function to use when inserting multiple elements at the back of an array (since the standard lacks an ##append(multiple-elements)## function). The code, therefore, required optimization.
There are three things that were slowing down fbvector:
- Excessive asserts. The fbvector code contains internal asserts that ensure proper calling of private methods. These are mostly unnecessary, given the extensive test suite that batters fbvector, in addition to fbvector's battle tested history. I have removed these.
- Internal data checks. When inserting into a vector, the elements being inserted may not reference the contents of the vector in question. If the client supplies internal references, then the standard allows for undefined behavior. An old design decision for fbvector was to be forgiving of this error; we check for internal data before performing internal data moves. This is expensive. Further, it is unnecessary when the insertion point is at the end of the vector (which is the valid case we are optimizing) or if the vector is being reallocated. I've rearchitected the insert macros to only perform the internal-data-check when it is absolutely necessary. While rearchitecting the checks, I also reorganized the n==0 base case checking.
- The 'window' function is pretty expensive when called with the end() iterator. It is effectively a no-op, but the function (and some of its called functions) are too big to inline, so the call overhead is non-trivial. I've put window calls behind a conditional to save this overhead.
I added a benchmark test to FBVectorTestBenchmark.cpp.h, but the particular pattern in that file caused the results to be completely optimized away, voiding the test. Oh well.
Running the github benchmarks show a 4x speedup for inserting a single element on the back, bringing it quite close to that of std::vector. The multi-insert function got a bit faster, though the exact number of iterations seems to be having an effect on the numbers (running the tests 10,000 times, intead of only 1,000 times, yields results that are 90% as fast as std::vector, instead of 75%). Both of these cases are now looking quite a bit better.
I resurrected the old StlVectorTest suite, since I was mucking with the complicated insert code. The suite had one latent compilation error (caught with newer compiler warnings - fortunately this particular case was benign). The test suite caught an error in shrink_to_fit, where a clean-slate optimization for empty vectors (see
D2696314) that trampled the vector's allocator. I have fixed that small bug in this diff, too.
Reviewed By: ot
Differential Revision:
D3105962
fb-gh-sync-id:
ac9fa9d7ca79335cf0ff6bb9010af1ed8bd7916b
fbshipit-source-id:
ac9fa9d7ca79335cf0ff6bb9010af1ed8bd7916b