various perf improvements
authorJames Sedgwick <jsedgwick@fb.com>
Tue, 16 Jun 2015 17:30:18 +0000 (10:30 -0700)
committerSara Golemon <sgolemon@fb.com>
Wed, 17 Jun 2015 17:26:16 +0000 (10:26 -0700)
commitca1e87ed0a85daf3abaf2ac423266e869f3961ec
tree12b7af73b49e4a1860aa2e6aebd27fb7a5c41c57
parent16b7f86299f935fc16d38d72a8b497e2ab4a1d4f
various perf improvements

Summary: Three strategies
1. Optimistic locking
2. Acquire-release memory ordering instead of full sequential consistency
3. Some low-hanging branch miss optimizations

Please review carefully; the dogscience is strong with this one

```
Before:

============================================================================
folly/futures/test/Benchmark.cpp                relative  time/iter  iters/s
============================================================================
constantFuture                                             127.99ns    7.81M
promiseAndFuture                                  94.89%   134.89ns    7.41M
withThen                                          28.40%   450.63ns    2.22M
----------------------------------------------------------------------------
oneThen                                                    446.68ns    2.24M
twoThens                                          58.35%   765.55ns    1.31M
fourThens                                         31.87%     1.40us  713.41K
hundredThens                                       1.61%    27.78us   35.99K
----------------------------------------------------------------------------
no_contention                                                4.63ms   216.00
contention                                        80.79%     5.73ms   174.52
----------------------------------------------------------------------------
throwAndCatch                                               10.91us   91.64K
throwAndCatchWrapped                             127.14%     8.58us  116.51K
throwWrappedAndCatch                             178.22%     6.12us  163.32K
throwWrappedAndCatchWrapped                      793.75%     1.37us  727.38K
----------------------------------------------------------------------------
throwAndCatchContended                                        1.35s  741.33m
throwAndCatchWrappedContended                    139.18%   969.23ms     1.03
throwWrappedAndCatchContended                    169.51%   795.76ms     1.26
throwWrappedAndCatchWrappedContended            17742.23%     7.60ms   131.53
----------------------------------------------------------------------------
complexUnit                                                127.50us    7.84K
complexBlob4                                     100.14%   127.32us    7.85K
complexBlob8                                     100.16%   127.30us    7.86K
complexBlob64                                     96.45%   132.19us    7.57K
complexBlob128                                    92.83%   137.35us    7.28K
complexBlob256                                    87.79%   145.23us    6.89K
complexBlob512                                    81.64%   156.18us    6.40K
complexBlob1024                                   72.54%   175.76us    5.69K
complexBlob2048                                   58.52%   217.89us    4.59K
complexBlob4096                                   32.54%   391.78us    2.55K
============================================================================

After:
============================================================================
folly/futures/test/Benchmark.cpp                relative  time/iter  iters/s
============================================================================
constantFuture                                              85.28ns   11.73M
promiseAndFuture                                  88.63%    96.22ns   10.39M
withThen                                          30.46%   279.99ns    3.57M
----------------------------------------------------------------------------
oneThen                                                    231.18ns    4.33M
twoThens                                          60.57%   381.70ns    2.62M
fourThens                                         33.52%   689.71ns    1.45M
hundredThens                                       1.49%    15.48us   64.58K
----------------------------------------------------------------------------
no_contention                                                3.84ms   260.19
contention                                        88.29%     4.35ms   229.73
----------------------------------------------------------------------------
throwAndCatch                                               10.63us   94.06K
throwAndCatchWrapped                             127.17%     8.36us  119.61K
throwWrappedAndCatch                             179.83%     5.91us  169.15K
throwWrappedAndCatchWrapped                     1014.48%     1.05us  954.19K
----------------------------------------------------------------------------
throwAndCatchContended                                        1.34s  749.03m
throwAndCatchWrappedContended                    140.66%   949.16ms     1.05
throwWrappedAndCatchContended                    164.87%   809.77ms     1.23
throwWrappedAndCatchWrappedContended            49406.39%     2.70ms   370.07
----------------------------------------------------------------------------
complexUnit                                                 86.83us   11.52K
complexBlob4                                      97.42%    89.12us   11.22K
complexBlob8                                      96.63%    89.85us   11.13K
complexBlob64                                     92.53%    93.84us   10.66K
complexBlob128                                    90.85%    95.57us   10.46K
complexBlob256                                    82.56%   105.17us    9.51K
complexBlob512                                    74.13%   117.12us    8.54K
complexBlob1024                                   63.67%   136.37us    7.33K
complexBlob2048                                   50.25%   172.79us    5.79K
complexBlob4096                                   26.63%   326.05us    3.07K
============================================================================
```

Reviewed By: @djwatson

Differential Revision: D2139822
folly/futures/Future-inl.h
folly/futures/Promise-inl.h
folly/futures/Promise.h
folly/futures/Try-inl.h
folly/futures/detail/Core.h
folly/futures/detail/FSM.h
folly/futures/test/Benchmark.cpp