Summary:
@lucian's
D1373308 set fbstring to conservative by default.
This breaks, eg, ti/proxygen/httpclient tests, by failing an assertion
inside of c_str(). Specifically, the terminator is not '\0'.
The fbstring_core move constructor, when sourced by a MediumLarge
string, steals the source's internal data, then resets the source by
calling ##setSmallSize(0)##. That function sets the in-situ size of the
fbstring to zero, thus requalifying the string as a small string;
however, it does nothing to the data - the previous 23 bytes now contain
garbage.
Sources of a move must be in a consistent state after the move is
complete. The source, once a MediumLarge string whose first eight bytes
were a pointer, is now a small string of size zero whose first byte is
not necessarily '\0'. This breaks the FBSTRING_CONSERVATIVE invariant.
This can be fixed by writing a terminator after the setSmallSize call.
I have fixed all setSmallSize locations that do not writeTerminator.
fbstring_core's move constructor is called exclusively from
basic_fbstring's move assignment operator, hence the odd format of the
test case.
== TMI ==
Interestingly, the source will almost certainly* contain a '\0', which
prevents this simple ##str.size() != strlen(str.c_str())## bug from
turning into a memory-trampling monster. The new size of zero is not
what saves us - the 'size' byte of a small fbstring, through a very
clever trick to allow 23-byte in-situ strings, is actually 23 minus the
actual size (now 0), so is 23! Where, then, does the '\0' byte come? A
MediumLarge string's data layout is [pointer, size, capacity]. The
pointer is not guaranteed to contain a '\0', and neither are size or
capacity. However, the size of the string needs to be very large in
order to force the top byte of the size member to be non-zero; in that
case, the string is so large that malloc is returning memory by the
page. Since page sizes are a multiple of 2^8 (almost always, and if not
then I don't think your fbstring can support large enough sizes
anyways), and we use goodMallocSize, the capacity pointer would have a
least signfigicant byte of zero.
Why the (*)? Well, when reserving extra space on a non-refcounted Large
string, the reallocation does not yield its extra goodMallocSize space.
This could be fixed, though probably isn't worth the trouble. Anyways,
since we aren't goodMallocSize-ing the user-supplied requested capacity,
it probably won't contain a '\0'.
Test Plan:
fbconfig -r folly && fbmake runtests
Modify folly/test/FBStringTest.cpp to define FBSTRING_CONSERVATIVE, then
fbconfig folly/test/:fbstring_test_using_jemalloc && fbmake runtests
Note that this fails before the patch is applied.
Note that it is possible for the tests to pass even when this bug is
present, since the top byte of the heap pointer must be non-0 in order
for this error to be triggered.
Reviewed By: lucian@fb.com
Subscribers: folly@lists, njormrod, lucian, markisaa, robbert, sdwilsh, tudorb, jdelong
FB internal diff:
D1376517
#include "folly/Traits.h"
#include "folly/Malloc.h"
#include "folly/Hash.h"
+#include "folly/ScopeGuard.h"
#if FOLLY_HAVE_DEPRECATED_ASSOC
#ifdef _GLIBCXX_SYMVER
// so just disable it on this function.
fbstring_core(const Char *const data, const size_t size)
FBSTRING_DISABLE_ADDRESS_SANITIZER {
+#ifndef NDEBUG
+#ifndef _LIBSTDCXX_FBSTRING
+ SCOPE_EXIT {
+ assert(this->size() == size);
+ assert(memcmp(this->data(), data, size * sizeof(Char)) == 0);
+ };
+#endif
+#endif
+
// Simplest case first: small strings are bitblitted
if (size <= maxSmallSize) {
// Layout is: Char* data_, size_t size_, size_t capacity_
}
}
setSmallSize(size);
+ return;
} else if (size <= maxMediumSize) {
// Medium strings are allocated normally. Don't forget to
// allocate one extra Char for the terminating null.
ml_.capacity_ = effectiveCapacity | isLarge;
}
writeTerminator();
- assert(this->size() == size);
- assert(memcmp(this->data(), data, size * sizeof(Char)) == 0);
}
~fbstring_core() noexcept {
// handling.
assert(ml_.size_ >= delta);
ml_.size_ -= delta;
+ writeTerminator();
} else {
assert(ml_.size_ >= delta);
// Shared large string, must make unique. This is because of the
fbstring_core(ml_.data_, ml_.size_ - delta).swap(*this);
}
// No need to write the terminator.
- return;
}
- writeTerminator();
}
void reserve(size_t minCapacity) {
newSz = sz + delta;
if (newSz <= maxSmallSize) {
setSmallSize(newSz);
- writeTerminator();
return small_ + sz;
}
reserve(newSz);
if (category() == isSmall) {
sz = smallSize();
if (sz < maxSmallSize) {
- setSmallSize(sz + 1);
small_[sz] = c;
- writeTerminator();
+ setSmallSize(sz + 1);
return;
}
reserve(maxSmallSize * 2);
// small_[maxSmallSize].
assert(s <= maxSmallSize);
small_[maxSmallSize] = maxSmallSize - s;
+ writeTerminator();
}
};
EXPECT_EQ("123123abc", b); // if things go wrong, you'd get "123123123"
}
+TEST(FBString, moveTerminator) {
+ // The source of a move must remain in a valid state
+ fbstring s(100, 'x'); // too big to be in-situ
+ fbstring k;
+ k = std::move(s);
+
+ EXPECT_EQ(0, s.size());
+ EXPECT_EQ('\0', *s.c_str());
+}
+
int main(int argc, char** argv) {
testing::InitGoogleTest(&argc, argv);
google::ParseCommandLineFlags(&argc, &argv, true);