From: Nicholas Ormrod <njormrod@fb.com>
Date: Wed, 26 Nov 2014 00:36:20 +0000 (-0800)
Subject: zlib compression fails on large IOBufs
X-Git-Tag: v0.22.0~144
X-Git-Url: http://demsky.eecs.uci.edu/git/?a=commitdiff_plain;h=eb9d7cd7754c3f21ec767e261dff0906c35d0cf6;p=folly.git

zlib compression fails on large IOBufs

Summary:
If a single IOBuf has size exceeding 2^32, then our zlib
compression algorithm fails. Specifically, zlib z_stream.avail_in is
only 32 bytes (I think it's a long?
http://www.gzip.org/zlib/zlib_faq.html#faq32), and so a too-big IOBuf
will overflow the z_stream and cause data loss.

This diff breaks up large IOBufs into smaller chunks.

Test Plan:
fbconfig -r folly && fbmake runtests

Also compressed biggrep's configerator blob, which is how this bug was
caught. It now works. See the associated task.

Reviewed By: robbert@fb.com

Subscribers: trunkagent, sdwilsh, njormrod, folly-diffs@

FB internal diff: D1702925

Tasks: 5648445

Signature: t1:1702925:1416958232:459d498ff1db13e1a20766855e6f2f97da8cde8c
---

diff --git a/folly/io/Compression.cpp b/folly/io/Compression.cpp
index 8c8fe61f..d7a0544d 100644
--- a/folly/io/Compression.cpp
+++ b/folly/io/Compression.cpp
@@ -553,21 +553,25 @@ std::unique_ptr<IOBuf> ZlibCodec::doCompress(const IOBuf* data) {
        defaultBufferLength));
 
   for (auto& range : *data) {
-    if (range.empty()) {
-      continue;
-    }
-
-    stream.next_in = const_cast<uint8_t*>(range.data());
-    stream.avail_in = range.size();
-
-    while (stream.avail_in != 0) {
-      if (stream.avail_out == 0) {
-        out->prependChain(addOutputBuffer(&stream, defaultBufferLength));
+    uint64_t remaining = range.size();
+    uint64_t written = 0;
+    while (remaining) {
+      uint32_t step = (remaining > maxSingleStepLength ?
+                       maxSingleStepLength : remaining);
+      stream.next_in = const_cast<uint8_t*>(range.data() + written);
+      stream.avail_in = step;
+      remaining -= step;
+      written += step;
+
+      while (stream.avail_in != 0) {
+        if (stream.avail_out == 0) {
+          out->prependChain(addOutputBuffer(&stream, defaultBufferLength));
+        }
+
+        rc = deflate(&stream, Z_NO_FLUSH);
+
+        CHECK_EQ(rc, Z_OK) << stream.msg;
       }
-
-      rc = deflate(&stream, Z_NO_FLUSH);
-
-      CHECK_EQ(rc, Z_OK) << stream.msg;
     }
   }