DecompressedHash should fail on trailing input
authorHelmut Grohne <helmut@subdivi.de>
Wed, 19 Feb 2014 06:54:21 +0000 (07:54 +0100)
committerHelmut Grohne <helmut@subdivi.de>
Wed, 19 Feb 2014 06:54:21 +0000 (07:54 +0100)
Otherwise all files smaller than 10 bytes are successfully hashed to the
hash of the empty input when using the GzipDecompressor.

Reported-By: Olly Betts
dedup/hashing.py

index 002eda8..5f015b2 100644 (file)
@@ -49,9 +49,13 @@ class DecompressedHash(object):
 
     def hexdigest(self):
         if not hasattr(self.decompressor, "flush"):
+            if self.decompressor.unused_data:
+                raise ValueError("decompressor did not consume all data")
             return self.hashobj.hexdigest()
         tmpdecomp = self.decompressor.copy()
         data = tmpdecomp.flush()
+        if tmpdecomp.unused_data:
+            raise ValueError("decompressor did not consume all data")
         tmphash = self.hashobj.copy()
         tmphash.update(data)
         return tmphash.hexdigest()