Python 处理gzip的网页压缩时,出现incorrect header check的解决方法

You have this error: zlib.error: Error -3 while decompressing: incorrect header check Which is most likely because you are trying to check headers that are not there, e.g. your...

You have this error:

zlib.error: Error -3 while decompressing: incorrect header check

Which is most likely because you are trying to check headers that are not there, e.g. your data follows RFC 1951 (deflate compressed format) rather than RFC 1950 (zlib compressed format) or RFC 1952 (gzip compressed format).

choosing windowBits

But zlib can decompress all those formats:

  • to (de-)compress deflate format, use wbits = -zlib.MAX_WBITS
  • to (de-)compress zlib format, use wbits = zlib.MAX_WBITS
  • to (de-)compress gzip format, use wbits = zlib.MAX_WBITS | 16

See documentation in http://www.zlib.net/manual.html#Advanced (section inflateInit2)

examples

test data:

>>> deflate_compress = zlib.compressobj(9, zlib.DEFLATED, -zlib.MAX_WBITS)
>>> zlib_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS)
>>> gzip_compress = zlib.compressobj(9, zlib.DEFLATED, zlib.MAX_WBITS | 16)
>>> 
>>> text = '''test'''
>>> deflate_data = deflate_compress.compress(text) + deflate_compress.flush()
>>> zlib_data = zlib_compress.compress(text) + zlib_compress.flush()
>>> gzip_data = gzip_compress.compress(text) + gzip_compress.flush()
>>> 

obvious test for zlib:

>>> zlib.decompress(zlib_data)
'test'

test for deflate:

>>> zlib.decompress(deflate_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(deflate_data, -zlib.MAX_WBITS)
'test'

test for gzip:

>>> zlib.decompress(gzip_data)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
zlib.error: Error -3 while decompressing data: incorrect header check
>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|16)
'test'

the data is also compatible with gzip module:

>>> import gzip
>>> import StringIO
>>> fio = StringIO.StringIO(gzip_data)
>>> f = gzip.GzipFile(fileobj=fio)
>>> f.read()
'test'
>>> f.close()

automatic header detection (zlib or gzip)

adding 32 to windowBits will trigger header detection

>>> zlib.decompress(gzip_data, zlib.MAX_WBITS|32)
'test'
>>> zlib.decompress(zlib_data, zlib.MAX_WBITS|32)
'test'

using gzip instead

or you can ignore zlib and use gzip module directly; but please remember that under the hood, gzip uses zlib.

fh = gzip.open('abc.gz', 'rb')
cdata = fh.read()
fh.close()
  • 发表于 2017-10-30 10:39
  • 阅读 ( 45 )

你可能感兴趣的文章

相关问题

0 条评论

请先 登录 后评论
石天
石天

437 篇文章

作家榜 »

  1. shitian 662 文章
  2. 石天 437 文章
  3. 每天惠23 33 文章
  4. 小A 29 文章