May 2016 Zip corrupt

Accounts_Monthly_Data-May2016.zip (866Mb)

Hi,

I’ve tried three times with different zip software to unzip the May 2016 zip file but each time it throws invalid file errors.Can you check and replace if applicable?

Thanks

Adrian

Hello Adrian,

I see no-one has come back to you as yet but just to let you know I am having the same problem with recent daily and monthly files.

Sometimes they work, but most of the time they fail so it is interesting to see that I am not the only one affected?

After trying to download the same file tonight and see the same problem: 134,106 errors out of 149,981 files, mostly ‘Header errors’. That’s a lot… :-/

For what it’s worth, here is my checksum analysis (obtained from the ‘CRC SHA’ context sub-menu option when right clicking the file in Windows, use the asterisk option to see all checksums at once):-

Name: Accounts_Monthly_Data-May2016.zip
Size: 908661744 bytes (866 MiB)
CRC32: EC3AE94B
CRC64: 0F7F7937328FE2F7
SHA256: C4B54BD7E93752C8B68CAD033FFD50EC2C37A63914AEA313A854E21B284CA977
SHA1: E0D09540437175D011716AC0EF49CE7FEFCD69D0
BLAKE2sp: 90398C5FB7F41192E8097D531643C7777A5AA39D88C75E7082FF3133CCFBAEB4

I appreciate that CH do not support these files as per their guidance notes. However, it would be helpful of them to publish their own checksums on the page - ideally SHA256 - for quick verification after a large download. This would save a lot of time in unzipping the files and only then finding errors!

There are Zip repair tools our there (both free and paid) but in my experience they are not great.

If anyone else reading this can say if they have had issues (or not) please do chip in.

Thanks all,
Martin

Martin,

Nice to hear I am not alone!

I found a solution earlier, I FTP the file to my webserver and unzipped it via SSH using the command line unzip function. That worked OK, I think because it does not perform any of the tests that the Windows products do such as Winzip.

Hope that helps

Adrian

Many thanks Adrian for the tip!

I had not considered using FTP>SSH>Linux so checked it out myself today.

Saying that Windows users can skip the first two steps with Cygwin, a collection of open source command line tools similar to Linux.

You just need to download it, remember to install the zip and unzip packages (they are not default modules) and set your path variables; the installer guides you through that.

I ran a test on the most recent monthly file, Accounts_Monthly_Data-October2020.zip. For the record, here are my checksums for that file: -

Name: Accounts_Monthly_Data-October2020.zip
Size: 1876773957 bytes (1789 MiB)
CRC32: 6327FA2A
CRC64: 81C2868A41B4C7C4
SHA256: 4EA3415041556D5B30F2F1B2B03BE4250D9E053E6C3F481915DBB55EA9362956
SHA1: B70BBECA25B62974889CBA23FA5DE5D3BD7C8547
BLAKE2sp: B85B7911DF78B622359990E25D2A3995531B40DC71A1CA24CD2C9DC3E11C9CA0

Of all the files processed in my regular Zip software, there were 227,536 errors out of 233,107 files, leaving only 5,571 to work with.

I then processed these same files in Cygwin using the following commands: -

zip -FF Accounts_Monthly_Data-October2020.zip --out recovered_Accounts_Monthly_Data-October2020.zip

There is also an ‘-F’ switch which can be utilised for zip files with minor damage, apparently.

This returned 231,182 files, a considerable improvement but with 1,925 short. Unzipping these files in turn with the command returned 229,759 files from the recovered zip, a further shortfall of 1,423.

unzip recovered_Accounts_Monthly_Data-October2020.zip

I am still investigating the reasons for the shortfall on the recovered file, but apparently others have tried rerunning this command to create a second recovered zip file and extracting from that!

The clear message is that this is a considerable improvement but it is still not 100% perfect. Perhaps mitigation by downloading both the daily and monthly files and cross-referencing them against those files already processed is the way to go, I don’t know - that would depend on the use case.

And if anyone knows of any ‘better’ Linux commands to rival Cygwin unzip (or other installations for that matter), I would be interested to hear about them.

Thanks again.

I have just downloaded the May 2016 file to my Mac and ‘unzip’ and had no errors and 149,981 files.
I also tried a ‘unzip -t’ and that produced a
No errors detected in compressed data of Accounts_Monthly_Data-May2016.zip

I will try Accounts_Monthly_Data-October2020.zip once it has downloaded…

Hello Mark,

Thank you for taking the time to look at this.

Just to give a quick progress update on this end. We have a result!

Disabling the anti-virus (Avast) and anti-malware (Malwarebytes) did the trick after downloading with curl. Tests report all OK, so the appropriate exceptions will be added. This seems to be a recent thing that one of them is taking an exception too - it certainly worked in the past.

To update on my earlier posts regarding checksums, the ‘CRC SHA’ context menu is installed with 7-Zip so it will not be available generally - I was under the impression that it was.

Many thanks,
Martin

Thanks for the update.
So just to double check, there is no issue and you can both unzip the files?

Confirmed that both files are successful, thanks for the assistance.

1 Like

Yes thanks I found a workaround using my webserver and its Linux unzip feature.

1 Like