Data corruption inside May2023 and Feb2023 bulk accounts data zip files

Hi @MArkWilliams /CH team

Two of the bulk account data zip files seem to have a bit of data corruption. These are Feb 2023 and May 2023, links below

https://download.companieshouse.gov.uk/Accounts_Monthly_Data-February2023.zip
The above gives a ‘Headers Error - There are some data after the end of the payload data’
https://download.companieshouse.gov.uk/Accounts_Monthly_Data-May2023.zip
The above also has multiple errors , two examples( there are MANY more that have this issue)
Data error : Prod224_0013_05287944_20221231.html
Headers Error : Prod224_0013_05290029_20221130.html

To ensure there is an error, I have downloaded them multiple times and tried to unzip with multiple tools including powershell and 7zip with no success.

I have uploaded an example screenshot for one of the files showing a failing 7zip extraction.

Could these be uploaded again?Screenshot 2023-09-11 072028
Thank you

Thank you for taking the time to report this. I will get this investigated.

I have been able to unzip this successfully from linux and a MAC.
Presumably you are able to unzip other files.
I will arrange to have these files re-zipped and re-uploaded to the site.
I will notify you when that is complete.

Hi Mark , Thank you for your swift response, as always.
Yes while I was able to unzip the files using 7zip on a windows machine, 122 files were reported as being corrupt (not the exact message but along those lines) in one archive and 1 file was reported as being corrupt(again, not the exact message but something like that) on another. I don’t have access to the machine right now to advise which one was which but both files had issues.
Thank you
:slight_smile:

Thanks. I have unzipped and re-zipped the Accounts_Monthly_Data-May2023.zip and Accounts_Monthly_Data-February2023.zip
Please could you check if you are able to unzip these versions.

1 Like

Both files unzipped OK this time. Thank you @MArkWilliams you are a star

1 Like