Using Google Apps, I am able to query the filings history of a company and get document meta data, when I then try and fetch the document I get “ Exception: Request failed for https://document-api.companieshouse.gov.uk returned code 500”.
The document is available if I go through CH website, plus, I have tried it on various different companies/filings
I don’t know the answer but there are some quirks to fetching the document data:
a) When you request the content you’ll get at least one redirect. According to the UrlFetchApp documentation the default is to follow redirects. However it’s possible that this is causing you problems because …
b) You are currently redirected to AWS where the documents are hosted. AWS have their own authorization scheme and so if the GScript is passing the Companies House authentication header to them that will cause issues.
c) … so you may have to tell GScript NOT to follow redirects, parse the redirect to find out where it’s going to and then make a separate request to that URL without the Companies House http Basic Authorization header.
It does appear that you’re correct in requesting application/pdf - most documents are but it’s always a good idea to request the document metadata first (if you’re not already doing so) and check what formats are available / use whatever link to the data they provide there.
For more detail see e.g. my reply posted in this thread (they also got a 500):
Did you try my example in the thread? (It’s old but probably still works)
I would definitely try to step through this manually using e.g. curl, so you can see exactly what url / authentication data you are supplying and exactly what responses you get back. Once you can do this manually it’s a matter of working out what’s happening differently in your language / library.
Also it looks like there’s a way of examining what Google apps is actually sending which may help - see here:
Hi @voracityemail - sorry for the delay in replying - I have been diligently going through your example. Given I did not know Curl I had to learn that and I think that was both good and bad.