Getting a HTTP/1.1 500 Internal Server Error when requesting documents through teh API

Hi,

I’m having a problem getting documents using the API.

I can get the filing history just fine using a url like this…

 https://api.companieshouse.gov.uk/company/06539163/filing-history

That will give me a response like this…

{“items”:[
{“category”:“incorporation”,“date”:“2008-03-19”,“description”:“incorporation-company”,“links”:{“self”:"/company/06539163/filing-history/MjAwMTY2MjMyOWFkaXF6a2N4",“document_metadata”:“https://frontend-doc-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA"},“type”:“NEWINC”,“pages”:12,“barcode”:“XODHPY46”,“transaction_id”:"MjAwMTY2MjMyOWFkaXF6a2N4”}
],
“start_index”:0,“items_per_page”:25,“total_count”:4,“filing_history_status”:“filing-history-available”}

However, when I try to get a “document_metadata” link I always get a 500 Internal Server Error…

https://frontend-doc-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA

HTTP/1.1 500 Internal Server Error
Date=Fri, 17 May 2024 13:23:30 GMT
Server=nginx/1.22.1
Content-Length=0
Connection=keep-alive

I get this for any document for any company. I’ve tried putting in an “Accept” header but that doesn’t make a difference.

Would you be able to help me figure out what it is I am doing wrong?

Thanks

dan

Short:

That Header - does that calculate anything for you or is it sending literally what you’ve put in there?

If the latter, then you will need to ensure you’ve got the correct format - which is:

Basic {credentials}

… where {credentials} is your API key, plus a single colon (":"), all base-64 encoded. (See guide e.g. here).

(The whole thing will end up as:

Authorization: Basic {credentials}

)

If that isn’t the issue… read on…

Other than either a bad header or bad URL (e.g. parameters) I’m not sure why you’d get a 500 here. Is something else is getting added to the URI / headers that Companies House isn’t recognising?

This works fine when I try it - either by following the given link or just using the server specified in the docs. I’m using curl (which I find helpful as you can set it to be very verbose / understand exactly what’s sent / received - useful for debugging):

curl -u MY_API_KEY_HERE: https://frontend-doc-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA
{"company_number":"06539163","barcode":"N0653916.3K","significant_date":null,"significant_date_type":"","category":"new-companies","pages":12,"filename":"","created_at":"2015-04-09T03:42:44.186559621Z","etag":"","links":{"self":"https://document-api.companieshouse.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA","document":"https://document-api.companieshouse.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA/content"},"resources":{"application/pdf":{"content_length":341641}}}

curl -u MY_API_KEY_HERE:  https://document-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA
{"company_number":"06539163","barcode":"N0653916.3K","significant_date":null,"significant_date_type":"","category":"new-companies","pages":12,"filename":"","created_at":"2015-04-09T03:42:44.186559621Z","etag":"","links":{"self":"https://document-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA","document":"https://document-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA/content"},"resources":{"application/pdf":{"content_length":341641}}}

(You don’t need an “Accept” header here - you’re just getting the metadata and that is always JSON. That is however used when getting the document itself - currently should be the same URI as the metadata link but with /content on the end. (I think it’s recommended that you follow the link you get back from the document metadata call however, in case CH redirect this / change something!)

I’m not sure what tool you’re using there - does it allow you to see exactly what you’re sending (including headers)? That is worth doing if you can!

Good luck.

Hi,

Thansk for getting back to me. I’m using SOAP UI just to test the requests. The request to get the filing history works perfectly with the authorisation key as shown below…

The only thing I’m changing then is the url in the Endpoint field which I’m changing to…

https://frontend-doc-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA

…and that always brings back the 500 error for any document link.

dan

…and here is the same issue in PowerAutomate (which is where I actually want to use the requests in the end).

I have two HTTP requests configured exactly the same apart from the endpoint (left hand side of teh image).

When the flow runs, the first HTTP request returns the filing history OK. The second HTTP request to get the document metadata, fails repeatedly with the 500 error.

dan

…don’t know if this helps, the inputs and outputs for the file history request that works are:

(INPUTS)…

{
“uri”: “https://api.companieshouse.gov.uk/company/06539163/filing-history”,
“method”: “GET”,
“headers”: {
“Authorization”: “sanitized
}
}

(OUTPUTS)…

{
“statusCode”: 200,
“headers”: {
“Date”: “Fri, 17 May 2024 14:40:00 GMT”,
“Connection”: “keep-alive”,
“Access-Control-Allow-Credentials”: “true”,
“Access-Control-Allow-Headers”: “X-RateLimit-Query, origin, content-type, content-length, user-agent, host, accept, authorization”,
“Access-Control-Expose-Headers”: “X-RateLimit-Window, X-RateLimit-Limit, X-RateLimit-Remain, X-RateLimit-Reset, Location, www-authenticate, cache-control, pragma, content-type, expires, last-modified,Location,www-authenticate,cache-control,pragma,content-type,expires,last-modified”,
“Access-Control-Max-Age”: “3600”,
“Cache-Control”: “no-store, must-revalidate, no-cache, post-check=0, pre-check=0”,
“Pragma”: “no-cache”,
“X-Ratelimit-Limit”: “600”,
“X-Ratelimit-Remain”: “599”,
“X-Ratelimit-Reset”: “1715957100”,
“X-Ratelimit-Window”: “5m”,
“Server”: “CompaniesHouse”,
“Content-Type”: “application/json”,
“Content-Length”: “1767”
},
“body”: {
“items”: [
{
“category”: “gazette”,
“date”: “2010-08-03”,
“description”: “gazette-dissolved-compulsory”,
“links”: {
“self”: “/company/06539163/filing-history/MzAyMDI2MDM0OWFkaXF6a2N4”,
“document_metadata”: “https://frontend-doc-api.company-information.service.gov.uk/document/gHRkUFGdKerwDd7eFPhnTauHBqh3bgRLzwq5znC8ANk
},
etc etc etc.

The inputs and outputs for the file metadata request which does not work are…

(INPUTS)…

{
“uri”: “https://frontend-doc-api.company-information.service.gov.uk/document/XJ5LJZHPw969QN4B3www_BHeVLETO1rg0CmSzVkDKiA”,
“method”: “GET”,
“headers”: {
“Authorization”: “sanitized
}
}

(OUTPUTS)…

{
“statusCode”: 500,
“headers”: {
“Date”: “Fri, 17 May 2024 15:05:21 GMT”,
“Server”: “nginx/1.22.1”,
“Connection”: “keep-alive”,
“Content-Length”: “0”
}
}

dan

Aha - well there’s another oddity of the API!

Looks like CH not only accept the correct / expected http Basic authorization header format BUT also accept one in the format below - but only for the “Public REST API” (e.g. api.companieshouse.gov.uk).

So BOTH these work for api.companieshouse.gov.uk:

Authorization: Basic {credentials - which is Base64(API KEY plus ":") }

Authorization: YOURAPIKEYHERE

However the second form does not work for server document-api.company-information.service.gov.uk.

The existence of the second type of header (e.g. not normal http basic) is a surprise to me - I’ve not seen that documented anywhere. That is possibly why it doesn’t work for e.g. documents API…

I suggest trying the “standard” form of this header for both. Or even better getting your tool - whether API explorer or PowerAutomate - to do this for you so you only have to put in the API key, not worry about encoding it yourself. Easier!

Let us know…

BTW thanks for making the effort to document this thoroughly - without being able to see the start of what you’d got in the Header box I would probably never have suspected this as a cause (nor found about Companies House other Authorization format).

I’d stick to the standard way unless they’ve documented the other one - I’d suspect that might not be stable. Plus you have to do it the “normal” way for documents anyway so just have one method.

Brilliant! That’s fixed it, thank you very much. Nice end to the week!

Have a great weekend

dan