Accessing Accounts data via API

Tara · February 18, 2024, 6:57pm

Hi everyone,

There have been a couple of threads on here querying how one can access accounts data via REST API.

I was wondering if i need to apply for additional authorisation to do this i.e in addition to having a live API key?

The reason I’m asking this is because I have successfully extracted accounts information for a select company which includes URLs to access the documents (https://frontend-doc-api.company-information.service.gov.uk/document/9hklhWtgRaiagwTMkXlTveMu80jyVd-1HMAeRnfgTvc) . However, when i click on the URL i am greeted with a 401 error on the webpage:

"This page isn’t working

If the problem continues, contact the site owner.

HTTP ERROR 401"

I have triple checked my API key is correct and i have removed any spaces.
I have triple checked that the company number exists and that there are indeed accounts available for this business.
This is my python coded GET request:

response = requests.get(f"https://api.company-information.service.gov.uk/company/{company_number}/filing-history", auth=(api_key, ‘’))

any help is much appreciated as i have been stuck on this for a while!

mh.hunt · February 21, 2024, 3:31pm

Hi there @Tara

Here are a few thoughts that may help you

If not done so already – I believe, you need to encode your password in Base64
I used this example to explore what I believe you are trying to achieve;

 From the Filing History endpoint I extracted this

Then, using Postman with the highlighted link – I got this;

image940×330 54.7 KB
Finally, the step 2 highlighted link, again in Postman with Base64 encoded password – I got this

image938×480 68.4 KB

In short, the baseline https link changed between step 2 and 3 and a Base64 coded password was required. The header formation is quite specific – “Authorization” “Basic” & API Key

This link may help: Document API: Fetch a document's metadata

This may not be what you are looking for, but there again it may just help you.

voracityemail · February 21, 2024, 5:54pm

To add to what @mh.hunt has written (which should help):

If you have a live application you should be able to do this (the sandbox / test ones seem to have issues - ignore them).

It is slightly unclear to me what you’re doing. From the second part of your post it sounds like you have managed to access the API and return filing history data - is that correct?

I’m not sure when you write “when I click on the URL” … is that within your own app? As @mh.hunt says you will need to send the header with the API key encoded per http Basic Authentication - but it seems that Python is handling that for you. So you won’t need to do your own base64 encoding or anything like that if requesting via requests.get.

If you are doing so then perhaps you’re actually going a step further and requesting the actual document data (same URL with /content at the end). Some people have had problems here because what actually happens is you get redirected to the AWS content server (and then sometimes redirected around there). If you are not careful you may find your system (python / whatever) is effectively sending your API key there. First - this is a different server so you don’t want to send them your password! Second AWS actually have their own authentication scheme so if you send them something in a different one their servers will reject it. (Full details in the post linked at the end of this).

So - back to your python request.

If you are able to access the Filling History List from Python then how you’re requesting things is fine and should work for the Document API. Essentially the main Public Data API (at api.company-information.service.gov.uk) and the Document API (at document-api.company-information.service.gov.uk - though you may see e.g. frontend-doc-api.company-information.service.gov.uk in the data responses) operate together, using the same API key and general approach.

The way the system is designed you conceptually “follow links” and it sounds like you’re doing that e.g. retrieve the link in the Filing History list / object from links.document_metadata and use that to request the metadata (that’s the link you posted). You could in theory skip that step but getting the metadata allows you to check what data formats (PDF, XBRL etc.) are available (almost all just PDF however). It’s also possible Companies House might decide to change the link to the actual file data so if you get the metadata you should always have that.

Further details on the process (with Python example) in the post below:

See also here on interacting with the Document API:

And here - although I did not find things quite the same as the poster who suggested a solution did:

Hope this helps.