A client should be following links.document_metadata URL of the filing history to get to a document, and not have to construct URL’s on the fly. Should we change the ID encoding, then client’s that construct their own URLs will break.
So, in the short term, take the links.document_metadata URL from the filing history, and append /content onto the end. This will return the default PDF, but if PDF is not available, you’ll have to deal with the failure. Calling the document_metatdata endpoint first allows you to query which document types are available before you request a particular form (possible types will be PDF, XBRL, iXBRL (coming soon), plus others in the future).
If a direct link to the default document is required, then we will look to adding this to the filing history links: {} sub-document. Client side (re)manipulation of URL’s is not desirable:
Thanks for the information, quick question, what do you mean by “your_encoded_key_goes_here”? Do you mean our API key or does it need to be encoded in some way?
I’ve been trying to connect to the get document API in java and failing with 500 Internal Server Error. Surprising is I could connect to the search companies and Filling History APIs successfully. Please see below analysis.
Is it possible to collect document with company information \ officers list through API ?
I found information about document API on website (https://developer.companieshouse.gov.uk/document/docs/), however document, which I received, doesn’t have sufficient information (please refer to ‘response.pdf’ file in attachment above).
I would be very grateful for any advice and answer, whether It is possible to receive other documents through API.
For your example company look back in time and find the first example “with updates” - e.g. “Confirmation statement made on 30 August 2018 with updates”. You might expect to find a note of any updates during the reporting period. However rather unhelpfully it says “all information…either has been delivered or is being delivered”. So you won’t even see updates over the period gathered in one document. You need to use the API.
The system now works like this (caveat - this is just my understanding and I’m neither a lawyer or member of Companies House):
The Companies House dataset - e.g. what you see via the API / the CH Beta website - is the main “record”.
Companies must inform Companies House if there are certain changes to their status.
This is normally done on separate filings through the reporting period.
If there are changes during the reporting period the company submits a confirmation statement “with updates”. (There are a couple of changes which can be reported on the confirmation statement I think).
If there are no changes, the company submits a confirmation statement like the one you found.
The overall issue here is that “ease of access to the data via the API” is not the same as “it’s easy to answer (some question) about a company / officer”. For many questions you need to understand something of the law as it relates to companies, reporting requirements and the role of Companies House. Given that this is a free service it may not be so suprising that the onus is on users to work out some of this information for themselves, including dealing with issues in the data itself…
I’ve been trying unsuccessfully to get the document metadata in order to pull up various accounts.
I can successfully run other get requests in python with API authorization.
I can get the filing history for a company and find in there the “relevant transaction_id” and also the links.document.metadata (metadata link I assume).
However, I am not able to access the metadata link!
Yet if I put the transaction_id into a a link for an online search a the companies house website, it shows the PDF without a problem (I can’t paste a third url in here as a new user, but it’s on the main search page and starts find-and_update.company-information.service.gov.uk)
Any advice welcome to get the metadata.
I would ideally like to get to the XBRL data, rather than PDFs.
See this article I put together on accessing XBRL documents through the document API: Filings Document API | CH Guide .
I believe your issue may be attempting to use the transaction ID of the filing in the URL for accessing metadata or content. As you can see in the following example, there is a different ID for the filing history endpoint and the metadata/content:
I can get a table of the metadata with links. Seems a bit strange that I can’t yet find a compamy with XBRL data? Only PDFs it seems. I thought XBRL was a mandatory requirement to be filed with Companies House?
The document metadata should show you if / when a filing is available in XBRL. You can get that from the AWS servers by setting the appropriate mime type (I think it’s usually "application/xhtml+xml") in the http “Accept” header that you send when making the document request. There is some info on that and an example filing with XBRL data in the following post: