Fetch Document 500ing

I get the Document Link curling ‘https://api.companieshouse.gov.uk/company/Compa/filing-history?category=accounts’ adding my valid token into the header as “Authorization: {token}”.

In the list I can see the link to the document, that I bet it is something like https://frontend-doc-api.company-information.service.gov.uk/document/{DocumentID}.

I then tried the following options:

curl https://frontend-doc-api.company-information.service.gov.uk/document/{DocumentID} -H "Authorization: <token>" -k -vvvvvvvvvvvvv

curl https://document-api.company-information.service.gov.uk/document/{DocumentID} -H "Authorization: <token>" -k -vvvvvvvvvvvvv

both responding with 500.

curl https://api.companieshouse.gov.uk/document/{DocumentID} -H "Authorization: <token>" -k -vvvvvvvvvvvvv

responds with 404

Please help

Hi. I’ve not retrieved documents yet, but are you base64 encoding the token and setting it to basic authentication? Only saying that because you just mentioned "Authorization: (token) "

1 Like

The same token works for

https://api.companieshouse.gov.uk/company/{CompanyID}/filing-history?category=accounts

Welcome!

I’m assuming here you just want help getting the document metadata - e.g. not downloading the filing document itself. For information on the whole process please see other answers on this forum e.g. the following one:

I’m not sure what you mean by the “token” part? All the endpoints simply take the API key (technically, http basic Authorization where the username is the API key and the password is blank). Since you’re already calling curl to get the filing history for a company you’ll have an API key, yes?

I find the simplest way to do this with curl is using the -u argument to pass the username and password. That means you can simply put your unmodified API key from CH and then a colon, space and then the rest of your curl statement (because password part is blank).

Rolling your own (header) is not difficult but it seems to cause people a lot of confusion, so I’d go with the simplest method first. (The only reasons for working directly with http headers in the CH API are if you haven’t got a library to do the basics for you [I’d recommend using one to save labor]. In that case you may need to manage the rate-limiting system and / or manually follow the redirects in the document API to download document content without passing CH authorisation to Amazon. Those are topics covered elsewhere e.g. on this forum and probably won’t concern you if you’re manually accessing the system using curl).

Another aside - I’d try to avoid using the curl -k argument - I’d certainly avoid this in a production environment. If curl can’t verify certificates this is really a prompt to update your certificate store.

…I bet it is something like…

No need to guess, it’s all (reasonably) well documented here (the links below work, not sure why the previews don’t) - first filing history. Note you can get either a single item or list of all items, I’m just listing the single item endpoint below:
https://developer-specs.company-information.service.gov.uk/companies-house-public-data-api/reference/filing-history/filinghistoryitem-resource
…and the format of a single entry:
https://developer-specs.company-information.service.gov.uk/companies-house-public-data-api/resources/filinghistoryitem?v=latest
…and how to request document metadata information:
https://developer-specs.company-information.service.gov.uk/document-api/reference/document-metadata/fetch-a-documents-metadata
…and what it returns:
https://developer-specs.company-information.service.gov.uk/document-api/resources/documentmetadata?v=latest

Using an example company here (04253605) and my API key I got the following to work fine just now. I’ve left out your -k and -v options (disable certificate check and verbose) for clarity:

Examining a filing history entry:
curl -uMY_API_KEY: “https://api.company-information.service.gov.uk/company/04253605/filing-history/MzI4MDk0OTUwM2FkaXF6a2N4

{
    "action_date": "2020-02-29",
    "category": "accounts",
    "date": "2020-10-19",
    "description": "accounts-with-accounts-type-dormant",
    "description_values": {
        "made_up_date": "2020-02-29"
    },
    "links": {
        "self": "/company/04253605/filing-history/MzI4MDk0OTUwM2FkaXF6a2N4",
        "document_metadata": "https://frontend-doc-api.company-information.service.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc"
    },
    "paper_filed": true,
    "type": "AA",
    "pages": 3,
    "barcode": "A9FKE6FU",
    "transaction_id": "MzI4MDk0OTUwM2FkaXF6a2N4"
}

Using the document metadata link to get document info:
curl -uMY_API_KEY_HERE: “https://frontend-doc-api.company-information.service.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc

{
    "company_number": "04253605",
    "barcode": "A9FKE6FU",
    "significant_date": "2020-02-29T00:00:00Z",
    "significant_date_type": "made-up-date",
    "category": "accounts",
    "pages": 3,
    "created_at": "2020-10-21T04:46:11.10712573Z",
    "etag": "",
    "links": {
        "self": "https://document-api.companieshouse.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc",
        "document": "https://document-api.companieshouse.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc/content"
    },
    "resources": {
        "application/pdf": {
            "content_length": 45593
        }
    }
}

Using the alternative form of the document API endpoint (instead of “frontend…”):
curl -uMY_API_KEY_HERE: “https://document-api.company-information.service.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc

{
    "company_number": "04253605",
    "barcode": "A9FKE6FU",
    "significant_date": "2020-02-29T00:00:00Z",
    "significant_date_type": "made-up-date",
    "category": "accounts",
    "pages": 3,
    "created_at": "2020-10-21T04:46:11.10712573Z",
    "etag": "",
    "links": {
        "self": "https://document-api.companieshouse.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc",
        "document": "https://document-api.companieshouse.gov.uk/document/bFofQLDBGWrTBK02r1myESnrGJi0Uf7v1OTfQE7cbvc/content"
    },
    "resources": {
        "application/pdf": {
            "content_length": 45593
        }
    }
}

Of course you can combine this with other things in curl e.g. curl -I to get the headers etc.

1 Like

Thanks very much, curl with -u works, I will figure out how to send the same request in Postman. Thanks a lot for your help!

So my problem now is to get Document content, I get exactly the same response as in https://forum.aws.chdev.org/t/how-to-download-a-document-from-companieshouse-api-through-postman/1809

At the end I solved this this way (thanks both for your help):

curl https://document-api.company-information.service.gov.uk/document/{DocumentID} -H “Authorization: Basic {base64 encoded token}” -k -vvvvvvvvvvvvv

curl https://document-api.company-information.service.gov.uk/document/{DocumentID}/content -H “Authorization: Basic {base64 encoded token}” -k -vvvvvvvvvvvvv

Thanks!