Stream API Design

I have a slightly different use case for the streaming service, rather than keeping a dataset up to date, I am using it to notify users that a change has happened of a certain type and therefore they can decide if they want to action something based on what it is.

Where I struggle is the companies API appears to return a changed event with no fields where a document has been uploaded. In and of itself zero help, something changed we don’t know what. If accounts come overdue then it will do a change but notify the field (there is no document for that, as something hasn’t happened).

I then thought companies house by definition will have a document for any change that happens, so together with the Companies End point and Filings end point this will work perfectly. Watching Curl I saw a confirmation statement on the filings and a change on the companies, so held true, but then letting them go, I saw a load of disolution filings which did not appear on the companies feed (very important not to miss).

Also, how do these feeds join together, the obvious key is company ID, yet the filing stream doesn’t have a unique field for that, you have to manipulate it from the URI. I don’t think I need to call the other streams as any changes to them, would require a filing, so I can derive from a filing if an officer has changed for instance? Is there a list of Meta Data types, so descriptions of filings for instance.

I am clearly missing fundamentals in how the designer intended me to use this.

I don’t yet use the streaming API myself but the company profile stream resource object appears to have - as most of the stream objects do - an event member:

    "event": {
        "fields_changed": [
            "string"
        ],
        "published_at": "date-time",
        "timepoint": "integer",
        "type": "string"
    },

However if this is like the rest of the API of course there will be no guarantee that this is consistently returned or contains something useful. I can imagine it might be that their code helpfully triggers an update on this stream (“something happened for this company”) but because none of the particular fields in the company profile changed (e.g. as you point out filing / officer change) that member could be empty…

I think you’ve answered your own question - the key field (as for most of Companies House) is the company number. (If you’re not sure on that search the forum to get the full story on all variations there - it should always be 8 characters as returned in Companies House URIs but there are some details e.g. here). It appears that you can indeed extract this from most streaming resources. I’ve only looked at the ones you mentioned:

  • Filing history - the links.self member - in the standard API this takes the form e.g. “/company/SC046419/filing-history/MDE3NzkwMzY0N2FkaXF6a2N4” (where the second item is the company number)

  • Officers - also has a links.self member - if analogous to the main API will be like “/company/OC307601/officers”

  • Insolvencies - not sure about this. If there is a links.charge member then that will help you but not all have this if I recall.

Is there a list of Meta Data types, so descriptions of filings for instance.

For working out what these will change the category field in the Filing History stream resource is probably what you want? That can be:

accounts, address, annual-return, capital, change-of-name, incorporation, liquidation, miscellaneous, mortgage, officers, resolution

Which filings affect which fields?
The following is speculation but I’d look to see if the following are affected by different types of filings:

Company Profile - when the following categories of filings are made I think certain fields should change:
accounts (the accounts member)
address (registered_office_address member)
annual-return (no longer the annual_return member - these will be confirmation statements so update the confirmation_statement member)
change-of-name (company_name and previous_company_names)
incorporation (starts the whole thing off, all the company type / status fields)
liquidation (has_been_liquidated, has_insolvency_history - I think these are “sticky” - see more about these here)
mortgage (has_insolvency_history)

The “wild cards” are the following - more later on these:
miscellaneous
resolution

Officers list - when the following categories of filings are made I think certain fields should change:

officers

About: resolutions and miscellaneous
For the resolution filings you get an additional member in the filing object: resolutions. This functions as a kind of “sub-filing”.

There’s also an associated_filings member which acts as a link to related ones. Example - if there was a resolution to change the company name I think you might see a link to an additional change-of-name filing.

If you wanted more detail on a particular filing you’ve two options - modern or traditional.

The description member (which can appear in the resolutions and associated_filings members too) contains a string which details what type of filing this is - see the “enum constants” for a list of these.

The type member gives the actual “form type”. You can find these codes with the collection of Companies House forms below. Note that this is not done for the benefit of users of this API. You’d need to wade through this yourself. However if you were stuck on the “meaning” or “where certain things come from” this could be (occasionally) useful.

Finally as always I wouldn’t assume anything. The documentation is improved and the system “works” but even assuming the code was without any inconsistencies this is a massive and very old public dataset and (by fiat and indeed law) has almost no “validation”. So all bets are off. I wouldn’t even assume that if one stream fires an alert that you’ll immediately get associated data on the others.

You asked about “how you are supposed to use it”. It may be helpful to consider this API as one oringinally for internal use to power some very specific tools but which has been improved so the public can use it. This is certainly true for the main API. Essentially that was for “specific searches to quickly find a narrow range of data”. It wasn’t designed for the uses which people then made of it e.g. allowing their own customers to search for anything company-related, large scale research over the entire dataset etc.

Enjoy!