Missing date_of_creation

In the documentation, it seems to indicate that date_of_creation is not an optional field in the API response.
However, I have found many examples of responses where this field isn’t present in the response, e.g. this result:

Regardless of whether or not this field is tagged as optional in the documentation, I was wondering if I can retrieve companies that do not have a date_of_creation using the advanced search API, similarly to how you can search for companies created between two dates?

Welcome to the wide world of entries in Companies House data! As well as the more “usual” companies you’ll also find some other types of entities. Many of these do not adhere to the “standard specification” exactly - or rather you may find things vary here. For starters the example you have - this is of type charitable-incorporated-organisation (company numbers start with “CE”). These are charities and this is an example of “data from elsewhere” that Companies House provides. If you look one of these up (via website or API) you’ll seen some indicators that these are “registered elsewhere” and/or that only “partial data is available”.

(In the case of querying the Company Profile endpoint for one of these you appear currently to get a date_of_creation but it will be an empty string - so the Company Profile API docs - if not the Advanced Search ones - are “correct” about there being a field, just not about the contents…).

More detail on this below. Before that - to answer your second question - this depends on exactly what you want to find out. I don’t know that you can directly query “no date of creation” via Advanced Search. However you can do so indirectly by listing all the company types which don’t have one! That’s why I asked what you wanted to find out. So this doesn’t necessarily guarantee you will find all entries without a date, OR that every item won’t have one. However if your purpose was just to investigate this set further then requesting something like the following should do it:

https://api.company-information.service.gov.uk/advanced-search/companies?company_type=assurance-company&company_type=charitable-incorporated-organisation&company_type=industrial-and-provident-society&company_type=investment-company-with-variable-capital&company_type=icvc-securities&company_type=icvc-warrant&company_type=icvc-umbrella&company_type=registered-society-non-jurisdictional&company_type=royal-charter&company_type=scottish-charitable-incorporated-organisation&company_type=unregistered-company

(Note that there’s no shortcut for ICVC companies; you have to list each type. The website allows you to group these!)

(You can do this via the website also)

So for different types you’ll find slightly different sets of data may be available. In general, but especially here - it is sensible to treat everything from Companies House with caution and parse / validate. (It’s good practice to do this with data from ANY external source of course!) So I would not depend on fields being returned, or the type of data within them, or particular values - without further checking.

So below is a (partial as I’ve not got all data to hand) list of the types which don’t appear to have date_of_creation, their initial letters in the company code. Many of these do not have much other data either since Companies House isn’t the main registrar. For some you may find a field in the Company Profile resource - partial_data_available - referencing text suggesting where more information is held. (You’ll need to look that text up in the “enum constants” mentioned next). You can see a list of the type constant at the Companies House “enum constants” collection - the main “constants” file.

Further detail is that some entries which might be expected to have less data e.g. Protected Cell companies and Registered Overseas Entities do seem to have date_of_creation.

Charities (note - do not depend on every charity being listed here either - refer to their respective registers!)
CE - charitable-incorporated-organisation
CS - scottish-charitable-incorporated-organisation

Bodies regulated by the Financial Conduct Authority:

AC - assurance-company
SA - assurance-company (Scotland)
IP - industrial-and-provident-society
SP - industrial-and-provident-society (Scotland)
NP - industrial-and-provident-society (Northern Ireland)
NO - industrial-and-provident-society
NI - industrial-and-provident-society

Royal Charter Companies and Registered Societies
RC - royal-charter
SR - royal-charter (Scotland)
RS - registered-society-non-jurisdictional

ICVCs - there are several types of these e.g. investment-company-with-variable-capital / icvc-umbrella icvc-securities / icvc-warrant
These have starting letters IC or SI for Scottish examples.

Unregistered Companies
ZC - unregistered-company
SZ - unregistered-company (Scotland)

1 Like

Thank you very much for your detailed answer! I’ve spent a while experimenting with the company_type field and as you have identified, certain company types do not have a date_of_creation for any entries in CH, but as you can see in the table, other company types such as ltd also lack a date_of_creation for a very small minority of companies.

The reason why I was asking in the first case is because I would like to retrieve the details of all companies currently registered on Companies House. I am aware of the free bulk data available however this only contains data on live companies (~5.3m) whereas I would like to retrieve the details of all companies (~11.2m).

My original plan was to iterate over date periods, where each date period contains <5k hits. However, my assumption that all entries would have a date_of_creation turned out to be false and even though this method allows the retrieval of 99% of companies on CH, I would ideally like to be able to retrieve all companies.

Do you have any suggestions on how else it might be possible to achieve this?

type hits hits with date_of_creation
private-unlimited 11191 11191
ltd 10538875 10538872
plc 13190 13190
old-public-company 31 31
private-limited-guarant-nsc-limited-exemption 67842 67842
limited-partnership 57568 57568
private-limited-guarant-nsc 233719 233718
converted-or-closed 2687 2381
private-unlimited-nsc 414 414
private-limited-shares-section-30-exemption 23 23
protected-cell-company 6 6
assurance-company 933 0
oversea-company 21281 21281
eeig-establishment 0 0
icvc-securities 10 0
icvc-warrant 0 0
icvc-umbrella 71 0
registered-society-non-jurisdictional 11374 0
industrial-and-provident-society 11046 0
northern-ireland 1 1
northern-ireland-other 0 0
llp 141693 141693
royal-charter 916 0
investment-company-with-variable-capital 946 0
unregistered-company 47 0
other 3 3
european-public-limited-liability-company-se 110 110
united-kingdom-societas 24 24
uk-establishment 19785 19785
scottish-partnership 683 683
charitable-incorporated-organisation 31798 0
scottish-charitable-incorporated-organisation 5929 0
further-education-or-sixth-form-college-corporation 3 3
eeig 21 21
ukeig 265 265
registered-overseas-entity 27845 27845

No. Although presumably you could come up with some other recipes which might flush out your missing few. For example polling over SIC codes / company names (there are some odd ones…) or some other feature you can search on.

If you “gotta catch them all” you might want to ask Companies House directly. They may have a contact somewhere (email) or you could create a new thread referencing them and hope someone sees and responds.
Previously they’ve been pretty clear that the purpose of the API is not to facilitate an export of all their data / scraping the system to get all companies. However they do seem to engage with people who approach them with specific use cases, so maybe worth a try?

I forgot - there are both Company Alphabetic Search and the similar Dissolved Company Search. Both take strings to search between so presumably you could step through all likely companies (again - there are some odd names…).

I have not used either myself though so have no further info on these.