Best way to get officers data (2)

Hello, is it possible to get data from this link BUSINESSWOMAN SPECIAL PRODUCTIONS LIMITED people - Find and update company information - GOV.UK via API?

I want to collect it for each company, so I would just be changing the ID I guess and calling API for each company.

Thanks :slight_smile:

That looks like the Officers you’re linking to there.

That can be obtained via the API but what is the task / goal you’re trying to achieve?

If you want certain information (what kind are you hoping for?) on Company Officers for all companies the way to do that would be to post a request to Companies House for the bulk data e.g. on the following thread:

That’s the recommended way since Companies House specify that that purpose of the API is not to crawl all the companies and download all data. There are rate limits and if you exceed them trying to do this they may block you.

If you only want data for a few / specific companies then yes, the (public data) API can give you that data (in JSON format). The documentation on that particular part of the API is here:

https://developer-specs.company-information.service.gov.uk/companies-house-public-data-api/reference/officers/list

There’s also an unofficial guide (for officer info) here:

https://chguide.co.uk/bulk-data/officers/

General API usage documentation is here:

https://developer.company-information.service.gov.uk/

Hope this helps.

Hello, thanks for your answer.

I want to automatically update my data from time to time, so the bulk download is not so good option for me. It’s for my diploma thesis, I’m doing a tool similar to https://opencorporates.com/ and I want to visualize the relationships between people through their companies, if there is any connection

If you are only interested in a very small subset of officers then just using the API is probably the way.

If you want the full data but only want to intermittently check for (a few) changes (e.g. not much data changing) then the suggested way seems to be to get a bulk download, then use the API to check for updates to specific companies.

Otherwise due to data volumes and Companies House rate limits I’d imagine you’d have to get the bulk data and then continually use the Streaming API (e.g. monitor for any changes and apply them).

I suspect that is also listed in rough order of effort / complexity.

Another possible option:
If you ingest the bulk file, there are daily “update” files to the bulk file, which contain only records that changed on each particular day (about 15,000 daily IIRC). If you have access to the bulk sftp server, you can access historical daily update files and apply them in order. This would allow you to process the bulk file once, and then when you want to update your database, process the daily files that have been produced since the main bulk file.
More info on Officers bulk updates file | CH Guide

This sounds good, so they give me access to the SFTP server and then I can access different files?

First I will process all records and then I will only be working with the daily bulk file?

Yes correct.
The initial file has well over 10 million records. Prod216 is what you want.
Then the daily files are much smaller and only contain new, removed or changed officers for that day. So processing even a whole week or months worth of daily files is a light operation in comparison with the initial bulk file.
Once you get access to the SFTP server you can browser all the files organised by product and date on there. You’ll need to provide companies house with your SSH key.