Hello Companies House community,
I’m working on a project that processes both the snapshot (Prod195) and streaming data feeds for company officers (directors/secretaries). My understanding is that the Person_Number field is split such that the first 8 characters identify the person, and the subsequent 4 characters represent an increment of changes.
In practice, though, I’m seeing multiple distinct 8-character “person IDs” for what appears to be the same individual. Based on matching name, partial date of birth, nationality, address, and appointment dates (combined with concatenated data points to help identify the individual), we often find multiple Person_Number values that almost certainly point to one real person—across different companies and sometimes even within the same company.
Additionally, the same individual can have multiple records purely because they hold multiple appointment types (e.g., director, secretary, etc.). It’s not always clear from the data whether a new record in the streaming updates corresponds to an existing individual with a new appointment type or if it represents an entirely different officer.
All of this makes it very difficult to maintain a deduplicated internal record of officers or to confidently update snapshot data with the streaming data, because there isn’t a stable, unique identifier that reliably ties the same person’s records together over time. Officers can also have multiple occupations, adding extra confusion.
My questions:
- Is there an official or recommended way to definitively link a streaming update record to a snapshot record for the same individual, especially if multiple appointment types are involved?
- Are there any best practices or data points we can rely on (beyond name, partial DOB, nationality, etc.) to consolidate records for a single person?
- Has there been any discussion or plan to provide a truly unique officer identifier in the future?
Any guidance or references from Companies House or other developers who’ve handled this would be greatly appreciated.
Thank you!