I know that officers are not truly uniquely identified and that we might have the same entity in real life being linked to different appointments_id.
Preamble to the questions is:
we query the API with a certain appointment_link containing an officer_id, say:
Let's store this json object in a variable called
r["name"] is COMWOOD SECRETARIAL LIMITED.
r["items"] contains all the actual appointments of COMWOOD SECRETARIAL LIMITED,
The list of
r["items"]["address"] in the object
r are not unique - which can be very normal as same entity can register at different addresses.
My questions are:
1) Just to confirm, even if the same entity has different addresses, since it is linked to the same officer_id, all appointments contained in
r["items"] are relating to the same officer (officer_id), right?
2) In the case where we have a set built by querying the API with 1000 appointment_links, if 1) is true, then we could apply the following logic¹:
if officer_id is not the same
and name is the same
and address² is the same
overwrite officer_id with the officer_id of the first record where this logic returns true
to have some sort of de duplication, with the caveat that of course, 2 officers with the exact same name could be registered at the exact same address - which is a false positive which I am willing to accept.
¹ this would be done in a simple SQL update statement.
² address would be the merged
r["items"]["address"] dictionary in one string.