Hi -
I know that officers are not truly uniquely identified and that we might have the same entity in real life being linked to different appointments_id.
Preamble to the questions is:
we query the API with a certain appointment_link containing an officer_id, say:
umVXYzu2PmpPehTY22bsCgQdmHA.
Let’s store this json object in a variable called r
.
r["name"]
is COMWOOD SECRETARIAL LIMITED.
r["items"]
contains all the actual appointments of COMWOOD SECRETARIAL LIMITED,
The list of r["items"]["address"]
in the object r
are not unique - which can be very normal as same entity can register at different addresses.
My questions are:
-
Just to confirm, even if the same entity has different addresses, since it is linked to the same officer_id, all appointments contained in
r["items"]
are relating to the same officer (officer_id), right? -
In the case where we have a set built by querying the API with 1000 appointment_links, if 1) is true, then we could apply the following logic¹:
if officer_id is not the same
and name is the same
and address² is the same
overwrite officer_id with the officer_id of the first record where this logic returns true
to have some sort of de duplication, with the caveat that of course, 2 officers with the exact same name could be registered at the exact same address - which is a false positive which I am willing to accept.
endnotes:
¹ this would be done in a simple SQL update statement.
² address would be the merged r["items"]["address"]
dictionary in one string.