I would like to make a seamless integration of data from the PSC bulk file and the PSC streaming API.
The only timestamp information on the PSC bulk file comes in the last line. For example, for the file available a few days ago this was (note I have adjusted formatting slightly for reading clarity):
{“data”:
{
“kind”: “totals#persons-of-significant-control- snapshot”,
“persons_of_significant_control_count”: 9810297,
“statements_count”: 628256,
“exemptions_count”: 61,
“generated_at”: “2022-07-03T03:41:30+01:00”
}
}
My question is: Can I rely on generated_at for comparison with the event.published_at data element from the streaming API? More specifically, is it guaranteed that the bulk file accounts for all changes/deletions made up to the generated_at moment and nothing else, so that if I begin with API timepoints with event.published_at immediately after that timestamp that I will have a seamless integration with no data loss?
I realize there may be timezone issue based on the formatting of the timestamps, but I should be able to figure that out.
Thanks