I’ve been building something along these lines .my SQL table has the Timepoint from the change stream as a column. Any record coming from the genuine change stream will have a non-zero Timepoint.
After syncing regularly from the change stream for a while, the next time a new monthly drop appears, I have another command that can import from the .csv, and it assumes Timepoint zero for those records. My update code requires the update to have a Timepoint that is greater than or equal to the timepoint of the existing record, so snapshots cannot overwrite change stream records.
There is a problem, which is that the snapshot CSV and the change stream JSON only overlap partially in their schemas. Some things are only in the CSV, other things are only in the change stream. Because I want to maintain up-to-date information via the change stream, I’m ignoring anything in the CSV that isn’t in the change stream.
This appears to be less of a problem with Persons With Significant Control because the snapshot is in almost the same schema as the change stream (except with no Timepoints, so same approach of setting those to zero).