Streaming API Snapshot Timepoints

Hi,

I have been trying to get the timepoints to work but it’s not working? e.g. I get an “Response status code 416 (Requested Range Not Satisfiable).” Has this now been implemented, if so what is the correct timepoint format and how long does the snapshot go back to?

Example using Unix epoch time?: https://stream.companieshouse.gov.uk/companies?timepoint=1616067600

Thanks,

Matt

I think this has been a common misunderstanding. Timepoint is not actually related to time. It seems to be a simple numbering of events coming from the stream. So first ever event in the stream would be 1 and next one 2 etc.

Just connect to the stream without supplying a timepoint. You can then note the timepoint from the responses and use it to reconnect if your connection breaks. e.g. latest number I’ve got is 123 then reconnect using timepoint=124.

Having said that, it would be handy to get some clarity on this and exactly how much buffer there is. i.e. is it a fixed number of messages? e.g. only last 100,000 events are available. Also an idea of current timepoint would make connecting easier especially for streams with few events.

2 Likes

Thanks for your message that has cleared most of my questions points, Also reading: Streaming API: helped, I’m with you on that one I agree that would be useful.

One point I will add from this post: Stream returning 416 when entering any timepoint you can see, @MArkWilliams Showing a user how to pre-generate a TimePoint, I would like to know how he’s doing this? And what format the TimePoints are using, because there must be a way to pre-generate a TimePoint outside of the stream using some type of TimePoint algorithm/format etc? e.g. what is the TimePoint comprised of/calculated or Generated?
As this would be useful if my stream application died and I lost the last TimePoint from the stream I could regenerate a TimePoint by hand to restart the stream where it crashed etc.

Ive just started getting into the streaming system and agree the timepoint seems to be incremented per event, not based on a “seconds since epoch” as I first thought.

Am I right in thinking that we can’t currently get a snapshot up to a specific day? We receive the daily bulk data files but the “format” is different. I dont mean this is JSON and bulk is fixed length fields, but company types, active status, the way the companies link to officers etc. Going to investigate if we can persuade the system to produce the files but if not it will be a slow process of hitting the api to pull all the data in over time.

Was reading elsewhere on this forum that snapshots are coming but with no firm date.

You can get a full list of companies as a CSV each month. I’m just taking that an enumerating my way through all the companies slowly. :disappointed:

UPDATE - Managed to figure out that if I save the timepoint from the stream if it crashed, I can start from that timepoint e.g. point in time, Like xiaozhouwang85’s answer, The timepoint is not related to time at all its more like an ID that references a request in the stream. In my case the last request in the stream before I restart it again. I will add that, timepoints expire up until a certain point they only go back approx 10 days though reading other posts from @MArkWilliams, example: Access to historical timepoints for streaming API - #2 by MArkWilliams. But for a quick restart of the stream it works.

Happy that you are sorted.
You should always record the timepoint you are up to, to ensure that you do not miss any updates.
The stream will disconnect you every 24 hrs anyway, but good practice is to connect at regular intervals with your last timepoint.

1 Like

Thanks @MArkWilliams that’s interesting to know and makes more sense, Cheers.

Hi Mark
How long should i be expecting to get results roughly without knowing my timepoint?
I have good enough speed at 60Mbps
https://stream.companieshouse.gov.uk/companies?