I'm dabbling in historical data for the first time and been trying to make sense of the data format. Having downloaded all historical golf data and extracted some, I can see that the data files are arranged in directories by date and then event, and that within this event folder there are files for different markets and an overall event file. According to the Historical data FAQ section: "File are provided by both marketId and eventId. Event files contain all marketId’s within an event and these are interleaved with each other based on ascending published time order." Can I assume then that the event file is simply the union of all of the market files?
Also can anyone confirm for me whether or not the runner id for a particular golfer (or whatever) is the same across markets,events and dates? This is true for the limited number of events that I've looked at but there is nothing in the specification or elsewhere that I've read to say it's the case. I should probably use the runner id as given in the market definition, but it would be a nice shortcut if it was always the same for each golfer.
Lastly, should I be able to spot a pattern to the dates given by the path of the data files in relation to the dates the events took place? Or is the date given in the file path more dependant on when the data was processed? For example, for a particular golf event that I've been looking at the data is found in a ../2019/Aug folder and then within 4 separate subfolders: 15, 16, 17 & 18. This corresponds to the standard Thur - Sun golf event schedule but this particular event occurred on the 11th - 14th July - exactly 5 weeks earlier than the date implied by the data file path. This might suggest that each betting data for each day is processed 5 weeks later but the line "All data will be made available 5 days after the event settlement time" from the FAQ contradicts this as well.
I'd be grateful for any info on any of this.
Also can anyone confirm for me whether or not the runner id for a particular golfer (or whatever) is the same across markets,events and dates? This is true for the limited number of events that I've looked at but there is nothing in the specification or elsewhere that I've read to say it's the case. I should probably use the runner id as given in the market definition, but it would be a nice shortcut if it was always the same for each golfer.
Lastly, should I be able to spot a pattern to the dates given by the path of the data files in relation to the dates the events took place? Or is the date given in the file path more dependant on when the data was processed? For example, for a particular golf event that I've been looking at the data is found in a ../2019/Aug folder and then within 4 separate subfolders: 15, 16, 17 & 18. This corresponds to the standard Thur - Sun golf event schedule but this particular event occurred on the 11th - 14th July - exactly 5 weeks earlier than the date implied by the data file path. This might suggest that each betting data for each day is processed 5 weeks later but the line "All data will be made available 5 days after the event settlement time" from the FAQ contradicts this as well.
I'd be grateful for any info on any of this.


Comment