The purpose of my previous post on the TransXChange timetable data was to make it possible to track National Rail trains in real time. Due to the large number of stations making up the network and the fact that you can’t obtain information for a whole line in one go, the only viable option is to use timetable data. The other limiting factor is the lack of any kind of unique train identifier on the National Rail website (see: http://ojp.nationalrail.co.uk/service/ldbboard/dep/EUS ).
Trains going into or out of Waterloo for 16:42 on a weekday
The technique is quite simple and involves making requests to the National Rail website to probe the current positions of trains. We first ask where the trains should be at the current point in time based on the timetable. Then, we probe the running information for the stations just ahead of where the trains should be. Tested using the whole of the Greater London network, only 319 unique station requests were required to determine train positions out of a total of 2,575 stations. This number can be reduced even further as we only need to hit on a single station ahead of the train in order to find out whether it’s on time. The position can always be worked out from the timetable by asking where it should be on its route at the time now minus the late minutes.
Once all the data for the departure boards has been collected, the next stage is to match up trains to departure details for stations based on the passing points extracted from the TransXChange timetable. This links a train to the running service which tells us all the stopping points and times on its route, along with a unique route code. This unique route code is used to identify the same train on different departure boards so we can use the best position information available, in other words, the departure board that it is approaching next.
An interesting question is what happens if there is enough of a disruption to the services to make the timetable useless? In this situation, the concept of whether a train is late is meaningless, but we still have a system which can probe the departure boards and match trains using the runtimes between stations. Certain network geometries make it impossible to match trains accurately without timetable data if the destination is shared between two routes. For example, a “Y” section where two trains with the same destination code merge onto one line. Another complicating factor is the circular route, where trains all start at Waterloo and end up at Waterloo again.