Rules are declared in the ValidationRules
class. Below are details of currently implemented rules.
Error ID | Error Title |
---|---|
E001 | Not in POSIX time |
E002 | stop_time_updates not strictly sorted |
E003 | GTFS-rt trip_id does not exist in GTFS data |
E004 | GTFS-rt route_id does not exist in GTFS data |
E006 | Missing required trip field for frequency-based exact_times = 0 |
E009 | GTFS-rt stop_sequence isn't provided for trip that visits same stop_id more than once |
E010 | location_type not 0 in stops.txt (Note that this is implemented but not executed because it's specific to GTFS - see issue #126) |
E011 | GTFS-rt stop_id does not exist in GTFS data |
E012 | Header timestamp should be greater than or equal to all other timestamps |
E013 | Frequency type 0 trip schedule_relationship should be UNSCHEDULED or empty |
E015 | All stop_ids referenced in GTFS-rt TripUpdates and VehiclePositions feeds must have the location_type = 0 |
E016 | trip_ids with schedule_relationship ADDED must not be in GTFS data |
E017 | GTFS-rt content changed but has the same header timestamp |
E018 | GTFS-rt header timestamp decreased between two sequential iterations |
E019 | GTFS-rt frequency type 1 trip start_time must be a multiple of GTFS headway_secs later than GTFS start_time |
E020 | Invalid start_time format |
E021 | Invalid start_date format |
E022 | Sequential stop_time_update times are not increasing |
E023 | trip start_time does not match first GTFS arrival_time |
E024 | trip direction_id does not match GTFS data |
E025 | stop_time_update departure time is before arrival time |
E026 | Invalid vehicle position |
E027 | Invalid vehicle bearing |
E028 | Vehicle position outside agency coverage area |
E029 | Vehicle position far from trip shape |
E030 | GTFS-rt alert trip_id does not belong to GTFS-rt alert route_id in GTFS trips.txt |
E031 | Alert informed_entity.route_id does not match informed_entity.trip.route_id |
E032 | Alert does not have an informed_entity |
E033 | Alert informed_entity does not have any specifiers |
E034 | GTFS-rt agency_id does not exist in GTFS data |
E035 | GTFS-rt trip.trip_id does not belong to GTFS-rt trip.route_id in GTFS trips.txt |
E036 | Sequential stop_time_updates have the same stop_sequence |
E037 | Sequential stop_time_updates have the same stop_id |
E038 | Invalid header.gtfs_realtime_version |
E039 | FULL_DATASET feeds should not include entity.is_deleted |
E040 | stop_time_update doesn't contain stop_id or stop_sequence |
E041 | trip doesn't have any stop_time_updates |
E042 | arrival or departure provided for NO_DATA stop_time_update |
E043 | stop_time_update doesn't have arrival or departure |
E044 | stop_time_update arrival/departure doesn't have delay or time |
E045 | GTFS-rt stop_time_update stop_sequence and stop_id do not match GTFS |
E046 | GTFS-rt stop_time_update without time doesn't have arrival/departure time in GTFS |
E047 | VehiclePosition and TripUpdate ID pairing mismatch |
E048 | header timestamp not populated (GTFS-rt v2.0 and higher) |
E049 | header incrementality not populated (GTFS-rt v2.0 and higher) |
E050 | timestamp is in the future |
E051 | GTFS-rt stop_sequence not found in GTFS data |
E052 | vehicle.id is not unique |
Warning ID | Warning Title |
---|---|
W001 | timestamps not populated |
W002 | vehicle_id not populated |
W003 | ID in one feed missing from the other |
W004 | vehicle speed is unrealistic |
W005 | Missing vehicle_id in trip_update for frequency-based exact_times = 0 |
W006 | trip_update missing trip_id |
W007 | Refresh interval is more than 35 seconds |
W008 | Header timestamp is older than 65 seconds |
W009 | schedule_relationship not populated |
All times and timestamps must be in POSIX time (i.e., number of seconds since January 1st 1970 00:00:00 UTC).
Common mistakes - Accidentally using Java's System.currentTimeMillis()
, which is the number of milliseconds since January 1st 1970 00:00:00 UTC.
Possible solution - Use TimeUnit.MILLISECONDS.toSeconds(System.currentTimeMillis())
to convert from milliseconds to seconds.
header.timestamp
trip_update.timestamp
vehicle_postion.timestamp
stop_time_update.arrival/departure.time
alert.active_period.start
andalert.active_period.end
stop_time_updates
for a given trip_id
must be strictly ordered by increasing stop_sequence
- this also means that no stop_sequence
should be repeated.
From Stop Time Updates description:
Updates should be sorted by stop_sequence (or stop_ids in the order they occur in the trip).
From GTFS stop_times.txt
:
The values for stop_sequence must be non-negative integers, and they must increase along the trip.
This validation rule is implemented for both when stop_sequence
is provided in the GTFS-rt feed, and when stop_sequence
is omitted from the GTFS-rt feed.
Common mistakes - Assuming that the GTFS stop_times.txt
file will be grouped by trip_id
and sorted by stop_sequence
- while sorting the data is a good practice, it's not strictly required by the spec.
Possible solution - Group the GTFS stop_times.txt
records by trip_id
and sort by stop_sequence
. Also, make sure that no stop_sequence
is repeated in GTFS stop_times.txt
.
All trip_ids
provided in the GTFS-rt feed must exist in the GTFS data, unless their schedule_relationship
is set to ADDED
.
trip
says:
trip_id
- The trip_id from the GTFS feed that this selector refers to.
schedule_relationship
says:
If a trip is done in accordance with temporary schedule, not reflected in GTFS, then it shouldn't be marked as SCHEDULED, but marked as ADDED...
ADDED
- An extra trip that was added in addition to a running schedule, for example, to replace a broken vehicle or to respond to sudden passenger load.
All route_ids
provided in the GTFS-rt feed must exist in the GTFS data.
trip
says:
route_id
- The route_id from the GTFS that this selector refers to.
Frequency-based exact_times
= 0 trip_updates
must contain trip_id
, start_time
, and start_date
.
If a GTFS trip
contains multiple references to the same stop_id
(i.e., the vehicle visits the same stop_id
more than once in the same trip), then GTFS-rt stop_time_updates
for this trip must include stop_sequence
.
From stop_time_update
:
If the same stop_id is visited more than once in a trip, then stop_sequence should be provided in all StopTimeUpdates for that stop_id on that trip.
(Note that this is implemented but not executed because it's specific to GTFS - see issue #126
If location_type is used in stops.txt
, all stops referenced in stop_times.txt
must have location_type
of 0
All stop_ids
referenced in GTFS-rt feeds must exist in the GTFS data in stops.txt
.
From stop_time_update
):
stop_id
- Must be the same as in stops.txt in the corresponding GTFS feed.
From position
:
stop_id
- Identifies the current stop. The value must be the same as in stops.txt in the corresponding GTFS feed.
No timestamps for individual entities (TripUpdate, VehiclePosition, Alerts) in the feeds should be greater than the header timestamp.
From header
:
timestamp
- This timestamp identifies the moment when the content of this feed has been created (in server time). In POSIX time (i.e., number of seconds since January 1st 1970 00:00:00 UTC). To avoid time skew between systems producing and consuming realtime information it is strongly advised to derive timestamp from a time server. It is completely acceptable to use Stratum 3 or even lower strata servers since time differences up to a couple of seconds are tolerable.
For frequency-based exact_times=0 trips, schedule_relationship should be UNSCHEDULED
or empty.
From Trip Updates -> Trip Descriptor description:
UNSCHEDULED
- This trip is running and is never associated with a schedule. For example, if there is no schedule and the buses run on a shuttle service.
From trip_update.trip.schedule_relationship
:
UNSCHEDULED
- A trip that is running with no schedule associated to it, for example, if there is no schedule at all.
E015 - All stop_ids
referenced in GTFS-rt TripUpdates and VehiclePositions feeds must have the location_type
= 0
All stop_ids
referenced in GTFS-rt TripUpdates and VehiclePositions feeds must have the location_type
= 0 in GTFS stops.txt
.
Alerts may reference stops with location_type
other than 0 (e.g., for pathway nodes of 2-4).
From GTFS stop_times.txt
:
stop_id
- ...The stop_id is referenced from the stops.txt file. If location_type is used in stops.txt, all stops referenced in stop_times.txt must have location_type of 0.
- GTFS
stop_times.txt
- GTFS
stops.txt
stop_time_update.stop_id
position.stop_id
informed_entity.stop_id
Trips that have a schedule_relationship
of ADDED
must NOT be included in the GTFS data.
From trip.schedule_relationship
:
ADDED
- An extra trip that was added in addition to a running schedule, for example, to replace a broken vehicle or to respond to sudden passenger load.
From Trip Updates -> Trip Descriptor description:
Added - This trip was not scheduled and has been added. For example, to cope with demand, or replace a broken down vehicle.
The GTFS-rt header timestamp
value should always change if the feed contents change - the feed contents must not change without updating the header timestamp
.
Common mistakes - If there are multiple instances of GTFS-realtime feed behind a load balancer, each instance may be pulling information from the real-time data source and publishing it to consumers slightly out of sync. If a GTFS-rt consumer makes two back-to-back requests, and each request is served by a different GTFS-rt feed instance, the same feed contents could potentially be returned to the consumer with different timestamps.
Possible solution - Configure the load balancer for "sticky routes", so that the consumer always receives the GTFS-rt feed contents from the same GTFS-rt instance.
The GTFS-rt header timestamp
should be monotonically increasing - it should always be the same value or greater than previous feed iterations if the feed contents are different.
Common mistakes - If there are multiple instances of GTFS-realtime feed behind a load balancer, each instance may be pulling information from the real-time data source and publishing it to consumers slightly out of sync. If a GTFS-rt consumer makes two back-to-back requests, and each request is served by a different GTFS-rt feed instance, the same feed contents could potentially be returned to the consumer with the most recent feed response having a timestamp that is less than the previous feed response.
Possible solution - Configure the load balancer for "sticky routes", so that the GTFS-rt consumer always receives the GTFS-rt feed contents from the same GTFS-rt instance.
For frequency-based trips defined in frequencies.txt
with exact_times
= 1, the GTFS-rt trip start_time
must be some multiple (including zero) of headway_secs
later than the start_time
in file frequencies.txt
for the corresponding time period. Note that this doesn't not apply to frequency-based trips defined in frequencies.txt
with exact_times
= 0.
From trip.start_time
:
start_time
- ...If the trip corresponds to exact_times=1 GTFS record, then start_time must be some multiple (including zero) of headway_secs later than frequencies.txt start_time for the corresponding time period.
start_time
must be in the format HH:MM:SS
or H:MM:SS
. Note that times can exceed 24 hrs if service goes into the next service day.
From trip.start_time
:
start_time
- ...Format and semantics of the field is same as that of GTFS/frequencies.txt/start_time, e.g., 1:15:35 or 25:15:35.
start_date
must be in the YYYYMMDD
format.
From trip.start_date
:
start_date
- The scheduled start date of this trip instance...In YYYYMMDD format.
stop_time_update
arrival/departure times between sequential stops should always increase - they should never be the same or decrease.
For normal scheduled trips (i.e., not defined in frequencies.txt
), the GTFS-realtime trip start_time
must match the first GTFS arrival_time
in stop_times.txt
for this trip.
From trip.start_time
:
start_time
- The initially scheduled start time of this trip instance. When the trip_id corresponds to a non-frequency-based trip, this field should either be omitted or be equal to the value in the GTFS feed.
Common mistakes - Accidentally providing a GTFS-realtime time that is modulo 24hr, such as 00:02:00
, when that trip start time in GTFS stop_times.txt
is after midnight of the service day, such as 24:02:00
Possible solution - Make sure that any start_times
in GTFS-realtime match that same trip start time in GTFS stop_times.txt
, especially if the trip starts after midnight of the service day.
GTFS-rt trip direction_id
must match the direction_id
in GTFS trips.txt
.
From trip.direction_id
:
direction_id
- The direction_id from the GTFS feed trips.txt file, indicating the direction of travel for trips this selector refers to.
Within the same stop_time_update
, arrival and departures times can be the same, or the departure time can be later than the arrival time - the departure time should never come before the arrival time.
Vehicle position must be valid WGS84 coordinates - latitude must be between -90 and 90 (inclusive), and vehicle longitude must be between -180 and 180 (inclusive).
From vehicle.position
:
latitude
- Degrees North, in the WGS-84 coordinate system.longitude
- Degrees East, in the WGS-84 coordinate system.
Vehicle bearing must be between 0 and 360 degrees (inclusive). The GTFS-rt spec says bearing is:
...in degrees, clockwise from True North, i.e., 0 is North and 90 is East. This can be the compass bearing, or the direction towards the next stop or intermediate location. This should not be deduced from the sequence of previous positions, which clients can compute from previous data.
The vehicle position
should be inside the agency coverage area. Coverage area is defined by a buffer surrounding the GTFS shapes.txt
data, or stops.txt
locations if the GTFS feed doesn't include shapes.txt
.
Buffer distance is defined by GtfsMetadata.REGION_BUFFER_METERS
, and is currently 1609 meters (roughly 1 mile).
The vehicle position
should be within a buffer surrounding the GTFS shapes.txt
data for the current trip unless there is an alert
with the effect
of DETOUR
for this trip_id
.
Buffer distance is defined by GtfsMetadata.TRIP_BUFFER_METERS
, and is currently 200 meters (roughly 1/8 of a mile).
The GTFS-rt alert.informed_entity.trip.trip_id
should belong to the specified GTFS-rt alert.informed_entity.route_id
in GTFS trips.txt
.
The alert.informed_entity.trip.route_id
should be the same as the specified alert.informed_entity.route_id
.
All alerts must have at least one informed_entity
.
From alert.informed_entity
:
The values of the fields should correspond to the appropriate fields in the GTFS feed. At least one specifier must be given. If several are given, then the matching has to apply to all the given specifiers.
Alert informed_entity
should have at least one specified value (route_id
, trip_id
, stop_id
, etc) to which the alert applies.
All agency_ids
provided in the GTFS-rt alert.informed_entity.agency_id
should also exist in GTFS agency.txt
.
The GTFS-rt trip.trip_id
should belong to the specified trip.route_id
in GTFS trips.txt
.
trip
says:
If route_id is also set, then it should be same as one that the given trip corresponds to.
Sequential GTFS-rt trip stop_time_updates
should never have the same stop_sequence
- stop_sequence
must increase for each stop_time_update
.
From GTFS stop_times.txt
:
The values for stop_sequence must be non-negative integers, and they must increase along the trip.
Common mistakes - Repeated records in the GTFS stop_times.txt
file
Possible solution - Make sure that no stop_sequence
is repeated in GTFS stop_times.txt
.
Sequential GTFS-rt trip stop_time_updates
shouldn't have the same stop_id
- sequential stop_ids
should be different. If a stop_id
is visited more than once in a trip (i.e., a loop), and if no stop_time_updates
in the loop are provided in the feed, and if the stop_sequence
field of the stop where the loop starts/stops is provided in the GTFS-rt feed for the given stop_id
, then this may not be an error.
header.gtfs_realtime_version
is required and must be a valid value. Currently, the only valid values are 1.0
and 2.0
.
The entity.is_deleted
field should only be included in GTFS-rt feeds with header.incrementality
of DIFFERENTIAL
.
All stop_time_updates
must contain stop_id
or stop_sequence
- both fields cannot be left blank.
From trip.stop_time_update
:
The update is linked to a specific stop either through stop_sequence or stop_id, so one of these fields must necessarily be set.
Unless a trip's
schedule_relationship
is CANCELED
, a trip
must have at least one stop_time_update
If a stop_time_update
has a schedule_relationship
of NO_DATA
, then neither arrival
nor departure
should be provided.
From stop_time_update.schedule_relationship
:
NO_DATA
-> No data is given for this stop. It indicates that there is no realtime information available. When set NO_DATA is propagated through subsequent stops so this is the recommended way of specifying from which stop you do not have realtime information. When NO_DATA is set neither arrival nor departure should be supplied.
If a stop_time_update
doesn't have a schedule_relationship
of SKIPPED
or NO_DATA
, then either arrival
or departure
must be provided.
From stop_time_update.schedule_relationship
:
SCHEDULED
-> The vehicle is proceeding in accordance with its static schedule of stops, although not necessarily according to the times of the schedule. This is the default behavior. At least one of arrival and departure must be provided. If the schedule for this stop contains both arrival and departure times then so must this update.
If the stop_time_update.schedule_relationship
is not SKIPPED
, stop_time_update.arrival
and stop_time_update.departure
must have either delay
or time
- both fields cannot be missing.
Stop Time Updates description says:
The update can provide a exact timing for arrival and/or departure at a stop in StopTimeUpdates using StopTimeEvent. This should contain either an absolute time or a delay (i.e. an offset from the scheduled time in seconds).
stop_time_update.schedule_relationship
says:
SKIPPED
- The stop is skipped, i.e., the vehicle will not stop at this stop. Arrival and departure are optional.
- Stop Time Updates description
stop_time_update
referencestop_time_update.arrival and stop_time_update.departure (StopTimeEvent)
stop_time_update.schedule_relationship
If GTFS-rt stop_time_update contains both stop_sequence and stop_id, the values must match the GTFS data in stop_times.txt
If only delay
is provided in a stop_time_update
arrival
or departure
(and not a time
), then the GTFS stop_times.txt
must contain arrival_times and/or departure_times for these corresponding stops. A delay
value in the real-time feed is meaningless unless you have a clock time to add it to in the GTFS stop_times.txt
file.
Common mistakes - Providing a arrival/departure.delay
value, but not providing a arrival/departure.time
value for non-timepoint stops that do not have an arrival_time
or departure_time
in GTFS stop_times.txt
.
Possible solution - Add a time
value to the GTFS-rt feed for the arrival
and departure
, or add an arrival_time
and departure_time
in GTFS stop_times.txt
.
stop_time_update
stop_time_update.arrival and stop_time_update.departure (StopTimeEvent)
- GTFS
stop_times.txt
If separate VehiclePositions
and TripUpdates
feeds are provided, VehicleDescriptor
or TripDescriptor
ID value pairing should match between the two feeds.
In other words, if the VehiclePosition
has a vehicle_id
A that is assigned to trip_id
4, then the TripUpdate
feed should have a prediction for trip_id
4 that includes a reference to vehicle_id
A. If the trip_id
of 4 is paired with a different vehicle_id
B in one of the two feeds, this is an error.
Note that this is different from W003, which simply checks to see if an ID that is provided in one feed is provided in the other - that is a warning.
timestamp
must be populated in FeedHeader
for gtfs_realtime_version
v2.0 and higher.
incrementality
must be populated in FeedHeader
for gtfs_realtime_version
v2.0 and higher.
All timestamps should be less than the current time.
header.timestamp
says:
This timestamp identifies the moment when the content of this feed has been created (in server time). In POSIX time (i.e., number of seconds since January 1st 1970 00:00:00 UTC). To avoid time skew between systems producing and consuming realtime information it is strongly advised to derive timestamp from a time server. It is completely acceptable to use Stratum 3 or even lower strata servers since time differences up to a couple of seconds are tolerable.
Timestamps are flagged as being in the future if they greater than the current time plus TimestampValidator.MAX_IN_FUTURE_SECONDS
, which is currently set to 60 seconds.
All stop_time_update
stop_sequences
in GTFS-realtime data must appear in GTFS stop_times.txt
for that trip.
To keep GTFS-rt validator runtime performance at O(n) for GTFS stop_times.txt (i.e., so we don't have to loop through the entire GTFS stop_times.txt for each GTFS-rt stop_time_update, which would be O(n*m)), if E051 is logged for a stop_time_update
, subsequent stop_time_updates
in that same GTFS-rt trip will not be checked for other errors or warnings (e.g., E046 - GTFS-rt stop_time_update
without time
doesn't have arrival/departure_time in GTFS).
See this issue for details.
Each vehicle should have a unique ID.
From VehiclePosition.VehicleDescriptor for vehicle.id
:
Internal system identification of the vehicle. Should be unique per vehicle, and is used for tracking the vehicle as it proceeds through the system. This id should not be made visible to the end-user; for that purpose use the label field
timestamps
should be populated for FeedHeader
, TripUpdates
, VehiclePositions
, and Alerts
.
Including timestamps
for each entity type enhances the transit rider experience, as consumers can show timestamp
information to end users give them an idea of how old certain information is.
For example, when a vehicle position is shown on a map, the marker may say "Data updated 17 sec ago" (see screenshot below). If vehicle position timestamps
aren't included, then the consumer must use the GTFS-rt header timestamp
, which may be much more recent than the actual vehicle position, resulting in misleading information being show to end users.
vehicle_id
should be populated for TripUpdates and VehiclePositions.
Populating vehicle_ids
in TripUpdates is important so consumers can relate a given arrival/departure prediction to a particular vehicle.
If separate VehiclePositions
and TripUpdates
feeds are provided, a trip_id
that is provided in the VehiclePositions
feed should be provided in the TripUpdates
feed, and a vehicle_id that is provided in the TripUpdates
feed should be provided in the VehiclePositions
feed.
In other words, if the VehiclePosition
has a vehicle that is assigned to trip_id
4, then the TripUpdate
feed should have a prediction for trip_id
4.
Note that when a vehicle is serving more than one trip in a block, it is recommended to include not only a TripUpdate for the currently served trip, but also a TripUpdate for the next trip to be served. In this case, there will not yet be a VehiclePosition for the next TripUpdate, and the W003 warning can be ignored.
Note that this is different from E047, which checks for a mismatch of IDs between the feeds - that is an error.
vehicle.position.speed
has an unrealistic speed that may be incorrect.
Speeds are flagged as unrealistic if they are greater than VehicleValidator.MAX_REALISTIC_SPEED_METERS_PER_SECOND
, which is currently set to 26 meters per second (approx. 60 miles per hour).
Common mistakes - Accidentally setting the speed value in miles per hour, instead of meters per second.
Possible solution - Check to make sure the speed units are meters per second.
Frequency-based exact_times = 0 trip_updates should contain vehicle_id
. This helps disambiguate predictions in situations where more than one vehicle is running the same trip instance simultaneously.
trips
should include a trip_id
. A missing trip_id
is usually an error in the feed (especially for frequency-based exact_times
= 0 trips - see E006), although the section on "Alternative trip matching" includes one exception:
Trips which are not frequency based may also be uniquely identified by a TripDescriptor including the combination of:
route_id
direction_id
start_time
start_date
...where
start_time
is the scheduled start time as defined in the static schedule, as long as the combination of ids provided resolves to a unique trip.
GTFS-realtime feeds should be refreshed at least every 35 seconds.
The data in a GTFS-realtime feed should always be less than one minute old.
trip.schedule_relationship
and stop_time_update.schedule_relationship
should be populated.