Skip to content

Instantly share code, notes, and snippets.

@dsamojlenko
Last active February 13, 2017 10:29
MySQL Stored Procedure to get upcoming bus times for a stop from GTFS data. This query includes buses that started running before midnight on the night before but continue through the current day. It also includes service additions and removals from calendar_dates. WARNING: without the proper indices, this query can be VERY slow. See my other Gi…
DELIMITER $$
DROP PROCEDURE IF EXISTS `ocdata`.`getNextTripsForStop`$$
CREATE PROCEDURE `ocdata`.`getNextTripsForStop` (
IN SelectedStop VARCHAR(6),
IN SearchDateTime DATETIME
)
BEGIN
SET @StopID = SelectedStop;
SET @DateToday = SearchDateTime;
SET @DateYesterday = DATE_SUB(@DateToday, INTERVAL 1 DAY);
SET @DateTomorrow = DATE_ADD(@DateToday, INTERVAL 1 DAY);
SET @monday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'monday', 1, NULL);
SET @tuesday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'tuesday', 1, NULL);
SET @wednesday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'wednesday', 1, NULL);
SET @thursday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'thursday', 1, NULL);
SET @friday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'friday', 1, NULL);
SET @saturday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'saturday', 1, NULL);
SET @sunday = IF(LCASE(DATE_FORMAT(@DateToday,'%W')) = 'sunday', 1, NULL);
SET @ymonday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'monday', 1, NULL);
SET @ytuesday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'tuesday', 1, NULL);
SET @ywednesday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'wednesday', 1, NULL);
SET @ythursday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'thursday', 1, NULL);
SET @yfriday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'friday', 1, NULL);
SET @ysaturday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'saturday', 1, NULL);
SET @ysunday = IF(LCASE(DATE_FORMAT(@DateYesterday,'%W')) = 'sunday', 1, NULL);
SELECT stop_name,
route_short_name,
trip_headsign,
IF(arrival_time >= '24:00:00', SUBTIME(arrival_time,'24:00:00'),arrival_time) AS ArrivalTime,
ArrivalDate,
CONCAT(ArrivalDate, ' ', IF(arrival_time >= '24:00:00', SUBTIME(arrival_time,'24:00:00'),arrival_time)) AS ArrivalDateTime
FROM (
SELECT s.stop_name,
r.route_short_name,
t.trip_headsign,
st.arrival_time,
DATE(IF(st.arrival_time >= '24:00:00', DATE_ADD(@DateYesterday, INTERVAL 1 DAY), @DateYesterday)) AS ArrivalDate
FROM stops s
INNER JOIN stop_times st ON s.stop_id = st.stop_id
INNER JOIN trips t ON st.trip_id = t.trip_id
INNER JOIN routes r ON t.route_id = r.route_id
INNER JOIN calendar c ON t.service_id = c.service_id
WHERE c.service_id IN (
SELECT service_id
FROM calendar c
WHERE DATE_FORMAT(@DateYesterday, '%Y%m%d') BETWEEN c.start_date AND c.end_date
AND (
c.monday = @ymonday
OR c.tuesday = @ytuesday
OR c.wednesday = @ywednesday
OR c.thursday = @ythursday
OR c.friday = @yfriday
OR c.saturday = @ysaturday
OR c.sunday = @ysunday
)
AND service_id NOT IN (
SELECT service_id FROM calendar_dates WHERE date = @DateYesterday AND exception_type = 2
)
UNION
SELECT c.service_id
FROM calendar c
INNER JOIN calendar_dates cd ON c.service_id = cd.service_id
WHERE cd.exception_type = 1
AND cd.date = @DateYesterday
)
AND s.stop_code = @StopID
HAVING CONCAT(ArrivalDate, ' ', arrival_time) > @DateToday
UNION
SELECT s.stop_name,
r.route_short_name,
t.trip_headsign,
st.arrival_time,
DATE(IF(st.arrival_time >= '24:00:00', DATE_ADD(@DateToday, INTERVAL 1 DAY), @DateToday)) AS ArrivalDate
FROM stops s
INNER JOIN stop_times st ON s.stop_id = st.stop_id
INNER JOIN trips t ON st.trip_id = t.trip_id
INNER JOIN routes r ON t.route_id = r.route_id
INNER JOIN calendar c ON t.service_id = c.service_id
WHERE c.service_id IN (
SELECT service_id
FROM calendar c
WHERE DATE_FORMAT(@DateToday, '%Y%m%d') BETWEEN c.start_date AND c.end_date
AND (
c.monday = @monday
OR c.tuesday = @tuesday
OR c.wednesday = @wednesday
OR c.thursday = @thursday
OR c.friday = @friday
OR c.saturday = @saturday
OR c.sunday = @sunday
)
AND service_id NOT IN (
SELECT service_id FROM calendar_dates WHERE date = @DateToday AND exception_type = 2
)
UNION
SELECT c.service_id
FROM calendar c
INNER JOIN calendar_dates cd ON c.service_id = cd.service_id
WHERE cd.exception_type = 1
AND cd.date = @DateToday
)
AND s.stop_code = @StopID
HAVING CONCAT(ArrivalDate, ' ', arrival_time) > @DateToday
) a
HAVING ArrivalDateTime BETWEEN @DateToday AND DATE_ADD(@DateToday, INTERVAL 1 HOUR)
ORDER BY ArrivalDateTime;
END;
@carlospuk
Copy link

Hi Dave,

This is a great piece of SQL, thanks for sharing.

I may be wrong, but I think there's a problem with your @dayoftheweek matching on line 37. As these trips left the preceding day, should you not be checking the calendar day match for the preceding day too.

In other words, perhaps add this at the top:
SET @DayOfTheWeekYesterday = LCASE(DATE_FORMAT(DATE_SUB(NOW(), INTERVAL 1 DAY),'%W'));

then change line 37 to:
AND ', @DayOfTheWeekYesterday , ' = 1

@dsamojlenko
Copy link
Author

I didn't catch your comment until now... I haven't been doing any work with this in a while - but I think you're right. I see you added the changes to your fork - I'm going to try that out. Thanks!

@dsamojlenko
Copy link
Author

Revamped the code completely - I think this way of doing the c.{day of the week} clause is a bit more verbose, but it allows us to have a straight-up stored procedure instead of all that string concatenation nonsense from the previous version. sure wish we could have variable column names in mysql...

This takes care of routes that start the previous day before midnight (yesterday) ending after midnight (today). But I'm wondering if we need to do the same thing for the following day - if it's 11:55pm, there may be arrivals in the next hour that started after midnight (ie, tomorrow). This is already running slow-ish... hate to add a whole other UNION inside the sub-query...

@stebazzi
Copy link

stebazzi commented Jun 9, 2016

Hi Dave,
I tried to solve the problem between 11.00pm and 11.59.59pm without success.
The only working solution that I found is to change the line WHERE arrival_time BETWEEN TIME(NOW()) AND TIME(DATE_ADD(NOW(), INTERVAL 1 HOUR))
with simply
where arrival_time >= NOW()
but I don't like it because of the uncorrect order arrival time
Do you have any other possible solution?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment