Skip to content

Instantly share code, notes, and snippets.

@joseph-higaki
Last active August 2, 2021 10:26
Show Gist options
  • Save joseph-higaki/81880dc4ecb1878ff26a969f8e5ef2d6 to your computer and use it in GitHub Desktop.
Save joseph-higaki/81880dc4ecb1878ff26a969f8e5ef2d6 to your computer and use it in GitHub Desktop.
Back 2 Back Project Tasks

Back 2 Back Project Tasks

Having a simple task (called projects) table with task_id, start and end dates, identify begin and end date for groups of tasks that are back 2 back. Back 2 back (b2b), means that a task starts on the same date where another ended.

2 solutions:

1. Using CTEs

I approached the solution as one where you needed to group the b2b tasks by its root or beggining task. As this is a tree, I used recursion through common table expressions (CTEs).

I have a thing for recursion, as you can see here, where I built a date dimension using CTEs and the perks of cartesian product.

This query will identify the root tasks and then group by them. Once grouped, min and max dates are selected

WITH recursive CTE_Projects AS
(
    SELECT
         0 previous_task_id, task_id root_task_id, task_id, Start_Date, End_Date
    FROM Projects
    where Start_Date not IN (SELECT End_Date FROM Projects)
    union all
    select
           previous.task_id parent_task_id, previous.root_task_id root_task_id,  p.task_id, p.start_date, p.end_date
    from Projects p, CTE_Projects previous
    where p.start_date = previous.end_date
)
SELECT
    MIN(Start_Date), MAX(End_Date),
    DATEDIFF(MAX(End_Date), MIN(Start_Date))
from CTE_Projects
group by root_task_id
order by DATEDIFF(MAX(End_Date), MIN(Start_Date)), MIN(Start_Date)

2. Using joins and aggregates

This was probably the easier more obvious solution. Identify both start and end tasks and just identify the minimum end task that corresponds to each start task. This way you get start and end date of b2b tasks.

This solution takes the assumption that there are NO OVERLAPPING GROUP OF B2B TASKS

SELECT
    start_task.Start_Date,
    MIN(end_task.End_Date),
    DATEDIFF(MIN(end_task.End_Date), start_task.Start_Date)
FROM
    (SELECT Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) start_task,
    (SELECT End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) end_task
WHERE start_task.Start_Date <= end_task.End_Date
GROUP BY start_task.Start_Date
ORDER BY DATEDIFF(MIN(end_task.End_Date), start_task.Start_Date), start_task.Start_Date

This problem/question was taken from here You can use the MySQL script below to populate test data and test your solution against the actual hackerrank question.

/* Using CTEs */
WITH recursive CTE_Projects AS
(
SELECT
0 previous_task_id, task_id root_task_id, task_id, Start_Date, End_Date
FROM Projects
where Start_Date not IN (SELECT End_Date FROM Projects)
union all
select
previous.task_id parent_task_id, previous.root_task_id root_task_id, p.task_id, p.start_date, p.end_date
from Projects p, CTE_Projects previous
where p.start_date = previous.end_date
)
SELECT
MIN(Start_Date), MAX(End_Date),
DATEDIFF(MAX(End_Date), MIN(Start_Date))
from CTE_Projects
group by root_task_id
order by DATEDIFF(MAX(End_Date), MIN(Start_Date)), MIN(Start_Date)
/* Using joins and aggregates */
SELECT
start_task.Start_Date,
MIN(end_task.End_Date),
DATEDIFF(MIN(end_task.End_Date), start_task.Start_Date)
FROM
(SELECT Start_Date FROM Projects WHERE Start_Date NOT IN (SELECT End_Date FROM Projects)) start_task,
(SELECT End_Date FROM Projects WHERE End_Date NOT IN (SELECT Start_Date FROM Projects)) end_task
WHERE start_task.Start_Date <= end_task.End_Date
GROUP BY start_task.Start_Date
ORDER BY DATEDIFF(MIN(end_task.End_Date), start_task.Start_Date), start_task.Start_Date
create schema if not exists playground_project_management;
use playground_project_management;
drop table if exists projects;
create table projects
(
task_id int not null,
start_date date,
end_date date,
PRIMARY KEY (task_id)
);
/**** Template to fill in google sheets ****/
/*
insert into projects (task_id, start_date, end_date) values ({$task_id}, '{$start_date}','{$end_date}');
*/
insert into projects (task_id, start_date, end_date) values (1, '2015-10-01','2015-10-02');
insert into projects (task_id, start_date, end_date) values (24, '2015-10-02','2015-10-03');
insert into projects (task_id, start_date, end_date) values (2, '2015-10-03','2015-10-04');
insert into projects (task_id, start_date, end_date) values (23, '2015-10-04','2015-10-05');
insert into projects (task_id, start_date, end_date) values (3, '2015-10-11','2015-10-12');
insert into projects (task_id, start_date, end_date) values (22, '2015-10-12','2015-10-13');
insert into projects (task_id, start_date, end_date) values (4, '2015-10-15','2015-10-16');
insert into projects (task_id, start_date, end_date) values (21, '2015-10-17','2015-10-18');
insert into projects (task_id, start_date, end_date) values (5, '2015-10-19','2015-10-20');
insert into projects (task_id, start_date, end_date) values (20, '2015-10-21','2015-10-22');
insert into projects (task_id, start_date, end_date) values (6, '2015-10-25','2015-10-26');
insert into projects (task_id, start_date, end_date) values (19, '2015-10-26','2015-10-27');
insert into projects (task_id, start_date, end_date) values (7, '2015-10-27','2015-10-28');
insert into projects (task_id, start_date, end_date) values (18, '2015-10-28','2015-10-29');
insert into projects (task_id, start_date, end_date) values (8, '2015-10-29','2015-10-30');
insert into projects (task_id, start_date, end_date) values (17, '2015-10-30','2015-10-31');
insert into projects (task_id, start_date, end_date) values (9, '2015-11-01','2015-11-02');
insert into projects (task_id, start_date, end_date) values (16, '2015-11-04','2015-11-05');
insert into projects (task_id, start_date, end_date) values (11, '2015-11-05','2015-11-06');
insert into projects (task_id, start_date, end_date) values (15, '2015-11-06','2015-11-07');
insert into projects (task_id, start_date, end_date) values (10, '2015-11-07','2015-11-08');
insert into projects (task_id, start_date, end_date) values (14, '2015-11-11','2015-11-12');
insert into projects (task_id, start_date, end_date) values (12, '2015-11-12','2015-11-13');
insert into projects (task_id, start_date, end_date) values (13, '2015-11-17','2015-11-18');
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment