Skip to content

Instantly share code, notes, and snippets.

@joel-bernstein
Last active September 10, 2018 15:51
Show Gist options
  • Save joel-bernstein/17240409050e85ea9ebe22793816be3c to your computer and use it in GitHub Desktop.
Save joel-bernstein/17240409050e85ea9ebe22793816be3c to your computer and use it in GitHub Desktop.
Temporal Filter
complement(on="Product_s, Actual_Start_Date_dt",
sort(by="Product_s asc, Actual_Start_Date_dt asc",
having(select(innerJoin(search(cp, q="*:*", fl="id, Product_s, Reported_Date_dt", sort="Product_s asc", qt="/export"),
search(cp-changes, q="*:*", fq="Actual_Start_Date_dt.epoch:[1 TO *]", fl="id, Product_Name_s, Actual_Start_Date_dt", sort="Product_Name_s asc", qt="/export"),
on="Product_s=Product_Name_s"),
id, Product_s, Product_Name_s, Reported_Date_dt, Actual_Start_Date_dt, sub(epoch(Reported_Date_dt), epoch(Actual_Start_Date_dt)) as diff),
and(gt(diff, 0), lt(diff, 86400000)))),
sort(by="Product_s asc, Actual_Start_Date_dt asc",
having(select(innerJoin(search(cp, q="*:*", fl="id, Product_s, Reported_Date_dt", sort="Product_s asc", qt="/export"),
search(cp-changes, q="*:*", fq="Actual_Start_Date_dt.epoch:[1 TO *]", fl="id, Product_Name_s, Actual_Start_Date_dt", sort="Product_Name_s asc", qt="/export"),
on="Product_s=Product_Name_s"),
id, Product_s, Product_Name_s, Reported_Date_dt, Actual_Start_Date_dt, sub(epoch(Reported_Date_dt), epoch(Actual_Start_Date_dt)) as diff),
and(lt(diff, 0), gt(diff, -86400000)))))
@joel-bernstein
Copy link
Author

joel-bernstein commented Sep 9, 2018

The Streaming Expression above is designed to find records that match the following temporal pattern:

Find incident reports for products that occur within a 24 hour window following a change request. With the further stipulation that the change request not be preceded by an incident report for the past 24 hours.

The expression accomplishes this by computing the complement of two joins with temporal filters:

Join 1: Joins all the incidents and changes by product. The joined records are then filtered to include only records where an incident was reported within a 24 hour window after the change.

Join 2: Joins all the incidents and changes by product. The joined records are then filtered to include only records where an incident was reported within a 24 hour window before the change.

@joel-bernstein
Copy link
Author

joel-bernstein commented Sep 9, 2018

Visual representation of the DAG (Directed acyclic graph) created by the Streaming Expression:
screen shot 2018-09-09 at 1 20 15 pm

@joel-bernstein
Copy link
Author

joel-bernstein commented Sep 9, 2018

The time series plot below is an example of the pattern we are looking for. The plot is for the changes and incidents for the SAP S4 HANA product on 2018-07-19. Change requests are in blue and incident reports are in red. Notice the incident reports follow the change request, and the change request is not preceded by an incident.
screen shot 2018-09-09 at 1 28 38 pm

@dennisgove
Copy link

I don't see any reason why evaluators can't be used as part of any of the Join* types to avoid a full cartesian product during the join followed by a filter. It'd really just be a matter of wrapping the evaluators in an Equalitor and using that equalitor for the join.

@dennisgove
Copy link

It'd probably be a nice approach to make all equalitors used in joins based on evaluators, actually. The current equality joins can be represented as evaluators with little effort. And using the full suite of evaluators in Joins opens up the ability to join on more than equality.

@joel-bernstein
Copy link
Author

Yeah, I was thinking the same thing. This would be a generalized approach for adding complex join constraints.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment