Skip to content

Instantly share code, notes, and snippets.

@jamescdavis
Last active March 19, 2024 17:57
Show Gist options
  • Save jamescdavis/9fe1eb5f7b205c5a50d27cba5c183f64 to your computer and use it in GitHub Desktop.
Save jamescdavis/9fe1eb5f7b205c5a50d27cba5c183f64 to your computer and use it in GitHub Desktop.
Experiences SDK Error Handling & Logging

Experiences SDK Error Handling & Logging

Currently, we have no error handling or logging in the Experiecnes SDK. This has been mostly fine so far because of these assumptions:

  1. When EC/index.html (or any file) is not found, we handle this in CloudFront by catching 403 and instead returning 200
  2. Any other error from CloudFront is likely quite rare (assuming the current configuration & permissions remain in place)
  3. We just accept and use whatever is in index.html without any kind of validation

Here are some considerations for each of these assumption:

With the introduction of reading a configuration file from the CDN we are introducing validation of said file, which invalidates assumption 3 for config.json. Config files that do not validate will be an error state (it will be handled and revert to default behavior of serving index.html, but still an error state since we cannot interpret the intended behavior from the config file). While we can and certainly will validate config.json on our side when it's written, we will still want/need to validate within the SDK and preferably report invalid files.

While assumption 2 may be mostly valid, right now the SDK will throw an error (technically, reject a promise) when any error is encountered making a request. It seems prudent that we either catch and handle the promise rejection (and probably log it), or instruct retailers to do so. Since there's not much a retailer could do with the error (besides inform us), it seems like the SDK handling and logging the error would be more useful.

I'm not sure of/can't remember the history behing assumption 1, but it seems like there could be a future where we don't need this and 403s are just handled properly by the SDK. The exists() check would then be required to avoid an error state (attempting to render EC when none exists and getting a 403 instead of a 200). This seems fine for retailers using the SDK (assuming they are all upgraded to a version that supports 403 handling), but are there any other integrations still in use that rely on this 403 handling in CloudFront?

Some open questions:

  1. Should we be handling request errors in the SDK?
  2. If so, at what level? (e.g. in the request util, guaranteeing all requests are handled, or in consumers of the request util, so that more specific handling/logging can be done)
  3. Should we log errors?
  4. If so, what mechanism should we use? (log error codes to retail-client-events-service? somewhere else?
  5. Should errors end up in BugSnag (I don't think we can or should do this directly, but perhaps indirectly through retail-client-events-service?)
  6. If we log, what should we do when logging fails? (related to log queueing/retry)
@izgeri
Copy link

izgeri commented Mar 19, 2024

Just noting here for anyone else reading -

Data Dogs discussed this a bit and we thought a nice interim solution could be to add an error logging utility to the SDK and do a nearterm release that just logs an sdk_error event so that we can start to collect data from the retailers that will auto-upgrade to the latest SDK. That will give us some insight into how frequently errors are happening in practice today and how critical / urgent this problem is.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment