Skip to content

Instantly share code, notes, and snippets.

@profjsb
Last active October 18, 2016 17:07
Show Gist options
  • Save profjsb/29713c63d874766b1909579c3f71200a to your computer and use it in GitHub Desktop.
Save profjsb/29713c63d874766b1909579c3f71200a to your computer and use it in GitHub Desktop.

Time-series BoF (Moore-Sloan 2016 retreat)

Organizers: Brett Naul, Josh Bloom, Stéfan van der Walt

Traditional inference techniques and infrastructure tools can be illsuited to time-series data, which may be noisy, streaming, multispectral, irregularly sampled, and/or extremely large. This BoF is aimed at identifying promising approaches to time series analysis across a diverse set of use cases and finding projects of common interest for potential future collaborations.

Approach:

  • Share use cases, tooling & pain points around time series analysis and inference
  • Identify common tools & difficulties across use cases
  • Open issues discussions
  • Find potential cross-domain/cross-methodology areas for future collaboration

Areas of interest:

  • How to deal with "too much data" -- sensors generating more data than can bear sent to analysis pipeline (e.g., radio astronomy, high-energy physics)
  • Rapid/real-time inference with limited data (seismology)
  • Dealing with concept drift; online/incremental models
  • Databases/query mechanisms for time-series (e.g., InfluxDB)
  • Inference influencing outcomes influencing inference... (e.g., reinforcement learning)
  • Anomaly detection
  • Handling noisy, uncertain, irregularly sampled data
@bnaul
Copy link

bnaul commented Oct 18, 2016

Looks good 👍. Just a couple of minor style things:

  • "time series" (below Approach) vs. "time-series"
  • I'd make "... (Reinforcement learning)" consistent with the style of the line above it: "(e.g., reinforcement learning)"

@choldgraf
Copy link

Another thought or potential topic of conversation: challenges in dealing with scale. E.g., as a neuroscientist when I think "time series" I think on the order of >= 2Hz, and in my specific case of electrophysiology, more like >=100Hz. That seems to be way faster than most other kinds of data out there, and as such many packages that are designed for TSA (e.g. pandas' time series functionality) isn't really useful for neuro analysis. I wonder if that's a fundamental problem that can't be resolved, or if there are more clever package structures that could handle it.

@stefanv
Copy link

stefanv commented Oct 18, 2016

"sensors generating more data than can bear sent to analysis pipeline" -- maybe "that the analysis pipeline can bear" or "than can be consumed by the analysis pipeline" ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment