Skip to content

Instantly share code, notes, and snippets.

@h3ct0r
Created July 27, 2017 17:05
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save h3ct0r/a7623f7d8c087d5dee80976990599ff3 to your computer and use it in GitHub Desktop.
Save h3ct0r/a7623f7d8c087d5dee80976990599ff3 to your computer and use it in GitHub Desktop.
Compare time series SSE (sum squared error)
import pandas as pd
data_1 = {
'date': [ '2014-05-01 18:47:05.069722',
'2014-05-01 18:47:05.119994',
'2014-05-02 18:47:05.178768',
'2014-05-02 18:47:05.230071',
'2014-05-02 18:47:05.230071',
'2014-05-02 18:47:05.280592',
'2014-05-03 18:47:05.332662',
'2014-05-03 18:47:05.385109',
'2014-05-04 18:47:05.436523',
'2014-05-04 18:47:05.486877'
],
'data': [34, 25, 26, 15, 15, 14, 26, 25, 62, 41]
}
data_2 = {
'date': [ '2014-05-01 18:47:05.069722',
'2014-05-01 18:47:05.119994',
'2014-05-02 18:47:05.178768',
'2014-05-02 18:47:05.230071',
'2014-05-02 18:47:05.230071',
'2014-05-02 18:47:05.280592',
'2014-05-03 18:47:05.332662',
'2014-05-03 18:47:05.385109',
'2014-05-04 18:47:05.436523',
'2014-05-04 18:47:05.486877'
],
'data': [1, 2, 3, 4, 5, 6, 7, 8, 9, 111]
}
data_3 = {
'date': [ '2014-05-01 18:47:05.069722',
'2014-05-01 18:47:05.119994',
'2014-05-02 18:47:05.178768',
'2014-05-02 18:47:05.230071',
'2014-05-02 18:47:05.230071',
'2014-05-02 18:47:05.280592',
'2014-05-03 18:47:05.332662',
'2014-05-03 18:47:05.385109',
'2014-05-04 18:47:05.436523',
'2014-05-04 18:47:05.486877'
],
'data': [30, 20, 20, 16, 15, 13, 22, 21, 50, 31]
}
df1 = pd.DataFrame(data_1, columns = ['date', 'data'])
df2 = pd.DataFrame(data_2, columns = ['date', 'data'])
df3 = pd.DataFrame(data_3, columns = ['date', 'data'])
print 'Comparison df1 vs df2', ((df2['data'] - df1['data']) ** 2).sum()
print 'Comparison df1 vs df2', ((df3['data'] - df1['data']) ** 2).sum()
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment