Skip to content

Instantly share code, notes, and snippets.

@krzysztof-slowinski
Last active March 2, 2018 16:39
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save krzysztof-slowinski/66d5fd123d7af14261347d7517d46192 to your computer and use it in GitHub Desktop.
Save krzysztof-slowinski/66d5fd123d7af14261347d7517d46192 to your computer and use it in GitHub Desktop.
Creating a new column based on index shifts of existing column and interpolated missing values (https://stackoverflow.com/questions/48563101/creating-a-new-column-based-on-index-shifts-of-existing-column-and-interpolated)
import pandas as pd
import numpy as np
def create_shift(df, column, shift_value, method, name):
"""
Create a new column based on an existing column with a given shift value. The shifted column is indexed based on an
existing index with he missing values interpolated using the given method.
:param df: DataFrame to create the shift in.
:param column: The column name.
:param shift_value: The value to shift the existing column by.
:param method: The interpolation method.
:param name: The name used for the newly created column.
"""
if column in df.columns:
current_index = df.index
# creating the shifted index with the 2 decimal point precision
shift_index = [round(i + shift_value, 2) for i in current_index.values]
shift_data = pd.Series(data=df[column].tolist(), index=shift_index)
# removing possible duplicates
shift_data = shift_data[~shift_data.index.duplicated(keep='first')]
shift_index = shift_data.index
missing_index = current_index.difference(shift_index)
combined_index = pd.Index(np.append(shift_index, missing_index)).sort_values()
combined_data = shift_data.reindex(combined_index)
combined_data.interpolate(method=method, inplace=True)
df[name] = combined_data
else:
print("[Warning] Cannot create shift {} for missing {} column...".format(name, column))
d1 = {'a': [4.0, 5.5, 5.5, 6.0, 8.5], 'b': [1.0, 2.5, 2.5, 3.0, 5.5]}
df1 = pd.DataFrame(data=d1, index=[0, 1.5, 1.5, 2, 4.5])
create_shift(df1, 'a', 0.5, 'linear', 'c')
print(df1)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment