import pandas
import os
filepath = os.path.join(os.getcwd(), 'sample_wind_data.csv')
import pandas as pd
# Read data from file 'filename.csv'
# (in the same directory that your python process is based)
# Control delimiters, rows, column names with read_csv (see later)
df = pd.read_csv(filepath)
df1 = df.head(48)
result = [df1['AVGspeed'][i:i+3] for i in range(len(df['AVGspeed'])-2) if all(i > 8 for i in df['AVGspeed'][i:i+3])]
result
result
is an array of numpy arrays that include a row's index and the corresponding AVGvalue
for consecutive rows of 3 with avg values > 8. This is not deduped.
For example, in the sample data, there are four 3 hour intervals when values are greater than 8. You can see that row 2 exists in results[0]
and result[1]
, row 3 exists in result[1]
and result[2]
, etc...
[0 12.960
1 11.180
2 9.835
Name: AVGspeed, dtype: float64,
1 11.180
2 9.835
3 8.047
Name: AVGspeed, dtype: float64,
2 9.835
3 8.047
4 9.388
Name: AVGspeed, dtype: float64,
6 10.28
7 10.28
8 10.28
Name: AVGspeed, dtype: float64]
This may not be what you want, so I included two other example output formats that might make sense? If you can provide markdown of the output format you want, that would be helpful to finish up the script. :)
Array of booleans that are all true and contain start time - end time of date time hours that are > 8.
result = [ [start time, endtime, true], [start time, endtime, true] ]
Array that contains start time
, end time
of date time hours that are > 8 and average of AVGspeed
during that period.
result = [ [start time, endtime, avg(AVGspeed of time period)], [start time, endtime, avg(AVGspeed of time period)]]