This file will include details on attempting to run an apache beam pipeline on a windows spark setup. Unlike many other techies out there, i am a firm Microsoft fan. I do believe they build quality products and over the last few years since Satya Nadella was handed the position of CEO, we're seeing massive strides on Microsoft products.
To that point, all my machines at home are windows based including my processing machine which i mosty use to run models or occasionally play a few games. A couple of weeks ago i decided to build a simple Apache Beam pipeline to process forex tick data into windowed candle stick form. Now this can be easily done in pandas if the data was smaller, having a 7GB tick data file, in order to do it using pandas we'd have to split the file into day or maybe week segments and process them one by one. In order to obviously avoid the simple way out, i decided to attempt to build a beam pipeline as it is promised to pr