Skip to content

Instantly share code, notes, and snippets.

View AlenkaF's full-sized avatar

Alenka Frim AlenkaF

View GitHub Profile
@AlenkaF
AlenkaF / arrow_build.bash
Created October 13, 2021 11:51
How I tried to build PyArrow
# --- Versions ---
# Xcode: Xcode 13.0, Build version 13A233
# Anaconda:conda 4.10.1
# Python:3.9
# --- Install XCode ---
# Maybe there should be a check if it is installed correctly?
# Haven't found any help online. It was strange that it wasn't able to open the
# Arrow folder at startup.
pushd arrow
git submodule init
git submodule update
export PARQUET_TEST_DATA="${PWD}/cpp/submodules/parquet-testing/data"
export ARROW_TEST_DATA="${PWD}/testing/data"
popd
conda create -y -n pyarrow-dev-no-gandiva -c conda-forge \
--file arrow/ci/conda_env_unix.txt \
--file arrow/ci/conda_env_cpp.txt \
@AlenkaF
AlenkaF / Benchmark_refactoring_PR.py
Last active August 30, 2022 15:19
Refactoring PR benchmark results, 10 iterations
##########################################
# dataframe-to-table
(qa) (base) alenkafrim@Alenkas-MacBook-Pro benchmarks % conbench dataframe-to-table chi_traffic_2020_Q1 --iterations 5
Time to POST http://localhost:5000/api/login/ 0.05665302276611328
POST http://localhost:5000/api/login/ failed
Time to POST http://localhost:5000/api/benchmarks/ 0.005268096923828125
POST http://localhost:5000/api/benchmarks/ failed
@AlenkaF
AlenkaF / Benchmark_baseline_PR.py
Last active August 30, 2022 13:31
Baseline PR benchmark results, 10 iterations
##########################################
# dataframe-to-table
(qa) (base) alenkafrim@Alenkas-MacBook-Pro benchmarks % conbench dataframe-to-table chi_traffic_2020_Q1 --iterations 5
Time to POST http://localhost:5000/api/login/ 0.0827488899230957
POST http://localhost:5000/api/login/ failed
Time to POST http://localhost:5000/api/benchmarks/ 0.004584789276123047
POST http://localhost:5000/api/benchmarks/ failed
(pyarrow-dev) C:\Users\Alenka\repos\arrow\cpp\build>cmake -G "Ninja" ^
More? -DARROW_DEPENDENCY_SOURCE=CONDA ^
More? -DCMAKE_INSTALL_PREFIX=%ARROW_HOME% ^
More? -DCMAKE_PREFIX_PATH=%ARROW_HOME% ^
More? -DARROW_BUILD_STATIC=OFF ^
More? -DARROW_CXXFLAGS="/WX /MP" ^
More? -DARROW_WITH_LZ4=ON ^
More? -DARROW_WITH_SNAPPY=ON ^
More? -DARROW_WITH_ZLIB=ON ^
More? -DARROW_WITH_ZSTD=ON ^
@AlenkaF
AlenkaF / Steps_build_Windows.txt
Last active October 4, 2022 13:09
Steps to build PyArrow on Windows
"C:\Program Files (x86)\Microsoft Visual Studio\2017\Community\Common7\Tools\VsDevCmd.bat" -arch=amd64
set CC=cl.exe
set CXX=cl.exe
conda create -y -n pyarrow-dev -c conda-forge ^
--file arrow\ci\conda_env_cpp.txt ^
--file arrow\ci\conda_env_python.txt ^
--file arrow\ci\conda_env_gandiva.txt ^
python=3.9
conda activate pyarrow-dev
@AlenkaF
AlenkaF / pyarrow_build_output.txt
Last active September 12, 2022 04:48
The ouput of the PyArrow build
(pyarrow-dev) C:\Users\Alenka\repos\arrow\python>python setup.py build_ext --inplace
running build_ext
creating C:\Users\Alenka\repos\arrow\python\build
creating C:\Users\Alenka\repos\arrow\python\build\cpp
-- Running CMake for PyArrow C++
cmake -DARROW_BUILD_DIR=build -DCMAKE_BUILD_TYPE=release -DCMAKE_INSTALL_LIBDIR=lib -DCMAKE_INSTALL_PREFIX=C:\Users\Alenka\repos\arrow\python\build\dist -DPYTHON_EXECUTABLE=C:\Users\Alenka\anaconda3\envs\pyarrow-dev\python.exe -DPython3_EXECUTABLE=C:\Users\Alenka\anaconda3\envs\pyarrow-dev\python.exe -DPYARROW_WITH_DATASET=on -DPYARROW_WITH_PARQUET_ENCRYPTION=on -DPYARROW_WITH_HDFS=off -G Ninja C:\Users\Alenka\repos\arrow\python\pyarrow/src
-- The C compiler identification is MSVC 19.16.27048.0
-- The CXX compiler identification is MSVC 19.16.27048.0
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
@AlenkaF
AlenkaF / inspect_files.txt
Created September 12, 2022 04:49
Loading PyArrow
(pyarrow-dev) C:\Users\Alenka\repos\arrow\python>cd pyarrow
(pyarrow-dev) C:\Users\Alenka\repos\arrow\python\pyarrow>ls
__init__.pxd _fs.pxd benchmark.py lib.cp39-win_amd64.pyd
__init__.py _fs.pyx builder.pxi lib.pxd
__pycache__ _gcsfs.pyx cffi.py lib.pyx
_compute.cp39-win_amd64.pyd _generated_version.py compat.pxi lib_api.h
_compute.pxd _hdfs.pyx compute.py memory.pxi
_compute.pyx _hdfsio.cp39-win_amd64.pyd config.pxi orc.py
_compute_docstrings.py _hdfsio.pyx conftest.py pandas-shim.pxi
@AlenkaF
AlenkaF / PyArrow_install_build.txt
Last active September 27, 2022 11:18
Installing PyArrow without doing an inplace build without setting CONDA_DLL_SEARCH_MODIFICATION_ENABLE=1
(pyarrow-dev38) C:\Users\Alenka\repos\arrow\python>pip install -e .
Obtaining file:///C:/Users/Alenka/repos/arrow/python
Installing build dependencies ... done
Checking if build backend supports build_editable ... done
Getting requirements to build editable ... done
Preparing editable metadata (pyproject.toml) ... done
Requirement already satisfied: numpy>=1.16.6 in c:\users\alenka\anaconda3\envs\pyarrow-dev38\lib\site-packages (from pyarrow==10.0.0.dev169+gcbf0ec0d0.d20220927) (1.23.3)
Building wheels for collected packages: pyarrow
Building editable for pyarrow (pyproject.toml) ... done
Created wheel for pyarrow: filename=pyarrow-10.0.0.dev169+gcbf0ec0d0.d20220927-0.editable-cp38-cp38-win_amd64.whl size=26397 sha256=317b71a1b46c66253926260a1fc9c01919ff82fa3326a264956d187276c84b77
(pyarrow-dev-9) (base) alenkafrim@Alenkas-MacBook-Pro arrow % archery lint --clang-format
INFO:archery:Running C++ linters
-- Building using CMake version: 3.24.2
-- The C compiler identification is AppleClang 13.1.6.13160021
-- The CXX compiler identification is AppleClang 13.1.6.13160021
-- Detecting C compiler ABI info
-- Detecting C compiler ABI info - done
-- Check for working C compiler: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/cc - skipped
-- Detecting C compile features
-- Detecting C compile features - done