Skip to content

Instantly share code, notes, and snippets.

View jdye64's full-sized avatar

Jeremy Dyer jdye64

  • Nvidia
  • Atlanta, GA
View GitHub Profile
jdye64 / Dask Dataframe Assign
Last active July 11, 2022 13:57
Dask DataFrame assign on empty DataFrame against non-empty DataFrame produces an empty DataFrame
import pandas as pd
import dask.dataframe as dd
# Create an empty dask.dataframe
df = dd.from_pandas(pd.DataFrame(), npartitions=1)
mappings = {'a': 1}
# Assign the new columns and data
df = df.assign(**mappings)
git clone
cd dask-sql
git checkout datafusion-sql-planner
conda env create -n dask-datafusion -f ./continuous_integration/environment-3.9-dev.yaml
conda activate dask-datafusion
python ./ install

Dask-SQL SegFault notes and observations


  • Occurs regardless of LocalCUDACluster transport specified. Ex: UCX, TCP, etc
  • Only occurs when ucx-py is installed in the Anaconda environment AND LocalCUDACluster is used instead of standard Distributed.Client
  • Any environment without UCX and issues cannot be reproduced


I have provided 2 test cases. One with ucx and one without. The tests are as close as possible (some imports had to be removed) to demonstrate the failures.

from custreamz import kafka
# How to connect to Kafka, brokers, partitions, security, etc ...
# Full list of configurations can be found at:
kafka_configs = {
"": "localhost:9092",
"": "custreamz-client",
# Global Arguments
echo "======== RapidsAI Xavier Installation Script ========"
1) clone the cudf repo, cloned from my repo, made cudf_xavier branch and added upstream to rapidsai/cudf expecting there might be some code changes I need to make to cudf and can capture those changes in this branch
2) Installed cmake via sudo apt-get install cmake since wouldn't work without cmake installed
3) That caused problems because the cmake version is installed was 3.10.2 and cudf needs >= 3.12 .... lets try something else. SKIP THIS STEP!
4) I wanted to do this without conda but I'm going to install conda and use the cmake that it installs.
5) Of course Anaconda does not seem to officially support ARM64 so that route is not going to work ... something else
6) I could build the latest version of cmake from source ... lets try that. Cmake does not offer binaries for ARM64 directly without building them.
7) cd /tmp && wget && tar -xzvf ./cmake-3.16.2.tar.gz && cd cmake-3.16.2 && ./bootstrap && make && make install
import codecs
def onTrigger(context, session):
flow_file = session.get()
if flow_file is not None:'got flow file: %s' % flow_file.getAttribute('filename'))
filename = flow_file.getAttribute('filename')
counter = filename.split('_')[1].split('.')[0]'counter is: %s' % counter)
weight_file = open('/tmp/weight/weight_' + counter + '.txt')
import picamera
import usb.core
import usb.util
import time
VENDOR_ID = 0x0922
PRODUCT_ID = 0x8003
# Find the Dymo USB scale.
device = usb.core.find(idVendor=VENDOR_ID, idProduct=PRODUCT_ID)
echo "Downloading OS X MiNiFi-CPP Garcon Binary from S3 ...."
cd /tmp
sudo yum install -y epel-release
sudo yum install -y leveldb boost
sudo yum install -y wget unzip
wget -O /tmp/
unzip /tmp/
/tmp/nifi-minifi-cpp-0.3.0/bin/ start
sudo su && yum install -y git wget
cd /opt
tar -xzvf ./nifi-1.1.2-bin.tar.gz
rm -f nifi-1.1.2-bin.tar.gz
nifi-1.1.2/bin/ install
# Warning right now this will start NiFi as root process which I need for some JNI and USB type operations I'm testing.
service nifi start