- If values are integers in [0, 255], Parquet will automatically compress to use 1 byte unsigned integers, thus decreasing the size of saved DataFrame by a factor of 8.
- Partition DataFrames to have evenly-distributed, ~128MB partition sizes (empirical finding). Always err on the higher side w.r.t. number of partitions.
- Pay particular attention to the number of partitions when using
flatMap
, especially if the following operation will result in high memory usage. TheflatMap
op usually results in a DataFrame with a [much] larger number of rows, yet the number of partitions will remain the same. Thus, if a subsequent op causes a large expansion of memory usage (i.e. converting a DataFrame of indices to a DataFrame of large Vectors), the memory usage per partition may become too high. In this case, it is beneficial to repartition the output offlatMap
to a number of partitions that will safely allow for appropriate partition memory sizes, based upon the
### | |
### | |
### UPDATE: For Win 11, I recommend using this tool in place of this script: | |
### https://christitus.com/windows-tool/ | |
### https://github.com/ChrisTitusTech/winutil | |
### https://www.youtube.com/watch?v=6UQZ5oQg8XA | |
### iwr -useb https://christitus.com/win | iex | |
### | |
### |
import zmq | |
# ZeroMQ Context | |
context = zmq.Context() | |
# Define the socket using the "Context" | |
sock = context.socket(zmq.SUB) | |
# Define subscription and messages with prefix to accept. | |
sock.setsockopt(zmq.SUBSCRIBE, "misp_json") |
import ruamel.yaml | |
yaml = ruamel.yaml.YAML() | |
data = yaml.load(open('environment.yml')) | |
requirements = [] | |
for dep in data['dependencies']: | |
if isinstance(dep, str): | |
package, package_version, python_version = dep.split('=') | |
if python_version == '0': |
HackerNews discussed this with many alternative solutions: https://news.ycombinator.com/item?id=24893615
I already have my own domain name: mydomain.com
. I wanted to be able to run some webapps on my Raspberry Pi 4B running
perpetually at home in headless mode (just needs 5W power and wireless internet). I wanted to be able to access these apps from public Internet. Dynamic DNS wasn't an option because my ISP blocks all incoming traffic. ngrok
would work but the free plan is too restrictive.
I bought a cheap 2GB RAM, 20GB disk VM + a 25GB volume on Hetzner for about 4 EUR/month. Hetzner gave me a static IP for it. I haven't purchased a floating IP yet.
<!DOCTYPE html> | |
<html lang="en"> | |
<head> | |
<meta charset="utf-8" /> | |
<meta name="viewport" content="width=device-width, initial-scale=1" /> | |
<title>*scratch*</title> | |
<style> | |
body { | |
font-family: Hack, Menlo, Monaco, 'Droid Sans Mono', 'Courier New', monospace; | |
white-space: pre; |
#API Documentation: https://api.riskiq.net/api/articles/ | |
#Command to pull IoC's for "Magecart Group 12: End of Life Magento Sites Infested with Ants and Cockroaches" | |
curl -u <RiskIQ_Email>:<APIKey> 'https://api.riskiq.net/pt/v2/articles/indicators?articleGuid=fda1f967' | jq '.indicators[] .value' |
This is inspired by A half-hour to learn Rust and Zig in 30 minutes.
Your first Go program as a classical "Hello World" is pretty simple:
First we create a workspace for our project:
VMWare Fusion 13 is now released. Read Vagrant and VMWare Fusion 13 Player on Apple M1 Pro for the latest.
This document summarizes notes taken while to make the VMWare Tech preview work on Apple M1 Pro, it originated