Skip to content

Instantly share code, notes, and snippets.

View bradfordcp's full-sized avatar

Christopher Bradford bradfordcp

View GitHub Profile
@jamesc127
jamesc127 / cassandra_tombstones.md
Last active September 7, 2022 01:02
A Brief Explanation On Cassandra Tombstones

Here is an example of several different kinds of C* tombstones. For reference, they are:

  • Partition
  • Row
  • Cell
  • Range

Intro

The obvious culprit of tombstone creation is DELETE, but there are other - less obvious - sources of the tombstone. Let’s see exacly what happens on disk when a tombstone is created. It's funny to say that a tombstone is created... aren't we deleting things? Remember, everything in C* is a write! A DELETE operation writes another sstable entry with a newer timestamp than all the other entries... and the most recent timestamp wins!

I’ve created a DataStax Studio notebook that complements this gist. You should be able to download it and run it for yourself :-) nodetool and sstabledump commands need to be run from a terminal on the node(s) you're working with.

DataStax Enterprise can be used to create a highly available, distributed document management system.

Using the project https://github.com/PatrickCallaghan/datastax-document-management, we can create a scalable system which can accommodate billions of files while making their metadata and content searchable.

The premise of the demo project is to allow directories of files to be processed into the system which will extract their content and useful searchable metadata such as 'created' and 'author'.

image Figure 1. Document Deployment Diagram.

@cliffrowley
cliffrowley / STREAMDECK_HID.md
Last active May 7, 2024 00:09
Notes on the Stream Deck HID protocol

Stream Deck Protocol

How to interface with a Stream Deck device.

Synopsis

The device uses the HID protocol to communicate with its software.

Configuration

@ibspoof
ibspoof / restore_node_from_opscenter_backups.ini
Last active February 22, 2019 16:04
Restore a single nodes SSTables from OpsCenters S3 Backup Location using multi-threaded downloads
[s3]
#s3 bucket name
bucket_name = my_backups
download_threads = 6
# other s3 access is defined in the default aws cli settings file
[opscenter]
backup_job_uuid = # get this from s3 bucket
[node]
@ibspoof
ibspoof / nodetool_diff.py
Last active December 28, 2019 00:22
nodetool tablestat difference tool
#!/usr/bin/env python
import re
import sys
REGEX_KEYSPACE = re.compile(r'^Keyspace: (.*)|^Keyspace : (.*)')
REGEX_TABLE = re.compile(r'^\t\tTable: (.*)')
REGEX_READ_CNT = re.compile(r'^\t\tLocal read count: (.*)')
REGEX_WRITE_CNT = re.compile(r'^\t\tLocal write count: (.*)')
##
@harshavardhana
harshavardhana / nginx-minio-static.md
Last active April 19, 2024 09:33 — forked from koolhead17/gist:4b8dd8d95ec86368634693cf9ad9391c
How to configure static website using Nginx with MinIO ?

How to configure static website using Nginx with MinIO ?

1. Install nginx

2. Install minio

3. Install mc client

4. Create a bucket:

$ mc mb myminio/static
Bucket created successfully ‘myminio/static’.
@stefanfoulis
stefanfoulis / docker_for_mac_disk_default_size.md
Last active June 29, 2023 12:02
How to resize Docker for Mac Disk image and set the default size for new images

Set the default size for new Docker for Mac disk images

UPDATE: The instructions here are no longer necessary! Resizing the disk image is now possible right from the UI since Docker for Mac Version 17.12.0-ce-mac49 (21995).

If you are getting the error: No space left on device

Configuring the qcow2 size cap is possible in the current versions:

# my disk is currently 64GiB
@devdazed
devdazed / tc.py
Created December 2, 2015 21:29
SSTable Tombstone Counter
import fileinput, re, operator
from collections import Counter
def sizeof_fmt(num, suffix='B'):
for unit in ['', 'Ki', 'Mi', 'Gi', 'Ti', 'Pi', 'Ei', 'Zi']:
if abs(num) < 1024.0:
return "%3.1f%s%s" % (num, unit, suffix)
num /= 1024.0
return "%.1f%s%s" % (num, 'Yi', suffix)
@rbranson
rbranson / gist:038afa9ad7af3693efd0
Last active September 29, 2016 17:44
Disaggregated Proxy & Storage Nodes

The point of this is to use cheap machines with small/slow storage to coordinate client requests while dedicating the machines with the big and fast storage to doing what they do best. I found that request coordination was contributing to about half the CPU usage on our Cassandra nodes, on average. Solid state storage is quite expensive, nearly doubling the cost of typical hardware. It also means that if people have control over hardware placement within the network, they can place proxy nodes closer to the client without impacting their storage footprint or fault tolerance characteristics.

This is accomplished in Cassandra by passing the -Dcassandra.join_ring=false option when the process is started. These nodes will connect to the seeds, cache the gossip data, load the schema, and begin listening for client requests. Messages like "/x.x.x.x is now UP!" will appear on the other nodes.

There are also some more practical benefits to this. Handling client requests caused us to push the NewSize of the heap up

@lenards
lenards / just_enough_scala.md
Last active October 1, 2022 23:32
A short introduction to Scala syntax and operations reworked and heavily borrowed from Holden Karau's "Scala Crash Course"

Just Enough Scala

(a moderately, well, shameless rework of Holden Karau's "Scala - Crash Course")

Scala is a multi-paradigm high-level language for the JVM.

It offers the ability to use both Object-oriented & Functional approaches.

Scala is statically typed. Type inference eliminates the need for more explicit type declarations.