Cosmo cosmojg

## fast_scansum.cu
// Fast block-local prefix-sum on CUDA, using warp-syncs.
// The input is an array of u32. It is mutated in place. Example:
// arr = [1,1,1,1,...]
// Becomes:
// arr = [1,2,3,4,...]
// The number of elements must be equal to threads per block (TPB).

#include <stdio.h>
#include <cuda_runtime.h>

## a_b_challenge.md

      
              1 file
            
          
              0 forks
            
          
              119 comments
            
          
              44 stars
            
          
                VictorTaelin
                / a_b_challenge.md
            
            
              Last active
              May 2, 2024 07:48
            
              
                A::B Prompting Challenge: $10k to prove me wrong!
              
          
    CHALLENGE

Develop an AI prompt that solves random 12-token instances of the A::B problem (defined here), with 90%+ success rate.
RULES

1. The AI will be given a <problem/> to solve.

We'll use your prompt as the SYSTEM PROMPT, and a specific instance of problem as the PROMPT, inside XML tags. Example:

  
## README.md

      
              1 file
            
          
              8 forks
            
          
              13 comments
            
          
              92 stars
            
          
                Artefact2
                / README.md
            
            
              Last active
              July 13, 2024 04:58
            
              
                GGUF quantizations overview
              
          
    Which GGUF is right for me? (Opinionated)

Good question! I am collecting human data on how quantization affects outputs. See here for more information: ggerganov/llama.cpp#5962
In the meantime, use the largest that fully fits in your GPU. If you can comfortably fit Q4_K_S, try using a model with more parameters.
llama.cpp feature matrix

See the wiki upstream: https://github.com/ggerganov/llama.cpp/wiki/Feature-matrix

  
## llm_samplers_explained.md

      
              1 file
            
          
              2 forks
            
          
              0 comments
            
          
              13 stars
            
          
                kalomaze
                / llm_samplers_explained.md
            
            
              Last active
              June 16, 2024 18:19
            
              
                LLM Samplers Explained
              
          
    LLM Samplers Explained

Everytime a large language model makes predictions, all of the thousands of tokens in the vocabulary are assigned some degree of probability, from almost 0%, to almost 100%. There are different ways you can decide to choose from those predictions.
This process is known as "sampling", and there are various strategies you can use which I will cover here.
OpenAI Samplers

Temperature


Temperature is a way to control the overall confidence of the model's scores (the logits). What this means is that, if you use a lower value than 1.0, the relative distance between the tokens will become larger (more deterministic), and if you use a larger value than 1.0, the relative distance between the tokens becomes smaller (less deterministic).
1.0 Temperature is the original distribution that the model was trained to optimize for, since the scores remain the same.
Graph demonstration with voiceover: https://files.catbox.moe/6ht56x.mp4


## worldspider_poem_prompt.txt
MiniModel
minimodel

A self contained hyper short post (I limit myself to 1024 characters, 2048 if I absolutely need it) which is intended to transmit a complete but not necessarily comprehensive model of some phenomena, skill, etc.

The MiniModel format fell out of three things:

1. My dissatisfaction with essays and blog posts.
2. My experimentation with microblogging as a way of getting my ideas out faster and more incrementally.
3. [Maia Pasek's published notes page](https://web.archive.org/web/20170821010721/https://squirrelinhell.github.io/).

## aliases
# ==============================================================================
# ShellGPT
# ==============================================================================

# ------------------------------------------------------------------------------
if which sgpt >/dev/null 2>&1; then
# ------------------------------------------------------------------------------

alias sgpt-chat="sgpt --repl chat"
alias sgpt-code="sgpt --repl code --code"

## Hacking the LG Monitor's EDID.md

      
              1 file
            
          
              0 forks
            
          
              8 comments
            
          
              80 stars
            
          
                kj800x
                / Hacking the LG Monitor's EDID.md
            
            
              Last active
              May 3, 2024 20:14
            
              
                Hacking the LG Monitor's EDID
              
          
    preface: Posting these online since it sounds like these notes are somewhat interesting based on a few folks I've shared with. These are semi-rough notes that I basically wrote for myself in case I ever needed to revisit this fix, so keep that in mind.
I recently bought an LG ULTRAGEAR monitor secondhand off of a coworker. I really love it and it's been great so far, but I ran into some minor issues with it in Linux. It works great on both Mac and Windows, but on Linux it displays just a black panel until I use the second monitor to go in and reduce the refresh rate down to 60 Hz.
This has worked decent so far but there's some issues:

It doesn't work while linux is booting up. The motherboards boot sequence is visible just fine, but as soon as control is handed over to Linux and I'd normally see a splash screen while I'm waiting for my login window, I see nothing.
It doesn't work on the login screen. This would be fine if login consistently worked on my second screen, but I need to manually switch


## blog.md

      
              4 files
            
          
              13 forks
            
          
              5 comments
            
          
              178 stars
            
          
                Hellisotherpeople
                / blog.md
            
            
              Last active
              July 14, 2024 18:44
            
              
                You probably don't know how to do Prompt Engineering, let me educate you. 
              
          
    You probably don't know how to do Prompt Engineering

(This post could also be titled "Features missing from most LLM front-ends that should exist")

Apologies for the snarky title, but there has been a huge amount of discussion around so called "Prompt Engineering" these past few months on all kinds of platforms. Much of it is coming from individuals who are peddling around an awful lot of "Prompting" and very little "Engineering".
Most of these discussions are little more than users finding that writing more creative and complicated prompts can help them solve a task that a more simple prompt was unable to help with. I claim this is not Prompt Engineering. This is not to say that crafting good prompts is not a difficult task, but it does not involve doing any kind of sophisticated modifications to general "template" of a prompt.
Others, who I think do deserve to call themselves "Prompt Engineers" (and an awful lot more than that), have been writing about and utilizing the rich new eco-system

  
## us_rmv_appointment_finder.py
# %% [markdown]
# RMV Appointment Finder v1.0

# %%
# Install Required Libraries
# !python3 -m pip install selenium pandas tqdm webdriver-manager -q
#
# Execution:
# python3 us_rmv_appointment_finder.py
#

## PROMPT.txt
Here is OCR'd text from a receipt:

"Dine-In

#15

<restaurant name redacted>

<address redacted> Phone <phone redacted>
	// Fast block-local prefix-sum on CUDA, using warp-syncs.
	// The input is an array of u32. It is mutated in place. Example:
	// arr = [1,1,1,1,...]
	// Becomes:
	// arr = [1,2,3,4,...]
	// The number of elements must be equal to threads per block (TPB).

	#include <stdio.h>
	#include <cuda_runtime.h>
	MiniModel
	minimodel

	A self contained hyper short post (I limit myself to 1024 characters, 2048 if I absolutely need it) which is intended to transmit a complete but not necessarily comprehensive model of some phenomena, skill, etc.

	The MiniModel format fell out of three things:

	1. My dissatisfaction with essays and blog posts.
	2. My experimentation with microblogging as a way of getting my ideas out faster and more incrementally.
	3. [Maia Pasek's published notes page](https://web.archive.org/web/20170821010721/https://squirrelinhell.github.io/).
	# ==============================================================================
	# ShellGPT
	# ==============================================================================

	# ------------------------------------------------------------------------------
	if which sgpt >/dev/null 2>&1; then
	# ------------------------------------------------------------------------------

	alias sgpt-chat="sgpt --repl chat"
	alias sgpt-code="sgpt --repl code --code"
	# %% [markdown]
	# RMV Appointment Finder v1.0

	# %%
	# Install Required Libraries
	# !python3 -m pip install selenium pandas tqdm webdriver-manager -q
	#
	# Execution:
	# python3 us_rmv_appointment_finder.py
	#
	Here is OCR'd text from a receipt:

	"Dine-In

	#15

	<restaurant name redacted>

	<address redacted> Phone <phone redacted>