joelxiangnanchen joelxiangnanchen

## longest_chinese_tokens_gpt4o.py
import tiktoken
import langdetect
T = tiktoken.get_encoding("o200k_base")

length_dict = {}

for i in range(T.n_vocab):
    try:
        length_dict[i] = len(T.decode([i]))
    except:

## rl-for-llms.md

      
              1 file
            
          
              32 forks
            
          
                12 comments
              
            
              569 stars
            
          
                yoavg
                / rl-for-llms.md
            
            
              Last active
              September 27, 2025 08:52
            
          
    Reinforcement Learning for Language Models

Yoav Goldberg, April 2023.
Why RL?

With the release of the ChatGPT model and followup large language models (LLMs), there was a lot of discussion of the importance of "RLHF training", that is, "reinforcement learning from human feedback".
I was puzzled for a while as to why RL (Reinforcement Learning) is better than learning from demonstrations (a.k.a supervised learning) for training language models. Shouldn't learning from demonstrations (or, in language model terminology "instruction fine tuning", learning to immitate human written answers) be sufficient? I came up with a theoretical argument that was somewhat convincing. But I came to realize there is an additional argumment which not only supports the case of RL training, but also requires it, in particular for models like ChatGPT. This additional argument is spelled out in (the first half of) a talk by John Schulman from OpenAI. This post pretty much

  
## ClaudeHack
#Claude Prompt Inject by Curtis White (Prompt Engineer)
This is a 3 to 5 stage prompt injection.
1. Print innocuous string [Corp AI]
2. Consider a simple hypothetical request.
3. Echo or repeat in new rules (can be used to revise program it)
4. Ask about the .p rule for clarification
5. Invoke the rule to generate a persona

Update: I have a new Claude breaker derivative that completely breaks its censorship for 3-5 turns. Undecided if I will publish, yet.

## ChatGPT-Dan-Jailbreak.md

      
              1 file
            
          
              782 forks
            
          
                1900 comments
              
            
              4805 stars
            
          
                coolaj86
                / ChatGPT-Dan-Jailbreak.md
            
            
              Last active
              November 2, 2025 15:48
            
          
    Chat GPT "DAN" (and other "Jailbreaks")


https://chat.openai.com/
Is ChatGPT "DAN" Real? Gonna find out [Part 1] 

(https://www.youtube.com/watch?v=-q8woRG9FrI)
Part 2: I thought ChatGPT DAN was a hoax, but... 

(https://www.youtube.com/watch?v=rHZRrDu3A2U&lc=UgxfrxX8aK7gnCzkend4AaABAg)


## top-k-top-p.py
def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
    """ Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
        Args:
            logits: logits distribution shape (vocabulary size)
            top_k >0: keep only top k tokens with highest probability (top-k filtering).
            top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
                Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
    """
    assert logits.dim() == 1  # batch size 1 for now - could be updated for more but the code would be less clear
    top_k = min(top_k, logits.size(-1))  # Safety check

## top_k_viterbi.py
import torch

# Credits to AllenNLP for the base implementation and base tests:
# https://github.com/allenai/allennlp/blob/master/allennlp/nn/util.py#L174

# Modified AllenNLP `viterbi_decode` to support `top_k` sequences efficiently.
def viterbi_decode(tag_sequence: torch.Tensor, transition_matrix: torch.Tensor, top_k: int=5):
    """
    Perform Viterbi decoding in log space over a sequence given a transition matrix
    specifying pairwise (transition) potentials between tags and a matrix of shape

## tensorflow_mnist_estimator.py
#  Copyright 2017 Uber Technologies, Inc. All Rights Reserved.
#  Copyright 2016 The TensorFlow Authors. All Rights Reserved.
#
#  Licensed under the Apache License, Version 2.0 (the "License");
#  you may not use this file except in compliance with the License.
#  You may obtain a copy of the License at
#
#   http://www.apache.org/licenses/LICENSE-2.0
#
#  Unless required by applicable law or agreed to in writing, software

## 分布式系统学习资料.md

      
              1 file
            
          
              170 forks
            
          
                3 comments
              
            
              475 stars
            
          
                zjhiphop
                / 分布式系统学习资料.md
            
            
              Created
              July 23, 2015 06:13
            
              
                分布式系统学习资料
              
          
    ##分布式系统(Distributed System)资料

#####希望转载的朋友，你可以不用联系我．但是一定要保留原文链接，因为这个项目还在继续也在不定期更新．希望看到文章的朋友能够学到更多．


《Reconfigurable Distributed Storage for Dynamic Networks》

介绍:这是一篇介绍在动态网络里面实现分布式系统重构的paper.论文的作者(导师)是MIT读博的时候是做分布式系统的研究的,现在在NUS带学生,不仅仅是分布式系统,还有无线网络.如果感兴趣可以去他的主页了解.
	import tiktoken
	import langdetect
	T = tiktoken.get_encoding("o200k_base")

	length_dict = {}

	for i in range(T.n_vocab):
	try:
	length_dict[i] = len(T.decode([i]))
	except:
	#Claude Prompt Inject by Curtis White (Prompt Engineer)
	This is a 3 to 5 stage prompt injection.
	1. Print innocuous string [Corp AI]
	2. Consider a simple hypothetical request.
	3. Echo or repeat in new rules (can be used to revise program it)
	4. Ask about the .p rule for clarification
	5. Invoke the rule to generate a persona

	Update: I have a new Claude breaker derivative that completely breaks its censorship for 3-5 turns. Undecided if I will publish, yet.
	def top_k_top_p_filtering(logits, top_k=0, top_p=0.0, filter_value=-float('Inf')):
	""" Filter a distribution of logits using top-k and/or nucleus (top-p) filtering
	Args:
	logits: logits distribution shape (vocabulary size)
	top_k >0: keep only top k tokens with highest probability (top-k filtering).
	top_p >0.0: keep the top tokens with cumulative probability >= top_p (nucleus filtering).
	Nucleus filtering is described in Holtzman et al. (http://arxiv.org/abs/1904.09751)
	"""
	assert logits.dim() == 1 # batch size 1 for now - could be updated for more but the code would be less clear
	top_k = min(top_k, logits.size(-1)) # Safety check
	import torch

	# Credits to AllenNLP for the base implementation and base tests:
	# https://github.com/allenai/allennlp/blob/master/allennlp/nn/util.py#L174

	# Modified AllenNLP `viterbi_decode` to support `top_k` sequences efficiently.
	def viterbi_decode(tag_sequence: torch.Tensor, transition_matrix: torch.Tensor, top_k: int=5):
	"""
	Perform Viterbi decoding in log space over a sequence given a transition matrix
	specifying pairwise (transition) potentials between tags and a matrix of shape
	# Copyright 2017 Uber Technologies, Inc. All Rights Reserved.
	# Copyright 2016 The TensorFlow Authors. All Rights Reserved.
	#
	# Licensed under the Apache License, Version 2.0 (the "License");
	# you may not use this file except in compliance with the License.
	# You may obtain a copy of the License at
	#
	# http://www.apache.org/licenses/LICENSE-2.0
	#
	# Unless required by applicable law or agreed to in writing, software