Skip to content

Instantly share code, notes, and snippets.

View radi-cho's full-sized avatar

Radi Cho radi-cho

View GitHub Profile
const int buttonPin = 4;
const int ss = 10;
const int mosi = 11;
const int miso = 12;
const int sclk = 13;
int lastState = LOW;
int ledState = LOW;
void setup()
@radi-cho
radi-cho / gpt4-SHAP.txt
Created March 28, 2023 08:35
Prompt: "Explain in detail what the SHAP Gradient Explainer is and how it works. Use LaTeX math. Provide intuition."
The SHAP (SHapley Additive exPlanations) Gradient Explainer is a method for interpreting the output of machine learning models by assigning feature importance values to input features. The idea behind SHAP values is to fairly distribute the contribution of each feature to the prediction for a specific instance by considering all possible feature combinations. SHAP values are based on the concept of Shapley values from cooperative game theory, which ensures that the contributions are fairly distributed among the features.
The Gradient Explainer specifically targets Deep Learning models, such as neural networks, and uses the gradients of the model's output with respect to its input features to approximate SHAP values. The Gradient Explainer is computationally efficient and can handle large neural networks with high-dimensional inputs. It works by combining the gradients with the expected values of the model output to provide an explanation for the local behavior of the model around a specific input instance.
model.compile(optimizer=Adam(3e-5))
early_stopping = tf.keras.callbacks.EarlyStopping(monitor='loss', patience=3)
model.fit(tf_train_dataset, epochs=10, callbacks=[early_stopping])
train_dataset = dataset.map(preprocess_function, batched=True, desc="Running tokenizer")
data_collator = DataCollatorForSeq2Seq(
tokenizer,
model=model,
label_pad_token_id=tokenizer.pad_token_id,
pad_to_multiple_of=64,
return_tensors="np")
tf_train_dataset = model.prepare_tf_dataset(
dataset = load_dataset("csv", data_files="train.csv")
dataset = dataset["train"].shuffle(seed=42)
def preprocess_function(examples):
padding = "max_length"
max_length = 200
inputs = [ex for ex in examples["Text"]]
targets = [ex for ex in examples["Expected"]]
model_inputs = tokenizer(inputs, max_length=max_length, padding=padding, truncation=True)
import tensorflow as tf
from datasets import load_dataset
from transformers import TFMT5ForConditionalGeneration, MT5Tokenizer, DataCollatorForSeq2Seq
from tensorflow.keras.optimizers import Adam
tokenizer = MT5Tokenizer.from_pretrained("google/mt5-small")
model = TFMT5ForConditionalGeneration.from_pretrained("google/mt5-small")
gmlp_blocks = keras.Sequential(
[gMLPLayer(num_patches, embedding_dim, dropout_rate) for _ in range(num_blocks)]
)
class gMLPLayer(layers.Layer):
def __init__(self, num_patches, embedding_dim, dropout_rate, *args, **kwargs):
super(gMLPLayer, self).__init__(*args, **kwargs)
self.channel_projection1 = keras.Sequential(
[
layers.Dense(units=embedding_dim * 2),
layers.ReLU(),
layers.Dropout(rate=dropout_rate),
]
# x_spatial shape: [batch_size, num_patches, embedding_dim].
x_spatial = self.spatial_gating_unit(x_projected)
# x_projected shape: [batch_size, num_patches, embedding_dim].
x_projected = self.channel_projection2(x_spatial)
def call(self, inputs):
x = self.normalize1(inputs)
x_projected = self.channel_projection1(x)