Skip to content

Instantly share code, notes, and snippets.

@baberabb
baberabb / spo_loss.py
Created June 10, 2024 15:38 — forked from crowsonkb/spo_loss.py
Scalar Preference Optimization
"""Scalar Preference Optimization."""
import torch
from torch.nn import functional as F
def logp_completion(logits, tokens, mask):
"""Compute the log probabilities of completions given their prompts.
Args:
import asyncio
import os
import random
from collections import deque
import ssl
from urllib.parse import urlparse
import aiohttp
import polars as pl
import aiofiles
@baberabb
baberabb / grouped_results.md
Last active February 12, 2024 23:23
mistral MMLU

hf (pretrained=mistralai/Mistral-7B-v0.1), gen_kwargs: (None), limit: None, num_fewshot: None, batch_size: 8

Tasks Version Filter n-shot Metric Value Stderr
mmlu N/A none 0 acc 0.6273 ± 0.0294
- humanities N/A none None acc 0.6520 ± 0.0233
- formal_logic 0 none None acc 0.3571 ± 0.0429
- high_school_european_history 0 none None acc 0.7455 ± 0.0340
- high_school_us_history 0 none None acc 0.7549 ± 0.0302
- high_school_world_history 0 none None acc 0.7764 ± 0.0271
- international_law 0 none None acc 0.7521 ± 0.0394