Skip to content

Instantly share code, notes, and snippets.

View stefan-it's full-sized avatar
🤓
hacking 🎧

Stefan Schweter stefan-it

🤓
hacking 🎧
View GitHub Profile
from transformers import ElectraForMaskedLM, ElectraForPreTraining
from transformers.models.electra.modeling_electra import ElectraForPreTrainingOutput
from torch import Tensor
import torch.nn.functional as F
import torch.nn as nn
import torch
class ElectraModel(nn.Module):
def __init__(self,
@BramVanroy
BramVanroy / run.py
Last active July 19, 2023 09:22
Overwrite HfArgumentParser config options with CLI arguments
# See https://gist.github.com/BramVanroy/f78530673b1437ed0d6be7c61cdbdd7c
parser = HfArgumentParser((ModelArguments, DataTrainingArguments, TrainingArguments, HyperOptArguments))
try:
# Assumes that the first .json file is the config file (if any)
config_file = next(iter(arg for arg in sys.argv if arg.endswith(".json")))
except StopIteration:
config_file = None
run_name_specified = False
@MidSpike
MidSpike / readme.md
Last active February 5, 2024 18:09
CVE-2022-23812 | RIAEvangelist/node-ipc is malware / protest-ware
@LilithWittmann
LilithWittmann / Berechtigungszertifikat für NPA.md
Last active February 23, 2022 07:49
Berechtigungszertifikat für NPA

Betreff: Berechtigungszertifikat für NPA An: VERTRIEB@bdr.de

Text: Sehr geehrte Damen und Herren, ich bin am Erwerb eines Berechtigungszertifikats für den neuen Personalausweis interessiert. Können Sie mir bitte ihre aktuelle Preisliste zukommen lassen?

Vielen Dank & viele Grüße,

[NAME]

; <<>> DiG 9.16.21 <<>> axfr 219.185.195.in-addr.arpa @91.189.192.6
;; global options: +cmd
; Transfer failed.
@yoavg
yoavg / stochastic-critique.md
Last active November 9, 2023 04:32
A criticism of Stochastic Parrots

A criticism of "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big"

Yoav Goldberg, Jan 23, 2021.

The FAccT paper "On the Dangers of Stochastic Parrots: Can Languae Models be Too Big" by Bender, Gebru, McMillan-Major and Shmitchell has been the center of a controversary recently. The final version is now out, and, owing a lot to this controversary, would undoubtly become very widely read. I read an earlier draft of the paper, and I think that the new and updated final version is much improved in many ways: kudos for the authors for this upgrade. I also agree with and endorse most of the content. This is important stuff, you should read it.

However, I do find some aspects of the paper (and the resulting discourse around it and around technology) to be problematic. These weren't clear to me when initially reading the first draft several months ago, but they became very clear to me now. These points are for the most part

@stephenroller
stephenroller / mixout.py
Last active February 10, 2023 23:49
Example of mixout on generic modules.
#!/usr/bin/env python3
"""
Example of a generic Mixout implementation. (Lee et al., 2019).
https://arxiv.org/abs/1909.11299
Implementation by Stephen Roller (https://stephenroller.com).
Updated 2020-02-10 to include 1/(1 - p) correction term. Thanks to
Cheolhyoung Lee for making this correction.
@thomwolf
thomwolf / gpt-2-wikitext-103.py
Last active April 16, 2024 19:27
A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103
# Copyright (c) 2019-present, Thomas Wolf.
# All rights reserved. This source code is licensed under the MIT-style license.
""" A very small and self-contained gist to train a GPT-2 transformer model on wikitext-103 """
import os
from collections import namedtuple
from tqdm import tqdm
import torch
import torch.nn as nn
from torch.utils.data import DataLoader
from ignite.engine import Engine, Events