Skip to content

Instantly share code, notes, and snippets.

@muellerzr
Created June 10, 2022 18:40
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save muellerzr/763feb654fc0446ed4ebf1813e0cb05e to your computer and use it in GitHub Desktop.
Save muellerzr/763feb654fc0446ed4ebf1813e0cb05e to your computer and use it in GitHub Desktop.
Colab memory issue
Display the source blob
Display the rendered blob
Raw
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"id": "view-in-github",
"colab_type": "text"
},
"source": [
"<a href=\"https://colab.research.google.com/gist/muellerzr/763feb654fc0446ed4ebf1813e0cb05e/scratchpad.ipynb\" target=\"_parent\"><img src=\"https://colab.research.google.com/assets/colab-badge.svg\" alt=\"Open In Colab\"/></a>"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {
"id": "lIYdn1woOS1n"
},
"outputs": [],
"source": [
"# !pip install transformers >> /dev/null"
]
},
{
"cell_type": "code",
"source": [
"from transformers import AutoTokenizer, AutoModelForSequenceClassification, AutoConfig\n",
"import gc, os, psutil, math"
],
"metadata": {
"id": "8nVOzPoJwhgC"
},
"execution_count": 1,
"outputs": []
},
{
"cell_type": "code",
"source": [
"def convert_size(size_bytes):\n",
" if size_bytes == 0:\n",
" return \"0B\"\n",
" size_name = (\"B\", \"KB\", \"MB\", \"GB\", \"TB\", \"PB\", \"EB\", \"ZB\", \"YB\")\n",
" i = int(math.floor(math.log(size_bytes, 1024)))\n",
" p = math.pow(1024, i)\n",
" s = round(size_bytes / p, 2)\n",
" return \"%s %s\" % (s, size_name[i])"
],
"metadata": {
"id": "Pspz1It0xgEw"
},
"execution_count": 2,
"outputs": []
},
{
"cell_type": "code",
"source": [
"def get_used_memory():\n",
" process = psutil.Process(os.getpid())\n",
" return convert_size(process.memory_info().rss)"
],
"metadata": {
"id": "aBtFUM4oyoFI"
},
"execution_count": 3,
"outputs": []
},
{
"cell_type": "markdown",
"source": [
"First, without pretrained model:"
],
"metadata": {
"id": "_uFEXrlJyqQt"
}
},
{
"cell_type": "code",
"source": [
"from transformers import AutoConfig\n",
"print(f'Memory pre config: {get_used_memory()}')\n",
"config = AutoConfig.from_pretrained(\"bert-base-cased\")\n",
"print(f'Memory post config: {get_used_memory()}')\n",
"model = AutoModelForSequenceClassification.from_config(config)\n",
"print(f'Memory post model load from config: {get_used_memory()}')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "vGhqHW8Dyssw",
"outputId": "01b066ef-dac4-46b4-fe33-4b2399e23d47"
},
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Memory pre config: 757.55 MB\n",
"Memory post config: 759.86 MB\n",
"Memory post model load from config: 1.15 GB\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"We should expect that the RAM used post loading weights is ~1.15gb, a bit more at first but back to that post gc"
],
"metadata": {
"id": "bXiEpNJy6pDE"
}
},
{
"cell_type": "markdown",
"source": [
"Now with:"
],
"metadata": {
"id": "oit5g-5VzBLT"
}
},
{
"cell_type": "code",
"source": [
"from transformers import AutoConfig\n",
"print(f'Memory pre from pretrained: {get_used_memory()}')\n",
"model = AutoModelForSequenceClassification.from_pretrained(\"bert-base-cased\")\n",
"print(f'Memory post model load with weights: {get_used_memory()}')\n",
"gc.collect()\n",
"print(f'Memory post garbage collection: {get_used_memory()}')"
],
"metadata": {
"colab": {
"base_uri": "https://localhost:8080/"
},
"id": "zd7l2MbZzCZa",
"outputId": "54907577-6fa2-48dc-8a8e-3d8671595811"
},
"execution_count": 4,
"outputs": [
{
"output_type": "stream",
"name": "stdout",
"text": [
"Memory pre from pretrained: 760.72 MB\n"
]
},
{
"output_type": "stream",
"name": "stderr",
"text": [
"Some weights of the model checkpoint at bert-base-cased were not used when initializing BertForSequenceClassification: ['cls.predictions.transform.dense.weight', 'cls.predictions.transform.LayerNorm.bias', 'cls.predictions.transform.dense.bias', 'cls.seq_relationship.bias', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.decoder.weight', 'cls.seq_relationship.weight']\n",
"- This IS expected if you are initializing BertForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).\n",
"- This IS NOT expected if you are initializing BertForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).\n",
"Some weights of BertForSequenceClassification were not initialized from the model checkpoint at bert-base-cased and are newly initialized: ['classifier.weight', 'classifier.bias']\n",
"You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.\n"
]
},
{
"output_type": "stream",
"name": "stdout",
"text": [
"Memory post model load with weights: 1.56 GB\n",
"Memory post garbage collection: 1.56 GB\n"
]
}
]
},
{
"cell_type": "markdown",
"source": [
"This should be ~1.15gb, and recreating this locally shows that the memory reduction is happening."
],
"metadata": {
"id": "8njZPBdp6P8T"
}
}
],
"metadata": {
"colab": {
"name": "scratchpad",
"provenance": [],
"include_colab_link": true
},
"kernelspec": {
"display_name": "Python 3",
"name": "python3"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment