Created
September 23, 2021 08:26
-
-
Save ebraraktas/30d0a24ce2fb391a22e68a30e2f9cdc1 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
{ | |
"nbformat": 4, | |
"nbformat_minor": 0, | |
"metadata": { | |
"colab": { | |
"name": "tflite_slow_inference_2.4.0_conversion_bug.ipynb", | |
"provenance": [], | |
"collapsed_sections": [] | |
}, | |
"kernelspec": { | |
"name": "python3", | |
"display_name": "Python 3" | |
}, | |
"language_info": { | |
"name": "python" | |
} | |
}, | |
"cells": [ | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "Jzl22GNGbDt7" | |
}, | |
"source": [ | |
"# Slow inference on TF Lite>=2.4.0 converted model" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "L4RoUpQkjuk5" | |
}, | |
"source": [ | |
"Conversion of `Conv1D` layer seems to have changed in TF Lite `2.4.0`. Minimal example below demonstrates that inference on model converted by `tensoflow>=2.4.0` 27x slower compared to one converted by `tensoflow==2.4.0`.\n", | |
"\n", | |
"Steps to reproduce:\n", | |
"0. If you have models in the github issue, upload them to root and jump to step 4.\n", | |
"1. Run `Prepare` section and install tensorflow `2.3.0`. Restart kernel.\n", | |
"2. Run `Create Model` and `Conversion` section.\n", | |
"3. Repeat step 1 and 2 for tensorflow `2.6.0`\n", | |
"4. Run `Inference Test`\n", | |
"\n", | |
"### Previous Results:\n", | |
"\n", | |
"```\n", | |
"TF Runtime Version: 2.6.0\n", | |
"Model path: model_2.3.0.tflite\n", | |
"Test Duration: 0.5242369174957275\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.6.0\n", | |
"Model path: model_2.3.0_quant.tflite\n", | |
"Test Duration: 0.7532312870025635\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.6.0\n", | |
"Model path: model_2.6.0.tflite\n", | |
"Test Duration: 0.532163143157959\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.6.0\n", | |
"Model path: model_2.6.0_quant.tflite\n", | |
"Test Duration: 18.914307594299316\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.3.0\n", | |
"Model path: model_2.3.0.tflite\n", | |
"Test Duration: 0.5360305309295654\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.3.0\n", | |
"Model path: model_2.3.0_quant.tflite\n", | |
"Test Duration: 0.5518338680267334\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.3.0\n", | |
"Model path: model_2.6.0.tflite\n", | |
"Test Duration: 0.5399584770202637\n", | |
"= = = = = = = = = = = = = = = = = = = = \n", | |
"TF Runtime Version: 2.3.0\n", | |
"Model path: model_2.6.0_quant.tflite\n", | |
"Cannot create Interpreter! Exception:\n", | |
"Didn't find op for builtin opcode 'CONV_2D' version '5'\n", | |
"Registration failed.\n", | |
"```" | |
] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "PwQ5iEETanLq" | |
}, | |
"source": [ | |
"## Prepare" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "Dh5dvbl6arr5" | |
}, | |
"source": [ | |
"# Using TF Lite 2.3.0 for conversion creates much faster model\n", | |
"use_tf_2_3 = False \n", | |
"\n", | |
"if use_tf_2_3:\n", | |
" !pip install tensorflow==2.3.0\n", | |
"else:\n", | |
" # 2.5.0 and 2.4.0 performs same\n", | |
" !pip install tensorflow==2.6.0" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "l9vk-50PZkSj" | |
}, | |
"source": [ | |
"import tensorflow as tf\n", | |
"print(tf.__version__)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "XVyhL41jaIFs" | |
}, | |
"source": [ | |
"def generate_model(num_hidden_units: int = 500, num_layers: int = 1) -> tf.keras.models.Sequential:\n", | |
" model = tf.keras.models.Sequential()\n", | |
" for _ in range(num_layers):\n", | |
" model.add(tf.keras.layers.Conv1D(filters=num_hidden_units, kernel_size=3, strides=1, padding='SAME', activation='relu'))\n", | |
" optimizer = tf.keras.optimizers.Adam()\n", | |
" model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])\n", | |
" return model" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "7WdyoJQ6aj3Q" | |
}, | |
"source": [ | |
"## Create model" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "dh-uhJw6aKIU" | |
}, | |
"source": [ | |
"model = generate_model()\n", | |
"model.build(input_shape=(None, None, 80))" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "pCZubKlIafJb" | |
}, | |
"source": [ | |
"## Conversion" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "9sIGNMdraXW6" | |
}, | |
"source": [ | |
"for quantize in (True, False):\n", | |
" converter = tf.lite.TFLiteConverter.from_keras_model(model)\n", | |
"\n", | |
" if quantize:\n", | |
" converter.optimizations = [tf.lite.Optimize.DEFAULT]\n", | |
"\n", | |
" model_path = f\"model_{tf.__version__}{'_quant' if quantize else ''}.tflite\"\n", | |
" with open(model_path, \"wb\") as f:\n", | |
" f.write(converter.convert())" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "markdown", | |
"metadata": { | |
"id": "LdqKlzo0acqt" | |
}, | |
"source": [ | |
"## Inference test" | |
] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "wjkVCwgYoz4u" | |
}, | |
"source": [ | |
"import time, glob\n", | |
"data = tf.random.normal((1, 50, 80))\n", | |
"\n", | |
"for inference_model_path in sorted(glob.glob(\"model_*.tflite\")):\n", | |
" print(f\"TF Runtime Version: {tf.__version__}\")\n", | |
" print(f\"Model path: {inference_model_path}\")\n", | |
" try:\n", | |
" interpreter = tf.lite.Interpreter(inference_model_path)\n", | |
" except Exception as e:\n", | |
" print(\"Cannot create Interpreter! Exception:\")\n", | |
" print(e)\n", | |
" continue\n", | |
" interpreter.resize_tensor_input(0, [1, 50, 80])\n", | |
" interpreter.allocate_tensors()\n", | |
"\n", | |
" start = time.time()\n", | |
" for _ in range(1000):\n", | |
" interpreter.set_tensor(0, data)\n", | |
" interpreter.invoke()\n", | |
" print(\"Test Duration:\", time.time() - start)\n", | |
" print(\"= \" * 20)" | |
], | |
"execution_count": null, | |
"outputs": [] | |
}, | |
{ | |
"cell_type": "code", | |
"metadata": { | |
"id": "RomoJSTYfem2" | |
}, | |
"source": [ | |
"" | |
], | |
"execution_count": null, | |
"outputs": [] | |
} | |
] | |
} |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment