Skip to content

Instantly share code, notes, and snippets.

Embed
What would you like to do?
{
"cells": [
{
"metadata": {},
"cell_type": "markdown",
"source": "# Problem Statement\n\nThe `fast.ai` library has a callback to track training metrics history. However, the history is reported via console, or Jupyter widget, and there are no callbacks to store these results into CSV format. In this notebook, the author proposes his approach to implement a callback similar to [CSVLogger from Keras library](https://github.com/keras-team/keras/blob/master/keras/callbacks.py#L1135) which will save tracked metrics into persistent file."
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "%reload_ext autoreload",
"execution_count": 1,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "%autoreload 2",
"execution_count": 2,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "from fastai import *\nfrom fastai.torch_core import *\nfrom fastai.vision import *\nfrom fastai.metrics import *\nfrom torchvision.models import resnet18",
"execution_count": 3,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "@dataclass\nclass CSVLogger(LearnerCallback):\n \"A `LearnerCallback` that saves history of training metrics into CSV file.\"\n filename: str = 'history'\n\n def __post_init__(self):\n self.path = self.learn.path/f'{self.filename}.csv'\n self.file = None\n\n @property\n def header(self):\n return self.learn.recorder.names\n\n def read_logged_file(self):\n return pd.read_csv(self.path)\n\n def on_train_begin(self, metrics_names: StrList, **kwargs: Any) -> None:\n self.path.parent.mkdir(parents=True, exist_ok=True)\n self.file = self.path.open('w')\n self.file.write(','.join(self.header) + '\\n')\n\n def on_epoch_end(self, epoch: int, smooth_loss: Tensor, last_metrics: MetricsList, **kwargs: Any) -> bool:\n self.write_stats([epoch, smooth_loss] + last_metrics)\n\n def on_train_end(self, **kwargs: Any) -> None:\n self.file.flush()\n self.file.close()\n\n def write_stats(self, stats: TensorOrNumList) -> None:\n stats = [str(stat) if isinstance(stat, int) else f'{stat:.6f}'\n for name, stat in zip(self.header, stats)]\n str_stats = ','.join(stats)\n self.file.write(str_stats + '\\n')",
"execution_count": 4,
"outputs": []
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Example"
},
{
"metadata": {},
"cell_type": "markdown",
"source": "Let's train MNIST classifier and track its metrics. All the metrics listed in `metrics` array, and also epoch number, train and valid loss should be saved into file. Then we can read this file and process somehow. "
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "path = untar_data(URLs.MNIST_TINY)",
"execution_count": 5,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "data = ImageDataBunch.from_folder(path)",
"execution_count": 6,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "learn = Learner(data, simple_cnn((3, 10, 10)), metrics=[accuracy, error_rate])",
"execution_count": 7,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "cb = CSVLogger(learn)",
"execution_count": 8,
"outputs": []
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "learn.fit(3, callbacks=[cb])",
"execution_count": 9,
"outputs": [
{
"output_type": "stream",
"text": "Total time: 00:02\nepoch train_loss valid_loss accuracy error_rate\n1 2.249624 2.172361 0.505007 0.494993 (00:00)\n2 2.118121 1.730644 0.505007 0.494993 (00:00)\n3 1.858596 1.108214 0.505007 0.494993 (00:00)\n\n",
"name": "stdout"
}
]
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "log_df = cb.read_logged_file()\nlog_df",
"execution_count": 10,
"outputs": [
{
"output_type": "execute_result",
"execution_count": 10,
"data": {
"text/plain": " epoch train_loss valid_loss accuracy error_rate\n0 1 2.249624 2.172361 0.505007 0.494993\n1 2 2.118121 1.730644 0.505007 0.494993\n2 3 1.858596 1.108214 0.505007 0.494993",
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>epoch</th>\n <th>train_loss</th>\n <th>valid_loss</th>\n <th>accuracy</th>\n <th>error_rate</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>1</td>\n <td>2.249624</td>\n <td>2.172361</td>\n <td>0.505007</td>\n <td>0.494993</td>\n </tr>\n <tr>\n <th>1</th>\n <td>2</td>\n <td>2.118121</td>\n <td>1.730644</td>\n <td>0.505007</td>\n <td>0.494993</td>\n </tr>\n <tr>\n <th>2</th>\n <td>3</td>\n <td>1.858596</td>\n <td>1.108214</td>\n <td>0.505007</td>\n <td>0.494993</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {}
}
]
},
{
"metadata": {},
"cell_type": "markdown",
"source": "## Tests\n\nThe tests are present in in [test_logger.py](./test_logger.py) file and could be invoked with command:\n```bash\n$ python -m pytest test_logger.py\n```\n\nTo keep all PRs code in a single place, here is the content of aforementioned file:"
},
{
"metadata": {
"trusted": true
},
"cell_type": "code",
"source": "from io import StringIO\nfrom contextlib import redirect_stdout\n\nimport pytest\nfrom fastai import *\nfrom fastai.vision import *\nfrom fastai.metrics import *\nfrom fastprogress import fastprogress\n\nfrom logger import CSVLogger\n\n\ndef test_callback_has_required_properties_after_init(classifier):\n cb = CSVLogger(classifier)\n\n assert cb.filename\n assert not cb.path.exists()\n assert cb.learn is classifier\n assert cb.file is None\n\n\ndef test_callback_writes_learn_metrics_during_training(classifier_and_logger):\n n_epochs = 3\n classifier, cb = classifier_and_logger\n\n classifier.fit(n_epochs, callbacks=[cb])\n\n log_df = cb.read_logged_file()\n assert cb.path.exists()\n assert cb.file.closed\n assert not log_df.empty\n assert len(log_df) == n_epochs\n assert classifier.recorder.names == log_df.columns.tolist()\n\n\n# We can drop this test if you think it doesn't make too much sense testing equality of \n# stdout progress output with CSV content.\ndef test_callback_written_metrics_are_equal_to_reported_via_stdout(classifier_and_logger, no_bar):\n n_epochs = 3\n classifier, cb = classifier_and_logger\n\n buffer = StringIO()\n with redirect_stdout(buffer):\n classifier.fit(n_epochs, callbacks=[cb])\n\n csv_df = cb.read_logged_file()\n stdout_df = convert_into_dataframe(buffer)\n pd.testing.assert_frame_equal(csv_df, stdout_df)\n\n\ndef test_callback_written_metrics_are_equal_to_values_stored_in_reporter(classifier_and_logger):\n n_epochs = 3\n classifier, cb = classifier_and_logger\n\n classifier.fit(n_epochs, callbacks=[cb])\n\n csv_df = cb.read_logged_file()\n recorder_df = create_metrics_dataframe(classifier)\n pd.testing.assert_frame_equal(csv_df, recorder_df)\n\n\n@pytest.fixture\ndef classifier(tmpdir):\n path = untar_data(URLs.MNIST_TINY)\n bunch = ImageDataBunch.from_folder(path)\n model_path = str(tmpdir.join('classifier'))\n learn = Learner(bunch, simple_cnn((3, 10, 10)), path=model_path)\n return learn\n\n\n@pytest.fixture\ndef classifier_and_logger(classifier):\n classifier.metrics = [accuracy, error_rate]\n cb = CSVLogger(classifier)\n return classifier, cb\n\n\n@pytest.fixture\ndef no_bar():\n fastprogress.NO_BAR = True\n yield\n fastprogress.NO_BAR = False\n\n\ndef convert_into_dataframe(buffer):\n \"Converts data captured from `fastprogress.ConsoleProgressBar` into dataframe.\"\n lines = buffer.getvalue().split('\\n')\n header, *lines = [l.strip() for l in lines if l]\n header = header.split()\n floats = [[float(x) for x in line.split()] for line in lines]\n records = [dict(zip(header, metrics_list)) for metrics_list in floats]\n df = pd.DataFrame(records, columns=header)\n df['epoch'] = df['epoch'].astype(int)\n return df\n\n\ndef create_metrics_dataframe(learn):\n \"Converts metrics stored in `Recorder` into dataframe.\"\n records = [\n [i, loss, val_loss, *epoch_metrics]\n for i, (loss, val_loss, epoch_metrics)\n in enumerate(zip(\n get_train_losses(learn),\n learn.recorder.val_losses,\n learn.recorder.metrics), 1)]\n return pd.DataFrame(records, columns=learn.recorder.names)\n\n\ndef get_train_losses(learn):\n \"Returns list of training losses at the end of each training epoch.\"\n np_losses = [to_np(l).item() for l in learn.recorder.losses]\n batch_size = len(learn.data.train_dl)\n return [batch[-1] for batch in partition(np_losses, batch_size)]",
"execution_count": 11,
"outputs": []
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3",
"language": "python"
},
"language_info": {
"name": "python",
"version": "3.7.0",
"mimetype": "text/x-python",
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"pygments_lexer": "ipython3",
"nbconvert_exporter": "python",
"file_extension": ".py"
}
},
"nbformat": 4,
"nbformat_minor": 2
}
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.