Last active
November 4, 2020 14:33
-
-
Save juliobguedes/8d18fe1cecdcc1aab49793fcd3c91434 to your computer and use it in GitHub Desktop.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1 Physical GPUs, 1 Logical GPU | |
> Preprocessing Spotify dataset. This process may take several hours. | |
>> Checking if the Spotify MPD is already preprocessed. | |
>> Preprocessed version was found. Skipping stage. | |
> Loading Spotify MPD dataset. | |
>> Dataset loaded. | |
> Splitting dataset | |
>> Sampling 10% of dataset | |
>> Splitting sample with 20% for testing | |
>> Creating Vocab. | |
> Building tensorflow datasets. | |
>> Building training and validation dataset. This process may take some time. | |
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 72415/72415 [00:09<00:00, 7815.94it/s] | |
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 72415/72415 [00:09<00:00, 7921.91it/s] | |
>> Building testing dataset. This process may take a long time. | |
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18104/18104 [00:02<00:00, 8216.06it/s] | |
Creating and compiling model | |
> Retrieving candidates. This process takes more time as the number of unique items increases. | |
Fitting model | |
Epoch 1/10 | |
WARNING:tensorflow:The dtype of the source tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got tf.int32 | |
WARNING:tensorflow:Gradients do not exist for variables ['counter:0'] when minimizing the loss. | |
WARNING:tensorflow:The dtype of the source tensor must be floating (e.g. tf.float32) when calling GradientTape.gradient, got tf.int32 | |
WARNING:tensorflow:Gradients do not exist for variables ['counter:0'] when minimizing the loss. | |
1/2263 [..............................] - ETA: 0s - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.9073 - regularization_loss: 0.0000e+00 - total_loss: 110.9073WARNING:tensorflow:From C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\ops\summary_ops_v2.py:1277: stop (from tensorflow.python.eager.profiler) is deprecated and will be removed after 2020-07-01. | |
Instructions for updating: | |
use `tf.profiler.experimental.stop` instead. | |
2/2263 [..............................] - ETA: 2:45:22 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8951 - regularization_loss: 0.0000e+00 - total_loss: 110.8951WARNING:tensorflow:Callbacks method `on_train_batch_end` is slow compared to the batch time (batch time: 1.7244s vs `on_train_batch_end` time: 7.0526s). Check your callbacks. | |
3/2263 [..............................] - ETA: 2:08:43 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8939 - regularization_loss: 0.0000e+00 - total_loss: 110.893 | |
4/2263 [..............................] - ETA: 1:46:16 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8943 - regularization_loss: 0.0000e+00 - total_loss: 110.894 | |
5/2263 [..............................] - ETA: 1:32:50 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.9021 - regularization_loss: 0.0000e+00 - total_loss: 110.902 | |
6/2263 [..............................] - ETA: 1:23:52 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.9047 - regularization_loss: 0.0000e+00 - total_loss: 110.904 | |
7/2263 [..............................] - ETA: 1:17:32 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.9015 - regularization_loss: 0.0000e+00 - total_loss: 110.901 | |
8/2263 [..............................] - ETA: 1:12:43 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.9008 - regularization_loss: 0.0000e+00 - total_loss: 110.900 | |
9/2263 [..............................] - ETA: 1:09:00 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.9005 - regularization_loss: 0.0000e+00 - total_loss: 110.900 | |
10/2263 [..............................] - ETA: 1:05:59 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8975 - regularization_loss: 0.0000e+00 - total_loss: 110.897 | |
11/2263 [..............................] - ETA: 1:03:32 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8981 - regularization_loss: 0.0000e+00 - total_loss: 110.898 | |
12/2263 [..............................] - ETA: 1:01:30 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8984 - regularization_loss: 0.0000e+00 - total_loss: 110.898 | |
13/2263 [..............................] - ETA: 59:45 - top_k_categorical_accuracy: 0.0000e+00 - loss: 110.8984 - regularization_loss: 0.0000e+00 - total_loss: 110.8984 | |
2263/2263 [==============================] - 3782s 2s/step - top_k_categorical_accuracy: 4.0047e-04 - loss: 108.2779 - regularization_loss: 0.0000e+00 - total_loss: 108.2779 - val_top_k_categorical_accuracy: 2.4857e-04 - val_loss: 94.5910 - val_regularization_loss: 0.0000e+00 - val_total_loss: 94.5910 | |
Epoch 2/10 | |
2263/2263 [==============================] - 3780s 2s/step - top_k_categorical_accuracy: 0.0015 - loss: 92.2767 - regularization_loss: 0.0000e+00 - total_loss: 92.2767 - val_top_k_categorical_accuracy: 9.1141e-04 - val_loss: 89.1521 - val_regularization_loss: 0.0000e+00 - val_total_loss: 89.1521 | |
Epoch 3/10 | |
2263/2263 [==============================] - 3791s 2s/step - top_k_categorical_accuracy: 0.0096 - loss: 59.0481 - regularization_loss: 0.0000e+00 - total_loss: 59.0481 - val_top_k_categorical_accuracy: 0.0014 - val_loss: 93.6735 - val_regularization_loss: 0.0000e+00 - val_total_loss: 93.6735 | |
Epoch 4/10 | |
2263/2263 [==============================] - 12196s 5s/step - top_k_categorical_accuracy: 0.0445 - loss: 29.7369 - regularization_loss: 0.0000e+00 - total_loss: 29.7369 - val_top_k_categorical_accuracy: 0.0020 - val_loss: 103.1409 - val_regularization_loss: 0.0000e+00 - val_total_loss: 103.1409 | |
Epoch 5/10 | |
2263/2263 [==============================] - 3757s 2s/step - top_k_categorical_accuracy: 0.1330 - loss: 13.9861 - regularization_loss: 0.0000e+00 - total_loss: 13.9861 - val_top_k_categorical_accuracy: 0.0024 - val_loss: 106.7716 - val_regularization_loss: 0.0000e+00 - val_total_loss: 106.7716 | |
Epoch 6/10 | |
2263/2263 [==============================] - 3762s 2s/step - top_k_categorical_accuracy: 0.2286 - loss: 7.2852 - regularization_loss: 0.0000e+00 - total_loss: 7.2852 - val_top_k_categorical_accuracy: 0.0027 - val_loss: 117.1357 - val_regularization_loss: 0.0000e+00 - val_total_loss: 117.1357 | |
Epoch 7/10 | |
2263/2263 [==============================] - 3758s 2s/step - top_k_categorical_accuracy: 0.2817 - loss: 4.2511 - regularization_loss: 0.0000e+00 - total_loss: 4.2511 - val_top_k_categorical_accuracy: 0.0030 - val_loss: 119.9953 - val_regularization_loss: 0.0000e+00 - val_total_loss: 119.9953 | |
Epoch 8/10 | |
2263/2263 [==============================] - 6142s 3s/step - top_k_categorical_accuracy: 0.3183 - loss: 2.7132 - regularization_loss: 0.0000e+00 - total_loss: 2.7132 - val_top_k_categorical_accuracy: 0.0028 - val_loss: 123.1363 - val_regularization_loss: 0.0000e+00 - val_total_loss: 123.1363 | |
Epoch 9/10 | |
2263/2263 [==============================] - 3781s 2s/step - top_k_categorical_accuracy: 0.3406 - loss: 1.8537 - regularization_loss: 0.0000e+00 - total_loss: 1.8537 - val_top_k_categorical_accuracy: 0.0029 - val_loss: 127.1309 - val_regularization_loss: 0.0000e+00 - val_total_loss: 127.1309 | |
Epoch 10/10 | |
2263/2263 [==============================] - 3799s 2s/step - top_k_categorical_accuracy: 0.3582 - loss: 1.3512 - regularization_loss: 0.0000e+00 - total_loss: 1.3512 - val_top_k_categorical_accuracy: 0.0031 - val_loss: 131.3763 - val_regularization_loss: 0.0000e+00 - val_total_loss: 131.3763 | |
Starting Evaluation | |
566/566 [==============================] - 357s 631ms/step - top_k_categorical_accuracy: 0.0018 - loss: 131.8116 - regularization_loss: 0.0000e+00 - total_loss: 131.8116 | |
2020-11-03 21:40:28.665119: W tensorflow/core/common_runtime/bfc_allocator.cc:431] Allocator (GPU_0_bfc) ran out of memory trying to allocate 8.09GiB (rounded to 8689920000)requested by op CudnnRNN | |
Current allocation summary follows. | |
2020-11-03 21:40:28.671396: W tensorflow/core/common_runtime/bfc_allocator.cc:439] *_*********************************************************************_____________________________ | |
2020-11-03 21:40:28.674661: E tensorflow/stream_executor/dnn.cc:616] OOM when allocating tensor with shape[2172480000] and type float on /job:localhost/replica:0/task:0/device:GPU:0 by allocator GPU_0_bfc | |
2020-11-03 21:40:28.677792: W tensorflow/core/framework/op_kernel.cc:1767] OP_REQUIRES failed at cudnn_rnn_ops.cc:1517 : Internal: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 3, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 100, 100, 1, 200, 18104, 0] | |
Traceback (most recent call last): | |
File "run.py", line 86, in <module> | |
pred = model.predict_batch(x_test, n=100) | |
File "../src\models\gru4rec.py", line 60, in predict_batch | |
_, recommended_items = index(user_history, k=n) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 985, in __call__ | |
outputs = call_fn(inputs, *args, **kwargs) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow_recommenders\layers\factorized_top_k.py", line 371, in call | |
queries = self.query_model(queries) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 985, in __call__ | |
outputs = call_fn(inputs, *args, **kwargs) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\engine\sequential.py", line 372, in call | |
return super(Sequential, self).call(inputs, training=training, mask=mask) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 386, in call | |
inputs, training=training, mask=mask) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\engine\functional.py", line 508, in _run_internal_graph | |
outputs = node.layer(*args, **kwargs) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent.py", line 663, in __call__ | |
return super(RNN, self).__call__(inputs, **kwargs) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\engine\base_layer.py", line 985, in __call__ | |
outputs = call_fn(inputs, *args, **kwargs) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent_v2.py", line 441, in call | |
inputs, initial_state, training, mask, row_lengths) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent_v2.py", line 496, in _defun_gru_call | |
last_output, outputs, new_h, runtime = gpu_gru(**gpu_gru_kwargs) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\keras\layers\recurrent_v2.py", line 656, in gpu_gru | |
rnn_mode='gru') | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\ops\gen_cudnn_rnn_ops.py", line 103, in cudnn_rnn | |
ctx=_ctx) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\ops\gen_cudnn_rnn_ops.py", line 180, in cudnn_rnn_eager_fallback | |
attrs=_attrs, ctx=ctx, name=name) | |
File "C:\Users\lmd-pc-03\anaconda3\lib\site-packages\tensorflow\python\eager\execute.py", line 60, in quick_execute | |
inputs, attrs, num_outputs) | |
tensorflow.python.framework.errors_impl.InternalError: Failed to call ThenRnnForward with model config: [rnn_mode, rnn_input_mode, rnn_direction_mode]: 3, 0, 0 , [num_layers, input_size, num_units, dir_count, max_seq_length, batch_size, cell_num_units]: [1, 100, 100, 1, 200, 18104, 0] [Op:CudnnRNN] |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment