PhilGlau filmo

## gist:a99db13793397c8364a3be42c711db9b
q_learning walkthrough. GATech OMSCS. (School policy prohibits our publishing code for honor policy)

## Double DQN
Used a 2-layer Fully Connected network with H1=100, H2=60 and ReLU

He Initialization of weights
Adam Optimizer. Initial learning rate = 0.001 Learning rate reduced using gamma of 0.50 every 350 episodes

Gamma = 0.99
Eps   = 1.00
Eps Decay = 0.98
Eps Decay every new episode. (not each step)

## binned_throttle.py
import numpy as np
np.set_printoptions(precision=4,suppress=True,linewidth=180)

def clamp(n, min, max):
    if n < min:
        return min
    if n > max:
        return max
    return n

## gist:2c1ac7467c26f62f589422aa55206a2e
On line 35 shuffle is imported from sklearn

from sklearn.utils import shuffle

and then later called inside the data generator with:

shuffle(keys)

However, shuffle from sklearn does ~not~ shuffle in-place. It returns a shuffled list, leaving the input parameter untouched.

## gist:6720092a1ceac129f52402dc61af0f5a
  0%|          | 0/1000 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
Traceback (most recent call last):
  File "/home/philglau/PycharmProjects/tokenizersLLM/medium_article_falcon7b.py", line 87, in <module>
    trainer.train()
  File "/home/philglau/anaconda3/envs/pytorch_hug_llm_203/lib/python3.11/site-packages/transformers/trainer.py", line 1645, in train
    return inner_training_loop(
           ^^^^^^^^^^^^^^^^^^^^
  File "/home/philglau/anaconda3/envs/pytorch_hug_llm_203/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in _inner_training_loop
    tr_loss_step = self.training_step(model, inputs)
                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
	Used a 2-layer Fully Connected network with H1=100, H2=60 and ReLU

	He Initialization of weights
	Adam Optimizer. Initial learning rate = 0.001 Learning rate reduced using gamma of 0.50 every 350 episodes

	Gamma = 0.99
	Eps = 1.00
	Eps Decay = 0.98
	Eps Decay every new episode. (not each step)
	import numpy as np
	np.set_printoptions(precision=4,suppress=True,linewidth=180)

	def clamp(n, min, max):
	if n < min:
	return min
	if n > max:
	return max
	return n
	On line 35 shuffle is imported from sklearn

	from sklearn.utils import shuffle

	and then later called inside the data generator with:

	shuffle(keys)

	However, shuffle from sklearn does ~not~ shuffle in-place. It returns a shuffled list, leaving the input parameter untouched.
	0%\| \| 0/1000 [00:00<?, ?it/s]You're using a PreTrainedTokenizerFast tokenizer. Please note that with a fast tokenizer, using the `__call__` method is faster than using a method to encode the text followed by a call to the `pad` method to get a padded encoding.
	Traceback (most recent call last):
	File "/home/philglau/PycharmProjects/tokenizersLLM/medium_article_falcon7b.py", line 87, in <module>
	trainer.train()
	File "/home/philglau/anaconda3/envs/pytorch_hug_llm_203/lib/python3.11/site-packages/transformers/trainer.py", line 1645, in train
	return inner_training_loop(
	^^^^^^^^^^^^^^^^^^^^
	File "/home/philglau/anaconda3/envs/pytorch_hug_llm_203/lib/python3.11/site-packages/transformers/trainer.py", line 1938, in _inner_training_loop
	tr_loss_step = self.training_step(model, inputs)
	^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^