I hereby claim:
- I am omdv on github.
- I am omdv (https://keybase.io/omdv) on keybase.
- I have a public key whose fingerprint is C318 B859 0B6A 8C63 50E6 0456 CCD7 3B79 529F FD75
To claim this, I am signing this object:
#!/bin/bash | |
DOMAIN=${HOMELAB_CLOUDFLARE_DOMAIN:-"example.com"} | |
TOKEN=${HOMELAB_CLOUDFLARE_API_TOKEN:-"123-456-789-abc-def"} | |
# api url | |
URL="https://api.cloudflare.com/client/v4/zones" | |
# get zone id | |
REC=$(curl -s -X GET "${URL}?name=${DOMAIN}&status=active&page=1&per_page=20&order=status&direction=desc&match=all" \ | |
-H "Authorization: Bearer ${TOKEN}" \ |
- name: get latest docker-compose | |
shell: > | |
curl -s https://api.github.com/repos/docker/compose/releases/latest \ | |
| grep "tag_name" \ | |
| cut -d \" -f 4 | |
register: docker_compose_version |
I hereby claim:
To claim this, I am signing this object:
This is the attempt to create a generic (hence the relatively long code) agent for different openAI gym environments. The model is based on Q-learning with experience replay. Collected Q-values are approximated by neural network (tensorflow). The action with the maximum Q-value for the given state is selected. Exploration rate starts at 0.6 and is quickly annealed to the standard 0.1 value. The neural network used for the cartpole environment is quite simple with one RELU hidden layer and linear activation on the output layer. The model is loosely based on excellent tutorial written by Tambet Matiisen in his blog.
The main challenge I experienced when adapting this agent to the cartpole environment was to select the proper reward model. The default award of +1 for every moment when the pole was upright was not very successful. Instead I assigned 0. award for every moment the pole was upright and penalized (-1.0