Skip to content

Instantly share code, notes, and snippets.

View omdv's full-sized avatar
🇺🇦

Oleg Medvedyev omdv

🇺🇦
  • the final frontier
View GitHub Profile
@omdv
omdv / ddns-cloudflare.sh
Created July 26, 2022 03:23
Cloudflare dynamic DNS updater
#!/bin/bash
DOMAIN=${HOMELAB_CLOUDFLARE_DOMAIN:-"example.com"}
TOKEN=${HOMELAB_CLOUDFLARE_API_TOKEN:-"123-456-789-abc-def"}
# api url
URL="https://api.cloudflare.com/client/v4/zones"
# get zone id
REC=$(curl -s -X GET "${URL}?name=${DOMAIN}&status=active&page=1&per_page=20&order=status&direction=desc&match=all" \
-H "Authorization: Bearer ${TOKEN}" \
@omdv
omdv / gist:a764cfc1a1202b413a15b57cb406e9f4
Created October 6, 2021 02:44
Get latest github release version in ansible variable
- name: get latest docker-compose
shell: >
curl -s https://api.github.com/repos/docker/compose/releases/latest \
| grep "tag_name" \
| cut -d \" -f 4
register: docker_compose_version
@omdv
omdv / keybase.md
Created June 9, 2017 04:33
keybase.io

Keybase proof

I hereby claim:

  • I am omdv on github.
  • I am omdv (https://keybase.io/omdv) on keybase.
  • I have a public key whose fingerprint is C318 B859 0B6A 8C63 50E6 0456 CCD7 3B79 529F FD75

To claim this, I am signing this object:

@omdv
omdv / README.md
Last active December 7, 2016 18:05
OpenAI gym

This is the attempt to create a generic (hence the relatively long code) agent for different openAI gym environments. The model is based on Q-learning with experience replay. Collected Q-values are approximated by neural network (tensorflow). The action with the maximum Q-value for the given state is selected. Exploration rate starts at 0.6 and is quickly annealed to the standard 0.1 value. The neural network used for the cartpole environment is quite simple with one RELU hidden layer and linear activation on the output layer. The model is loosely based on excellent tutorial written by Tambet Matiisen in his blog.

The main challenge I experienced when adapting this agent to the cartpole environment was to select the proper reward model. The default award of +1 for every moment when the pole was upright was not very successful. Instead I assigned 0. award for every moment the pole was upright and penalized (-1.0