Skip to content

Instantly share code, notes, and snippets.

View amdshameer's full-sized avatar

Mohammed Shameer Abubucker amdshameer

  • Bielefeld, Germany
View GitHub Profile
@amdshameer
amdshameer / cem.md
Created October 11, 2022 09:08 — forked from kashif/cem.md
Cross Entropy Method

Cross Entropy Method

How do we solve for the policy optimization problem which is to maximize the total reward given some parametrized policy?

Discounted future reward

To begin with, for an episode the total reward is the sum of all the rewards. If our environment is stochastic, we can never be sure if we will get the same rewards the next time we perform the same actions. Thus the more we go into the future the more the total future reward may diverge. So for that reason it is common to use the discounted future reward where the parameter discount is called the discount factor and is between 0 and 1.

A good strategy for an agent would be to always choose an action that maximizes the (discounted) future reward. In other words we want to maximize the expected reward per episode.

@amdshameer
amdshameer / Image_Moments.ipynb
Created October 4, 2022 12:57 — forked from xvdp/Image_Moments.ipynb
Image Moments, test and Implementation.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
@amdshameer
amdshameer / inertia_matrix.py
Created May 10, 2022 13:09 — forked from awesomebytes/inertia_matrix.py
Compute inertia matrix for simple solids: cube, sphere and cylinder
#/usr/bin/env python
# Based on:
# http://mathworld.wolfram.com/MomentofInertia.html
def get_cube_inertia_matrix(mass, x, y, z):
"""Given mass and dimensions of a cube return intertia matrix.
:return: ixx, ixy, ixz, ixy, iyy, iyz, ixz, iyz, izz
From https://www.wolframalpha.com/input/?i=moment+of+inertia+cube"""