Skip to content

Instantly share code, notes, and snippets.

@jmbarr
Created May 5, 2021 16:26
Show Gist options
  • Save jmbarr/fc1fefddf0d203ebe70b7444eb233228 to your computer and use it in GitHub Desktop.
Save jmbarr/fc1fefddf0d203ebe70b7444eb233228 to your computer and use it in GitHub Desktop.
def choose_actions(sensors, predicted_tracks, reward_function, smapping=[0,2], tmapping=[0,2]):
"""Work out which action to take. Sensors must be able to return actions, a collection which is either
iterable or testable for membership. Tracks have been predicted to a time of interest in advance (this
logic could exist inside choose actions with the addition of a :class:`Predictor`. A reward function is
passed - or could be accessed from the parent state. Mappings to the x,y position of sensor and track are
provided - it's assumed that the angle to target is in x,y on a flat earth.)"""
# Firstly, can a sensor see a target? Create a dictionary of those
sensor_target_pair = dict()
for sensor in sensors:
sensor_target_pair[sensor] = set()
for track in predicted_tracks:
relative_position = track.state_vector[tmapping] - sensor.position[smapping]
angle_to_target = np.arctan2(relative_position[1], relative_position[0])
if angle_to_target in sensor.actions(predicted_track.timestamp):
sensor_target_pair[sensor].add(track)
# From this create the list of possible configurations
configs = ({sensor: action
for sensor, action in zip(sensor_target_pair.keys(), config)}
for config in itertools.product(*sensor_target_pair.values()))
best_reward = -inf
selected_config = None
for config in configs:
reward = reward_function(config)
if reward > best_reward:
selected_config = config
best_reward = reward
return config, best_reward
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment