YAML#

Finally, all of the customized grid-objects and functions defined in the last pages can be used in conjunction with the yaml configuration files to make the creation of custom environments extremely simple. All that is requires is that a custom module is created and contains all the custom code (grid-objects and functions), that all custom components are appropriately registered, and that the module name is used in the yaml when indicating the grid-objects and functions to be used.

Practical Example#

Note

The examples shown here can be found in the examples/ folder.

We are going to create a custom environment where the agent needs to collect all the coins scattered around. For this purpose, we will first define a new custom module coin_env.py which contains all the necessary components: A new Coin grid-object, and appropriate reset, transition, reward, and terminating functions.

from typing import Optional

import numpy.random as rnd

from gym_gridverse.action import Action
from gym_gridverse.agent import Agent
from gym_gridverse.design import draw_room, draw_wall_boundary
from gym_gridverse.envs.reset_functions import reset_function_registry
from gym_gridverse.envs.reward_functions import reward_function_registry
from gym_gridverse.envs.terminating_functions import (
    terminating_function_registry,
)
from gym_gridverse.envs.transition_functions import transition_function_registry
from gym_gridverse.geometry import Area, Orientation
from gym_gridverse.grid import Grid
from gym_gridverse.grid_object import Color, Floor, GridObject, Wall
from gym_gridverse.rng import choice, get_gv_rng_if_none
from gym_gridverse.state import State


class Coin(GridObject):
    state_index = 0
    color = Color.NONE
    blocks_movement = False
    blocks_vision = False
    holdable = False

    @classmethod
    def can_be_represented_in_state(cls) -> bool:
        return True

    @classmethod
    def num_states(cls) -> int:
        return 1

    def __repr__(self):
        return f'{self.__class__.__name__}()'


@reset_function_registry.register
def coin_maze(*, rng: Optional[rnd.Generator] = None) -> State:
    """creates a maze with collectible coins"""

    # must call this to include reproduceable stochasticity
    rng = get_gv_rng_if_none(rng)

    # initializes grid with Coin
    grid = Grid.from_shape((7, 9), factory=Coin)
    # assigns Wall to the border
    draw_wall_boundary(grid)
    # draw other walls
    draw_room(grid, Area((2, 4), (2, 6)), Wall)
    # re-assign openings
    grid[2, 3] = Coin()
    grid[4, 5] = Coin()

    # final result (#=Wall, .=Coin):

    # #########
    # #.......#
    # #.W.WWW.#
    # #.W...W.#
    # #.WWW.W.#
    # #.......#
    # #########

    # randomized agent position and orientation
    agent_position = choice(
        rng,
        [
            position
            for position in grid.area.positions()
            if isinstance(grid[position], Coin)
        ],
    )
    agent_orientation = choice(rng, list(Orientation))
    agent = Agent(agent_position, agent_orientation)

    # remove coin from agent initial position
    grid[agent.position] = Floor()

    return State(grid, agent)


@transition_function_registry.register
def collect_coin_transition(
    state: State,
    action: Action,
    *,
    rng: Optional[rnd.Generator] = None,
):
    """collects and removes coins"""
    if isinstance(state.grid[state.agent.position], Coin):
        state.grid[state.agent.position] = Floor()


@reward_function_registry.register
def collect_coin_reward(
    state: State,
    action: Action,
    next_state: State,
    *,
    reward: float = 1.0,
    rng: Optional[rnd.Generator] = None,
):
    """gives reward if a coin was collected"""
    return (
        reward
        if isinstance(state.grid[next_state.agent.position], Coin)
        else 0.0
    )


@terminating_function_registry.register
def no_more_coins(
    state: State,
    action: Action,
    next_state: State,
    *,
    rng: Optional[rnd.Generator] = None,
):
    """terminates episodes if all coins are collected"""
    return not any(
        isinstance(next_state.grid[position], Coin)
        for position in next_state.grid.area.positions()
    )

Next, we are going to create a YAML configuration file coin_env.yaml which combines these new custom components with some predefined ones. To use the customly defined components, we just need to prepend their names with the name of the modules where they are found.

state_space:
  objects: [ Wall, Floor, coin_env:Coin ]
  colors: [ NONE ]

action_space:
  - MOVE_FORWARD
  - MOVE_BACKWARD
  - MOVE_LEFT
  - MOVE_RIGHT
  - TURN_LEFT
  - TURN_RIGHT

observation_space:
  objects: [ Wall, Floor, coin_env:Coin ]
  colors: [ NONE ]

reset_function:
  name: coin_env:coin_maze

transition_functions:
  - name: move_agent
  - name: turn_agent
  - name: coin_env:collect_coin_transition

reward_functions:
  - name: living_reward
    reward: -0.1
  - name: coin_env:collect_coin_reward

observation_function:
  name: partially_occluded
  area: [ [ -6, 0 ], [-3, 3 ] ]

terminating_function:
  name: coin_env:no_more_coins

Important

For the custom module to be useable in the YAML configuration, you’ll need to make sure that its directory is in the PYTHONPATH, and is therefore findable by the python interpreter.

Note

This example can be run with the gv_viewer.py script:

cd <path/to/examples/folder>
PYTHONPATH="$PYTHONPATH:$PWD" gv_viewer.py coin_env.yaml