Skip to content
Snippets Groups Projects
Jaeyoung Lee's avatar
Jae Young Lee authored
Retraining wait maneuver

See merge request !4
558f2efd
History

WiseMove is safe reinforcement learning framework that combines hierarchical reinforcement learning and model-checking using temporal logic constraints.

Requirements

  • Python 3.6
  • Sphinx
  • Please check requirements.txt for python package list.

Installation

  • Run the install dependencies script: ./scripts/install_dependencies.sh to install pip3 and required python packages.

Note: The script checks if dependencies folder exists in the project root folder. If it does, it will install from the local packages in that folder, else will install required packages from the internet. If you do not have an internet connection and the dependencies folder does not exist, you will need to run ./scripts/download_dependencies.sh using a machine with an internet connection first and transfer that folder.

Documentation

  • Open ./documentation/index.html to view the documentation
  • If the file does not exist, use command ./scripts/generate_doc.sh build to generate documentation first. Note that this requires Sphinx to be installed.

Replicate Results

These are the minimum steps required to replicate the results for simple_intersection environment. For a detailed user guide, it is recommended to view the documentation.

  • Run ./scripts/install_dependencies.sh to install python dependencies.
  • Low-level policies:
    • You can choose to train and test all the maneuvers. But this may take some time and is not recommended.
      • To train all low-level policies from scratch: python3 low_level_policy_main.py --train. This may take some time.
      • To test all these trained low-level policies: python3 low_level_policy_main.py --test --saved_policy_in_root.
      • Make sure the training is fully complete before running above test.
    • It is easier to verify few of the maneuvers using below commands:
      • To train a single low-level, for example wait: python3 low_level_policy_main.py --option=wait --train.
      • To test one of these trained low-level policies, for example wait: python3 low_level_policy_main.py --option=wait --test --saved_policy_in_root
      • Available maneuvers are: wait, changelane, stop, keeplane, follow
    • These results are visually evaluated.
    • Note: This training has a high variance issue due to the continuous action space, especially for stop and keeplane maneuvers. It may help to train for 0.2 million steps than the default 0.1 million by adding argument '--nb_steps=200000' while training.
  • High-level policy:
    • To train high-level policy from scratch using the given low-level policies: python3 high_level_policy_main.py --train
    • To evaluate this trained high-level policy: python3 high_level_policy_main.py --evaluate --saved_policy_in_root.
    • The success average and standard deviation corresponds to the result from high-level policy experiments.
  • To run MCTS using the high-level policy:
    • To obtain a probabilites tree and save it: python3 mcts.py --train
    • To evaluate using this saved tree: python3 mcts.py --evaluate --saved_policy_in_root.
    • The success average and standard deviation corresponds to the results from MCTS experiments.

Coding Standards

We follow PEP8 style guidelines for coding and PEP257 for documentation. It is not necessary to keep these in mind while coding, but before submitting a pull request, do these two steps for each python file you have modified.

  1. yapf -i YOUR_MODIFIED_FILE.py
  2. docformatter --in-place YOUR_MODIFIED_FILE.py

yapf formats the code and docformatter formats the docstrings.