• Jae Young Lee's avatar
    Improve and Bug-fix DQNLearner and environments. · 4a9327bd
    Jae Young Lee authored
    - Added RestrictedEpsGreedyPolicy and RestrictedGreedyPolicy and use them as policy and test_policy in DQNLearner. Now, the agent never chooses the action corresponding to -inf Q-value if there is at least one action with finite Q-value (if not, it chooses any action randomly, which is necessary for compatibility with keras-rl --
     see the comments in select_action).
    
    - Now, generate_scenario in SimpleIntersectionEnv generates veh_ahead_scenario even when randomize_special_scenario = 1.
    
    - In EpisodicEnvBase, the terminal reward is by default determined by the minimum one;
    
    - Small change of initiation_condition of EpisodicEnvBase (simplified);
    4a9327bd
Name
Last commit
Last update
backends Loading commit data...
documentation Loading commit data...
env Loading commit data...
model_checker Loading commit data...
options Loading commit data...
scripts Loading commit data...
.gitignore Loading commit data...
LICENSE Loading commit data...
README.md Loading commit data...
config.json Loading commit data...
high_level_policy_main.py Loading commit data...
low_level_policy_main.py Loading commit data...
mcts.py Loading commit data...
mcts_config.json Loading commit data...
ppo2_training.py Loading commit data...
requirements.txt Loading commit data...