Skip to content

[Looking for contributors] Fixing BC and RL training #162

@micahcarroll

Description

@micahcarroll

Hi there, I'm the original creator of the repo. Since the original release, the original packages we were using for BC and RL training have had major updates, people have transitioned from tensoflow to pytorch to jax, etc.

This has broken the BC and RL training. Over the years, I've tried to fix things locally without having to spend too much time on it, but I don't think it's worth doing that anymore: this is investing in a stack (mainly built on tensorflow) which is outdated, and should be re-written. Good news is that writing this stack today is much easier than it once was.

What would this look like?

  • For BC, you could quickly get the current tensorflow implementation in human_aware_rl to work (I was able to at least), but it's probably best to start from scratch and write it in pytorch (or even better, JAX). This shouldn't be too hard. If you do so, please make a PR!
  • For RL training, I'd recommend using JaxMARL: it's much faster than our training was. I don't think salvaging our rllib-based training is worth it.
  • For evaluating/visualizing/interacting with the trained RL policies, this would require some (hopefully minimal) infrastructure to port the JaxMARL-trained policies back to the overcooked-ai environment.

Old instructions from the README about testing whether human_aware_rl is working correctly

To check whether the humam_aware_rl is installed correctly, you can run the following command from the src/human_aware_rl directory:

$ ./run_tests.sh

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions