[Looking for contributors] Fixing BC and RL training

Hi there, I'm the original creator of the repo. Since the original release, the original packages we were using for BC and RL training have had major updates, people have transitioned from tensoflow to pytorch to jax, etc. 

This has broken the BC and RL training. Over the years, I've tried to fix things locally without having to spend too much time on it, but I don't think it's worth doing that anymore: this is investing in a stack (mainly built on tensorflow) which is outdated, and should be re-written. Good news is that writing this stack today is much easier than it once was.

What would this look like?
* For BC, you could quickly get the current tensorflow implementation in `human_aware_rl` to work (I was able to at least), but it's probably best to start from scratch and write it in pytorch (or even better, JAX). This shouldn't be too hard. If you do so, please make a PR!
* For RL training, I'd recommend using [JaxMARL](https://github.com/FLAIROx/JaxMARL): it's _much_ faster than our training was. I don't think salvaging our rllib-based training is worth it.
* For evaluating/visualizing/interacting with the trained RL policies, this would require some (hopefully minimal) infrastructure to port the JaxMARL-trained policies back to the overcooked-ai environment.



### Old instructions from the README about testing whether human_aware_rl is working correctly

To check whether the `humam_aware_rl` is installed correctly, you can run the following command **from the `src/human_aware_rl` directory**:

```
$ ./run_tests.sh
```

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Looking for contributors] Fixing BC and RL training #162

Old instructions from the README about testing whether human_aware_rl is working correctly

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Looking for contributors] Fixing BC and RL training #162

Description

Old instructions from the README about testing whether human_aware_rl is working correctly

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions