Skip to content

Conversation

chriselion
Copy link
Contributor

@chriselion chriselion commented Jul 28, 2020

Proposed change(s)

  • Add optional "virtual root" GameObject to the RigidBodyPoseExtractor. If set, this acts as the reference position, and the root RB is parented to this.
  • Update the Crawler scene to use a RigidBodySensorComponent, and remove related code from CollectObservations.
  • Enable toggling (in code) specific bodies in the PoseExtractor. The root body is disabled by default, since that observations for that are always the identity. Need to enable setting these for other bodies in a custom inspector (e.g. disabling the eyeballs in the worm).
  • Changed pose access to be an IEnumerable, to make skipping disabled poses transparent.

image

Followup:

  • retrain UR3 model, because the layout of the observations changed (maybe not a problem if not using the velocities)
  • Add VirtualRoot to ArticulationBodyPoseExtractor (not needed by anything at the moment).

Useful links (Github issues, JIRA tickets, ML-Agents forum threads etc.)

https://jira.unity3d.com/browse/MLA-1135
https://jira.unity3d.com/browse/MLA-1166
https://jira.unity3d.com/browse/MLA-1167

Types of change(s)

  • Bug fix
  • New feature
  • Code refactor
  • Breaking change
  • Documentation update
  • Other (please describe)

Checklist

  • Added tests that prove my fix is effective or that my feature works
  • Updated the changelog (if applicable)
  • Updated the documentation (if applicable)
  • Updated the migration guide (if applicable)

Other comments

@chriselion chriselion marked this pull request as ready for review July 29, 2020 18:52
@chriselion chriselion marked this pull request as draft July 29, 2020 18:52

[assembly: InternalsVisibleTo("Unity.ML-Agents.Editor.Tests")]
[assembly: InternalsVisibleTo("Unity.ML-Agents.Editor")]
[assembly: InternalsVisibleTo("Unity.ML-Agents.Extensions")]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Possibly controversial; I added this to get timers. But it's something we could use in the future, e.g. public actuators in extensions, but keep IActuator internal.

@chriselion chriselion marked this pull request as ready for review July 30, 2020 20:00
@chriselion chriselion requested a review from Hunter-Unity July 30, 2020 20:00

//Get velocities in the context of our orientation cube's space
//Note: You can get these velocities in world space as well but it may not train as well.
sensor.AddObservation(orientationCube.transform.InverseTransformDirection(bp.rb.velocity));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corresponds to model space velocity

//Get velocities in the context of our orientation cube's space
//Note: You can get these velocities in world space as well but it may not train as well.
sensor.AddObservation(orientationCube.transform.InverseTransformDirection(bp.rb.velocity));
sensor.AddObservation(orientationCube.transform.InverseTransformDirection(bp.rb.angularVelocity));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No direct correspondence, but removing it didn't seem to hurt much. I also need to convince myself that you can just transform angular velocities like that.

sensor.AddObservation(orientationCube.transform.InverseTransformDirection(bp.rb.angularVelocity));

//Get position relative to hips in the context of our orientation cube's space
sensor.AddObservation(orientationCube.transform.InverseTransformDirection(bp.rb.position - body.position));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corresponds to model space positions


if (bp.rb.transform != body)
{
sensor.AddObservation(bp.rb.transform.localRotation);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Corresponds to local space rotations

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these decisions that could have been made with the help of the tool @andrewcoh was talking about for saliency?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case, I removed everything from CollectObservations that had an equivalent in the sensor.

In general, we could turn everything on in the sensor, run the tool to see what's important, and then disable unimportant stuff. However:

  1. It could be tedious to attribute a particular index in the tool to a specific option in the sensor.
  2. The tool might be too fine-grained. For example, if it said that the w component of one quaternion wasn't relevant, we couldn't just turn that one off. Or if local space velocities was only relevant for one body, we couldn't keep that one but disable the rest.

propertyPath: targetToLookAt
value:
objectReference: {fileID: 7802320107249901494}
- target: {fileID: 4622120667686875944, guid: 0456c89e8c9c243d595b039fe7aa0bf9,
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, don't think I meant to override this. Now I need to double check the model settings

@chriselion
Copy link
Contributor Author

Training times and rewards:

master

Job Id Scene Name Expected Reward Mean Reward Max Reward Inference Reward Training Time Hours
crawdy-ppo crawlerdynamictarget 400 810.8445 813.6515 645.7593 10.79
crawsta-ppo crawlerstatictarget 2000 2069.4256 2073.8216 1988.6075 11.01

this branch

Job Id Scene Name Expected Reward Mean Reward Max Reward Inference Reward Training Time Hours
crawdy-ppo crawlerdynamictarget 400 801.0141 821.2339 832.2614 10.79
crawsta-ppo crawlerstatictarget 2000 1789.9456 1863.3114 1777.4036 13.13

I'm going to retrain the crawlerstatictarget scene again tonight; I got some prefabs mixed up so it was using different settings (added local space velocities too). Hopefully that will bring the training time down and the reward up a bit.

Also note that for crawlerstatictarget, master is above the expected reward and below on this branch. However, the reward on master is the outlier (unusually good) if we look at the last few weeks of data:
image

/// </summary>
/// <param name="index"></param>
/// <param name="val"></param>
public void SetPoseEnabled(int index, bool val)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SetPoseEnabled is only used to make sure the Virtual Root is not being observed right? I did not see being used elsewhere. Making it public might be dangerous, dot sure what would happen if the user called SetPoseEnabled in the middle of training.
Also, nothing to do with Enabled / Disabled GameObjects right?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, it's only used on the root (virtual or otherwise). But I was to add a custom Editor to expose it per-body. For example, on the Worm scene, the eyes are actually rigid bodies constrained to the head, with locked linear motion and free angular motion:
image
We would basically be wasting observations on those, so I'd like to be able to filter them out.

For now, I'd prefer to keep things more accessible, and tighten up the visibility if we move to the core package.

Correct, nothing to do Enabled / Disabled GameObjects. I hadn't thought about that - is there a better name that would make this more obvious?

/// </summary>
/// <param name="parentIndices"></param>
protected void SetParentIndices(int[] parentIndices)
protected void Setup(int[] parentIndices)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the assumption be checked since this is a protected method on a public class?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done now.


/// <inheritdoc/>
protected override Vector3 GetLinearVelocityAt(int index)
protected internal override Vector3 GetLinearVelocityAt(int index)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why the internal addition?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wanted to be able to assert on it

Assert.AreEqual(rb1.velocity, poseExtractor.GetLinearVelocityAt(0));

// * go2
// - rb2
// - joint
var virtualRoot = new GameObject("I am vroot");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🌳

@chriselion chriselion merged commit fe7d147 into master Jul 31, 2020
@delete-merged-branch delete-merged-branch bot deleted the MLA-1135-reference-body branch July 31, 2020 17:29
@chriselion chriselion changed the title [WIP] [MLA-1135] Physics sensors - optional reference body [MLA-1135] Physics sensors - optional reference body Jul 31, 2020
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Jul 31, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants