Skip to content

Conversation

machshev
Copy link
Collaborator

@machshev machshev commented Oct 3, 2025

Debug feature to dump a JSON representation of the Deploy objects, just before they are sent to the scheduler. This feature is hidden behind an environment variable to avoid cluttering the CLI interface. The full Deploy object is not JSON serialisable in it's current form, so we only serialise the attributes that the LocalLauncher uses.

To enable set DVSIM_DEPLOY_DUMP=true, run DVSim and there will be a file called deploy-<timestamp>.json generated in the scratch directory scratch/<branch>/deploy-<timestamp.json>.

This file can they be compared between runs to check for regressions in the flow leading up to job deployment. With --fixed-seed=1 --branch=baseline, the resulting json files should be identical. Setting the "branch" doesn't actually change the git branch, it just overrides what DVSim
thinks is the branch as far as paths in the scratch directory are concerned. We need to either set this to something fixed so that the deployment objects contain the same paths, otherwise a diff will fail.

Generating a diff or hash of the two files and comparing shows if the job deployments would be identical or not.

Using this functionality I have confirmed, by backporting this commit to the first release tag, that none of the refactorings made so far in this repository have changed the deployment objects in such a way that the launched jobs are different in any way.

✦ ❯ sha512sum baseline.json deploy_25.10.03_11.35.33.json
af732c3011753cfc7dc40068e1ce9b6cf3624973ffbbd25f828e4d7727f27dd65b5ada19407500315ea6f279946222d185dc939fe4978b0b32b3e7e8b4f7f60c  baseline.json
af732c3011753cfc7dc40068e1ce9b6cf3624973ffbbd25f828e4d7727f27dd65b5ada19407500315ea6f279946222d185dc939fe4978b0b32b3e7e8b4f7f60c  deploy_25.10.03_11.35.33.json

The first JSON file was generated from backporting the this feature to the tagged commit. The second file was generated with this branch and includes all the tidyup work made so far to DVSim.

The DVSim command used is:

DVSIM_DEPLOY_DUMP=true dvsim \
    ~/base/opentitan/hw/top_earlgrey/dv/top_earlgrey_sim_cfgs.hjson \
    --proj-root ~/base/opentitan -i nightly --scratch ~/scratch \
    --branch baseline --fixed-seed 1 \
    --verbose=debug --max-parallel=50 \
    --cov

Note: your hashes will be different from mine as the directory paths will be different. However if you run this against the same versions your hashes should be the same as each other.

Copy link

@mkj121 mkj121 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pirmin is away until Oct 20 so I suggest you remove him as a reviewer

@machshev machshev removed the request for review from vogelpi October 3, 2025 13:13
@machshev machshev force-pushed the deploy_object_dump branch 2 times, most recently from c630250 to 4699d57 Compare October 3, 2025 15:48
Debug feature to dump a JSON representation of the `Deploy` objects,
just before they are sent to the scheduler. This feature is hidden behind
an environment variable to avoid cluttering the CLI interface. The full
`Deploy` object is not JSON serialisable in it's current form, so we
only serialise the attributes that the `LocalLauncher` uses and some
identifying attributes like the full name of the job and the deployment
class name.

To enable set `DVSIM_DEPLOY_DUMP=true`, run DVSim and there will be a
file called `deploy-<branch>-<timestamp>.json` generated in the scratch
directory `scratch/deploy-<branch>-<timestamp.json>`.

This file can they be compared between runs to check for regressions in
the flow leading up to job deployment. With `--fixed-seed=1 --branch=baseline`,
the resulting json files should be identical. Setting the "branch"
doesn't actually change the git branch, it just overrides what DVSim
thinks is the branch as far as paths in the scratch directory are
concerned. We need to either set this to something fixed so that the
deployment objects contain the same paths, otherwise a diff will fail.

Generating a `diff` or hash of the two files and comparing shows if the
job deployments would be identical or not.

Using this functionality I have confirmed, by backporting this commit to
the first release tag, that none of the refactorings made so far in this
repository have changed the deployment objects in such a way that the
launched jobs are different in any way.

```
✦ ❯ sha512sum baseline.json deploy_25.10.03_11.35.33.json
af732c3011753cfc7dc40068e1ce9b6cf3624973ffbbd25f828e4d7727f27dd65b5ada19407500315ea6f279946222d185dc939fe4978b0b32b3e7e8b4f7f60c  baseline.json
af732c3011753cfc7dc40068e1ce9b6cf3624973ffbbd25f828e4d7727f27dd65b5ada19407500315ea6f279946222d185dc939fe4978b0b32b3e7e8b4f7f60c  deploy_25.10.03_11.35.33.json
```

The first JSON file was generated from backporting the this feature to
the tagged commit. The second file was generated with this branch and
includes all the tidyup work made so far to DVSim.

The DVSim command used is:

```
DVSIM_DEPLOY_DUMP=true dvsim \
    ~/base/opentitan/hw/top_earlgrey/dv/top_earlgrey_sim_cfgs.hjson \
    --proj-root ~/base/opentitan -i nightly --scratch ~/scratch \
    --branch baseline --fixed-seed 1 \
    --verbose=debug --max-parallel=50 \
    --cov
```

**Note:** *your hashes will be different from mine as the directory paths
will be different. However if you run this against the same versions
your hashes should be the same as each other.*

**Note:** *Depending on the flags you pass to DVSim there may be some
minor indeterminism in the configuration. For example with CovMerge the
coverage directories are not always given in the same order. It can take
several runs to get a different hash, or it could be different on every
run. In such cases it's worth using `diff` across the files to see what
the actual differences are, they may not be consequential.*

Signed-off-by: James McCorrie <[email protected]>
@machshev machshev force-pushed the deploy_object_dump branch from 4699d57 to 61776ff Compare October 3, 2025 16:17
@machshev
Copy link
Collaborator Author

machshev commented Oct 3, 2025

Turns out that the coverage merge directories are provided in a semi random order. Which means that if you run DVSim several times then the hash can be different due to that.

I've updated the PR to sort the deployment object model dump JSON objects by the job full name to make sure the jobs themselves are in the same order each time. However small variations may be present anyway due to issues like the above.

We could make sure these directory paths are sorted to make them more deterministic. But that's for another PR. For the moment this mechanism seems good enough.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants