Skip to content

Commit d795f5d

Browse files
authored
Expand evaluation doc (#1396)
1 parent d076ce4 commit d795f5d

File tree

1 file changed

+23
-1
lines changed

1 file changed

+23
-1
lines changed

docs/source/trainer/evaluation.rst

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -17,6 +17,26 @@ specified by the the :class:`.Trainer` parameter ``eval_interval``.
1717
)
1818
1919
The metrics should be provided by :meth:`.ComposerModel.metrics`.
20+
For more information, see the "Metrics" section in :doc:`/composer_model`.
21+
22+
To provide a deeper intuition, here's pseudocode for the evaluation logic that occurs every ``eval_interval``:
23+
24+
.. code:: python
25+
26+
metrics = model.metrics(train=False)
27+
28+
for batch in eval_dataloader:
29+
outputs, targets = model.validate(batch)
30+
metrics.update(outputs, targets) # implements the torchmetrics interface
31+
32+
metrics.compute()
33+
34+
- The trainer iterates over ``eval_dataloader`` and passes each batch to the model's :meth:`.ComposerModel.validate` method.
35+
- Outputs of ``model.validate`` are used to update ``metrics`` (a :class:`torchmetrics.Metric` or :class:`torchmetrics.MetricCollection` returned by :meth:`.ComposerModel.metrics <model.metrics(train=False)>`).
36+
- Finally, metrics over the whole validation dataset are computed.
37+
38+
Note that the tuple returned by :meth:`.ComposerModel.validate` provide the positional arguments to ``metrics.update``.
39+
Please keep this in mind when using custom models and/or metrics.
2040

2141
Multiple Datasets
2242
-----------------
@@ -25,7 +45,7 @@ If there are multiple validation datasets that may have different metrics,
2545
use :class:`.Evaluator` to specify each pair of dataloader and metrics.
2646
This class is just a container for a few attributes:
2747

28-
- ``label``: a user-specified name for the metric.
48+
- ``label``: a user-specified name for the evaluator.
2949
- ``dataloader``: PyTorch :class:`~torch.utils.data.DataLoader` or our :class:`.DataSpec`.
3050
See :doc:`DataLoaders</trainer/dataloaders>` for more details.
3151
- ``metric_names``: list of names of metrics to track.
@@ -55,3 +75,5 @@ can be specified as in the following example:
5575
eval_dataloader=[glue_mrpc_task, glue_mnli_task],
5676
...
5777
)
78+
79+
Note that `metric_names` must be a subset of the metrics provided by the model in :meth:`.ComposerModel.metrics`.

0 commit comments

Comments
 (0)