Fix path of the states in `SeedGenerator` and tracking of `torch_params` #19495

james77777778 · 2024-04-11T15:07:11Z

This PR:

Fix the incorrect paths of the states in SeedGenerator
Add seed_generators attr to Layer (primarily for testing purposes)
Add assertion for len of torch_params in torch backend
Update torch_params in _track_variable and _untrack_variable

In the current codebase, the fact that all states in the seed generator share an identical path seed_generator_state should be considered an issue.

Additionally, I have found some incorrect expected_num_seed_generators in run_layer_test within certain tests.

keras/layers/regularization/dropout.py

codecov-commenter · 2024-04-11T16:02:02Z

Codecov Report

Attention: Patch coverage is 85.00000% with 6 lines in your changes are missing coverage. Please review.

Project coverage is 76.25%. Comparing base (dca1d8a) to head (eefcc8c).

Files	Patch %	Lines
keras/backend/torch/layer.py	90.00%	0 Missing and 1 partial ⚠️
keras/layers/regularization/alpha_dropout.py	50.00%	0 Missing and 1 partial ⚠️
keras/layers/regularization/gaussian_dropout.py	50.00%	0 Missing and 1 partial ⚠️
keras/layers/regularization/gaussian_noise.py	50.00%	0 Missing and 1 partial ⚠️
keras/random/seed_generator.py	85.71%	0 Missing and 1 partial ⚠️
keras/testing/test_case.py	90.90%	0 Missing and 1 partial ⚠️

Additional details and impacted files

@@           Coverage Diff           @@
##           master   #19495   +/-   ##
=======================================
  Coverage   76.25%   76.25%           
=======================================
  Files         367      367           
  Lines       41195    41226   +31     
  Branches     8066     8077   +11     
=======================================
+ Hits        31413    31438   +25     
  Misses       8060     8060           
- Partials     1722     1728    +6

Flag	Coverage Δ
keras	`76.11% <85.00%> (+<0.01%)`	⬆️
keras-jax	`60.27% <57.50%> (-0.01%)`	⬇️
keras-numpy	`54.24% <57.50%> (-0.01%)`	⬇️
keras-tensorflow	`61.53% <57.50%> (-0.01%)`	⬇️
keras-torch	`60.39% <80.00%> (+0.01%)`	⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

fchollet

Thanks for the PR!

Uniquifying paths for seed generator states is a good idea.

keras/random/seed_generator.py

keras/layers/layer.py

fchollet · 2024-04-11T16:05:15Z

keras/backend/torch/layer.py

    def _track_variables(self):
-        self.torch_params = torch.nn.ParameterList(
-            [variable.value for variable in self.variables]
+        self.torch_params = torch.nn.ParameterDict(


The dict is more readable, but it's unsafe -- there's no guarantee that variable names are unique within a model (except for Functional models). It's entirely possible to create models with duplicate variable paths, which would cause tracking issues above. So the list is preferable.

It should be safe to use id(variable) as a key for torch.nn.ParameterDict. I believe BaseOptimizer adopts the same approach to get the mapping of variables.

I could not find a solution for safely adding/removing the variable from torch.nn.ParameterList
(in and remove of KerasVariable are not supported)

What's the original issue with using a list though?

The problem is that torch.nn.ParameterList cannot remove elements. This will cause failure in the int8 quantization because we need to remove the old floating _kernel.
Also, it is hard to determine if it is safe to append a new variable because we cannot check whether the item is already in the list.

from keras import layers layer = layers.Dense(units=16) layer.build([None, 8]) assert len(layer.torch_params) == len(layer.variables) layer.enable_lora(rank=2) assert len(layer.torch_params) == len(layer.variables) # <--

master this pr

Failed at line 8 (4 vs. 2) Success

The biggest issue is that zero_grad() will fail to reset the uncaptured variables.
I believe the currect LoRA implementation will not be trained correctly on torch backend.

To demonstrate the error of zero_grad using LoRA:

import torch from keras import layers layer = layers.Dense(units=3) layer.build([None, 2]) layer.enable_lora(rank=1) def list_grads(layer): grads = dict() for v in layer.trainable_variables: grads[v.name] = v.value.grad return grads # Fake grads for v in layer.trainable_variables: v.value.grad = torch.rand_like(v.value) print(list_grads(layer)) layer.zero_grad() print(list_grads(layer))

# master branch {'bias': tensor([0.6259, 0.4827, 0.6012], device='cuda:0'), 'lora_kernel_a': tensor([[0.6620], [0.7231]], device='cuda:0'), 'lora_kernel_b': tensor([[0.7123, 0.9257, 0.1676]], device='cuda:0')} {'bias': None, 'lora_kernel_a': tensor([[0.6620], [0.7231]], device='cuda:0'), 'lora_kernel_b': tensor([[0.7123, 0.9257, 0.1676]], device='cuda:0')} # this pr {'bias': tensor([0.5960, 0.2336, 0.1569], device='cuda:0'), 'lora_kernel_a': tensor([[0.9123], [0.9217]], device='cuda:0'), 'lora_kernel_b': tensor([[0.3435, 0.9276, 0.6599]], device='cuda:0')} {'bias': None, 'lora_kernel_a': None, 'lora_kernel_b': None}

fchollet

Thanks for the update!

keras/layers/layer.py

fchollet · 2024-04-12T06:05:33Z

keras/layers/preprocessing/discretization.py

        self.sparse = sparse

        if self.bin_boundaries:
-            self.built = True


Why this change?

Explicitly set self.built=True in __init__ will fail the run_build_asserts of symbolic call test in run_layer_test.
The root cause is that self.torch_params will not be initialized.

I'm unsure where the issue is but it should be acceptable to leave self.built=True for build method.

from keras import layers inputs = layers.Input([2]) layer = layers.Dropout(rate=0.2) layer(inputs) assert len(layer.torch_params) == len(layer.variables)

This script will fail at the master branch but work fine in this pr.

fchollet · 2024-04-12T06:05:56Z

keras/layers/regularization/dropout.py

        if rate > 0:
            self.seed_generator = backend.random.SeedGenerator(seed)
        self.supports_masking = True
-        self.built = True


Why this change?

Same as above

james77777778 · 2024-04-12T08:54:35Z

keras/ops/operation.py

+        """Can be overridden for per backend post track actions."""
+        pass
+
+    def _post_untrack_variable(self, variable):


Two hooks have been introduced to enable the postprocessing of _track_variable and _untrack_variable in torch backend

fchollet

LGTM, thank you!

Fix seed_generator path and counting issue

1ccadd2

google-ml-butler bot added the size:M label Apr 11, 2024

google-ml-butler bot assigned gbaned Apr 11, 2024

james77777778 mentioned this pull request Apr 11, 2024

Introduce float8 training #19488

Merged

james77777778 changed the title ~~Fix path of the states in SeedGenerator~~ Fix path of the states in SeedGenerator and tracking of torch_params Apr 11, 2024

Fix tests

ca5e7e7

james77777778 commented Apr 11, 2024

View reviewed changes

keras/layers/regularization/dropout.py Outdated Show resolved Hide resolved

Fix tests

a00089f

fchollet reviewed Apr 11, 2024

View reviewed changes

james77777778 added 2 commits April 12, 2024 09:16

Address comments

19c20b8

Merge branch 'keras-team:master' into fix-torch_params

0619222

james77777778 requested a review from fchollet April 12, 2024 01:53

google-ml-butler bot added the awaiting review label Apr 12, 2024

gbaned removed the awaiting review label Apr 12, 2024

james77777778 added 2 commits April 12, 2024 13:49

Check before adding variable to self.torch_params

20e015a

Merge branch 'keras-team:master' into fix-torch_params

7db40be

fchollet reviewed Apr 12, 2024

View reviewed changes

Move self.torch_params logic from Layer to TorchLayer

6176411

james77777778 requested a review from fchollet April 12, 2024 06:50

google-ml-butler bot added the awaiting review label Apr 12, 2024

james77777778 marked this pull request as draft April 12, 2024 07:50

Use hooks to postprocess torch_params in torch backend

eefcc8c

james77777778 commented Apr 12, 2024

View reviewed changes

james77777778 marked this pull request as ready for review April 12, 2024 08:56

fchollet approved these changes Apr 12, 2024

View reviewed changes

google-ml-butler bot added kokoro:force-run ready to pull Ready to be merged into the codebase labels Apr 12, 2024

fchollet merged commit 30622e3 into keras-team:master Apr 12, 2024

google-ml-butler bot removed the awaiting review label Apr 12, 2024

google-ml-butler bot removed ready to pull Ready to be merged into the codebase kokoro:force-run labels Apr 12, 2024

james77777778 deleted the fix-torch_params branch April 13, 2024 02:57

Fix path of the states in SeedGenerator and tracking of torch_params #19495

Fix path of the states in SeedGenerator and tracking of torch_params #19495

Uh oh!

Conversation

james77777778 commented Apr 11, 2024

Uh oh!

Uh oh!

codecov-commenter commented Apr 11, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

james77777778 Apr 12, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

fchollet left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Fix path of the states in `SeedGenerator` and tracking of `torch_params` #19495

Fix path of the states in `SeedGenerator` and tracking of `torch_params` #19495

codecov-commenter commented Apr 11, 2024 •

edited

Loading

james77777778 Apr 12, 2024 •

edited

Loading