Skip to content

Can't run bert_vocab_from_dataset without TypeError: Tensor is unhashable #61781

@mihalt

Description

@mihalt

Issue type

Support

Have you reproduced the bug with TensorFlow Nightly?

No

Source

source

TensorFlow version

2.13

Custom code

Yes

OS platform and distribution

No response

Mobile device

No response

Python version

No response

Bazel version

No response

GCC/compiler version

No response

CUDA/cuDNN version

No response

GPU model and memory

No response

Current behavior?

This is the code from you manual and I really don't understans that I get this error. Why is it?

If I add
tf.compat.v1.disable_eager_execution()
tf.compat.v1.disable_v2_behavior()

I get
RuntimeError: input_dataset: Attempting to capture an EagerTensor without building a function.

Standalone code to reproduce the issue

data = tf.data.TextLineDataset([SENTENCES_PATH, TAGS_PATH])

from tensorflow_text.tools.wordpiece_vocab import bert_vocab_from_dataset as bert_vocab

tokens = bert_vocab.bert_vocab_from_dataset(
    data,
    # The target vocabulary size
    vocab_size = 50000,
    # Reserved tokens that must be included in the vocabulary
    reserved_tokens=["[PAD]", "[UNK]", "[START]", "[END]"],
    # Arguments for `text.BertTokenizer`
    bert_tokenizer_params=dict(lower_case=True),
    # Arguments for `wordpiece_vocab.wordpiece_tokenizer_learner_lib.learn`
    learn_params={},
)

Relevant log output

TypeError: Tensor is unhashable. Instead, use tensor.ref() as the key.

Metadata

Metadata

Labels

TF 2.13For issues related to Tensorflow 2.13comp:apisHighlevel API related issuesstaleThis label marks the issue/pr stale - to be closed automatically if no activitystat:awaiting responseStatus - Awaiting response from authortype:supportSupport issues

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions