Skip to content

The ##word should not be predicted  #5

@AnblueWang

Description

@AnblueWang

In bert paper, it seems that the words start with '##' should not be predicted. And you did compute is_head variable, but why this variable is not used when computing loss ?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions