What is the actual max-model-len for this model?

#1
by vrdn23 - opened

Hi,
I was just curious about an inconsistency with the max-model-len about this model.
The README suggests that the max model length was 1024 tokens, but it seems the encoder used is a Deberta model with only 512 positional embeddings. I would assume this means that it can no longer be reliably used for sentences with longer than 512 tokens, or did I misunderstand how Gliner operates internally?

cc @Ihor

Knowledgator Engineering org

Hi @vrdn23 , DeBERTa uses relative positional encoding. While it was pre-trained on a 512-token window, it can extrapolate beyond that window. Our model was trained on a 1024-token window size, but can also extrapolate into longer sequences.

Sign up or log in to comment