YAML Metadata
Warning:
empty or missing yaml metadata in repo card
(https://huggingface.co/docs/hub/model-cards#model-card-metadata)
⚠Warning⚠ this is an experimental weight. It may not have practical performance.
Also, the model file must be manually rewritten or replaced to use this weight.
The model file is available here.
https://github.com/lucidrains/BS-RoFormer
The BS-Roformer has been updated in terms of architecture for the first time in a while.
In the 0.5.x update, a mechanism called "Value Residual Learning" was introduced. (https://arxiv.org/abs/2410.17897)
The paper argues that this mechanism can reduce the over-focus of attention and further reduce the vanishing gradient problem.
Inference Providers
NEW
This model is not currently available via any of the supported third-party Inference Providers, and
HF Inference API was unable to determine this model's library.