YAML Metadata Warning: empty or missing yaml metadata in repo card (https://huggingface.co/docs/hub/model-cards#model-card-metadata)

⚠Warning⚠ this is an experimental weight. It may not have practical performance.
Also, the model file must be manually rewritten or replaced to use this weight.

The model file is available here.
https://github.com/lucidrains/BS-RoFormer

The BS-Roformer has been updated in terms of architecture for the first time in a while.
In the 0.5.x update, a mechanism called "Value Residual Learning" was introduced. (https://arxiv.org/abs/2410.17897)
The paper argues that this mechanism can reduce the over-focus of attention and further reduce the vanishing gradient problem.

Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model is not currently available via any of the supported third-party Inference Providers, and HF Inference API was unable to determine this model's library.