johnlockejrr (John Locke)

reacted to victor's post with 🚀 2 days ago

Post

2754

Finally, an open-source AI that turns your lyrics into full songs is here—meet YuE! Unlike other tools that only create short clips, YuE can make entire songs (up to 5 minutes) with vocals, melody, and instruments all working together. Letsss go!

m-a-p/YuE-s1-7B-anneal-en-cot

liked a Space 3 days ago

Sleeping

31

🏢

HTRFLOW

htrflow demo app

New activity in Teklia/pylaia-belfort 11 days ago

PyLaia enhancement

2

#7 opened 13 days ago by

johnlockejrr

published a Space 13 days ago

Sleeping

📈

Yolo Pylaia

Samaritan OCR with YOLOv8 and PyLaia

New activity in cantillation/Teamim-small_Random_WeightDecay-0.05_Augmented_New-Data_date-02-08-2024 28 days ago

A little info

#1 opened 28 days ago by

johnlockejrr

reacted to singhsidhukuldeep's post with 🚀 about 1 month ago

Post

3665

Exciting breakthrough in AI: @Meta 's new Byte Latent Transformer (BLT) revolutionizes language models by eliminating tokenization!

The BLT architecture introduces a groundbreaking approach that processes raw bytes instead of tokens, achieving state-of-the-art performance while being more efficient and robust. Here's what makes it special:

>> Key Innovations
Dynamic Patching: BLT groups bytes into variable-sized patches based on entropy, allocating more compute power where the data is more complex. This results in up to 50% fewer FLOPs during inference compared to traditional token-based models.

Three-Component Architecture:
• Lightweight Local Encoder that converts bytes to patch representations
• Powerful Global Latent Transformer that processes patches
• Local Decoder that converts patches back to bytes

>> Technical Advantages
• Matches performance of Llama 3 at 8B parameters while being more efficient
• Superior handling of non-English languages and rare character sequences
• Remarkable 99.9% accuracy on spelling tasks
• Better scaling properties than token-based models

>> Under the Hood
The system uses an entropy model to determine patch boundaries, cross-attention mechanisms for information flow, and hash n-gram embeddings for improved representation. The architecture allows simultaneous scaling of both patch and model size while maintaining fixed inference costs.

This is a game-changer for multilingual AI and could reshape how we build future language models. Excited to see how this technology evolves!

3 replies

·

liked a Space about 2 months ago

Running

7

🏃

Ituria

New activity in Gabriel/Qwen2-VL-2B-Instruct about 2 months ago

Model inference

1

#1 opened about 2 months ago by

johnlockejrr

reacted to MohamedRashad's post with ❤️❤️ about 2 months ago

Post

1661

A while back i shared this model MohamedRashad/arabic-small-nougat that was a finetune from facebook/nougat-small for the Arabic Language.

Today this humble project has been scaled with new models, new datasets, new space, and a new paper

Check everything throught this collection here:
MohamedRashad/arabic-nougat-673a3f540bd92904c9b92a8e

1 reply

·

upvoted a paper about 2 months ago

Arabic-Nougat: Fine-Tuning Vision Transformers for Arabic OCR and Markdown Extraction

Paper • 2411.17835 • Published Nov 19, 2024 • 3

New activity in MohamedRashad/arabic-small-nougat 2 months ago

Arabic Small Nougat

10

#1 opened 9 months ago by

johnlockejrr

reacted to MohamedRashad's post with 🤗🚀 2 months ago

Post

1661

A while back i shared this model MohamedRashad/arabic-small-nougat that was a finetune from facebook/nougat-small for the Arabic Language.

Today this humble project has been scaled with new models, new datasets, new space, and a new paper

Check everything throught this collection here:
MohamedRashad/arabic-nougat-673a3f540bd92904c9b92a8e

1 reply

·

upvoted an article 2 months ago

Article

HTRflow - A tool for HTR and OCR

By

•

Oct 1, 2024

• 15

reacted to jsulz's post with 🔥 2 months ago

Post

2926

When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. That’s where our chunk-based approach comes in.

Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means:

⏩ Only upload the chunks that changed.
🚀 Download just the updates, not the whole file.
🧠 We store your file as deduplicated chunks

In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isn’t just a performance boost. It’s a rethinking of how we manage models and datasets on the Hub.

We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows?

https://huggingface.co/blog/from-files-to-chunks