Hugging Face
Models
Datasets
Spaces
Posts
Docs
Enterprise
Pricing
Log In
Sign Up
14.0
TFLOPS
13
4
10
Garreth Lee
PRO
garrethlee
Follow
hynky's profile picture
ltim's profile picture
21world's profile picture
15 followers
·
89 following
garrethleee
garrethlee
garrethlee
garrethlee
AI & ML interests
None yet
Recent Activity
Reacted to
jsulz
's
post
with 🔥
3 days ago
When the XetHub crew joined Hugging Face this fall, @erinys and I started brainstorming how to share our work to replace Git LFS on the Hub. Uploading and downloading large models and datasets takes precious time. That’s where our chunk-based approach comes in. Instead of versioning files (like Git and Git LFS), we version variable-sized chunks of data. For the Hugging Face community, this means: ⏩ Only upload the chunks that changed. 🚀 Download just the updates, not the whole file. 🧠 We store your file as deduplicated chunks In our benchmarks, we found that using CDC to store iterative model and dataset version led to transfer speedups of ~2x, but this isn’t just a performance boost. It’s a rethinking of how we manage models and datasets on the Hub. We're planning on our new storage backend to the Hub in early 2025 - check out our blog to dive deeper, and let us know: how could this improve your workflows? https://huggingface.co/blog/from-files-to-chunks
liked
a Space
3 days ago
xet-team/lfs-analysis
New activity
4 days ago
garrethlee/comprehensive-arithmetic-problems-carries:
Convert dataset to Parquet
View all activity
Organizations
garrethlee
's activity
All
Models
Datasets
Spaces
Papers
Collections
Community
Posts
Upvotes
Likes
liked
a Space
3 days ago
Running
16
📈
Hub LFS Analysis
An analysis of LFS files on the Hub.
liked
a model
6 days ago
GoToCompany/gemma2-9b-cpt-sahabatai-v1-instruct
Updated
18 days ago
•
1.48k
•
22
liked
a Space
8 days ago
Running
on
Zero
2
😻
Sahabat-AI Chatbot (Gemma2 9b)
Chatbot
liked
2 datasets
9 days ago
indolem/IndoMMLU
Updated
Oct 11, 2023
•
23.1k
•
13
PleIAs/common_corpus
Viewer
•
Updated
1 day ago
•
397M
•
39.6k
•
154
liked
a Space
about 1 month ago
Running
43
📝
Scaling FineWeb to 1000+ languages: Step 1: finding signal in 100s of evaluation tasks
liked
2 Spaces
about 2 months ago
Running
89
📖
TxT360: Trillion Extracted Text
Running
on
CPU Upgrade
817
🚀
Model Memory Utility
liked
a Space
3 months ago
Running
538
🍷
FineWeb: decanting the web for the finest text data at scale
liked
a model
8 months ago
mistralai/Mistral-7B-Instruct-v0.2
Text Generation
•
Updated
Sep 27
•
820k
•
•
2.58k