2 1 1

Yeonseok Kim PRO

yeonseok-zeticai

https://zetic.ai

AI & ML interests

On device AI with mobile hardware utilizations

Recent Activity

reacted to their post with 🧠 14 days ago

⚡ MobileCLIP2 Complete On-device Study: TMLR 2025 Featured Model on Mobile Major Release: Comprehensive mobile deployment study of Apple's MobileCLIP2 (TMLR August 2025 Featured) with detailed performance benchmarks across 52+ mobile devices! 🎯 Model Overview: - Architecture: Multi-modal reinforced training (vision + language) - Research: TMLR 2025 Featured Certification - Innovation: Improved efficiency-accuracy trade-offs vs SigLIP/OpenAI CLIP - Specialty: Zero-shot image classification and retrieval 📊 Mobile Performance Results: Latency Metrics: - NPU (Best): 9.74ms average inference - GPU: 39.00ms average - CPU: 494.89ms average - NPU Advantage: 115.94x speedup over CPU baseline! Memory Efficiency: - Model Size: 1.66 GB (production optimized) - Runtime Memory: 466.18 MB peak consumption - Load Range: 0-1,884 MB across device categories - Inference Range: 431-1,616 MB Accuracy Preservation: - FP16 Precision: 39.78 dB maintained - Quantized Mode: 15.07 dB (INT optimization available) - Zero-shot Quality: Production-grade vision-language matching 🏆 Research Highlights: MobileCLIP2-S4 Performance: - Matches SigLIP-SO400M/14 accuracy - 2x fewer parameters - 2.5x lower latency than DFN ViT-L/14 MobileCLIP-S0 Efficiency: - Similar zero-shot performance to OpenAI ViT-B/16 - 4.8x faster inference - 2.8x smaller model size MobileCLIP-S2 Advantages: - Better avg zero-shot than SigLIP ViT-B/16 - 2.3x faster, 2.1x smaller - Trained with 3x less seen samples MobileCLIP-B (LT) Accuracy: - 77.2% ImageNet zero-shot - Surpasses OpenAI ViT-L/14@336 - Better than DFN and SigLIP similar architectures 🔗 Resources: - Complete Study: https://mlange.zetic.ai/p/Steve/MobileCLIP2-image Ready to build vision-language applications that run entirely on-device? The future of multi-modal AI runs locally in everyone's pocket! 🚀

reacted to their post with 🤗 14 days ago

reacted to their post with ❤️ 14 days ago

View all activity

Organizations

Posts 13

Post

214

⚡ MobileCLIP2 Complete On-device Study: TMLR 2025 Featured Model on Mobile
Major Release: Comprehensive mobile deployment study of Apple's MobileCLIP2 (TMLR August 2025 Featured) with detailed performance benchmarks across 52+ mobile devices!

🎯 Model Overview:
- Architecture: Multi-modal reinforced training (vision + language)
- Research: TMLR 2025 Featured Certification
- Innovation: Improved efficiency-accuracy trade-offs vs SigLIP/OpenAI CLIP
- Specialty: Zero-shot image classification and retrieval

📊 Mobile Performance Results:

Latency Metrics:
- NPU (Best): 9.74ms average inference
- GPU: 39.00ms average
- CPU: 494.89ms average
- NPU Advantage: 115.94x speedup over CPU baseline!

Memory Efficiency:
- Model Size: 1.66 GB (production optimized)
- Runtime Memory: 466.18 MB peak consumption
- Load Range: 0-1,884 MB across device categories
- Inference Range: 431-1,616 MB

Accuracy Preservation:
- FP16 Precision: 39.78 dB maintained
- Quantized Mode: 15.07 dB (INT optimization available)
- Zero-shot Quality: Production-grade vision-language matching

🏆 Research Highlights:

MobileCLIP2-S4 Performance:
- Matches SigLIP-SO400M/14 accuracy
- 2x fewer parameters
- 2.5x lower latency than DFN ViT-L/14

MobileCLIP-S0 Efficiency:
- Similar zero-shot performance to OpenAI ViT-B/16
- 4.8x faster inference
- 2.8x smaller model size

MobileCLIP-S2 Advantages:
- Better avg zero-shot than SigLIP ViT-B/16
- 2.3x faster, 2.1x smaller
- Trained with 3x less seen samples

MobileCLIP-B (LT) Accuracy:
- 77.2% ImageNet zero-shot
- Surpasses OpenAI ViT-L/14@336
- Better than DFN and SigLIP similar architectures

🔗 Resources:
- Complete Study: https://mlange.zetic.ai/p/Steve/MobileCLIP2-image

Ready to build vision-language applications that run entirely on-device?
The future of multi-modal AI runs locally in everyone's pocket! 🚀

Post

2121

⚡ ColBERT-ko-v1.0 Complete On-device Study: SOTA Korean Retrieval on Mobile
Major Release: Comprehensive mobile deployment study of yoonjong's ColBERT-ko-v1.0 with detailed performance benchmarks across 50+ mobile devices!

🎯 Model Overview:
Architecture: Korean-optimized ColBERT (late interaction)
Parameters: 0.1B (compact and efficient)
Specialty: Korean document retrieval and semantic search
Performance: 1.0 recall@10, 0.966 nDCG@10 on AutoRAGRetrieval
Benchmark: Outperforms Jina-ColBERT-v2 on Korean MTEB tasks

📊 Mobile Performance Results:

Latency Metrics:
NPU (Best): 3.17ms average inference
GPU: 11.67ms average
CPU: 21.36ms average
NPU Advantage: 18.46x speedup over CPU

Memory Efficiency:
Model Size: 567.89 MB (production optimized)
Runtime Memory: 170.87 MB peak consumption
Load Range: 4-614 MB across device categories
FP32 Memory: 642.65 MB (optional high precision)

Accuracy Preservation:
- FP16 Precision: 53.51 dB maintained
- Quantized Mode: 32.77 dB (available for memory constraints)
-Retrieval Quality: Production-grade Korean semantic matching

🔬 Research Implications:

This study demonstrates:
- Late interaction models are viable for mobile deployment
- Korean language models achieve SOTA performance on consumer hardware
- Sub-5ms retrieval enables real-time RAG applications
- Privacy-first Korean AI is now technically feasible

Benchmark Methodology:
- Standardized Korean query inputs
- Multiple inference cycles with statistical averaging
- Real-world Korean document corpus testing
- Thermal and battery impact assessment

Deployment Recommendations:
- Use NPU acceleration for 18x speedup on Android
- Implement MUVERA for real-time requirements
- Cache frequent queries for instant responses
- Progressive loading for large document collections
- Hybrid approach for extremely large corpora (local + cloud)

🔗 Resources:
Complete Study: https://mlange.zetic.ai/p/Steve/colbert_kor
Original Model: https://huggingface.co/yoonjong0505/ColBERT-ko-v1.0

View all Posts

models 0

None public yet

datasets 0

None public yet