Yeonseok Kim PRO

yeonseok-zeticai

AI & ML interests

On device AI with mobile hardware utilizations

Recent Activity

reacted to their post with ๐Ÿง  14 days ago
โšก MobileCLIP2 Complete On-device Study: TMLR 2025 Featured Model on Mobile Major Release: Comprehensive mobile deployment study of Apple's MobileCLIP2 (TMLR August 2025 Featured) with detailed performance benchmarks across 52+ mobile devices! ๐ŸŽฏ Model Overview: - Architecture: Multi-modal reinforced training (vision + language) - Research: TMLR 2025 Featured Certification - Innovation: Improved efficiency-accuracy trade-offs vs SigLIP/OpenAI CLIP - Specialty: Zero-shot image classification and retrieval ๐Ÿ“Š Mobile Performance Results: Latency Metrics: - NPU (Best): 9.74ms average inference - GPU: 39.00ms average - CPU: 494.89ms average - NPU Advantage: 115.94x speedup over CPU baseline! Memory Efficiency: - Model Size: 1.66 GB (production optimized) - Runtime Memory: 466.18 MB peak consumption - Load Range: 0-1,884 MB across device categories - Inference Range: 431-1,616 MB Accuracy Preservation: - FP16 Precision: 39.78 dB maintained - Quantized Mode: 15.07 dB (INT optimization available) - Zero-shot Quality: Production-grade vision-language matching ๐Ÿ† Research Highlights: MobileCLIP2-S4 Performance: - Matches SigLIP-SO400M/14 accuracy - 2x fewer parameters - 2.5x lower latency than DFN ViT-L/14 MobileCLIP-S0 Efficiency: - Similar zero-shot performance to OpenAI ViT-B/16 - 4.8x faster inference - 2.8x smaller model size MobileCLIP-S2 Advantages: - Better avg zero-shot than SigLIP ViT-B/16 - 2.3x faster, 2.1x smaller - Trained with 3x less seen samples MobileCLIP-B (LT) Accuracy: - 77.2% ImageNet zero-shot - Surpasses OpenAI ViT-L/14@336 - Better than DFN and SigLIP similar architectures ๐Ÿ”— Resources: - Complete Study: https://mlange.zetic.ai/p/Steve/MobileCLIP2-image Ready to build vision-language applications that run entirely on-device? The future of multi-modal AI runs locally in everyone's pocket! ๐Ÿš€
reacted to their post with ๐Ÿค— 14 days ago
โšก MobileCLIP2 Complete On-device Study: TMLR 2025 Featured Model on Mobile Major Release: Comprehensive mobile deployment study of Apple's MobileCLIP2 (TMLR August 2025 Featured) with detailed performance benchmarks across 52+ mobile devices! ๐ŸŽฏ Model Overview: - Architecture: Multi-modal reinforced training (vision + language) - Research: TMLR 2025 Featured Certification - Innovation: Improved efficiency-accuracy trade-offs vs SigLIP/OpenAI CLIP - Specialty: Zero-shot image classification and retrieval ๐Ÿ“Š Mobile Performance Results: Latency Metrics: - NPU (Best): 9.74ms average inference - GPU: 39.00ms average - CPU: 494.89ms average - NPU Advantage: 115.94x speedup over CPU baseline! Memory Efficiency: - Model Size: 1.66 GB (production optimized) - Runtime Memory: 466.18 MB peak consumption - Load Range: 0-1,884 MB across device categories - Inference Range: 431-1,616 MB Accuracy Preservation: - FP16 Precision: 39.78 dB maintained - Quantized Mode: 15.07 dB (INT optimization available) - Zero-shot Quality: Production-grade vision-language matching ๐Ÿ† Research Highlights: MobileCLIP2-S4 Performance: - Matches SigLIP-SO400M/14 accuracy - 2x fewer parameters - 2.5x lower latency than DFN ViT-L/14 MobileCLIP-S0 Efficiency: - Similar zero-shot performance to OpenAI ViT-B/16 - 4.8x faster inference - 2.8x smaller model size MobileCLIP-S2 Advantages: - Better avg zero-shot than SigLIP ViT-B/16 - 2.3x faster, 2.1x smaller - Trained with 3x less seen samples MobileCLIP-B (LT) Accuracy: - 77.2% ImageNet zero-shot - Surpasses OpenAI ViT-L/14@336 - Better than DFN and SigLIP similar architectures ๐Ÿ”— Resources: - Complete Study: https://mlange.zetic.ai/p/Steve/MobileCLIP2-image Ready to build vision-language applications that run entirely on-device? The future of multi-modal AI runs locally in everyone's pocket! ๐Ÿš€
reacted to their post with โค๏ธ 14 days ago
โšก MobileCLIP2 Complete On-device Study: TMLR 2025 Featured Model on Mobile Major Release: Comprehensive mobile deployment study of Apple's MobileCLIP2 (TMLR August 2025 Featured) with detailed performance benchmarks across 52+ mobile devices! ๐ŸŽฏ Model Overview: - Architecture: Multi-modal reinforced training (vision + language) - Research: TMLR 2025 Featured Certification - Innovation: Improved efficiency-accuracy trade-offs vs SigLIP/OpenAI CLIP - Specialty: Zero-shot image classification and retrieval ๐Ÿ“Š Mobile Performance Results: Latency Metrics: - NPU (Best): 9.74ms average inference - GPU: 39.00ms average - CPU: 494.89ms average - NPU Advantage: 115.94x speedup over CPU baseline! Memory Efficiency: - Model Size: 1.66 GB (production optimized) - Runtime Memory: 466.18 MB peak consumption - Load Range: 0-1,884 MB across device categories - Inference Range: 431-1,616 MB Accuracy Preservation: - FP16 Precision: 39.78 dB maintained - Quantized Mode: 15.07 dB (INT optimization available) - Zero-shot Quality: Production-grade vision-language matching ๐Ÿ† Research Highlights: MobileCLIP2-S4 Performance: - Matches SigLIP-SO400M/14 accuracy - 2x fewer parameters - 2.5x lower latency than DFN ViT-L/14 MobileCLIP-S0 Efficiency: - Similar zero-shot performance to OpenAI ViT-B/16 - 4.8x faster inference - 2.8x smaller model size MobileCLIP-S2 Advantages: - Better avg zero-shot than SigLIP ViT-B/16 - 2.3x faster, 2.1x smaller - Trained with 3x less seen samples MobileCLIP-B (LT) Accuracy: - 77.2% ImageNet zero-shot - Surpasses OpenAI ViT-L/14@336 - Better than DFN and SigLIP similar architectures ๐Ÿ”— Resources: - Complete Study: https://mlange.zetic.ai/p/Steve/MobileCLIP2-image Ready to build vision-language applications that run entirely on-device? The future of multi-modal AI runs locally in everyone's pocket! ๐Ÿš€
View all activity

Organizations

ZETIC.ai On-device AI's profile picture