Morgan Funtowicz's picture

28 8 6

Morgan Funtowicz PRO

mfuntowicz

·

https://github.com/mfuntowicz

AI & ML interests

Model inference low-level optimization, hardware affinity and large-scale distributed training.

Recent Activity

liked a model 11 days ago

google/timesfm-2.5-200m-pytorch

upvoted an article 3 months ago

Training and Finetuning Sparse Embedding Models with Sentence Transformers v5

View all activity

Organizations

published an article 5 months ago

Article

Blazingly fast whisper transcriptions with Inference Endpoints

May 13

• 79

published an article 9 months ago

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

Jan 16

• 75

published an article about 1 year ago

Article

Introducing the AMD 5th Gen EPYC™ CPU

Oct 10, 2024

• 7

published an article over 1 year ago

Article

Hugging Face on AMD Instinct MI300 GPU

May 21, 2024

• 15

published an article over 1 year ago

Article

CPU Optimized Embeddings with 🤗 Optimum Intel and fastRAG

Mar 15, 2024

• 11

published an article over 1 year ago

Article

利用 🤗 Optimum Intel 和 fastRAG 在 CPU 上优化文本嵌入

Mar 15, 2024

published an article over 1 year ago

Article

利用 🤗 Optimum Intel 和 fastRAG 在 CPU 上优化文本嵌入

Mar 15, 2024

published an article over 1 year ago

Article

利用 🤗 Optimum Intel 和 fastRAG 在 CPU 上优化文本嵌入

Mar 15, 2024

published an article over 1 year ago

Article

利用 🤗 Optimum Intel 和 fastRAG 在 CPU 上优化文本嵌入

Mar 15, 2024

published an article over 1 year ago

Article

利用 🤗 Optimum Intel 和 fastRAG 在 CPU 上优化文本嵌入

Mar 15, 2024

published an article almost 2 years ago

Article

Accelerating SD Turbo and SDXL Turbo Inference with ONNX Runtime and Olive

Jan 15, 2024

• 7

published an article almost 2 years ago

Article

AMD + 🤗: Large Language Models Out-of-the-Box Acceleration with AMD GPU

Dec 5, 2023

• 4

published an article almost 2 years ago

Article

Optimum-NVIDIA - Unlock blazingly fast LLM inference in just 1 line of code

Dec 5, 2023

• 5

published an article about 2 years ago

Article

Accelerating over 130,000 Hugging Face models with ONNX Runtime

Oct 4, 2023

• 1

published an article almost 4 years ago

Article

Case Study: Millisecond Latency using Hugging Face Infinity and modern CPUs

Jan 13, 2022

• 2

published an article almost 4 years ago

Article

Scaling up BERT-like model Inference on modern CPU - Part 2

Nov 4, 2021

• 1

published an article about 4 years ago

Article

Introducing Optimum: The Optimization Toolkit for Transformers at Scale

Sep 14, 2021

• 2

published an article over 4 years ago

Article

Scaling-up BERT Inference on CPU (Part 1)

Apr 20, 2021

• 3