fix-glu-mlp

#17

by michael-guenther - opened Mar 28, 2024

base: refs/heads/main

←

from: refs/pr/17

Discussion Files changed

+10

-3

michael-guenther

Jina AI org Mar 28, 2024

•

edited Mar 28, 2024

The GluMLP is not working without flash attention, because the tensors are passed in a different shape. This PR fixes the issue. I also tested it that the embeddings with and without flash attentions are the same.

fix: glu for non-flash-attnc768124c

michael-guenther changed pull request status to open Mar 28, 2024

bwang0911

Jina AI org Apr 2, 2024

LGTM!

michael-guenther changed pull request status to merged Apr 2, 2024

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

Your need to confirm your account before you can post a new comment.

· Sign up or log in to comment