Integration in transformers lib.

#27

by sudhir2016 - opened Sep 22, 2023

Discussion

sudhir2016

Sep 22, 2023

When do you plan to integrate in transformers lib as a pipeline function ?

lysandre

Microsoft org Sep 22, 2023

•

edited Sep 22, 2023

On behalf of the transformers team, we'd be happy to help with the integration within the library if there is desire from @gugarosa or @suriyagunasekar 🤗

susnato

Sep 24, 2023

Hi, I am currently working on this integration, PR. :)

sudhir2016

Sep 25, 2023

Thank you !!

SinclairWang

Oct 13, 2023

Will it support fine-tuning these models, such as phi-1 and phi-1.5?

Currently, during my finetuning, I encountered this warning

`attention_mask` is not supported during training. Using it might lead to unexpected results.
{'loss': 1.3228, 'learning_rate': 1.999875577156579e-05, 'epoch': 0.02}
  1%|▍                                                                                             | 300/59745 [06:19<20:47:29,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 301/59745 [06:20<20:48:14,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 302/59745 [06:22<20:48:01,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 303/59745 [06:23<20:47:31,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 304/59745 [06:24<20:48:13,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 305/59745 [06:25<20:49:27,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 306/59745 [06:27<20:48:52,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 307/59745 [06:28<20:48:29,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 308/59745 [06:29<20:49:14,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
  1%|▍                                                                                             | 309/59745 [06:30<20:49:49,  1.26s/it]`attention_mask` is not supported during training. Using it might lead to unexpected results.
{'loss': 1.5263, 'learning_rate': 1.9998671442394832e-05, 'epoch': 0.02}

susnato

Oct 13, 2023

Hi @SinclairWang , yes it will support attention_mask, so you won't get this warning.

gugarosa

Microsoft org Oct 30, 2023

Hello @SinclairWang ! Until phi is fully integrated in transformers, we added support for training/fine-tuning with attention mask in the files located in this repository.

You should not get the warning anymore if using the latest revision.

gugarosa changed discussion status to closed Nov 20, 2023

Upload images, audio, and videos by dragging in the text input, pasting, or clicking here.

Tap or paste here to upload images

· Sign up or log in to comment