AjayP13 commited on
Commit
31b4a2a
·
verified ·
1 Parent(s): ec74662

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +5 -32
README.md CHANGED
@@ -9,38 +9,11 @@ tags:
9
  - text2text-generation
10
  widget:
11
  - text: >-
12
- In this paper, we delve into advanced techniques and methods in Natural
13
- Language Processing (NLP), innovatively incorporating Transformer
14
- architectures and self-supervised learning methods. We aim to reiterate the
15
- current understanding of Transformer-based models in executing various
16
- language tasks by dissecting their versatility and expandability on broad
17
- language systems.
18
-
19
-
20
- Moreover, stabilization measures, tokenization assortment, and interpreting
21
- latent spaces provide an in-depth novelty to our pipeline, overcoming
22
- long-known obstacles. We explore meta-architectural modifications focusing
23
- on enhancing prompt language models' efficiency, allowing flexible
24
- adaptations to the core Transformer technique's abundance in BERT, GPT-like
25
- systems.
26
-
27
-
28
- To implement these adaptations, several experiments were conducted on varied
29
- benchmark datasets to evaluate core metrics such as Bleu, Rouge, and
30
- Warp-CTC metrics in translation and transcription tasks. We carried out
31
- significant analysis focusing on module interpretability, additional error
32
- inspection, task-specific regulatory mechanisms, execution speed, and
33
- computational considerations.
34
-
35
-
36
- Our experimental results bring in distraction from widespread but
37
- sub-optimal benchmarks and offer evidence underpinning the contrary yet
38
- potent issues yet to be addressed methodically. We invite the community to
39
- reflect on these novel insights, develop and refine our proposed techniques,
40
- speeding technical progress, avoiding prototypical retrodiction in the
41
- Natural Language Understanding ecosystem to respect inclusive, diverse, and
42
- correctly perceived expressive content.
43
- example_title: Example 1
44
  - text: >-
45
  In this research paper, we propose a novel approach to Natural Language
46
  Processing (NLP) that addresses several limitations of existing methods. By
 
9
  - text2text-generation
10
  widget:
11
  - text: >-
12
+ An important paradigm of natural language processing consists of large-scale pre-training on general domain data and adaptation to particular tasks or domains. As we pre-train larger models, full fine-tuning, which retrains all model parameters, becomes less feasible. Using GPT-3 175B as an example -- deploying independent instances of fine-tuned models, each with 175B parameters, is prohibitively expensive. We propose Low-Rank Adaptation, or LoRA, which freezes the pre-trained model weights and injects trainable rank decomposition matrices into each layer of the Transformer architecture, greatly reducing the number of trainable parameters for downstream tasks. Compared to GPT-3 175B fine-tuned with Adam, LoRA can reduce the number of trainable parameters by 10,000 times and the GPU memory requirement by 3 times. LoRA performs on-par or better than fine-tuning in model quality on RoBERTa, DeBERTa, GPT-2, and GPT-3, despite having fewer trainable parameters, a higher training throughput, and, unlike adapters, no additional inference latency. We also provide an empirical investigation into rank-deficiency in language model adaptation, which sheds light on the efficacy of LoRA. We release a package that facilitates the integration of LoRA with PyTorch models and provide our implementations and model checkpoints for RoBERTa, DeBERTa, and GPT-2 at this https URL.
13
+ output:
14
+ text: >-
15
+ "Exciting news in #NLP! We've developed Low-Rank Adaptation, or LoRA, to reduce the number of trainable parameters for downstream tasks. It reduces model weights by 10,000 times and GPU memory by 3 times. #AI #MachineLearning"
16
+ example_title: LoRA Abstract
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
17
  - text: >-
18
  In this research paper, we propose a novel approach to Natural Language
19
  Processing (NLP) that addresses several limitations of existing methods. By