Update README.md
Browse files
README.md
CHANGED
@@ -7,10 +7,17 @@ sdk: static
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
-
|
|
|
|
|
|
|
|
|
|
|
11 |
|
12 |
-
|
13 |
-
|
14 |
-
|
15 |
-
|
16 |
-
|
|
|
|
|
|
7 |
pinned: false
|
8 |
---
|
9 |
|
10 |
+
人工智能的显著进步产生了许多伟大的模型,特别是基于预训练的基础模型成为了一种新兴的范式。传统的AI模型必须要在专门的巨大的数据集上为一个或几个有限的场景进行训练,相比之下,基础模型可以适应广泛的下游任务。基础模型造就了AI在低资源的场景下落地的可能。
|
11 |
+
如今的基础模型,尤其是语言模型,正在被英文社区主导着。与此同时,中文作为这个世界上最大的口语语种(母语者中),却缺乏系统性的研究资源支撑,这使得中文领域的研究进展相较于英文来说有些滞后。
|
12 |
+
为了解决中文领域研究进展滞后和研究资源严重不足的问题,[IDEA研究院](https://idea.edu.cn/)正式宣布,开启 “封神榜”开源体系——一个以中文驱动的基础生态系统,其中包括了预训练大模型,特定任务的微调应用,基准和数据集等。我们的目标是构建一个全面的,标准化的,以用户为中心的生态系统。尽管这一目标可以通过多种方式去实现,但是我们经过对中文社区的重新审视与思考,提出了我们认为最为有效的方案:
|
13 |
+
- 步骤1: 从我们的[封神榜模型库](https://huggingface.co/IDEA-CCNL)中选择一个预训练好的中文NLP模型.
|
14 |
+
- 步骤2: 通过阅读我们的教程示例,使用[封神框架](https://github.com/IDEA-CCNL/Fengshenbang-LM)调整模型。
|
15 |
+
- 步骤3: 在我们的[封神榜单](https://fengshenbang-lm.com/benchmarks) (敬请期待)或者自定义任务中评估模型在下游任务上的表现。
|
16 |
|
17 |
+
Remarkable advances in Artificial Intelligence (AI) have produced great models, in particular, pre-trained based foundation models become an emerging paradigm. In contrast to traditional AI models that must be trained on vast datasets for one or a few scenarios, foundation models can be adapted to a wide range of downstream tasks, therefore, limiting the amount of resource demanded to acquire an AI venture off the ground.
|
18 |
+
Foundation models, most notably language models, are dominated by the English-language community.
|
19 |
+
The Chinese language as the world's largest spoken language (native speakers), however, has no systematic research resources to support it, making the progress in the Chinese language domain lag behind others.
|
20 |
+
[IDEA](https://idea.edu.cn/) (International Digital Economy Academy) officially announces the launch of "Fengshenbang" open source project —— a Chinese language driven foundation ecosystem, incorporates pre-trained models, task-specific fine-tune applications, benchmarks, and datasets. Our goal is to build a comprehensive, standardized and user-centered ecosystem. Although this can be instantiated in a variety of ways, we present the following design that we find to be particularly effective:
|
21 |
+
- Step 1: Choosing a pre-trained Chinese NLP model from our [open-source library](https://huggingface.co/IDEA-CCNL) of Fengshenbang Models.
|
22 |
+
- Step 2: Employing [Fengshen Framework](https://github.com/IDEA-CCNL/Fengshenbang-LM) to adjust the model by exploring the our tutorial examples.
|
23 |
+
- Step 3: Evaluating on downstream tasks, such as [Fengshenbang Benchmarks](https://fengshenbang-lm.com/benchmarks) (On going) or custom tasks.
|