ytzfhqs commited on
Commit
f26ca2d
1 Parent(s): d822c70

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -0
README.md CHANGED
@@ -8,6 +8,7 @@ metrics:
8
  base_model:
9
  - Qwen/Qwen2.5-0.5B
10
  ---
 
11
  The model is an intermediate product of the [EPCD (Easy-Data-Clean-Pipeline)](https://github.com/ytzfhqs/EDCP) project, primarily used to distinguish between the main content and non-content (such as book introductions, publisher information, writing standards, revision notes) of **medical textbooks** after performing OCR using [MinerU](https://github.com/opendatalab/MinerU). The base model uses [Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B), avoiding the length limitation of the Bert Tokenizer while providing higher accuracy.
12
 
13
  # Data Composition
 
8
  base_model:
9
  - Qwen/Qwen2.5-0.5B
10
  ---
11
+ # Qwen2.5-med-book-main-classification
12
  The model is an intermediate product of the [EPCD (Easy-Data-Clean-Pipeline)](https://github.com/ytzfhqs/EDCP) project, primarily used to distinguish between the main content and non-content (such as book introductions, publisher information, writing standards, revision notes) of **medical textbooks** after performing OCR using [MinerU](https://github.com/opendatalab/MinerU). The base model uses [Qwen2.5-0.5B](https://huggingface.co/Qwen/Qwen2.5-0.5B), avoiding the length limitation of the Bert Tokenizer while providing higher accuracy.
13
 
14
  # Data Composition