burgerbee commited on
Commit
81bd948
1 Parent(s): 70eeb6d

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +3 -5
README.md CHANGED
@@ -18,12 +18,10 @@ Embeddings is the engine that delivers semantic search. Data is transformed into
18
  An embeddings index generated by txtai is a fully encapsulated index format. It DOESN'T require a database server.
19
 
20
  This index is built from the [Wikipedia Februari 2024 dataset](https://huggingface.co/datasets/burgerbee/wikipedia-sv-20240220).
21
- Only the first two paragraph from each article is included. The Wikipedia index works well as a fact-based context source for retrieval augmented generation (RAG).
22
-
23
- It also uses [Wikipedia Page Views](https://dumps.wikimedia.org/other/pageviews/readme.html) data to add a `percentile` field. The `percentile` field can be used
24
  to only match commonly visited pages.
25
 
26
- txtai must be [installed](https://neuml.github.io/txtai/install/) to use this model.
27
 
28
  ## Example
29
 
@@ -44,7 +42,7 @@ for x in embeddings.search("SELECT id, text, score, percentile FROM txtai WHERE
44
  print(json.dumps(x, indent=2))
45
  ```
46
 
47
- # Source
48
 
49
  https://dumps.wikimedia.org/svwiki/20240220/dumpstatus.json
50
 
 
18
  An embeddings index generated by txtai is a fully encapsulated index format. It DOESN'T require a database server.
19
 
20
  This index is built from the [Wikipedia Februari 2024 dataset](https://huggingface.co/datasets/burgerbee/wikipedia-sv-20240220).
21
+ Only the first two paragraph from each article is included. The Wikipedia index works well as a fact-based context source for retrieval augmented generation (RAG). It also uses [Wikipedia Page Views](https://dumps.wikimedia.org/other/pageviews/readme.html) data to add a `percentile` field. The `percentile` field can be used
 
 
22
  to only match commonly visited pages.
23
 
24
+ txtai must be (pip) [installed](https://neuml.github.io/txtai/install/) to use this model.
25
 
26
  ## Example
27
 
 
42
  print(json.dumps(x, indent=2))
43
  ```
44
 
45
+ # Data source
46
 
47
  https://dumps.wikimedia.org/svwiki/20240220/dumpstatus.json
48