TobDeBer commited on
Commit
a653b79
1 Parent(s): ee372b9

update README

Browse files
Files changed (2) hide show
  1. README.md +41 -4
  2. serve-from-url.sh +10 -0
README.md CHANGED
@@ -1,13 +1,50 @@
1
- # Container Repository for CPU adaptations of Inference code
2
 
3
  ## Variants
4
 
 
 
 
 
 
 
 
 
 
 
 
 
5
  ### CPUdiffusion
6
 
7
  - inference diffusion models on CPU
8
  - include CUDAonCPU stack
9
 
10
- ### CPUgguf
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
11
 
12
- - inference gguf models on CPU
13
- - include GUI libraries
 
1
+ # Model Repository for Big Endian models
2
 
3
  ## Variants
4
 
5
+ ### Arco 500M Q4 BE
6
+
7
+
8
+ # Container Repository for CPU adaptations of Inference code
9
+
10
+ ## Variants for Inference
11
+
12
+ ### Slim container
13
+
14
+ - run std binaries
15
+
16
+
17
  ### CPUdiffusion
18
 
19
  - inference diffusion models on CPU
20
  - include CUDAonCPU stack
21
 
22
+ ### Diffusion container
23
+
24
+ - run diffusion app.py variants
25
+ - support CPU and CUDA
26
+ - include Flux
27
+
28
+ ### Slim CUDA container
29
+
30
+ - run CUDA binaries
31
+
32
+
33
+ ## Variants for Build
34
+
35
+ ### Llama.cpp build container
36
+
37
+ - build llama-cli-static
38
+ - build llama-server-static
39
+
40
+ ### sd build container
41
+
42
+ - build sd
43
+ - optional: build sd-server
44
+
45
+ ### CUDA build container
46
+
47
+ - build cuda binaries
48
+ - support sd_cuda
49
+
50
 
 
 
serve-from-url.sh ADDED
@@ -0,0 +1,10 @@
 
 
 
 
 
 
 
 
 
 
 
1
+ 1) wget the model
2
+ 2) save hostip: ip route | grep 'default' | awk '{print $9}' >hostip
3
+ 3a) calls llama-server in container
4
+ 3b) calls sed + llama-server in container
5
+
6
+ opodman run -d --net=host -v ~/funstreams:/models localhost/bookworm:server ./models/llama-server-static -m /models/qwen2-500.gguf
7
+
8
+
9
+
10
+