TobDeBer
/

myContainers

Inference Endpoints

Model card Files Files and versions Community

TobDeBer commited on Sep 11

Commit

a653b79

•

1 Parent(s): ee372b9

update README

Files changed (2) hide show

README.md +41 -4
serve-from-url.sh +10 -0

README.md CHANGED Viewed

@@ -1,13 +1,50 @@
-# Container Repository for CPU adaptations of Inference code
 ## Variants
 ### CPUdiffusion
 - inference diffusion models on CPU
 - include CUDAonCPU stack
-### CPUgguf
-- inference gguf models on CPU
-- include GUI libraries

+# Model Repository for Big Endian models
 ## Variants
+### Arco 500M Q4 BE
+# Container Repository for CPU adaptations of Inference code
+## Variants for Inference
+### Slim container
+- run std binaries
 ### CPUdiffusion
 - inference diffusion models on CPU
 - include CUDAonCPU stack
+### Diffusion container
+- run diffusion app.py variants
+- support CPU and CUDA
+- include Flux
+### Slim CUDA container
+- run CUDA binaries
+## Variants for Build
+### Llama.cpp build container
+- build llama-cli-static
+- build llama-server-static
+### sd build container
+- build sd
+- optional: build sd-server
+### CUDA build container
+- build cuda binaries
+- support sd_cuda

serve-from-url.sh ADDED Viewed

	@@ -0,0 +1,10 @@

+		1) wget the model
+		2) save hostip:		ip route | grep 'default' | awk '{print $9}' >hostip
+		3a) calls llama-server in container
+		3b) calls sed + llama-server in container
+		opodman run -d --net=host -v ~/funstreams:/models localhost/bookworm:server ./models/llama-server-static -m /models/qwen2-500.gguf