BlackSamorez
commited on
Commit
•
955c824
1
Parent(s):
0b99023
Create README.md
Browse files
README.md
ADDED
@@ -0,0 +1,18 @@
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
1 |
+
# Bringing SOTA quantization to mobile LLM deployment: A practical Executorch integration guide
|
2 |
+
|
3 |
+
Article: <\TODO>
|
4 |
+
|
5 |
+
## Usage
|
6 |
+
|
7 |
+
- Download and install the `.apk` file on your Android phone.
|
8 |
+
- Download the `.pte` and `.model` files and put them into the `/data/local/tmp/llama` folder on your Android phone.
|
9 |
+
- Running the app you will see the option to load the `.pte` and `.model` files. After loading them, you'll be able to chat with the model.
|
10 |
+
|
11 |
+
## Requirements
|
12 |
+
|
13 |
+
This app was tested on `Samsung S24 Ultra` running `Android 14`.
|
14 |
+
|
15 |
+
## Limitations
|
16 |
+
|
17 |
+
- Although the app looks like chat, generation requests are independent.
|
18 |
+
- Llama-3 chat template is hard-coded into the app.
|