christinetyip commited on
Commit
e241a15
·
verified ·
1 Parent(s): b123cc5

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +1 -1
README.md CHANGED
@@ -14,7 +14,7 @@ library_name: kernels
14
 
15
  Core attention kernel from [Open-TQ-Metal](https://github.com/mutable-state-inc/Open-TQ-Metal).
16
 
17
- Open-TQ-Metal is a Metal-native implementation of fused compressed-domain attention built by Ensue. The full release enables Llama 3.1 70B at 128K context on a single 64GB Mac, and includes a C++ inference engine, multiple attention kernels, a 330-experiment cross-architecture analysis, and a paper.
18
 
19
  - Paper: https://arxiv.org/pdf/2604.16957
20
  - Write-up: https://ensue.dev/blog/introducing-open-tq-metal/
 
14
 
15
  Core attention kernel from [Open-TQ-Metal](https://github.com/mutable-state-inc/Open-TQ-Metal).
16
 
17
+ Open-TQ-Metal is a Metal-native implementation of fused compressed-domain attention built by Ensue. The full release enables Llama 3.1 70B at 128K context on a single 64GB Mac, 48x faster attention at 128K context, and includes a C++ inference engine, multiple attention kernels, a 330-experiment cross-architecture analysis, and a paper.
18
 
19
  - Paper: https://arxiv.org/pdf/2604.16957
20
  - Write-up: https://ensue.dev/blog/introducing-open-tq-metal/