Do you know any good literature on how they created the base model?
I've always been fascinated with the idea of using a string rewriting system to generate 4/4 time techno:
https://en.m.wikipedia.org/wiki/Rewriting
https://en.m.wikipedia.org/wiki/Semi-Thue_system
Sadly, the wiki page doesn't do the idea any justice, but Lindenmayer-Systems shows the complexity these can generate for images and 3D models:
https://en.m.wikipedia.org/wiki/L-system
https://mathworld.wolfram.com/LindenmayerSystem.html
Me and one of my friends were thinking about this 30 years ago, and tried all sorts of stuff like using GAs and a human judge to try to evolve the grammars, but nothing really worked that well... We did find that rewriting systems naturally generated really satisfying 4/4 time minimal techno though (probably due to all our rules being of the form a --> bc
, or ab --> acab
IIRC).
No idea if it's applicable to LLMs, but thought I'd share it :)