🤝 Open to Collab

19 6 25

AbstractPhila PRO

AbstractPhil

https://civitai.com/user/AbstractPhila

AbstractEyes

AI & ML interests

datasets, research papers, experimentation, vision, classification, text encoders, tokenization, llms, diffusion, distillation, and more.

Recent Activity

repliedto their post about 2 hours ago

Anima - Brent JSON (PREVIEW) - Subject Bucketing Full article available https://huggingface.co/blog/AbstractPhil/subject-bucketing. There is additionally a civit model release as well. https://civitai.com/models/2730503/anima-jsonenglish https://huggingface.co/AbstractPhil/anima-prelim-1k-r64 The JSON multi-prompt diffusion model prototype using Anima 1.0 base as the pretrain to finetune into the JSON target. The upcoming JSON lora is being cached and trained with 40,000 of the full 83,000 valid images from the qwen set. This first preview version is ready to use as a ComfyUI capable LORA, so you can just load up the epoch you want without anything special in comfyui and have at it. You can currently use plain English in conjunction with tagging to produce useful and meaningful prompt targets without the JSON. https://huggingface.co/AbstractPhil/anima-prelim-1k-r64/tree/main/comfy-qwen-json The comfyui nodes are present and work for testing use-case, but they are not ready for production use just yet. -- Technical -- Primarily the target was the VLM json target followed by the AnimeTIMM vit processed through the VLM json processor as the followup. First 12 epochs VLM experienced images with json formatting, last 8 epochs were finetuning from epoch 12 onward to 20 using the AnimeTIMM captions turned into JSON instead. The Anima model itself accepted the 1000 image and the json prompting works quite well. In the process I set up a couple comfyui nodes that can translate base prompts into the same language the model is learning. Those are present in the repo.

updated a dataset about 6 hours ago

AbstractPhil/anima-90k-cache

published a dataset about 9 hours ago

AbstractPhil/anima-90k-cache

View all activity

Organizations

replied to their post about 2 hours ago

Upcoming behavioral assessments include a large array of QWEN VLM models I will publish benchmarks for.

These will be aligned to generic use-case, meaning as many tasks as possible that do not require finetuning.

Which produces valid json schema?
image classification
bounding box location
image text identification and accuracy checking
structural and spatial awareness
3d geometric object identification and awareness
camera rotational offset
subject fixation and awareness
semantic association
depth analysis
segmentation potential
vit accuracy to image prompting
outline and association testing
style identification and structural awareness
type differentiation with data types; json, yaml, MD, and a multitude of other potentials.
utilization and response to those types and the expected prompts

updated a dataset about 6 hours ago

AbstractPhil/anima-90k-cache

Updated 22 minutes ago

published a dataset about 9 hours ago

AbstractPhil/anima-90k-cache

Updated 22 minutes ago

published an article about 11 hours ago

Article

Subject Bucketing: Teaching a Diffusion Model New Prompt Languages Without Forgetting

AbstractPhil

•

about 11 hours ago

posted an update about 11 hours ago

Post

Anima - Brent JSON (PREVIEW) - Subject Bucketing

Full article available https://huggingface.co/blog/AbstractPhil/subject-bucketing.

There is additionally a civit model release as well.
https://civitai.com/models/2730503/anima-jsonenglish

AbstractPhil/anima-prelim-1k-r64
The JSON multi-prompt diffusion model prototype using Anima 1.0 base as the pretrain to finetune into the JSON target. The upcoming JSON lora is being cached and trained with 40,000 of the full 83,000 valid images from the qwen set.

This first preview version is ready to use as a ComfyUI capable LORA, so you can just load up the epoch you want without anything special in comfyui and have at it. You can currently use plain English in conjunction with tagging to produce useful and meaningful prompt targets without the JSON.

AbstractPhil/anima-prelim-1k-r64
The comfyui nodes are present and work for testing use-case, but they are not ready for production use just yet.

-- Technical --
Primarily the target was the VLM json target followed by the AnimeTIMM vit processed through the VLM json processor as the followup. First 12 epochs VLM experienced images with json formatting, last 8 epochs were finetuning from epoch 12 onward to 20 using the AnimeTIMM captions turned into JSON instead.

The Anima model itself accepted the 1000 image and the json prompting works quite well. In the process I set up a couple comfyui nodes that can translate base prompts into the same language the model is learning. Those are present in the repo.

1 reply

updated 2 models 1 day ago

AbstractPhil/Qwen3.5-0.8B-json-captioner

Image-Text-to-Text • 0.9B • Updated 1 day ago • 73

AbstractPhil/anima-prelim-1k-r64

Text-to-Image • Updated about 11 hours ago

published a model 1 day ago

AbstractPhil/anima-prelim-1k-r64

Text-to-Image • Updated about 11 hours ago

updated a collection 3 days ago

Flagships

Collection

My flagship models that actually work or are the best I have capable from a category currently. • 13 items • Updated 3 days ago

published a model 3 days ago

AbstractPhil/Qwen3.5-0.8B-json-captioner

Image-Text-to-Text • 0.9B • Updated 1 day ago • 73

updated a dataset 3 days ago

AbstractPhil/diffusion-pretrain-set-ft1

Viewer • Updated 3 days ago • 1.46M • 2.06k • 1

liked a model 5 days ago

mnemic/paligemma-longprompt-v1-safetensors

Image-Text-to-Text • 3B • Updated Jan 10 • 4 • 2

updated a model 7 days ago

AbstractPhil/geolip-constellation-aleph

Updated 7 days ago

published a model 7 days ago

AbstractPhil/geolip-constellation-aleph

Updated 7 days ago

posted an update 9 days ago

Post

149

The article for aleph attention routing needs more work on vision, as the vision portion has not been fully validated, while the LM prototype has been semi-validated for small and medium-small scale. I will post my findings in the coming days with the consequences of training an LM and a VIT utilizing the prototype system.

The current structure for the Geometric Vocabulary does nearly reflect the intended shape as discussed in the earlier posts and articles, so that's coming along nicely - but there are stipulations and problems involved that I did not foresee.

My apologies for the incomplete article I just released on a whim. I jumped to the conclusion a bit early in anticipation before the formulas were fully converged. I also released an early post the other day speaking about the prototype AlephLM - which I removed as an invalid conclusion.

I'm doing my best to only release validated empirical information instead of speculative - however I do sometimes jump to conclusions without proper validation from time to time. Occasionally, I get a bit theory-overzealous and require tidying up through thorough experimentation which I'm currently approaching directly.

updated a model 10 days ago

AbstractPhil/geolip-aleph-lm

Text Generation • Updated 10 days ago • 1

published a model 10 days ago

AbstractPhil/geolip-aleph-lm

Text Generation • Updated 10 days ago • 1

updated a model 12 days ago

AbstractPhil/geolip-aleph-void

Feature Extraction • Updated 12 days ago

reacted to OzTianlu's post with 🧠 12 days ago

Post

6299

ResNet is Explicit Euler. GPT is Implicit Euler. What Else is Hiding in Plain Sight?

Read online: https://datawhalechina.github.io/learning-terrain/

I wrote an open-source monograph on learning dynamics — The Terrain of Learning. Bilingual (Chinese/English), 4 volumes, 12 chapters, 30+ print-grade figures. Completely free (CC BY-NC-SA 4.0).

The core argument: gradient descent is not optimization. It's terrain motion. The loss function is a landscape. The gradient is the direction of slope. The optimizer is how you choose each step. Once you see it this way, everything clicks:

ResNet = explicit Euler integration on a vector field. The residual branch is the vector field. Each layer takes one Euler step.

GPT autoregression = implicit-state Euler iteration. Stable where explicit Euler explodes. That's why transformers handle long-range dependencies.

DEQ = the Banach fixed-point theorem in production. The forward pass is root-finding. There are no layers to backprop through.

KL divergence = a Bregman divergence on the entropy landscape. Your belief space is curved, not flat.

Chain-of-thought reasoning = hidden states flowing along a reasoning field toward an attractor basin. Correct answers have wide basins. The number of reasoning steps is determined by the terrain, not by the problem.

Diffusion models = systems flowing downhill along a score vector field, from noise to structure, from high energy to low energy.

The book traces one idea across 337 years — from F=ma (Newton, 1687) to H=T+V (Hamilton, 1833) to loss landscape + gradient field (2020s). Hamilton replaced a catalog of forces with one geometric object. This book does the same for deep learning.

GitHub: https://github.com/datawhalechina/learning-terrain
Discussion: https://github.com/datawhalechina/learning-terrain/discussions/2

Convergence is not hope. Convergence is geometry. You see.

1 reply

replied to OzTianlu's post 12 days ago

geolip-aleph-void and the LM aleph routing is implicit recursive infinities confined into a microcosm of forced rebounding finite space forced through a gelu sift - more akin to an emulated quaternion. All because quaternion is computationally heavy and Cantor's fractals are additionally computationally precise (often >fp64 required), requiring an entirely deviant approach to rotary in order to computationally stabilize the system at BF16 so it won't take 2 weeks for a single epoch on a model 35m params.

Makes me feel a little overdressed for the occasion.

AbstractPhila PRO

AI & ML interests

Recent Activity

Organizations

AbstractPhil's activity

Subject Bucketing: Teaching a Diffusion Model New Prompt Languages Without Forgetting