AI & ML interests

None defined yet.

Recent Activity

toshas 
posted an update 3 months ago
view post
Post
865
Introducing StereoSpace -- our new end-to-end method for turning photos into stereo images without explicit geometry or depth maps. This makes it especially robust with thin structures and transparencies. Try the demo below:

🌐 Project: prs-eth/stereospace_web
📕 Paper: StereoSpace: Depth-Free Synthesis of Stereo Geometry via End-to-End Diffusion in a Canonical Space (2512.10959)
🐙 Code: https://github.com/prs-eth/stereospace
🤗 Demo: toshas/stereospace
🤗 Weights: prs-eth/stereospace-v1-0

By ETH Zürich ( @behretj , @Bingxin , @konradschindler ), University of Bologna ( @fabiotosi92 , @mpoggi ), HUAWEI Bayer Lab ( @toshas ).
toshas 
posted an update 3 months ago
view post
Post
2282
Introducing 🇨🇭WindowSeat🇨🇭 –– our new method for removing reflections from photos taken through windows, on planes, in malls, offices, and other glass-filled environments.

Finetuning a foundation diffusion transformer for reflection removal quickly runs up against the limits of what existing datasets and techniques can offer. To fill that gap, we generate physically accurate examples in Blender that simulate realistic glass and reflection effects. This data enables strong performance on both established benchmarks and previously unseen images.

To make this practical, the open-source Apache-2 model builds on Qwen-Image-Edit-2509, a 20B image-editing diffusion transformer that runs on a single GPU and can be fine-tuned in about a day. WindowSeat keeps its use of the underlying DiT cleanly separated from the data and training recipe, allowing future advances in base models to be incorporated with minimal friction.

Try it out with your own photos in this interactive demo:
🤗 toshas/windowseat-reflection-removal

Other resources:
🌎 Website: huawei-bayerlab/windowseat-reflection-removal-web
🎓 Paper: Reflection Removal through Efficient Adaptation of Diffusion Transformers (2512.05000)
🤗 Model: huawei-bayerlab/windowseat-reflection-removal-v1-0
🐙 Code: https://github.com/huawei-bayerlab/windowseat-reflection-removal

Team: Daniyar Zakarin ( @daniyarzt )*, Thiemo Wandel ( @thiemo-wandel )*, Anton Obukhov ( @toshas ), Dengxin Dai.
*Work done during internships at HUAWEI Bayer Lab
multimodalart 
posted an update 5 months ago
view post
Post
20795
Want to iterate on a Hugging Face Space with an LLM?

Now you can easily convert any HF entire repo (Model, Dataset or Space) to a text file and feed it to a language model!

multimodalart/repo2txt
  • 1 reply
·
multimodalart 
posted an update 9 months ago
view post
Post
18210
Self-Forcing - a real-time video distilled model from Wan 2.1 by @adobe is out, and they open sourced it 🐐

I've built a live real time demo on Spaces 📹💨

multimodalart/self-forcing
·