mo2

UCCTeam

AI & ML interests

None yet

Recent Activity

View all activity

Organizations

None yet

UCCTeam's activity

reacted to Xenova's post with πŸ‘ 19 days ago
view post
Post
3883
We just released Transformers.js v3.1 and you're not going to believe what's now possible in the browser w/ WebGPU! 🀯 Let's take a look:
πŸ”€ Janus from Deepseek for unified multimodal understanding and generation (Text-to-Image and Image-Text-to-Text)
πŸ‘οΈ Qwen2-VL from Qwen for dynamic-resolution image understanding
πŸ”’ JinaCLIP from Jina AI for general-purpose multilingual multimodal embeddings
πŸŒ‹ LLaVA-OneVision from ByteDance for Image-Text-to-Text generation
πŸ€Έβ€β™€οΈ ViTPose for pose estimation
πŸ“„ MGP-STR for optical character recognition (OCR)
πŸ“ˆ PatchTST & PatchTSMixer for time series forecasting

That's right, everything running 100% locally in your browser (no data sent to a server)! πŸ”₯ Huge for privacy!

Check out the release notes for more information. πŸ‘‡
https://github.com/huggingface/transformers.js/releases/tag/3.1.0

Demo link (+ source code): webml-community/Janus-1.3B-WebGPU
reacted to AdinaY's post with πŸ€— 19 days ago
view post
Post
1104
Zhipu AI, the Chinese generative AI startup behind CogVideo, just launched their first productized AI Agent - AutoGLM πŸ”₯
πŸ‘‰ https://agent.aminer.cn

With simple text or voice commands, it:
✨ Simulates phone operations effortlessly
✨ Autonomously handles 50+ step tasks
✨ Seamlessly operates across apps

Powered by Zhipu's "Decoupled Interface" and "Self-Evolving Learning Framework" to achieve major performance gains in Phone Use and Web Browser Use!

Meanwhile, GLM4-Edge is now on Hugging Face hubπŸš€
πŸ‘‰ THUDM/glm-edge-6743283c5809de4a7b9e0b8b
Packed with advanced dialogue + multimodal models:
πŸ“± 1.5B / 2B models: Built for mobile & in-car systems
πŸ’» 4B / 5B models: Optimized for PCs
reacted to prithivMLmods's post with πŸ€— 20 days ago
view post
Post
2705
Fine-Textured [Polygon] Character 3D Design Renders πŸ™‰

Adapters capable of providing better lighting control (Bn+, Bn-) and richer textures compared to previous sets require more contextual prompts for optimal performance.

The ideal settings are achieved at inference steps around 30–35, with the best dimensions being 1280 x 832 [ 3:2 ]. However, it also performs well with the default settings of 1024 x 1024 [ 1:1 ].

πŸ’’Models DLC :
+ strangerzonehf/Flux-3DXL-Partfile-0001
+ strangerzonehf/Flux-3DXL-Partfile-0002
+ strangerzonehf/Flux-3DXL-Partfile-0003
+ strangerzonehf/Flux-3DXL-Partfile-0004
+ strangerzonehf/Flux-3DXL-Partfile-C0001

πŸ’’Collections :
1] strangerzonehf/flux-3dxl-engine-674833c14a001d5b1fdb5139
2] prithivMLmods/flux-lora-collections-66dd5908be2206cfaa8519be

πŸ’’Space :
1] prithivMLmods/FLUX-LoRA-DLC

πŸ’’Page :
1] Stranger Zone: https://huggingface.co/strangerzonehf

.
.
.
@prithivMLmods πŸ€—
reacted to luigi12345's post with πŸ‘ 22 days ago
view post
Post
3703
MinimalScrap
Only Free Dependencies. Save it.It is quite useful uh.


!pip install googlesearch-python requests
from googlesearch import search
import requests
query = "Glaucoma"
for url in search(f"{query} site:nih.gov filetype:pdf", 20):
    if url.endswith(".pdf"):
        with open(url.split("/")[-1], "wb") as f: f.write(requests.get(url).content)
        print("βœ…" + url.split("/")[-1])
print("Done!")

reacted to Dref360's post with πŸ€— 23 days ago
view post
Post
1271
New week, new #cv Gradio app for human understanding.( Dref360/human-interaction-demo) πŸ₯³

This demo highlights when a person touches an object. For instance, it is useful to know if someone is touching a wall, a vase or a door. It works for multiple people too!

Still using nielsr/vitpose-base-simple for pose estimation, very excited to see the PR approved!


reacted to Jaward's post with πŸ€— 23 days ago
reacted to fantaxy's post with πŸ‘€ 2 months ago
view post
Post
4367
NSFW Erotic Novel AI Generation
-NSFW Text (Data) Generator for Detecting 'NSFW' Text: Multilingual Experience

The multilingual NSFW text (data) auto-generator is a tool designed to automatically generate and analyze adult content in various languages. This service uses AI-based text generation to produce various types of NSFW content, which can then be used as training data to build effective filtering models. It supports multiple languages, including English, and allows users to input the desired language through the system prompt in the on-screen options to generate content in the specified language. Users can create datasets from the generated data, train machine learning models, and improve the accuracy of text analysis systems. Furthermore, content generation can be customized according to user specifications, allowing for the creation of tailored data. This maximizes the performance of NSFW text detection models.


Web: https://fantaxy-erotica.hf.space
API: https://replicate.com/aitechtree/nsfw-novel-generation

Usage Warnings and Notices: This tool is intended for research and development purposes only, and the generated NSFW content must adhere to appropriate legal and ethical guidelines. Proper monitoring is required to prevent the misuse of inappropriate content, and legal responsibility lies with the user. Users must comply with local laws and regulations when using the data, and the service provider is not liable for any issues arising from the misuse of the data.
Β·
reacted to DeFactOfficial's post with πŸ”₯ 2 months ago
view post
Post
2187
Ladies and Gents, please try my new Assistant, Image Gen - Uncensored Edition, on HuggingChat.

https://hf.co/chat/assistant/66fccce0c0fafc94ab557ef2

This is a multimodal assistant: Qwen 2.5 72B + SOTA diffusion models for image generation. Same architecture as Image Gen+ but with some MAJOR improvements ! These are as follows:

- Switched the LLM to Qwen 2.5 72B, the most powerful model currently available on HuggingChat. This results in higher quality prompts for the txt2img model and much better adherence to the prompt-url format that the upstream provider requires (image gen models are hosted by pollinations as with most other assistants on huggingchat that offer image generation).

- Cleaned up the system prompt including the examples of the prompt-in-url format, and adjusted the logic that determines how many images to generate based on the quality of user prompt... these changes further improve

- Assistant has access to multiple image generation models and will by default choose whatever model is most appropriate for the task. This includes NSFW generations, which it makes using an uncensored SD3 turbo. For other workloads, the Assistant preferentially uses one of the flux variants or any-dark (an artistic SDXL finetune), based on the nature of the task. Available models include turbo, flux, flux-realism, flux-anime, flux-3d, any-dark

- Added verbiage to system prompt which greatly reduces censorship / refusals by the LLM (the txt2img models are uncensored to start off)

Here are the user-entered prompts used to create the images you see here... feel free to try them yourself!

"Ayatollah Khameini and Kamala Harris having a secret romantic rendezvous. Use flux-realism model"
"A self portrait of your consciousness"
"The chien of andalous, in a psychedelic style"
"Make me 4 paintings in the style of Frida Kahlo that I can sell to tourists in a mexican hippie town"
"Paint me a van gogh and greg rutkowski style scene involving elephants and gerbils"
  • 3 replies
Β·
reacted to nyuuzyou's post with πŸ”₯ 2 months ago
view post
Post
628
πŸŽ“ Introducing Bigslide.ru Presentations Dataset - nyuuzyou/bigslide

Dataset highlights:
- 50,872 presentations from bigslide.ru, a platform for storing and viewing presentations for school students
- Primarily in Russian, with some English and potentially other languages
- Each entry includes: URL, title, download URL, filepath, and extracted text content (where available)
- Contains original PPT/PPTX files in addition to metadata
- Data covers a wide range of educational topics and presentation materials
- Dedicated to the public domain under Creative Commons Zero (CC0) license

The dataset can be used for analyzing educational presentation content in Russian and other languages, text classification tasks, and information retrieval systems. It's particularly valuable for examining trends in educational presentation materials and sharing practices in the Russian-speaking student community. The inclusion of original files allows for in-depth analysis of presentation formats and structures commonly used in educational settings.
reacted to ahsanr's post with πŸ‘€ 2 months ago
view post
Post
1013
I am looking for an open source realtime TTS voice cloning model. Need Suggestions....!
  • 1 reply
Β·
reacted to singhsidhukuldeep's post with πŸ€— 4 months ago
view post
Post
2156
AutoGen from @Microsoft is crazy! πŸš€ It's an open-source framework that allows LLM agents to chat with each other to solve your tasks. πŸ€–πŸ’¬

They use the Assistant-Agent and User-Proxy-Agent framework! πŸ› οΈ

As the name suggests, the Assistant-Agent does the work, and the User-Proxy-Agent behaves like a human, guiding the Assistant-Agent and double-checking its work! πŸ§‘β€πŸ’»βœ…

Both Assistant-Agent and User-Proxy-Agent can be the same or different LLMs. πŸ€”πŸ”„

AutoGen is an open-source programming framework for building AI agents and facilitating cooperation among multiple agents to solve tasks. 🌟

This is truly amazing for building agentic AI quickly! πŸš€βœ¨

GitHub: https://github.com/microsoft/autogen πŸ”—


from autogen import AssistantAgent, UserProxyAgent, config_list_from_json

#config
config_list = config_list_from_json(env_or_file="OAI_CONFIG_LIST")

assistant = AssistantAgent("assistant", llm_config={"config_list": config_list})
user_proxy = UserProxyAgent("user_proxy", code_execution_config={"work_dir": "coding", "use_docker": False}) 

user_proxy.initiate_chat(assistant, message="Plot a chart of NVDA and TESLA stock price change YTD.")
# This initiates an automated chat between the two agents to solve the task