Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up

All HF Hub posts

openfreeย 
posted an update 2 days ago
view post
Post
4869
๐Ÿ”ฅ Creating a qwen3-30b-a3b / qwen3-235b-a22b Chatbot with Deep Research Capabilities ๐Ÿš€

openfree/qwen3-30b-a3b-research
openfree/qwen3-235b-a22b-research

Hello AI researchers! ๐Ÿ‘‹ Today I'm introducing a powerful chatbot implementation with real-time web search capabilities.
โœจ Key Features

๐Ÿง  Chatbot based on qwen3-30b-a3b and llama4-maverick models
๐Ÿ” LLM-based optimal keyword extraction
๐ŸŒ Real-time web search using SerpHouse API
๐Ÿ’ฌ Streaming responses for natural conversation experience

๐Ÿ› ๏ธ Technology Stack

Gradio: Implementation of intuitive web interface
Fireworks.ai API: Access to high-performance LLM models
SerpHouse API: Collection of real-time search results

๐ŸŒŸ Application Areas

Question answering systems requiring up-to-date information
Providing current information beyond training data
Delivering reliable information with accurate sources

Add real-time search capabilities to your AI applications with this project! ๐ŸŽ‰ Leave your questions or suggestions in the comments! Let's improve it together~ ๐Ÿ’ช
#LLM #ArtificialIntelligence #WebSearch #Gradio #DeepResearch #OpenSource
merveย 
posted an update about 21 hours ago
view post
Post
1581
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers ๐Ÿ”ฅ

D-FINE is the sota real-time object detector that runs on T4 (free Colab) ๐Ÿคฉ

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper ๐ŸŽฉ

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve ๐Ÿฅฒโ˜น๏ธ



D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate ๐Ÿคฉ

Another core idea behind this model is Global Optimal Localization Self-Distillation โคต๏ธ

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

Kseniaseย 
posted an update 2 days ago
view post
Post
2940
10 new Chain-of-Thoughts (CoT) methods

CoT has long been one of the hottest techniques in AI thanks to its effectiveness and compelling core idea: encouraging models to solve complex problems through explicit intermediate reasoning steps. But usually researchers modify original CoT approach, finding tips that further improve LLMs' reasoning. That's what we're going to talk about today.

Here's a list of 10 latest enhanced CoT approaches:

1. Chain-of-Defensive-Thought -> Chain-of-Defensive-Thought: Structured Reasoning Elicits Robustness in Large Language Models against Reference Corruption (2504.20769)
Provides a few structured, defensive reasoning exemplars to improve the robustness of LLMs

2. Hybrid-CoT -> AdaR1: From Long-CoT to Hybrid-CoT via Bi-Level Adaptive Reasoning Optimization (2504.21659)
Proposes using Adaptive Hybrid Reasoning Model (AdaR1) that combines Long- and Short-CoT, and applying bi-level preference training to select effective reasoning styles

3. Semantic-level and token-level CoT -> T2I-R1: Reinforcing Image Generation with Collaborative Semantic-level and Token-level CoT (2505.00703)
Introduces T2I-R1 text-to-image gen model, that uses semantic-level CoT for prompt planning and token-level CoT for pixel-level generation, while BiCoT-GRPO coordinates them both

4. Speculative CoT (SCoT) -> Efficient Reasoning for LLMs through Speculative Chain-of-Thought (2504.19095)
SCoT drafts multiple reasoning paths with a lightweight draft, selects the best, and uses the target model for correction - all this to reduce latency by 48โ€“66%

5. Collaborative CoT (Co-CoT) -> Co-CoT: A Prompt-Based Framework for Collaborative Chain-of-Thought Reasoning (2504.17091)
Breaks reasoning into blocks that users can inspect, modify and re-run, promoting active engagement. An adaptation mechanism aligns outputs with diverse cognitive styles and user goals

6. XS-CoT -> Enhancing Non-Core Language Instruction-Following in Speech LLMs via Semi-Implicit Cross-Lingual CoT Reasoning (2504.20835)
It's a cross-lingual framework that integrates speech-to-text translation into reasoning, using a semi-implicit CoT approach to compress intermediate tokens. This improves non-core language responses by up to 45%

Read further in the comments ๐Ÿ‘‡

If you liked this, also subscribe to the Turing Post -> https://www.turingpost.com/subscribe
  • 1 reply
ยท
BramVanroyย 
posted an update 2 days ago
view post
Post
2294
๐Ÿ“ข๐Ÿ’พ Introducing the Common Crawl Creative Commons Corpus (C5)!

C5 is a large-scale effort to heavily filter web-crawled data, as collected by the non-profit Common Crawl, to only documents that are Creative Commons-licensed such as cc-by-4.0 or public domain cc0. At this stage 150 billion tokens have been collected.

---
๐Ÿ“„ data: BramVanroy/CommonCrawl-CreativeCommons
๐Ÿงฐ software: https://github.com/BramVanroy/CommonCrawl-CreativeCommons
---

</> To build C5, HTML pages are scrutinized and all links (if any) to CC licenses are collected, both in regular hyperlinks as well as in metadata. Additional data fields are included such as "was the license found in the head?" or "if multiple licenses were found, do they contradict each other?", which makes further filtering a breeze.

๐ŸŒ In this first version of C5, 8 languages are included (Afrikaans, German, English, French, Frysian, Italian, Dutch and Spanish). The language set was limited for two reasons: computational and storage limitations, and a collaboration with GPT-NL, which requested CC data for these languages to train a Dutch-focused, copyright-conscious LLM. In total, this V1 release contains almost 150 thousand documents and 150 billion tokens. This data was not filtered on quality nor deduplicated so that you can decide for yourself how much data to keep. To give some quality indication, a dataset field is present to describe whether a document is included in the FineWeb(-2) datasets, which are of high quality.

๐Ÿ” More work needs to be done! Only 7 out of 100+ Common Crawl crawls have been processed so far. That's encouraging because it means there is a lot more Creative Commons data to be collected! But to get there I need help in terms of compute. The current processing was already heavily sponsored by the Flemish Supercomputer but more is needed. If you have the compute available and which to collaborate in an open and transparent manner, please get in touch!
  • 1 reply
ยท
ginipickย 
posted an update 2 days ago
view post
Post
4435
๐Ÿ”ฎ Mistral Perflexity AI - Local LLM Space with Web Search Capabilities ๐ŸŒ
Hello AI enthusiasts! Today I'm excited to introduce my special Hugging Face space! ๐Ÿš€

ginigen/Mistral-Perflexity

โœจ Key Features

Powerful Model: Using Private-BitSix-Mistral-Small-3.1-24B-Instruct-2503, optimized through 6-bit quantization to run smoothly on local 4090 GPUs! ๐Ÿ’ช
Web Search Integration: Leveraging the Brave Search API to provide real-time web search results for user queries! ๐Ÿ”
Customizable Responses: Shape AI personality and response format through system messages โš™๏ธ
Multilingual Support: Perfect handling of both English and Korean! ๐Ÿ‡บ๐Ÿ‡ธ๐Ÿ‡ฐ๐Ÿ‡ท

๐Ÿ› ๏ธ Technical Highlights

GGUF Format: Optimized quantized model with excellent memory efficiency
Flash Attention: Applied optimization technology for faster inference speeds
8K Context Window: Capable of handling lengthy conversations and complex queries
Streaming Responses: Watch text being generated in real-time

๐Ÿ’ก Use Cases

Complex Q&A requiring real-time information
Programming assistance and code generation
Multilingual content creation and translation
Summarization and explanation of learning materials

๐Ÿ”ง Customization
Adjust various parameters like Temperature, Top-p, Top-k, and repetition penalty to control response creativity and accuracy. Lower temperature (0.1-0.5) produces more deterministic responses, while higher values (0.7-1.0) generate more creative outputs!

๐ŸŒŸ Try It Yourself!
This space is available for anyone to use for free. Experience the power of a robust local LLM combined with web search capabilities! Your feedback is always welcome! ๐Ÿ˜Š
samihalawaย 
posted an update 1 day ago
view post
Post
2554
HELLO GUYS ๐Ÿš€ Just released my first MCP: VUDA โ€“ Visual UI Debug Agent
Ever been stuck debugging buttons that donโ€™t work? Broken flows? Inconsistent UI behavior?

VUDA sees it, clicks it, fixes it.
An automated visual debug agent that inspects, validates, and repairs your UI โ€” like magic ๐Ÿง โœจ Better that any other playwright / puppeteer.

๐Ÿ”ง Install now via Smithery:

npx -y @smithery /cli@latest install @samihalawa /visual-ui-debug-agent-mcp --client cursor



โธป

Want a shorter alt for social media too?
nyuuzyouย 
posted an update 1 day ago
view post
Post
2478
๐Ÿ–ผ๏ธ PublicDomainFiles.com Collection - nyuuzyou/publicdomainfiles

Collection of 206,204 Public Domain multimedia files featuring:

- Comprehensive metadata: title, description, creator name, keywords, original page URL, and more.
- Contains various media types including images, clip art, artwork, fonts, videos, and TV shows.
- All content explicitly released into the public domain under the CC0 license.
- Organized in a single train split with 206,204 entries.
ZennyKennyย 
posted an update 2 days ago
view post
Post
3045
When I heard the Reasoning Dataset Competition deadline was extended to 9 May, I knew I had time to get in one more entry. ๐Ÿ”ฅ๐Ÿ”ฅ๐Ÿ”ฅ

With the rise of Vibe Coding, and the potential risks that are introduced by humans letting LLMs build their apps for them, lots of people are (rightfully) concerned about the safety of the code that is hitting prod.

In response to that, I'm happy to present my final submission to the Reasoning Dataset Competition and attempt to start benchmarking the ability of LLMs to identify unsafe and / or exploitable code by way of the CoSa (Code Safety) benchmark: ZennyKenny/cosa-benchmark-dataset

Currently a curated set of 200 examples, calibrated on OpenAI's standard issue models (GPT-4.1, o4 mini, and GPT-3.5 Turbo) as "baseline performance" (70% decile). Check it out and drop a โค๏ธ if you think it could be useful or hit the Community section with suggestions / critiques.
  • 2 replies
ยท
MonsterMMORPGย 
posted an update 1 day ago
view post
Post
2229
Just published a tutorial that shows how to properly install ComfyUI, SwarmUI, use installed ComfyUI as a backend in SwarmUI with absolutely maximum best performance such as out of the box Sage Attention, Flash Attention, RTX 5000 Series support and more. Also how to upscale images with max quality

Tutorial Link

https://youtu.be/fTzlQ0tjxj0

Tutorial Information

If you want to generate the very best AI videos and images on your Windows computer locally this is the tutorial that you were looking for. Literally 1-click to install most powerful and advanced generative AI interface SwarmUI (with Flash Attention, Sage Attention, Triton, DeepSpeed, xFormers, RTX 5000 series perfect compatibility) and download the very best AI image and video generation models with ultra advanced model downloader Gradio app. SwarmUI utilizes the famous and most powerful, advanced, performant and optimized ComfyUI as a backend. So SwarmUI is the ultimate generative AI tool at the moment with vast amount of features and constant updates.

Tutorial Important Download Links App Links
๐Ÿ”—Follow below link to download the zip file that contains SwarmUI installer and AI models downloader Gradio App - the one used in the tutorial โคต๏ธ

โ–ถ๏ธ https://www.patreon.com/posts/SwarmUI-Installer-AI-Videos-Downloader-114517862

๐Ÿ”—Follow below link to download the zip file that contains ComfyUI 1-click installer that has all the Flash Attention, Sage Attention, xFormers, Triton, DeepSpeed, RTX 5000 series support โคต๏ธ

โ–ถ๏ธ https://www.patreon.com/posts/Advanced-ComfyUI-1-Click-Installer-105023709

๐Ÿ”— Python, Git, CUDA, C++, FFMPEG, MSVC installation tutorial - needed for ComfyUI โคต๏ธ

โ–ถ๏ธ https://youtu.be/DrhUHnYfwC0

๐Ÿ”— SECourses Official Discord 10500+ Members โคต๏ธ

โ–ถ๏ธ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

๐Ÿ”— Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub โคต๏ธ

โ–ถ๏ธ https://github.com/FurkanGozukara/Stable-Diffusion
Raahulthakurย 
posted an update 1 day ago
view post
Post
2743
FinSightX: Your AI Financial Co-Pilot
FinSightX is a multi-agent financial assistant powered by language models. Designed for analysts, investors, and fintech developers, it combines insights from multiple domains into a single, sleek Streamlit interface.

Features
Equity Analyst Agent โ†’ Ask questions about stocks, indicators, performance.

Macro Strategist Agent โ†’ Get macroeconomic insights using language models.
News Summarizer Agent โ†’ Summarizes market headlines instantly.
Quant Backtester Agent โ†’ Run basic backtests using bt.
Regulatory Radar Agent โ†’ Monitor policy shifts and alerts.
Client Advisor Agent โ†’ Assist with client queries or hypothetical portfolios.

Tech Stack
transformers, sentence-transformers
torch, scikit-learn, neuralprophet
bt for strategy backtesting
chromadb for vector storage
Streamlit + FastAPI for UI/backend

Developed and maintained by @Raahul-Thakur
Live Space: Raahulthakur/FinsightX

Built using open-source tools and financial domain knowledge. Contributions, feedback, and forks welcome!
OSZAR »