441 1691 21379

John Smith PRO

John6666

John6666cat

AI & ML interests

None yet

Recent Activity

reacted to ProCreations's post with 👀 1 minute ago

What do you think of Intellite’s new icons/logo? Let us know! Also Intellite chat technically does work! But we decided to scale it up a bit (same parameter count at 100m, but we went from trained on 4b tokens to 200b tokens, big upgrade!) for max quality.

reacted to blaise-tk's post with 👀 1 minute ago

Today we launch Dione. A few months ago it was just a wild idea I shared with @bygimenez, now it's real. Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones. Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that. Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source. Users can even write and share their own installation scripts. This is just the beginning. 🚀 Join our exclusive Beta → https://getdione.app/beta/join

reacted to Jaward's post with 🚀 1 minute ago

finally, a course that makes diffusion math much easier to grasp, well done 👍 https://diffusion.csail.mit.edu/

View all activity

Organizations

John6666's activity

reacted to ProCreations's post with 👀 1 minute ago

Post

835

What do you think of Intellite’s new icons/logo? Let us know!

Also Intellite chat technically does work! But we decided to scale it up a bit (same parameter count at 100m, but we went from trained on 4b tokens to 200b tokens, big upgrade!) for max quality.

2 replies

reacted to blaise-tk's post with 👀 1 minute ago

Post

340

Today we launch Dione.

A few months ago it was just a wild idea I shared with @bygimenez , now it's real.

Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones.

Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that.

Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source.
Users can even write and share their own installation scripts. This is just the beginning.

🚀 Join our exclusive Beta
→ https://getdione.app/beta/join

reacted to Jaward's post with 🚀 1 minute ago

Post

finally, a course that makes diffusion math much easier to grasp, well done 👍 https://diffusion.csail.mit.edu/

reacted to kulia-moon's post with 👍 about 20 hours ago

Post

347

I'd love for someone to check News2GPT :o, Thanks GetNova/output-News2GPT!

reacted to prithivMLmods's post with 👍 about 20 hours ago

Post

1093

Dropping some image classification models for content moderation, balancers, and classifiers trained on synthetic datasets—along with others based on datasets available on the Hub. Also loaded a few low-rank datasets for realistic gender portrait classification and document-type classifiers, all fine-tuned on the SigLIP-2 Patch-16 224 backbone. Models and datasets are listed below:

🤗Models & Datasets :

Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
⎙ prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
⎙ prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
⎙ DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
⎙ SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
⎙ Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
⎙ jonathan-roberts1/GID

🤗Collection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c

To know more about it, visit the model card of the respective model.

reacted to lorraine2's post with 👀 1 day ago

Post

384

🔊 New NVIDIA paper: Audio-SDS 🔊

We adapt Score Distillation Sampling (SDS), originally developed for text-to-3D generation, to audio diffusion models, allowing us to reuse large pretrained models for new text-guided parametric audio tasks such as source separation, physically informed impact synthesis, and more.

🔎 Project Page: https://research.nvidia.com/labs/toronto-ai/Audio-SDS/
📖 Full Paper: https://arxiv.org/abs/2505.04621

Check out more from NVIDIA’s Spatial Intelligence Lab here: https://research.nvidia.com/labs/toronto-ai/

This project was led by the great work of Jessie Richter-Powell, along with Antonio Torralba.

Notably, we find a new and exciting use case for Stable Audio Open 🚀

reacted to daavoo's post with 👀 1 day ago

Post

1025

Have you heard about the Agent2Agent Protocol (A2A)?

We have just released an option in https://github.com/mozilla-ai/any-agent to serve with A2A any of the supported agent frameworks (Agno, Google ADK, Langchain, LlamaIndex, OpenAI Agents SDK, smolagents and tinyagent)!

Check the docs https://mozilla-ai.github.io/any-agent/serving/

# google_expert.py
from any_agent import AgentConfig, AnyAgent
from any_agent.config import ServingConfig
from any_agent.tools import search_web

agent = AnyAgent.create(
    "google",
    AgentConfig(
        name="google_expert",
        model_id="gpt-4.1-nano",
        instructions="You must use the available tools to find an answer",
        description="An agent that can answer questions about the Google Agents Development Kit (ADK).",
        tools=[search_web]
    )
)

agent.serve(ServingConfig(port=5001))

reacted to YerbaPage's post with 🔥 1 day ago

Post

1537

Curated list of **Next-Gen Code Generation** papers & benchmarks! 🔥

Stay ahead with the latest in:
✅ Repo-level Issue Resolution (SWE-bench, Agents)
✅ Repo-level Code Completion (Repo understanding)
✅ Datasets & Benchmarks

👉 Check it out: https://github.com/YerbaPage/Awesome-Repo-Level-Code-Generation 🔥

1 reply

reacted to nyuuzyou's post with 👍 1 day ago

Post

313

🎞️ HailuoAI Video Metadata Dataset - nyuuzyou/hailuoai

Collection of 544,646 AI-generated video metadata entries from HailuoAI featuring:

- Comprehensive metadata: direct video URLs, dimensions, creation parameters, model IDs, tags, and more.
- All metadata explicitly released into the public domain under the CC0 license.
- Organized in a single train split with 544,646 entries.

This is likely the most extensive public dataset of AI-generated videos to date.

reacted to MonsterMMORPG's post with 👀 1 day ago

Post

1466

TRELLIS is still the lead Open Source AI model to generate high-quality 3D Assets from static images — Some mind blowing examples — Supports multi-angle improved image to 3D as well — Works as low as 6 GB GPUs

Tutorial link : https://www.youtube.com/watch?v=EhU7Jil9WAk

App Link : https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976

Our app is super advanced with so many features and supports as low as 6 GB GPUs

Also fully supports RTX 5000 GPUs as well

TRELLIS is currently the state of the art locally run-able open source image-to-3D very high quality asset generator. I have developed a 1-click installers and super advanced Gradio app for this model with so many amazing features. In this tutorial video I will show you how to step by step use this amazing AI tool and generate the very best very high-quality 3D assets locally. Moreover, you can also use this tool on RunPod and Massed Compute as well if you are GPU poor.

🔗Follow below link to download the zip file that contains Trellis installer and Gradio App - the one used in the tutorial ⤵️
▶️ https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976

🔗 Python, Git, CUDA, C++ Tools, FFmpeg, cuDNN, MSVC installation tutorial - needed for AI apps - 1-time only setup⤵️
▶️ https://youtu.be/DrhUHnYfwC0

🔗 SECourses Official Discord 10500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

🔗 Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub ⤵️
▶️ https://github.com/FurkanGozukara/Stable-Diffusion

🔗 SECourses Official Reddit - Stay Subscribed To Learn All The News and More ⤵️
▶️ https://www.reddit.com/r/SECourses/

🔗Official TRELLIS Repo ⤵️
▶️ https://github.com/microsoft/TRELLIS

reacted to DawnC's post with 🔥 1 day ago

Post

2312

PawMatchAI 🐾: The Complete Dog Breed Platform

PawMatchAI offers a comprehensive suite of features designed for dog enthusiasts and prospective owners alike. This all-in-one platform delivers five essential tools to enhance your canine experience:

1. 🔍Breed Detection: Upload any dog photo and the AI accurately identifies breeds from an extensive database of 124+ different dog breeds. The system detects dogs in the image and provides confident breed identification results.

2.📊Breed Information: Access detailed profiles for each breed covering exercise requirements, typical lifespan, grooming needs, health considerations, and noise behavior - giving you complete understanding of any breed's characteristics.

3.📋 Breed Comparison : Compare any two breeds side-by-side with intuitive visualizations highlighting differences in care requirements, personality traits, health factors, and more - perfect for making informed decisions.

4.💡 Breed Recommendation: Receive personalized breed suggestions based on your lifestyle preferences. The sophisticated matching system evaluates compatibility across multiple factors including living space, exercise capacity, experience level, and family situation.

5.🎨 Style Transfer: Transform your dog photos into artistic masterpieces with five distinct styles: Japanese Anime, Classic Cartoon, Oil Painting, Watercolor, and Cyberpunk - adding a creative dimension to your pet photography.

👋Explore PawMatchAI today:
DawnC/PawMatchAI

If you enjoy this project or find it valuable for your canine companions, I'd greatly appreciate your support with a Like❤️ for this project.

#ArtificialIntelligence #MachineLearning #ComputerVision #PetTech #TechForLife

reacted to onekq's post with 🚀 1 day ago

Post

1428

The new Mistral medium model is very impressive for its size. Will it be open sourced given the history of Mistral? Does anyone have insights?

onekq-ai/WebApp1K-models-leaderboard

reacted to Nymbo's post with 👀 1 day ago

Post

749

Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

reacted to m-ric's post with 🔥 2 days ago

Post

2743

I've made an open version of Google's NotebookLM, and it shows the superiority of the open source tech task! 💪

The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.

The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.

And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.

Overall, open source solutions rival the quality of closed-source solutions at close to no cost!

Try it here 👉👉 m-ric/open-notebooklm

2 replies

reacted to Zherui's post with 🚀 2 days ago

Post

395

Hi community,

We are excited to announce the AgiBot World Challenge at IROS 2025! This competition offers an opportunity to push the limits of humanoid robotics, focusing on real-world manipulation tasks and generative modeling.

The competition features two tracks:
Manipulation: Participants will train models for various tasks ranging from easy to challenging, including precise manipulations, long-term tasks, and multi-robot collaboration in diverse environments such as homes, dining areas, and retail spaces.
World Model: This track evaluates models’ ability to predict the evolution of visual perspectives based on action sequences, requiring participants to work with real-world data and simulate various robotic interactions.

For more on the challenge, visit our website at https://opendrivelab.com/challenge2025/. We look forward to your participation and advancing the future of robotics together!

reacted to ProCreations's post with 🔥 2 days ago

Post

2304

Post of the Day – Your Thoughts, Our Take

Yesterday we asked:
If AI could master just one thing, what should it be?
And the responses? Insightful, creative, and genuinely thought-provoking.

Here’s a few that stood out:

🍼 @NandaKrishvaa said “Curiosity like a baby.”
Instead of just answering questions, an AI that asks them with childlike wonder? That’s a whole new kind of intelligence.

@MrDevolver suggested “Master being Jack of All Trades.”
Sure, it bends the rules a bit — but adaptability is key. Sometimes breadth can outshine depth.

@afranco50 argued for “Perfect logic,” saying it could unlock all other abilities.
It’s a solid point: if an AI can reason flawlessly, it may just learn to improve everything else on its own.

⸻

Our take?
We still believe the biggest leap forward is flawless conversation — not just accurate, but deeply human. Emotional intelligence, nuance, humor, empathy. That kind of interaction is what makes AI feel real.

It’s also why we’re building IntellIte Chat to focus on that exact skillset:
• Emotion-aware replies
• Natural, flowing conversation
• Strong command of casual and expressive English

When it releases, it won’t just talk — it’ll connect. And in a world full of tools, we think the future needs more companions.
What do you think? Let us know! If we get more comments, might as well do another post on this tomorrow lol.

2 replies

reacted to AdinaY's post with 🔥 2 days ago

Post

1812

HunyuanCustom 🔥 a multimodal video generation framework supporting image, audio, video & text conditions, released by TencentHunyuan

tencent/HunyuanCustom
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation (2505.04512)

✨Strong Identity Consistency
✨SOTA outperforms

reacted to onekq's post with 🤗 2 days ago

Post

3068

This time Gemini is very quick with API support on its 2.5 pro May release. The performance is impressive too, now it is among top contenders like o4, R1, and Claude.

onekq-ai/WebApp1K-models-leaderboard

reacted to clem's post with 🔥 2 days ago

Post

3677

nvidia dominating the top trending open datasets these days!

http://hf.co/datasets

reacted to DawnC's post with 🔥 2 days ago

Post

5203

VisionScout — Now with Video Analysis! 🚀

I’m excited to announce a major update to VisionScout, my interactive vision tool that now supports VIDEO PROCESSING, in addition to powerful object detection and scene understanding!

⭐️ NEW: Video Analysis Is Here!
🎬 Upload any video file to detect and track objects using YOLOv8.
⏱️ Customize processing intervals to balance speed and thoroughness.
📊 Get comprehensive statistics and summaries showing object appearances across the entire video.

What else can VisionScout do?

🖼️ Analyze any image and detect 80 object types with YOLOv8.
🔄 Switch between Nano, Medium, and XLarge models for speed or accuracy.
🎯 Filter by object classes (people, vehicles, animals, etc.) to focus on what matters.
📊 View detailed stats on detections, confidence levels, and distributions.
🧠 Understand scenes — interpreting environments and potential activities.
⚠️ Automatically identify possible safety concerns based on detected objects.

What’s coming next?
🔎 Expanding YOLO’s object categories.
⚡ Faster real-time performance.
📱 Improved mobile responsiveness.

My goal:
To bridge the gap between raw detection and meaningful interpretation.
I’m constantly exploring ways to help machines not just "see" but truly understand context — and to make these advanced tools accessible to everyone, regardless of technical background.

Try it now! 🖼️👉 DawnC/VisionScout

If you enjoy VisionScout, a ❤️ Like for this project or feedback would mean a lot and keeps me motivated to keep building and improving!

#ComputerVision #ObjectDetection #VideoAnalysis #YOLO #SceneUnderstanding #MachineLearning #TechForLife

2 replies