John Smith's picture

John Smith PRO

John6666

AI & ML interests

None yet

Recent Activity

Organizations

open/ acc's profile picture Solving Real World Problems's profile picture FashionStash Group meeting's profile picture No More Copyright's profile picture

John6666's activity

reacted to ProCreations's post with ๐Ÿ‘€ 1 minute ago
view post
Post
835
What do you think of Intelliteโ€™s new icons/logo? Let us know!

Also Intellite chat technically does work! But we decided to scale it up a bit (same parameter count at 100m, but we went from trained on 4b tokens to 200b tokens, big upgrade!) for max quality.
  • 2 replies
ยท
reacted to blaise-tk's post with ๐Ÿ‘€ 1 minute ago
view post
Post
340
Today we launch Dione.

A few months ago it was just a wild idea I shared with @bygimenez , now it's real.

Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones.

Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that.

Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source.
Users can even write and share their own installation scripts. This is just the beginning.

๐Ÿš€ Join our exclusive Beta
โ†’ https://getdione.app/beta/join
reacted to Jaward's post with ๐Ÿš€ 1 minute ago
reacted to kulia-moon's post with ๐Ÿ‘ about 20 hours ago
reacted to prithivMLmods's post with ๐Ÿ‘ about 20 hours ago
view post
Post
1093
Dropping some image classification models for content moderation, balancers, and classifiers trained on synthetic datasetsโ€”along with others based on datasets available on the Hub. Also loaded a few low-rank datasets for realistic gender portrait classification and document-type classifiers, all fine-tuned on the SigLIP-2 Patch-16 224 backbone. Models and datasets are listed below:

๐Ÿค—Models & Datasets :

Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
โŽ™ prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
โŽ™ prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
โŽ™ DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
โŽ™ SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
โŽ™ Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
โŽ™ jonathan-roberts1/GID

๐Ÿค—Collection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c

To know more about it, visit the model card of the respective model.
reacted to lorraine2's post with ๐Ÿ‘€ 1 day ago
view post
Post
384
๐Ÿ”Š New NVIDIA paper: Audio-SDS ๐Ÿ”Š

We adapt Score Distillation Sampling (SDS), originally developed for text-to-3D generation, to audio diffusion models, allowing us to reuse large pretrained models for new text-guided parametric audio tasks such as source separation, physically informed impact synthesis, and more.

๐Ÿ”Ž Project Page: https://research.nvidia.com/labs/toronto-ai/Audio-SDS/
๐Ÿ“– Full Paper: https://arxiv.org/abs/2505.04621

Check out more from NVIDIAโ€™s Spatial Intelligence Lab here: https://research.nvidia.com/labs/toronto-ai/

This project was led by the great work of Jessie Richter-Powell, along with Antonio Torralba.

Notably, we find a new and exciting use case for Stable Audio Open ๐Ÿš€
reacted to daavoo's post with ๐Ÿ‘€ 1 day ago
view post
Post
1025
Have you heard about the Agent2Agent Protocol (A2A)?

We have just released an option in https://github.com/mozilla-ai/any-agent to serve with A2A any of the supported agent frameworks (Agno, Google ADK, Langchain, LlamaIndex, OpenAI Agents SDK, smolagents and tinyagent)!

Check the docs https://mozilla-ai.github.io/any-agent/serving/

# google_expert.py
from any_agent import AgentConfig, AnyAgent
from any_agent.config import ServingConfig
from any_agent.tools import search_web

agent = AnyAgent.create(
    "google",
    AgentConfig(
        name="google_expert",
        model_id="gpt-4.1-nano",
        instructions="You must use the available tools to find an answer",
        description="An agent that can answer questions about the Google Agents Development Kit (ADK).",
        tools=[search_web]
    )
)

agent.serve(ServingConfig(port=5001))
reacted to YerbaPage's post with ๐Ÿ”ฅ 1 day ago
reacted to nyuuzyou's post with ๐Ÿ‘ 1 day ago
view post
Post
313
๐ŸŽž๏ธ HailuoAI Video Metadata Dataset - nyuuzyou/hailuoai

Collection of 544,646 AI-generated video metadata entries from HailuoAI featuring:

- Comprehensive metadata: direct video URLs, dimensions, creation parameters, model IDs, tags, and more.
- All metadata explicitly released into the public domain under the CC0 license.
- Organized in a single train split with 544,646 entries.

This is likely the most extensive public dataset of AI-generated videos to date.
reacted to MonsterMMORPG's post with ๐Ÿ‘€ 1 day ago
view post
Post
1466
TRELLIS is still the lead Open Source AI model to generate high-quality 3D Assets from static images โ€” Some mind blowing examples โ€” Supports multi-angle improved image to 3D as well โ€” Works as low as 6 GB GPUs


Tutorial link : https://www.youtube.com/watch?v=EhU7Jil9WAk

App Link : https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976

Our app is super advanced with so many features and supports as low as 6 GB GPUs

Also fully supports RTX 5000 GPUs as well

TRELLIS is currently the state of the art locally run-able open source image-to-3D very high quality asset generator. I have developed a 1-click installers and super advanced Gradio app for this model with so many amazing features. In this tutorial video I will show you how to step by step use this amazing AI tool and generate the very best very high-quality 3D assets locally. Moreover, you can also use this tool on RunPod and Massed Compute as well if you are GPU poor.

๐Ÿ”—Follow below link to download the zip file that contains Trellis installer and Gradio App - the one used in the tutorial โคต๏ธ
โ–ถ๏ธ https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976

๐Ÿ”— Python, Git, CUDA, C++ Tools, FFmpeg, cuDNN, MSVC installation tutorial - needed for AI apps - 1-time only setupโคต๏ธ
โ–ถ๏ธ https://youtu.be/DrhUHnYfwC0

๐Ÿ”— SECourses Official Discord 10500+ Members โคต๏ธ
โ–ถ๏ธ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

๐Ÿ”— Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub โคต๏ธ
โ–ถ๏ธ https://github.com/FurkanGozukara/Stable-Diffusion

๐Ÿ”— SECourses Official Reddit - Stay Subscribed To Learn All The News and More โคต๏ธ
โ–ถ๏ธ https://www.reddit.com/r/SECourses/

๐Ÿ”—Official TRELLIS Repo โคต๏ธ
โ–ถ๏ธ https://github.com/microsoft/TRELLIS
reacted to DawnC's post with ๐Ÿ”ฅ 1 day ago
view post
Post
2312
PawMatchAI ๐Ÿพ: The Complete Dog Breed Platform

PawMatchAI offers a comprehensive suite of features designed for dog enthusiasts and prospective owners alike. This all-in-one platform delivers five essential tools to enhance your canine experience:

1. ๐Ÿ”Breed Detection: Upload any dog photo and the AI accurately identifies breeds from an extensive database of 124+ different dog breeds. The system detects dogs in the image and provides confident breed identification results.

2.๐Ÿ“ŠBreed Information: Access detailed profiles for each breed covering exercise requirements, typical lifespan, grooming needs, health considerations, and noise behavior - giving you complete understanding of any breed's characteristics.

3.๐Ÿ“‹ Breed Comparison : Compare any two breeds side-by-side with intuitive visualizations highlighting differences in care requirements, personality traits, health factors, and more - perfect for making informed decisions.

4.๐Ÿ’ก Breed Recommendation: Receive personalized breed suggestions based on your lifestyle preferences. The sophisticated matching system evaluates compatibility across multiple factors including living space, exercise capacity, experience level, and family situation.

5.๐ŸŽจ Style Transfer: Transform your dog photos into artistic masterpieces with five distinct styles: Japanese Anime, Classic Cartoon, Oil Painting, Watercolor, and Cyberpunk - adding a creative dimension to your pet photography.

๐Ÿ‘‹Explore PawMatchAI today:
DawnC/PawMatchAI

If you enjoy this project or find it valuable for your canine companions, I'd greatly appreciate your support with a Likeโค๏ธ for this project.

#ArtificialIntelligence #MachineLearning #ComputerVision #PetTech #TechForLife
reacted to onekq's post with ๐Ÿš€ 1 day ago
view post
Post
1428
The new Mistral medium model is very impressive for its size. Will it be open sourced given the history of Mistral? Does anyone have insights?

onekq-ai/WebApp1K-models-leaderboard
reacted to Nymbo's post with ๐Ÿ‘€ 1 day ago
view post
Post
749
Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?
reacted to m-ric's post with ๐Ÿ”ฅ 2 days ago
view post
Post
2743
I've made an open version of Google's NotebookLM, and it shows the superiority of the open source tech task! ๐Ÿ’ช

The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.

The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.

And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.

Overall, open source solutions rival the quality of closed-source solutions at close to no cost!

Try it here ๐Ÿ‘‰๐Ÿ‘‰ m-ric/open-notebooklm
  • 2 replies
ยท
reacted to Zherui's post with ๐Ÿš€ 2 days ago
view post
Post
395
Hi community,

We are excited to announce the AgiBot World Challenge at IROS 2025! This competition offers an opportunity to push the limits of humanoid robotics, focusing on real-world manipulation tasks and generative modeling.

The competition features two tracks:
Manipulation: Participants will train models for various tasks ranging from easy to challenging, including precise manipulations, long-term tasks, and multi-robot collaboration in diverse environments such as homes, dining areas, and retail spaces.
World Model: This track evaluates modelsโ€™ ability to predict the evolution of visual perspectives based on action sequences, requiring participants to work with real-world data and simulate various robotic interactions.

For more on the challenge, visit our website at https://opendrivelab.com/challenge2025/. We look forward to your participation and advancing the future of robotics together!
reacted to ProCreations's post with ๐Ÿ”ฅ 2 days ago
view post
Post
2304
Post of the Day โ€“ Your Thoughts, Our Take

Yesterday we asked:
If AI could master just one thing, what should it be?
And the responses? Insightful, creative, and genuinely thought-provoking.

Hereโ€™s a few that stood out:

๐Ÿผ @NandaKrishvaa said โ€œCuriosity like a baby.โ€
Instead of just answering questions, an AI that asks them with childlike wonder? Thatโ€™s a whole new kind of intelligence.

@MrDevolver suggested โ€œMaster being Jack of All Trades.โ€
Sure, it bends the rules a bit โ€” but adaptability is key. Sometimes breadth can outshine depth.

@afranco50 argued for โ€œPerfect logic,โ€ saying it could unlock all other abilities.
Itโ€™s a solid point: if an AI can reason flawlessly, it may just learn to improve everything else on its own.

โธป

Our take?
We still believe the biggest leap forward is flawless conversation โ€” not just accurate, but deeply human. Emotional intelligence, nuance, humor, empathy. That kind of interaction is what makes AI feel real.

Itโ€™s also why weโ€™re building IntellIte Chat to focus on that exact skillset:
โ€ข Emotion-aware replies
โ€ข Natural, flowing conversation
โ€ข Strong command of casual and expressive English

When it releases, it wonโ€™t just talk โ€” itโ€™ll connect. And in a world full of tools, we think the future needs more companions.
What do you think? Let us know! If we get more comments, might as well do another post on this tomorrow lol.
  • 2 replies
ยท
reacted to AdinaY's post with ๐Ÿ”ฅ 2 days ago
reacted to onekq's post with ๐Ÿค— 2 days ago
view post
Post
3068
This time Gemini is very quick with API support on its 2.5 pro May release. The performance is impressive too, now it is among top contenders like o4, R1, and Claude.

onekq-ai/WebApp1K-models-leaderboard
reacted to clem's post with ๐Ÿ”ฅ 2 days ago
reacted to DawnC's post with ๐Ÿ”ฅ 2 days ago
view post
Post
5203
VisionScout โ€” Now with Video Analysis! ๐Ÿš€

Iโ€™m excited to announce a major update to VisionScout, my interactive vision tool that now supports VIDEO PROCESSING, in addition to powerful object detection and scene understanding!

โญ๏ธ NEW: Video Analysis Is Here!
๐ŸŽฌ Upload any video file to detect and track objects using YOLOv8.
โฑ๏ธ Customize processing intervals to balance speed and thoroughness.
๐Ÿ“Š Get comprehensive statistics and summaries showing object appearances across the entire video.

What else can VisionScout do?

๐Ÿ–ผ๏ธ Analyze any image and detect 80 object types with YOLOv8.
๐Ÿ”„ Switch between Nano, Medium, and XLarge models for speed or accuracy.
๐ŸŽฏ Filter by object classes (people, vehicles, animals, etc.) to focus on what matters.
๐Ÿ“Š View detailed stats on detections, confidence levels, and distributions.
๐Ÿง  Understand scenes โ€” interpreting environments and potential activities.
โš ๏ธ Automatically identify possible safety concerns based on detected objects.

Whatโ€™s coming next?
๐Ÿ”Ž Expanding YOLOโ€™s object categories.
โšก Faster real-time performance.
๐Ÿ“ฑ Improved mobile responsiveness.

My goal:
To bridge the gap between raw detection and meaningful interpretation.
Iโ€™m constantly exploring ways to help machines not just "see" but truly understand context โ€” and to make these advanced tools accessible to everyone, regardless of technical background.

Try it now! ๐Ÿ–ผ๏ธ๐Ÿ‘‰ DawnC/VisionScout

If you enjoy VisionScout, a โค๏ธ Like for this project or feedback would mean a lot and keeps me motivated to keep building and improving!

#ComputerVision #ObjectDetection #VideoAnalysis #YOLO #SceneUnderstanding #MachineLearning #TechForLife
  • 2 replies
ยท
OSZAR »