John Smith PRO
John6666
AI & ML interests
None yet
Recent Activity
reacted
to
ProCreations's
post
with ๐
1 minute ago
What do you think of Intelliteโs new icons/logo? Let us know!
Also Intellite chat technically does work! But we decided to scale it up a bit (same parameter count at 100m, but we went from trained on 4b tokens to 200b tokens, big upgrade!) for max quality.
Organizations
John6666's activity

reacted to
ProCreations's
post with ๐
1 minute ago

reacted to
blaise-tk's
post with ๐
1 minute ago
Post
340
Today we launch Dione.
A few months ago it was just a wild idea I shared with @bygimenez , now it's real.
Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones.
Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that.
Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source.
Users can even write and share their own installation scripts. This is just the beginning.
๐ Join our exclusive Beta
โ https://getdione.app/beta/join
A few months ago it was just a wild idea I shared with @bygimenez , now it's real.
Dione (Beta) is here, the easiest way to discover and install open-source apps, especially AI ones.
Think of it as the Steam of open source. Installing open-source tools is often a mess. Dione fixes that.
Beautiful UI and workflow. Soon multi-platform, multilingual & fully open-source.
Users can even write and share their own installation scripts. This is just the beginning.
๐ Join our exclusive Beta
โ https://getdione.app/beta/join

reacted to
Jaward's
post with ๐
1 minute ago
Post
62
finally, a course that makes diffusion math much easier to grasp, well done ๐ https://diffusion.csail.mit.edu/

reacted to
kulia-moon's
post with ๐
about 20 hours ago

reacted to
prithivMLmods's
post with ๐
about 20 hours ago
Post
1093
Dropping some image classification models for content moderation, balancers, and classifiers trained on synthetic datasetsโalong with others based on datasets available on the Hub. Also loaded a few low-rank datasets for realistic gender portrait classification and document-type classifiers, all fine-tuned on the SigLIP-2 Patch-16 224 backbone. Models and datasets are listed below:
๐คModels & Datasets :
Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
โ prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
โ prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
โ DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
โ SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
โ Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
โ jonathan-roberts1/GID
๐คCollection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c
To know more about it, visit the model card of the respective model.
๐คModels & Datasets :
Realistic Gender Classification : prithivMLmods/Realistic-Gender-Classification
โ prithivMLmods/Realistic-Portrait-Gender-1024px
Document Type Detection : prithivMLmods/Document-Type-Detection
โ prithivMLmods/Document-Type-Detection
Face Mask Detection : prithivMLmods/Face-Mask-Detection
โ DamarJati/Face-Mask-Detection
Alzheimer Stage Classifier : prithivMLmods/Alzheimer-Stage-Classifier
โ SilpaCS/Augmented_alzheimer
Bone Fracture Detection : prithivMLmods/Bone-Fracture-Detection
โ Hemg/bone-fracture-detection
GiD Land Cover Classification : prithivMLmods/GiD-Land-Cover-Classification
โ jonathan-roberts1/GID
๐คCollection : prithivMLmods/siglip2-05102025-681c2b0e406f0740a993fc1c
To know more about it, visit the model card of the respective model.

reacted to
lorraine2's
post with ๐
1 day ago
Post
384
๐ New NVIDIA paper: Audio-SDS ๐
We adapt Score Distillation Sampling (SDS), originally developed for text-to-3D generation, to audio diffusion models, allowing us to reuse large pretrained models for new text-guided parametric audio tasks such as source separation, physically informed impact synthesis, and more.
๐ Project Page: https://research.nvidia.com/labs/toronto-ai/Audio-SDS/
๐ Full Paper: https://arxiv.org/abs/2505.04621
Check out more from NVIDIAโs Spatial Intelligence Lab here: https://research.nvidia.com/labs/toronto-ai/
This project was led by the great work of Jessie Richter-Powell, along with Antonio Torralba.
Notably, we find a new and exciting use case for Stable Audio Open ๐
We adapt Score Distillation Sampling (SDS), originally developed for text-to-3D generation, to audio diffusion models, allowing us to reuse large pretrained models for new text-guided parametric audio tasks such as source separation, physically informed impact synthesis, and more.
๐ Project Page: https://research.nvidia.com/labs/toronto-ai/Audio-SDS/
๐ Full Paper: https://arxiv.org/abs/2505.04621
Check out more from NVIDIAโs Spatial Intelligence Lab here: https://research.nvidia.com/labs/toronto-ai/
This project was led by the great work of Jessie Richter-Powell, along with Antonio Torralba.
Notably, we find a new and exciting use case for Stable Audio Open ๐

reacted to
daavoo's
post with ๐
1 day ago
Post
1025
Have you heard about the Agent2Agent Protocol (A2A)?
We have just released an option in https://github.com/mozilla-ai/any-agent to serve with A2A any of the supported agent frameworks (Agno, Google ADK, Langchain, LlamaIndex, OpenAI Agents SDK, smolagents and tinyagent)!
Check the docs https://mozilla-ai.github.io/any-agent/serving/
We have just released an option in https://github.com/mozilla-ai/any-agent to serve with A2A any of the supported agent frameworks (Agno, Google ADK, Langchain, LlamaIndex, OpenAI Agents SDK, smolagents and tinyagent)!
Check the docs https://mozilla-ai.github.io/any-agent/serving/
# google_expert.py
from any_agent import AgentConfig, AnyAgent
from any_agent.config import ServingConfig
from any_agent.tools import search_web
agent = AnyAgent.create(
"google",
AgentConfig(
name="google_expert",
model_id="gpt-4.1-nano",
instructions="You must use the available tools to find an answer",
description="An agent that can answer questions about the Google Agents Development Kit (ADK).",
tools=[search_web]
)
)
agent.serve(ServingConfig(port=5001))

reacted to
YerbaPage's
post with ๐ฅ
1 day ago
Post
1537
Curated list of **Next-Gen Code Generation** papers & benchmarks! ๐ฅ
Stay ahead with the latest in:
โ Repo-level Issue Resolution (SWE-bench, Agents)
โ Repo-level Code Completion (Repo understanding)
โ Datasets & Benchmarks
๐ Check it out: https://github.com/YerbaPage/Awesome-Repo-Level-Code-Generation ๐ฅ
Stay ahead with the latest in:
โ Repo-level Issue Resolution (SWE-bench, Agents)
โ Repo-level Code Completion (Repo understanding)
โ Datasets & Benchmarks
๐ Check it out: https://github.com/YerbaPage/Awesome-Repo-Level-Code-Generation ๐ฅ

reacted to
nyuuzyou's
post with ๐
1 day ago
Post
313
๐๏ธ HailuoAI Video Metadata Dataset -
nyuuzyou/hailuoai
Collection of 544,646 AI-generated video metadata entries from HailuoAI featuring:
- Comprehensive metadata: direct video URLs, dimensions, creation parameters, model IDs, tags, and more.
- All metadata explicitly released into the public domain under the CC0 license.
- Organized in a single train split with 544,646 entries.
This is likely the most extensive public dataset of AI-generated videos to date.
Collection of 544,646 AI-generated video metadata entries from HailuoAI featuring:
- Comprehensive metadata: direct video URLs, dimensions, creation parameters, model IDs, tags, and more.
- All metadata explicitly released into the public domain under the CC0 license.
- Organized in a single train split with 544,646 entries.
This is likely the most extensive public dataset of AI-generated videos to date.

reacted to
MonsterMMORPG's
post with ๐
1 day ago
Post
1466
TRELLIS is still the lead Open Source AI model to generate high-quality 3D Assets from static images โ Some mind blowing examples โ Supports multi-angle improved image to 3D as well โ Works as low as 6 GB GPUs
Tutorial link : https://www.youtube.com/watch?v=EhU7Jil9WAk
App Link : https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976
Our app is super advanced with so many features and supports as low as 6 GB GPUs
Also fully supports RTX 5000 GPUs as well
TRELLIS is currently the state of the art locally run-able open source image-to-3D very high quality asset generator. I have developed a 1-click installers and super advanced Gradio app for this model with so many amazing features. In this tutorial video I will show you how to step by step use this amazing AI tool and generate the very best very high-quality 3D assets locally. Moreover, you can also use this tool on RunPod and Massed Compute as well if you are GPU poor.
๐Follow below link to download the zip file that contains Trellis installer and Gradio App - the one used in the tutorial โคต๏ธ
โถ๏ธ https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976
๐ Python, Git, CUDA, C++ Tools, FFmpeg, cuDNN, MSVC installation tutorial - needed for AI apps - 1-time only setupโคต๏ธ
โถ๏ธ https://youtu.be/DrhUHnYfwC0
๐ SECourses Official Discord 10500+ Members โคต๏ธ
โถ๏ธ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
๐ Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub โคต๏ธ
โถ๏ธ https://github.com/FurkanGozukara/Stable-Diffusion
๐ SECourses Official Reddit - Stay Subscribed To Learn All The News and More โคต๏ธ
โถ๏ธ https://www.reddit.com/r/SECourses/
๐Official TRELLIS Repo โคต๏ธ
โถ๏ธ https://github.com/microsoft/TRELLIS
Tutorial link : https://www.youtube.com/watch?v=EhU7Jil9WAk
App Link : https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976
Our app is super advanced with so many features and supports as low as 6 GB GPUs
Also fully supports RTX 5000 GPUs as well
TRELLIS is currently the state of the art locally run-able open source image-to-3D very high quality asset generator. I have developed a 1-click installers and super advanced Gradio app for this model with so many amazing features. In this tutorial video I will show you how to step by step use this amazing AI tool and generate the very best very high-quality 3D assets locally. Moreover, you can also use this tool on RunPod and Massed Compute as well if you are GPU poor.
๐Follow below link to download the zip file that contains Trellis installer and Gradio App - the one used in the tutorial โคต๏ธ
โถ๏ธ https://www.patreon.com/posts/Trellis-App-Installer-Zip-File-117470976
๐ Python, Git, CUDA, C++ Tools, FFmpeg, cuDNN, MSVC installation tutorial - needed for AI apps - 1-time only setupโคต๏ธ
โถ๏ธ https://youtu.be/DrhUHnYfwC0
๐ SECourses Official Discord 10500+ Members โคต๏ธ
โถ๏ธ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
๐ Stable Diffusion, FLUX, Generative AI Tutorials and Resources GitHub โคต๏ธ
โถ๏ธ https://github.com/FurkanGozukara/Stable-Diffusion
๐ SECourses Official Reddit - Stay Subscribed To Learn All The News and More โคต๏ธ
โถ๏ธ https://www.reddit.com/r/SECourses/
๐Official TRELLIS Repo โคต๏ธ
โถ๏ธ https://github.com/microsoft/TRELLIS

reacted to
DawnC's
post with ๐ฅ
1 day ago
Post
2312
PawMatchAI ๐พ: The Complete Dog Breed Platform
PawMatchAI offers a comprehensive suite of features designed for dog enthusiasts and prospective owners alike. This all-in-one platform delivers five essential tools to enhance your canine experience:
1. ๐Breed Detection: Upload any dog photo and the AI accurately identifies breeds from an extensive database of 124+ different dog breeds. The system detects dogs in the image and provides confident breed identification results.
2.๐Breed Information: Access detailed profiles for each breed covering exercise requirements, typical lifespan, grooming needs, health considerations, and noise behavior - giving you complete understanding of any breed's characteristics.
3.๐ Breed Comparison : Compare any two breeds side-by-side with intuitive visualizations highlighting differences in care requirements, personality traits, health factors, and more - perfect for making informed decisions.
4.๐ก Breed Recommendation: Receive personalized breed suggestions based on your lifestyle preferences. The sophisticated matching system evaluates compatibility across multiple factors including living space, exercise capacity, experience level, and family situation.
5.๐จ Style Transfer: Transform your dog photos into artistic masterpieces with five distinct styles: Japanese Anime, Classic Cartoon, Oil Painting, Watercolor, and Cyberpunk - adding a creative dimension to your pet photography.
๐Explore PawMatchAI today:
DawnC/PawMatchAI
If you enjoy this project or find it valuable for your canine companions, I'd greatly appreciate your support with a Likeโค๏ธ for this project.
#ArtificialIntelligence #MachineLearning #ComputerVision #PetTech #TechForLife
PawMatchAI offers a comprehensive suite of features designed for dog enthusiasts and prospective owners alike. This all-in-one platform delivers five essential tools to enhance your canine experience:
1. ๐Breed Detection: Upload any dog photo and the AI accurately identifies breeds from an extensive database of 124+ different dog breeds. The system detects dogs in the image and provides confident breed identification results.
2.๐Breed Information: Access detailed profiles for each breed covering exercise requirements, typical lifespan, grooming needs, health considerations, and noise behavior - giving you complete understanding of any breed's characteristics.
3.๐ Breed Comparison : Compare any two breeds side-by-side with intuitive visualizations highlighting differences in care requirements, personality traits, health factors, and more - perfect for making informed decisions.
4.๐ก Breed Recommendation: Receive personalized breed suggestions based on your lifestyle preferences. The sophisticated matching system evaluates compatibility across multiple factors including living space, exercise capacity, experience level, and family situation.
5.๐จ Style Transfer: Transform your dog photos into artistic masterpieces with five distinct styles: Japanese Anime, Classic Cartoon, Oil Painting, Watercolor, and Cyberpunk - adding a creative dimension to your pet photography.
๐Explore PawMatchAI today:
DawnC/PawMatchAI
If you enjoy this project or find it valuable for your canine companions, I'd greatly appreciate your support with a Likeโค๏ธ for this project.
#ArtificialIntelligence #MachineLearning #ComputerVision #PetTech #TechForLife

reacted to
onekq's
post with ๐
1 day ago
Post
1428
The new Mistral medium model is very impressive for its size. Will it be open sourced given the history of Mistral? Does anyone have insights?
onekq-ai/WebApp1K-models-leaderboard
onekq-ai/WebApp1K-models-leaderboard

reacted to
Nymbo's
post with ๐
1 day ago
Post
749
Haven't seen this posted anywhere - Llama-3.3-8B-Instruct is available on the new Llama API. Is this a new model or did someone mislabel Llama-3.1-8B?

reacted to
m-ric's
post with ๐ฅ
2 days ago
Post
2743
I've made an open version of Google's NotebookLM, and it shows the superiority of the open source tech task! ๐ช
The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.
The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.
And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.
Overall, open source solutions rival the quality of closed-source solutions at close to no cost!
Try it here ๐๐ m-ric/open-notebooklm
The app's workflow is simple. Given a source PDF or URL, it extracts the content from it, then tasks Meta's Llama 3.3-70B with writing the podcast script, with a good prompt crafted by @gabrielchua ("two hosts, with lively discussion, fun notes, insightful question etc.")
Then it hands off the text-to-speech conversion to Kokoro-82M, and there you go, you have two hosts discussion any article.
The generation is nearly instant, because:
> Llama 3.3 70B is running at 1,000 tokens/seconds with Cerebras inference
> The audio is generated in streaming mode by the tiny (yet powerful) Kokoro, generating voices faster than real-time.
And the audio generation runs for free on Zero GPUs, hosted by HF on H200s.
Overall, open source solutions rival the quality of closed-source solutions at close to no cost!
Try it here ๐๐ m-ric/open-notebooklm

reacted to
Zherui's
post with ๐
2 days ago
Post
395
Hi community,
We are excited to announce the AgiBot World Challenge at IROS 2025! This competition offers an opportunity to push the limits of humanoid robotics, focusing on real-world manipulation tasks and generative modeling.
The competition features two tracks:
Manipulation: Participants will train models for various tasks ranging from easy to challenging, including precise manipulations, long-term tasks, and multi-robot collaboration in diverse environments such as homes, dining areas, and retail spaces.
World Model: This track evaluates modelsโ ability to predict the evolution of visual perspectives based on action sequences, requiring participants to work with real-world data and simulate various robotic interactions.
For more on the challenge, visit our website at https://opendrivelab.com/challenge2025/. We look forward to your participation and advancing the future of robotics together!
We are excited to announce the AgiBot World Challenge at IROS 2025! This competition offers an opportunity to push the limits of humanoid robotics, focusing on real-world manipulation tasks and generative modeling.
The competition features two tracks:
Manipulation: Participants will train models for various tasks ranging from easy to challenging, including precise manipulations, long-term tasks, and multi-robot collaboration in diverse environments such as homes, dining areas, and retail spaces.
World Model: This track evaluates modelsโ ability to predict the evolution of visual perspectives based on action sequences, requiring participants to work with real-world data and simulate various robotic interactions.
For more on the challenge, visit our website at https://opendrivelab.com/challenge2025/. We look forward to your participation and advancing the future of robotics together!

reacted to
ProCreations's
post with ๐ฅ
2 days ago
Post
2304
Post of the Day โ Your Thoughts, Our Take
Yesterday we asked:
If AI could master just one thing, what should it be?
And the responses? Insightful, creative, and genuinely thought-provoking.
Hereโs a few that stood out:
๐ผ @NandaKrishvaa said โCuriosity like a baby.โ
Instead of just answering questions, an AI that asks them with childlike wonder? Thatโs a whole new kind of intelligence.
@MrDevolver suggested โMaster being Jack of All Trades.โ
Sure, it bends the rules a bit โ but adaptability is key. Sometimes breadth can outshine depth.
@afranco50 argued for โPerfect logic,โ saying it could unlock all other abilities.
Itโs a solid point: if an AI can reason flawlessly, it may just learn to improve everything else on its own.
โธป
Our take?
We still believe the biggest leap forward is flawless conversation โ not just accurate, but deeply human. Emotional intelligence, nuance, humor, empathy. That kind of interaction is what makes AI feel real.
Itโs also why weโre building IntellIte Chat to focus on that exact skillset:
โข Emotion-aware replies
โข Natural, flowing conversation
โข Strong command of casual and expressive English
When it releases, it wonโt just talk โ itโll connect. And in a world full of tools, we think the future needs more companions.
What do you think? Let us know! If we get more comments, might as well do another post on this tomorrow lol.
Yesterday we asked:
If AI could master just one thing, what should it be?
And the responses? Insightful, creative, and genuinely thought-provoking.
Hereโs a few that stood out:
๐ผ @NandaKrishvaa said โCuriosity like a baby.โ
Instead of just answering questions, an AI that asks them with childlike wonder? Thatโs a whole new kind of intelligence.
@MrDevolver suggested โMaster being Jack of All Trades.โ
Sure, it bends the rules a bit โ but adaptability is key. Sometimes breadth can outshine depth.
@afranco50 argued for โPerfect logic,โ saying it could unlock all other abilities.
Itโs a solid point: if an AI can reason flawlessly, it may just learn to improve everything else on its own.
โธป
Our take?
We still believe the biggest leap forward is flawless conversation โ not just accurate, but deeply human. Emotional intelligence, nuance, humor, empathy. That kind of interaction is what makes AI feel real.
Itโs also why weโre building IntellIte Chat to focus on that exact skillset:
โข Emotion-aware replies
โข Natural, flowing conversation
โข Strong command of casual and expressive English
When it releases, it wonโt just talk โ itโll connect. And in a world full of tools, we think the future needs more companions.
What do you think? Let us know! If we get more comments, might as well do another post on this tomorrow lol.

reacted to
AdinaY's
post with ๐ฅ
2 days ago
Post
1812
HunyuanCustom ๐ฅ a multimodal video generation framework supporting image, audio, video & text conditions, released by TencentHunyuan
tencent/HunyuanCustom
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation (2505.04512)
โจStrong Identity Consistency
โจSOTA outperforms
tencent/HunyuanCustom
HunyuanCustom: A Multimodal-Driven Architecture for Customized Video Generation (2505.04512)
โจStrong Identity Consistency
โจSOTA outperforms

reacted to
onekq's
post with ๐ค
2 days ago
Post
3068
This time Gemini is very quick with API support on its 2.5 pro May release. The performance is impressive too, now it is among top contenders like o4, R1, and Claude.
onekq-ai/WebApp1K-models-leaderboard
onekq-ai/WebApp1K-models-leaderboard

reacted to
DawnC's
post with ๐ฅ
2 days ago
Post
5203
VisionScout โ Now with Video Analysis! ๐
Iโm excited to announce a major update to VisionScout, my interactive vision tool that now supports VIDEO PROCESSING, in addition to powerful object detection and scene understanding!
โญ๏ธ NEW: Video Analysis Is Here!
๐ฌ Upload any video file to detect and track objects using YOLOv8.
โฑ๏ธ Customize processing intervals to balance speed and thoroughness.
๐ Get comprehensive statistics and summaries showing object appearances across the entire video.
What else can VisionScout do?
๐ผ๏ธ Analyze any image and detect 80 object types with YOLOv8.
๐ Switch between Nano, Medium, and XLarge models for speed or accuracy.
๐ฏ Filter by object classes (people, vehicles, animals, etc.) to focus on what matters.
๐ View detailed stats on detections, confidence levels, and distributions.
๐ง Understand scenes โ interpreting environments and potential activities.
โ ๏ธ Automatically identify possible safety concerns based on detected objects.
Whatโs coming next?
๐ Expanding YOLOโs object categories.
โก Faster real-time performance.
๐ฑ Improved mobile responsiveness.
My goal:
To bridge the gap between raw detection and meaningful interpretation.
Iโm constantly exploring ways to help machines not just "see" but truly understand context โ and to make these advanced tools accessible to everyone, regardless of technical background.
Try it now! ๐ผ๏ธ๐ DawnC/VisionScout
If you enjoy VisionScout, a โค๏ธ Like for this project or feedback would mean a lot and keeps me motivated to keep building and improving!
#ComputerVision #ObjectDetection #VideoAnalysis #YOLO #SceneUnderstanding #MachineLearning #TechForLife
Iโm excited to announce a major update to VisionScout, my interactive vision tool that now supports VIDEO PROCESSING, in addition to powerful object detection and scene understanding!
โญ๏ธ NEW: Video Analysis Is Here!
๐ฌ Upload any video file to detect and track objects using YOLOv8.
โฑ๏ธ Customize processing intervals to balance speed and thoroughness.
๐ Get comprehensive statistics and summaries showing object appearances across the entire video.
What else can VisionScout do?
๐ผ๏ธ Analyze any image and detect 80 object types with YOLOv8.
๐ Switch between Nano, Medium, and XLarge models for speed or accuracy.
๐ฏ Filter by object classes (people, vehicles, animals, etc.) to focus on what matters.
๐ View detailed stats on detections, confidence levels, and distributions.
๐ง Understand scenes โ interpreting environments and potential activities.
โ ๏ธ Automatically identify possible safety concerns based on detected objects.
Whatโs coming next?
๐ Expanding YOLOโs object categories.
โก Faster real-time performance.
๐ฑ Improved mobile responsiveness.
My goal:
To bridge the gap between raw detection and meaningful interpretation.
Iโm constantly exploring ways to help machines not just "see" but truly understand context โ and to make these advanced tools accessible to everyone, regardless of technical background.
Try it now! ๐ผ๏ธ๐ DawnC/VisionScout
If you enjoy VisionScout, a โค๏ธ Like for this project or feedback would mean a lot and keeps me motivated to keep building and improving!
#ComputerVision #ObjectDetection #VideoAnalysis #YOLO #SceneUnderstanding #MachineLearning #TechForLife