Adina Yakefu

AdinaY

AI & ML interests

None yet

Recent Activity

Organizations

Hugging Face's profile picture Hugging Face Chinese Localization's profile picture Huggingface Projects's profile picture Blog-explorers's profile picture ICCV2023's profile picture Open LLM Leaderboard's profile picture huggingPartyParis's profile picture Qwen's profile picture Women on Hugging Face's profile picture Journalists on Hugging Face's profile picture Social Post Explorers's profile picture Chinese LLMs on Hugging Face's profile picture Hugging Face for Legal's profile picture Inference Endpoints Images's profile picture LeRobot Worldwide Hackathon's profile picture

AdinaY's activity

posted an update about 5 hours ago
view post
Post
385
RoboBrain 🧠 an 32B open embodied AI model enabling multi-robot collaboration, released by BAAIBeijing.

Model: BAAI/robobrain-681e1389c64d06b3e4a45e44
Dataset: BAAI/ShareRobot

✨ Task decomposition into 20+ precise actions
✨ Operable region detection (e.g: teapot handles, drawers)
✨ Motion trajectory prediction to avoid collisions
posted an update about 6 hours ago
posted an update 3 days ago
reacted to merve's post with πŸ”₯ 5 days ago
view post
Post
4902
A ton of impactful models and datasets in open AI past week, let's summarize the best 🀩 merve/releases-apr-21-and-may-2-6819dcc84da4190620f448a3

πŸ’¬ Qwen made it rain! They released Qwen3: new dense and MoE models ranging from 0.6B to 235B 🀯 as well as Qwen2.5-Omni, any-to-any model in 3B and 7B!
> Microsoft AI released Phi4 reasoning models (that also come in mini and plus sizes)
> NVIDIA released new CoT reasoning datasets
πŸ–ΌοΈ > ByteDance released UI-TARS-1.5, native multimodal UI parsing agentic model
> Meta released EdgeTAM, an on-device object tracking model (SAM2 variant)
πŸ—£οΈ NVIDIA released parakeet-tdt-0.6b-v2, a smol 600M automatic speech recognition model
> Nari released Dia, a 1.6B text-to-speech model
> Moonshot AI released Kimi Audio, a new audio understanding, generation, conversation model
πŸ‘©πŸ»β€πŸ’» JetBrains released Melium models in base and SFT for coding
> Tesslate released UIGEN-T2-7B, a new text-to-frontend-code model 🀩
reacted to merve's post with πŸ‘πŸš€ 6 days ago
view post
Post
6406
A real-time object detector much faster and accurate than YOLO with Apache 2.0 license just landed to Hugging Face transformers πŸ”₯

D-FINE is the sota real-time object detector that runs on T4 (free Colab) 🀩

> Collection with all checkpoints and demo ustc-community/d-fine-68109b427cbe6ee36b4e7352

Notebooks:
> Tracking https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_tracking.ipynb
> Inference https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_inference.ipynb
> Fine-tuning https://github.com/qubvel/transformers-notebooks/blob/main/notebooks/DFine_finetune_on_a_custom_dataset.ipynb
h/t @vladislavbro @qubvel-hf @ariG23498 and the authors of the paper 🎩

Regular object detectors attempt to predict bounding boxes in (x, y, w, h) pixel perfect coordinates, which is very rigid and hard to solve πŸ₯²β˜ΉοΈ



D-FINE formulates object detection as a distribution for bounding box coordinates, refines them iteratively, and it's more accurate 🀩

Another core idea behind this model is Global Optimal Localization Self-Distillation ‡️

this model uses final layer's distribution output (sort of like a teacher) to distill to earlier layers to make early layers more performant.

  • 2 replies
Β·
posted an update 6 days ago
view post
Post
3840
ACE-Step 🎡 a music generation foundation model released by
StepFun & ACEStudio

Model: ACE-Step/ACE-Step-v1-3.5B
Demo: ACE-Step/ACE-Step

✨ 3.5B, Apache2.0 licensed
✨ 115Γ— faster than LLMs (4-min music in 20s on A100)
✨ Diffusion + DCAE + linear transformer = speed + coherence
✨ Supports voice cloning, remixing, lyric editing & more
  • 1 reply
Β·
posted an update 6 days ago
view post
Post
793
CCI4.0-M2 πŸ“Š A powerful dataset with 3 specialized subsets, released by
BAAIBeijing

BAAI/cci40-68199d90bbc798680df16d7c

✨ M2-Base: 3.5TB web data (EN/ZH), with LLM-augmented content, APACHE2.0
✨ M2-CoT: 4.2TB of auto-synthesized CoT reasoning data
✨ M2-Extra: domain-specific knowledge

OSZAR »