Join the conversation

Join the community of Machine Learners and AI enthusiasts.

Sign Up
Kseniase 
posted an update 7 days ago
Post
6145
9 Recent advances in Multi-Agent Systems (all open-source)

The idea to split tasks across multiple agents instead of relying on one universal agent is now seen as one of the most effective ways to build an AI stack. Concepts like “agent swarms” were highlighted at the AI Engineer Code Summit in NYC (Nov 20–21) as the winning architecture. And this trend is not only about coding and software. It applies across all AI domains.

So here is some recent research that helps keep multi-agent systems (MAS) better and up-to-date:

1. LatentMAS → Latent Collaboration in Multi-Agent Systems (2511.20639)
AI agents share their hidden "thoughts" directly in latent space instead of talking through text. This makes collaboration and reasoning way faster and accurate (no extra training needed)

2. Puppeteer → Multi-Agent Collaboration via Evolving Orchestration (2505.19591)
Uses a “puppeteer” LLM that dynamically decides which agents (“puppets”) to call and in what order. By learning this orchestration with reinforcement learning (RL), the system solves complex tasks more efficiently and with fewer compute costs

3. MADD → MADD: Multi-Agent Drug Discovery Orchestra (2511.08217)
A MAS with 4 agents for drug discovery. It lets researchers describe a drug discovery task in plain language. Then MADD automatically builds and runs the full hit-identification pipeline, making AI-driven drug design a simple end-to-end workflow

4. Multi-Agent Tool-Integrated Policy Optimization (MATPO) → Multi-Agent Tool-Integrated Policy Optimization (2510.04678)
Lets one LLM act as multiple agents (like a planner and a worker) by using different prompts and training them together with RL. So you get the benefits of a multi-agent system without needing multiple models

If you're interested in trends in multi-agent for software development of the future, explore my article with the emergent playbook. This is super interesting → https://www.turingpost.com/p/aisoftwarestack
Also, subscribe to the Turing Post: https://www.turingpost.com/subscribe

Read further below ⬇️
  1. QuantAgent → https://huggingface.co/papers/2509.09995
    A multi-agent LLM system for high-frequency trading in real time. It splits the job between 4 agents – Indicator, Pattern, Trend, and Risk – to make quick, precise decisions, based on short-term market signals

  2. MAC-Flow → https://huggingface.co/papers/2511.05005
    Learns complex multi-agent coordination with a flow model and distills it into fast one-step policies, providing diffusion-level coordination with Gaussian-level real-time speed

  3. MrlX → https://github.com/AQ-MedAI/MrlX
    A multi-agent RL framework where 2 agents talk through a multi-turn dialogue (Agent A initiates it, Agent B engages in responses), learn from each other, and update their models in a continuous “generate → train → sync” loop. The agents co-evolve and get better at collaborative decision-making over time

  4. M-GRPO for Multi-Agent Deep Research → https://huggingface.co/papers/2511.13288
    This training method lets different agents in a MAS use their own specialized LLMs while still learning together. It gives each agent its own local reward signal and aligns their uneven trajectories, so they stay coordinated even when running at different speeds or on different servers

  5. MarsRL→ https://huggingface.co/papers/2511.11373
    Trains the Solver, Verifier, and Corrector agents together with separate rewards for each and a pipeline-style RL setup, which makes them better at catching mistakes and refining answers and reaching much higher accuracy on math benchmarks

Out of all LatentMAS is super crazy and Interesting

In this post