Societies of Agents: From Individual Intelligence to Intelligent Societies

My research centers on a fundamental transition now underway: intelligence is no longer confined to isolated models or tools, but increasingly embodied in agents that act, learn, and influence one another within shared environments.

I frame this direction as Societies of Agents - systems in which multiple learning agents form structured, adaptive, and value-driven societies rather than merely coexisting as independent components.

Societies of Agents: From Individual Intelligence to Intelligent Societies — A three-layer view of Societies of Agents: infrastructure → agent foundation learning → societal-scale deployment(image generated by nana banana).

This research agenda unfolds across three interconnected layers:

Infrastructure for Agentic Learning

I develop training and execution frameworks that enable long-horizon learning, coordination, and adaptation among agents. This layer focuses on how agents are systematically organized to support sustained interaction and collective behavior.

Domain-Specific Agent Foundation Learning

Building on this infrastructure, I study how datasets, post-training, reward shaping, and workflow design give rise to stable, interpretable, and transferable agent behaviors. The central question here is not model capability, but how agents acquire roles, responsibilities, and norms within a society.

Societal-Scale Multi-Agent Systems in the Real World

At the highest level, I explore how societies of agents operate in industrial and real-world settings, where cooperation and competition coexist, and where governance, credit assignment, and value flow become central design challenges.

Across reinforcement learning, multi-agent systems, trust modeling, and governance, my work is guided by a single conviction: the future of intelligence lies not in isolated agents, but in learning societies of agents.

Current Research Projects

My current projects span reinforcement learning, multi-agent systems, post-training, and real-world deployments. Below is a structured snapshot aligned with the Societies of Agents agenda.

A. Agentic Infrastructure and Orchestration

Symphony: Decentralized multi-agent framework for large-scale coordination.
Workflow Optimization: Dynamic routing, planning, and scheduling for complex tasks.
Memory Evolve: Scalable memory systems enabling long-term learning.
LLM Hardware Acceleration: Inference optimization (e.g., speculative decoding).

B. Agent Foundation Learning (Data, Post-Training, Reward)

Post-Training Optimization: Alignment and capability improvement.
Reward Benchmark: Evaluating reward functions in multi-agent environments.
LLM Reasoning: Enhancing reasoning via agentic and collaborative design.
LLM Red-Teaming: Adversarial evaluation and robustness.

C. Societal-Scale Agents in Real-World Domains

Financial Intelligence Agents: Fraud detection and anomaly mining.
Stock Market Mining: Market analysis via RL and agent-based models.
Intelligent Customer Service: Large-scale dialogue systems.
Intelligent Transportation Modeling: Multi-agent RL for traffic systems.
Multimodal Geo Foundation Models: Vision-text-structured geospatial learning.

D. Long-Horizon Interactive Environments (Testbeds)

Reinforcement Learning for Poker: Long-horizon strategy optimization.
Pokemon Agents: Planning, exploration, and self-evolving behaviors.
Computer Use agent: Web-environment agents with world models.

E. Multimodal and Human-Centric Systems

Digital Human Generation: Skeleton-based motion generation.
Virtual Try-On: Realistic and personalized try-on systems.
AI for Education: Interactive tutoring systems.
Medical LLM: Language models for healthcare.
Computer Vision for Surveillance: Safety-oriented visual understanding.

For collaboration opportunities or more information about my research, please contact me.