Skip to content

🤖 A collection of AI agents includes research papers, blogs, and products focused on developing autonomous systems.

Notifications You must be signed in to change notification settings

algomatic-inc/awesome-ai-agents-guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

23 Commits
 
 
 
 

Repository files navigation

Awesome AI Agents Guide

Algomatic PR Welcome License: MIT

The awesome-ai-agent-guide repository is an initial effort to put together a comprehensive list of AI/LLM Agents focused on research and products.

Please note, this repository is a voluntary project and does not list all existing AI agents. This repository is a work in progress, and items are being added gradually. Contributions are welcome, so we look forward to your proactive PRs!

Disclaimer
・ If there are any errors in interpretation or quotations, please let us know.
・ Please be sure to refer to the licenses and terms of use when using.

🌟 If this was helpful, we’d love it if you followed us!: @AlgomaticJP

Overview


AI Agent

We are not sure about the exact definition, but one commonly known definition of "autonomous agent" is the following.

An autonomous agent is a system situated within and a part of an environment that senses that environment and acts on it, over time, in pursuit of its own agenda and so as to effect what it senses in the future.
--- Franklin and Graesser (1997)

Additionally, in Practices for Governing Agentic AI Systems published by Shavit et al. (OpenAI) in December 2023, the degree of agenticness in a system is defined as follows:

We define the degree of agenticness in a system as “the degree to which a system can adaptably achieve complex goals in complex environments with limited direct supervision.” Agenticness as defined here thus breaks down into several components:

  • Goal complexity: How challenging would the AI system’s goal8 be for a human to achieve and how wide of a range of goals could the system achieve? Properties of the goal may include target levels of reliability, speed, and safety.
  • Environmental complexity: How complex are the environments under which a system can achieve the goal? (E.g., to what extent are they cross-domain, multi-stakeholder, require operating over long time-horizons, and/or involve the use of multiple external tools.)
  • Adaptability: How well can the system adapt and react to novel or unexpected circumstances?
  • Independent execution: To what extent can the system reliably achieve its goals with limited human intervention or supervision?

Survey

  • 2024.02 - Huang et al., Position Paper: Agent AI Towards a Holistic Intelligence [arXiv]
  • 2023.09 - Xi et al., The Rise and Potential of Large Language Model Based Agents: A Survey [arXiv][GitHub]
  • 2023.09 - Zhao et al., An In-depth Survey of Large Language Model-based Artificial Intelligence Agents [arXiv]
  • 2023.08 - Wang et al., A Survey on Large Language Model Based Autonomous Agents [arXiv][GitHub]
  • 2023.06 - Taniguchi et al., World models and predictive coding for cognitive and developmental robotics: frontiers and challenges (Advanced Robotics) [tandfonline]

Workshop or Tutorial

  • 2024.07 - ICML 2024 Tutorial Understanding the Role of Large Language Models in Planning [Home]
  • 2024.06 - CVPR 2024 Tutorial on Generalist Agent AI [Home]
  • 2024.05 - ICLR 2024 Workshop on LLM Agents [Home]
  • 2023.10 - CIKM 2023 Personalized Generative AI [Home]
  • 2023.08 - IJCAI 2023 Symposium on Large Language Models (LLM 2023) [Home]

Misc

  • 2025.XX - Micheal Lanham, GPT Agents in Action [manning]
  • 2024.XX - LangChain, Go autonomous 
with LangChain Agents [langchain]
  • 2024.06 - DeepLearning.AI, AI Agents in LangGraph [deeplearning.ai]
  • 2024.05 - DeepLeanring.AI, AI Agentic Design Patterns with AutoGen [deeplearning.ai]
  • 2024.05 - DeepLearning.AI, Multi AI Agent Systems with crewAI [deeplearning.ai]
  • 2024.05 - DeepLearning.AI, Functions, Tools and Agents with LangChain [deeplearning.ai]
  • 2024.05 - DeepLearning.AI, Building Agentic RAG with LlamaIndex [deeplearning.ai]
  • 2024.05 - Chi Wang, Agents in AutoGen [autogen]
  • 2024.05 - Bavor and Taylor, The Guide to AI Agents [SIERRA]
  • 2024.05 - Yohei Nakajima, Future of Autonomous Agents [X's broadcast]
  • 2024.05 - Alex Klein, The agentic era of UX [Medium]
  • 2024.05 - Cobus Greyling, Five Levels Of AI Agents [Medium]
  • 2024.03 - Andrew Ng, Agentic Design Patterns Part 1 [deeplearning.ai]
  • 2024.03 - Harrison Chase, What's next for AI agents [Sequoia Capital, Youtube]
  • 2024.02 - Zaharia et al., The Shift from Models to Compound AI Systems [BAIR]
  • 2024.02 - Vincent Koc, Generative AI Design Patterns: A Comprehensive Guide [Medium]
  • 2023.12 - OpenAI, Practices for Governing Agentic AI Systems [OpenAI]
  • 2023.12 - Victor Dibia, Multi-Agent LLM Applications | A Review of Current Research, Tools, and Challenges [newsletter]
  • 2023.11 - Tanay Varshney, Introduction to LLM Agents [NVIDIA Blog]
  • 2023.06 - Lilian Weng, LLM Powered Autonomous Agents [Lil'Log]
  • Prompt Engineering Guide, LLM Agents [promptingguide.ai]

Libraries

Agent

  • 2023 - Yao et al., ReAct: Synergizing Reasoning and Acting in Language Models (ICLR) [openreview]

  • 2024.05 - Liu et al., Agent Design Pattern Catalogue: A Collection of Architectural Patterns for Foundation Model based Agents [arXiv]

  • 2024.05 - Liu et al., Exploring Prosocial Irrationality for LLM Agents: A Social Cognition View [arXiv]

  • 2024.03 - Spivack et al., Cognition is All You Need -- The Next Layer of AI Above Large Language Models [arXiv]

  • 2024.02 - Zhang et al., Offline Training of Language Model Agents with Functions as Learnable Weights [arXiv]

  • 2024.02 - Li et al., More Agents Is All You Need [arXiv]

  • 2024.02 - Mo et al., A Trembling House of Cards? Mapping Adversarial Attacks against Language Agents [arXiv]

  • 2023.12 - Ge et al., LLM as OS, Agents as Apps: Envisioning AIOS, Agents and the AIOS-Agent Ecosystem [arXiv]

  • 2023.10 - Zeng et al., AgentTuning: Enabling Generalized Agent Abilities for LLMs [arXiv]

  • 2023.10 - Xie et al., OpenAgents: An Open Platform for Language Agents in the Wild [arXiv]

  • 2023.09 - Sumers et al., Cognitive Architectures for Language Agents [arXiv]

  • 2023.08 - Liu et al., AgentBench: Evaluating LLMs as Agents [arXiv]

  • 2023.05 - Xie et al., OlaGPT: Empowering LLMs With Human-like Problem-Solving Abilities [arXiv]

  • 2023.03 - Shen et al., HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face [arXiv]

Profile

Survey

  • 2024.04 - Chen et al., From Persona to Personalization: A Survey on Role-Playing Language Agents [arXiv]
  • 2024.04 - Mathur et al., Advancing Social Intelligence in AI Agents: Technical Challenges and Open Questions [arXiv]

Papers

  • 2024.04 - Yang et al., Social Skill Training with Large Language Models [arXiv]
  • 2024.02 - Xie et al., Can Large Language Model Agents Simulate Human Trust Behaviors? [arXiv]
  • 2023.12 - Yan et al., LARP: Language-Agent Role Play for Open-World Games [arXiv]

Planning and Reasoning

Survey

  • 2023 - Valmeekam et al., On the Planning Abilities of Large Language Models: A Critical Investigation (NeurIPS) [arXiv]

  • 2024.04 - Zhang et al., LLM as a Mastermind: A Survey of Strategic Reasoning with Large Language Models [arXiv]

  • 2024.02 - Huang et al., Understanding the planning of LLM agents: A survey [arXiv]

  • 2023.12 - Sun et al., A Survey of Reasoning with Foundation Models [arXiv]

  • 2023.03 - Yang et al., Foundation Models for Decision Making: Problems, Methods, and Opportunities [arXiv]

Papers

  • 2024 - Chen et al., When is Tree Search Useful for LLM Planning? It Depends on the Discriminator (ACL) [arXiv]

  • 2024 - Qiao et al., AutoAct: Automatic Agent Learning from Scratch for QA via Self-Planning (ACL) [arXiv]

  • 2024 - Kim et al., An LLM Compiler for Parallel Function Calling (ICML) [arXiv]

  • 2024 - Prasad et al., ADaPT: As-Needed Decomposition and Planning with Language Models (NAACL) [arXiv]

  • 2024 - Zhou et al., Enhancing the General Agent Capabilities of Low-Parameter LLMs through Tuning and Multi-Branch Reasoning (NAACL) [arXiv]

  • 2024 - Wang et al., RecMind: Large Language Model Powered Agent For Recommendation (NAACL) [arXiv]

  • 2024 - Roy et al., FLAP: Flow-Adhering Planning with Constrained Decoding in LLMs (NAACL) [arXiv]

  • 2024 - Lee et al., PlanRAG: A Plan-then-Retrieval Augmented Generation for Generative Large Language Models as Decision Makers (NAACL) [openreview]

  • 2024 - Ning et al., Skeleton-of-Thought: Prompting LLMs for Efficient Parallel Generation (ICLR) [openreview]

  • 2024 - Choi et al., LoTa-Bench: Benchmarking Language-oriented Task Planners for Embodied Agents [openreview]

  • 2024 - Qi et al., CivRealm: A Learning and Reasoning Odyssey in Civilization for Decision-Making Agents (ICLR) [openreview]

  • 2023 - Hao et al., Reasoning with Language Model is Planning with World Model (EMNLP) [aclanthology]

  • 2023 - Press et al., Measuring and Narrowing the Compositionality Gap in Language Models (EMNLP) [aclanthology]

  • 2023 - Gupta et al., Visual Programming: Compositional Visual Reasoning Without Training (CVPR) [CVF]

  • 2023 - Khot et al., Decomposed Prompting: A Modular Approach for Solving Complex Tasks (ICLR) [openreview]

  • 2023 - Zhou et al., Least-to-Most Prompting Enables Complex Reasoning in Large Language Models (ICLR) [openreview]

  • 2023 - Valmeekam et al., PlanBench: An Extensible Benchmark for Evaluating Large Language Models on Planning and Reasoning about Change (NeurIPS) [arXiv]

  • 2023 - Yao et al., Tree of Thoughts: Deliberate Problem Solving with Large Language Models [arXiv]

  • 2023 - Wang et al., Plan-and-Solve Prompting: Improving Zero-Shot Chain-of-Thought Reasoning by Large Language Models (ACL) [aclanthology]

  • 2023 - Subramanian et al., Modular Visual Question Answering via Code Generation (ACL) [aclanthology]

  • 2023 - Chen et al., Program of Thoughts Prompting: Disentangling Computation from Reasoning for Numerical Reasoning Tasks (TMLR) [openreview]

  • 2022 - Dua et al., Successive Prompting for Decomposing Complex Questions (EMNLP) [aclanthology]

  • 2024.05 - Xu et al., Faithful Logical Reasoning via Symbolic Chain-of-Thought [arXiv]

  • 2024.05 - Stechly et al., Chain of Thoughtlessness? An Analysis of CoT in Planning [arXiv]

  • 2024.05 - Meta-Task Planning for Language Agents [arXiv]

  • 2024.05 - Verma et al., On the Brittle Foundations of ReAct Prompting for Agentic Large Language Models [arXiv]

  • 2024.04 - Jin et al., Graph Chain-of-Thought: Augmenting Large Language Models by Reasoning on Graphs [arXiv]

  • 2024.04 - Juneja et al., 𝙻𝙼𝟸: A Simple Society of Language Models Solves Complex Reasoning [arXiv]

  • 2024.03 - Zhu et al., KnowAgent: Knowledge-Augmented Planning for LLM-Based Agents [arXiv]

  • 2024.02 - Hirsch et al., What's the Plan? Evaluating and Developing Planning-Aware Techniques for Language Models [arXiv]

  • 2024.02 - Hu et al., Uncertainty of Thoughts: Uncertainty-Aware Planning Enhances Information Seeking in Large Language Models [arXiv]

  • 2024.02 - Stechly et al., On the Self-Verification Limitations of Large Language Models on Reasoning and Planning Tasks [arXiv]

  • 2024.02 - Kambhampati et al., LLMs Can't Plan, But Can Help Planning in LLM-Modulo Frameworks [arXiv]

  • 2023.10 - Zhang et al., Probing the Multi-turn Planning Capabilities of LLMs via 20 Question Games [arXiv]

  • 2023.10 - Wang et al., PromptAgent: Strategic Planning with Language Models Enables Expert-level Prompt Optimization [arXiv]

  • 2023.08 - Besta et al., Graph of Thoughts: Solving Elaborate Problems with Large Language Models [arXiv]

  • 2023.08 - Dagan et al., Dynamic Planning with a LLM [arXiv]

  • 2023.05 - Surís et al., ViperGPT: Visual Inference via Python Execution for Reasoning [arXiv]

  • 2023.05 - Xu et al., ReWOO: Decoupling Reasoning from Observations for Efficient Augmented Language Models [arXiv]

  • 2023.05 - Brahman et al., PlaSma: Making Small Language Models Better Procedural Knowledge Models for (Counterfactual) Planning [arXiv]

  • 2023.04 - Lu et al., Chameleon: Plug-and-Play Compositional Reasoning with Large Language Models [arXiv]

  • 2022.11 - Gao et al., PAL: Program-aided Language Models [arXiv]

Action

Survey

  • 2024.05 - Qu et al., Tool Learning with Large Language Models: A Survey [arXiv]
  • 2024.03 - Wang et al., What Are Tools Anyway? A Survey from the Language Model Perspective [arXiv]
  • 2023.04 - Qin et al., Tool Learning with Foundation Models [arXiv]

Papers

  • 2024 - Basu et al., API-BLEND: A Comprehensive Corpora for Training and Benchmarking API LLMs (ACL) [arXiv]
  • 2024 - Qiao et al., Making Language Models Better Tool Learners with Execution Feedback (NAACL) [arXiv]
  • 2024 - Zhang et al., Reverse Chain: A Generic-Rule for LLMs to Master Multi-API Planning (NAACL) [arXiv]
  • 2024 - Huang et al., Planning and Editing What You Retrieve for Enhanced Tool Learning (NAACL) [arXiv]
  • 2024 - Qian et al., Toolink: Linking Toolkit Creation and Using through Chain-of-Solving on Open-Source Model (NAACL) [arXiv]
  • 2024 - Zheng et al., ToolRerank: Adaptive and Hierarchy-Aware Reranking for Tool Retrieval (LREC-COLING) [arXiv]
  • 2024 - Asai et al., Self-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection (ICLR) [openreview]
  • 2024 - Xu et al., On the Tool Manipulation Capability of Open-sourced Large Language Models (ICLR) [openreview]
  • 2024 - Gou et al., ToRA: A Tool-Integrated Reasoning Agent for Mathematical Problem Solving (ICLR) [openreview]
  • 2024 - Li et al., Tool-Augmented Reward Modeling (ICLR) [openreview]
  • 2024 - Ruan et al., Identifying the Risks of LM Agents with an LM-Emulated Sandbox (ICLR) [openreview]
  • 2023 - Li et al., API-Bank: A Comprehensive Benchmark for Tool-Augmented LLMs (EMNLP) [aclanthology]
  • 2023 - Jacovi et al., A Comprehensive Evaluation of Tool-Assisted Generation Strategies (EMNLP) [aclanthology]
  • 2023 - Chen et al., ChatCoT: Tool-Augmented Chain-of-Thought Reasoning on Chat-based Large Language Models (EMNLP) [aclanthology]
  • 2023 - Schick et al., Toolformer: Language Models Can Teach Themselves to Use Tools (NeurIPS) [openreview]
  • 2023 - Hao et al., ToolkenGPT: Augmenting Frozen Language Models with Massive Tools via Tool Embeddings (NeurIPS) [openreview]
  • 2023 - Srinivasan et al., NexusRaven: a commercially-permissive Language Model for function calling (NeurIPS) [openreview]
  • 2023 - Yang et al., GPT4Tools: Teaching Large Language Model to Use Tools via Self-instruction (NeurIPS) [openreview]
  • 2022 - Parisi et al., TALM: Tool Augmented Language Models [arXiv]
  • 2024.03 - Wang et al., LLMs in the Imaginarium: Tool Learning through Simulated Trial and Error [arXiv]
  • 2024.02 - Das et al., MATHSENSEI: A Tool-Augmented Large Language Model for Mathematical Reasoning [arXiv]
  • 2024.02 - Du et al., AnyTool: Self-Reflective, Hierarchical Agents for Large-Scale API Calls [arXiv]
  • 2024.02 - Mekala et al., TOOLVERIFIER: Generalization to New Tools via Self-Verification [arXiv]
  • 2024.01 - Shen et al., Small LLMs Are Weak Tool Learners: A Multi-LLM Agent [arXiv]
  • 2024.01 - Gao et al., Efficient Tool Use with Chain-of-Abstraction Reasoning [arXiv]
  • 2024.01 - Yuan et al., EASYTOOL: Enhancing LLM-based Agents with Concise Tool Instruction [arXiv]
  • 2024.01 - Wang et al., TroVE: Inducing Verifiable and Efficient Toolboxes for Solving Programmatic Tasks [arXiv]
  • 2023.12 - NexusRaven-V2: Surpassing GPT-4 for Zero-shot Function Calling [Nexusflow]
  • 2023.08 - Hsieh et al., Tool Documentation Enables Zero-Shot Tool-Usage with Large Language Models [arXiv]
  • 2023.07 - Qin et al., ToolLLM: Facilitating Large Language Models to Master 16000+ Real-world APIs [arXiv]
  • 2023.06 - Song et al., RestGPT: Connecting Large Language Models with Real-World RESTful APIs [arXiv]
  • 2023.06 - Tang et al., ToolAlpaca: Generalized Tool Learning for Language Models with 3000 Simulated Cases [arXiv]
  • 2023.05 - Cai et al., Large Language Models as Tool Makers [arXiv]
  • 2023.05 - Patil et al., Gorilla: Large Language Model Connected with Massive APIs [arXiv]
  • 2023.03 - Shen et al., HuggingGPT: Solving AI Tasks with ChatGPT and its Friends in Hugging Face [arXiv]
  • 2023.03 - Paranjape et al., ART: Automatic multi-step reasoning and tool-use for large language models [arXiv]

Memory

Survey

  • 2024.04 - Zhang et al., A Survey on the Memory Mechanism of Large Language Model based Agents [arXiv][GitHub]

Papers

  • 2024 - Na et al., Efficient Episodic Memory Utilization of Cooperative Multi-Agent Reinforcement Learning (ICLR) [openreview]
  • 2024.06 - Yang et al., Buffer of Thoughts: Thought-Augmented Reasoning with Large Language Models [arXiv]
  • 2023.10 - Zhang et al., Retrieve Anything To Augment Large Language Models [arXiv]
  • 2023.05 - Modarressi et al., RET-LLM: Towards a General Read-Write Memory for Large Language Models [arXiv]

Reflection and Refinement

Survey

  • 2024 - Pan et al., Automatically Correcting Large Language Models: Surveying the landscape of diverse self-correction strategies (TACL) [aclanthology]

Papers

  • 2024 - Yu et al., Teaching Language Models to Self-Improve through Interactive Demonstrations (NAACL) [arXiv]
  • 2024 - Xu et al., LLMRefine: Pinpointing and Refining Large Language Models via Fine-Grained Actionable Feedback (NAACL) [arXiv]
  • 2024 - Miao et al., SelfCheck: Using LLMs to Zero-Shot Check Their Own Step-by-Step Reasoning (ICLR) [openreview]
  • 2024 - Zhou et al., Solving Challenging Math Word Problems Using GPT-4 Code Interpreter with Code-based Self-Verification (ICLR) [openreview]
  • 2024 - Shridhar et al., The ART of LLM Refinement: Ask, Refine, Trust (ICLR) [openreview]
  • 2024 - Huang et al., Large Language Models Cannot Self-Correct Reasoning Yet (ICLR) [openreview]
  • 2024 - Promptbreeder: Self-Referential Self-Improvement via Prompt Evolution (ICLR) [openreview]
  • 2024 - Yao et al., Retroformer: Retrospective Large Language Agents with Policy Gradient Optimization (ICLR) [openreview]
  • 2023 - Huang et al., Large Language Models Can Self-Improve (EMNLP) [aclanthology]
  • 2023 - Shinn et al., Reflexion: Language Agents with Verbal Reinforcement Learning (NeurIPS) [openreview]
  • 2023 - Madaan et al., Self-Refine: Iterative Refinement with Self-Feedback (NeurIPS) [openreview]
  • 2023 - Gero et al., Self-Verification Improves Few-Shot Clinical Information Extraction (IMLH) [openreview]
  • 2024.05 - Renze et al., Self-Reflection in LLM Agents: Effects on Problem-Solving Performance [arXiv]
  • 2024.04 - Naik et al., Generating Situated Reflection Triggers about Alternative Solution Paths: A Case Study of Generative AI for Computer-Supported Collaborative Learning [arXiv]
  • 2024.04 - Tian et al., Toward Self-Improvement of LLMs via Imagination, Searching, and Criticizing [arXiv]
  • 2024.03 - Song et al., Trial and Error: Exploration-Based Trajectory Optimization for LLM Agents [arXiv]
  • 2024.02 - Zhou et al., ArCHer: Training Language Model Agents via Hierarchical Multi-Turn RL [arXiv]
  • 2023.10 - Zhang et al., Self-Convinced Prompting: Few-Shot Question Answering with Repeated Introspection [arXiv]
  • 2023.09 - Dhuliawala et al., Chain-of-Verification Reduces Hallucination in Large Language Models [arXiv]
  • 2023.09 - Shridhar et al., SCREWS: A Modular Framework for Reasoning with Revisions [arXiv]
  • 2023.08 - Wang et al., Shepherd: A Critic for Language Model Generation [arXiv]

Multi-modal

Survey

  • 2024.05 - LLMs Meet Multimodal Generation and Editing: A Survey [arXiv]
  • 2024.01 - Agent AI: Surveying the Horizons of Multimodal Interaction [arXiv]

Papers

  • 2023 - Hu et al., AVIS: Autonomous Visual Information Seeking with Large Language Model Agent (NeurIPS) [openreview]
  • 2022 - Reed et al., A Generalist Agent (TMLR) [arXiv]
  • 2024.04 - Shaham et al., A Multimodal Automated Interpretability Agent [arXiv]
  • 2024.03 - Mar et al., Modeling Multimodal Social Interactions: New Challenges and Baselines with Densely Aligned Representations [arXiv]
  • 2022.09 - Liang et al., Code as Policies: Language Model Programs for Embodied Control [arXiv]

Environments

Survey

Papers

  • 2024 - Zheng et al., Steve-Eye: Equipping LLM-based Embodied Agents with Visual Perception in Open Worlds (ICLR) [arXiv]
  • 2024 - Zhou et al., SOTOPIA: Interactive Evaluation for Social Intelligence in Language Agents (ICLR) [openreview]
  • 2024 - Wu et al., SmartPlay : A Benchmark for LLMs as Intelligent Agents (ICLR) [openreview]
  • 2024.06 - Xi et al., AgentGym: Evolving Large Language Model-based Agents across Diverse Environments [arXiv]
  • 2024.04 - Xie et al., OSWorld: Benchmarking Multimodal Agents for Open-Ended Tasks in Real Computer Environments [arXiv]
  • 2024.03 - SIMA Team, Scaling Instructable Agents Across Many Simulated Worlds [arXiv]
  • 2023.05 - Wang et al., Voyager: An Open-Ended Embodied Agent with Large Language Models [arXiv]
  • 2023.04 - Park et al., Generative Agents: Interactive Simulacra of Human Behavior [arXiv]

Multi Agent

Survey

  • 2024.02 - Han et al., LLM Multi-Agent Systems: Challenges and Open Problems [arXiv]
  • 2024.01 - Guo et al., Large Language Model based Multi-Agents: A Survey of Progress and Challenges [arXiv]
  • 2024.01 - Cheng et al., Exploring Large Language Model based Intelligent Agents: Definitions, Methods, and Prospects [arXiv]

Papers

  • 2024 - Zhang et al., ProAgent: Building Proactive Cooperative Agents with Large Language Models (AAAI) [arXiv]

  • 2024 - Chen et al., CoMM: Collaborative Multi-Agent, Multi-Reasoning-Path Prompting for Complex Problem Solving (NAACL) [arXiv]

  • 2024 - Gong et al., MindAgent: Emergent Gaming Interaction (NAACL) [arXiv]

  • 2024 - Zhang et al., Exploring Collaboration Mechanisms for LLM Agents: A Social Psychology View (ICLR) [openreview]

  • 2024 - Du et al., Improving Factuality and Reasoning in Language Models through Multiagent Debate (ICLR) [openreview]

  • 2024 - Liu et al., Efficient Multi-agent Reinforcement Learning by Planning (ICLR) [openreview]

  • 2024 - Chan et al., ChatEval: Towards Better LLM-based Evaluators through Multi-Agent Debate (ICLR) [openreview]

  • 2024 - Chen et al., AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors (ICLR) [openreview]

  • 2024 - Lo et al., Identifying the Risks of LM Agents with an LM-Emulated Sandbox (ICLR) [openreview]

  • 2024 - Chen et al., AutoAgents: A Framework for Automatic Agent Generation (ICLR) [openreview]

  • 2024 - Wang et al., Adapting LLM Agents Through Communication (ICLR) [openreview]

  • 2024 - Goktas et al., Efficient Inverse Multiagent Learning (ICLR) [openreview]

  • 2024 - Hu et al., Learning Multi-Agent Communication from Graph Modeling Perspective (ICLR) [openreview]

  • 2023 - Xiong et al., Examining Inter-Consistency of Large Language Models Collaboration: An In-depth Analysis via Debate (EMNLP) [aclanthology]

  • 2024.05 - Sarkar et al., Normative Modules: A Generative Agent Architecture for Learning Norms that Supports Multi-Agent Cooperation [arXiv]

  • 2024.05 - Sun et al., Facilitating Multi-Role and Multi-Behavior Collaboration of Large Language Models for Online Job Seeking and Recruiting [arXiv]

  • 2024.04 - Yue et al., MathVC: An LLM-Simulated Multi-Character Virtual Classroom for Mathematics Education [arXiv]

  • 2024.02 - Wang et al., Multi-Agent Collaboration Framework for Recommender Systems [arXiv]

  • 2024.02 - An et al., Nissist: An Incident Mitigation Copilot based on Troubleshooting Guides [arXiv]

  • 2024.02 - Li et al., Can LLMs Speak For Diverse People? Tuning LLMs via Debate to Generate Controllable Controversial Statements [arXiv]

  • 2024.02 - Fang et al., A Multi-Agent Conversational Recommender System [arXiv]

  • 2023.10 - Agashe et al., LLM-Coordination: Evaluating and Analyzing Multi-agent Coordination Abilities in Large Language Modes [arXiv]

  • 2023.07 - Nascimento et al., Self-Adaptive Large Language Model (LLM)-Based Multiagent Systems [arXiv]

  • 2023.07 - Wang et al., Unleashing the Emergent Cognitive Synergy in Large Language Models: A Task-Solving Agent through Multi-Persona Self-Collaboration [arXiv]

  • 2023.06 - Cui et al., Chatlaw: A Multi-Agent Collaborative Legal Assistant with Knowledge Graph Enhanced Mixture-of-Experts Large Language Model [arXiv]

  • 2023.05 - Liang et al., Encouraging Divergent Thinking in Large Language Models through Multi-Agent Debate [arXiv]

  • 2023.04 - Chen et al., Teaching Large Language Models to Self-Debug [arXiv]

Application

Reading and Writing Assistant

  • 2024 - Shao et al., Assisting in Writing Wikipedia-like Articles From Scratch with Large Language Models (NAACL) [arXiv]
  • 2024.02 - Lee et al., A Human-Inspired Reading Agent with Gist Memory of Very Long Contexts [arXiv]

Information Seeking

  • 2024.04 - Chen et al., ChatShop: Interactive Information Seeking with Language Agents [arXiv]

Web Navigation

  • 2024 - Tao et al., WebWISE: Unlocking Web Interface Control for LLMs via Sequential Exploration (NAACL) [arXiv]

  • 2024 - Wang et al., Mobile-Agent: Autonomous Multi-Modal Mobile Device Agent with Visual Perception (ICLR) [openreview]

  • 2024 - Gur et al., A Real-World WebAgent with Planning, Long Context Understanding, and Program Synthesis (ICLR) [openreview]

  • 2024 - Furuta et al., Multimodal Web Navigation with Instruction-Finetuned Foundation Models (ICLR) [openreview]

  • 2024 - Zhang et al., You Only Look at Screens: Multimodal Chain-of-Action Agents (ICLR) [openreview]

  • 2024 - Zhou et al., WebArena: A Realistic Web Environment for Building Autonomous Agents (ICLR) [openreview]

  • 2024 - AutoDroid: LLM-powered Task Automation in Android (MobiCom) [arXiv]

  • 2023 - Ma et al., LASER: LLM Agent with State-Space Exploration for Web Navigation (NeurIPS) [openreview]

  • 2023 - Deng et al., Mind2Web: Towards a Generalist Agent for the Web (NeurIPS) [arXiv]

  • 2022 - Yao et al., WebShop: Towards Scalable Real-World Web Interaction with Grounded Language Agents (NeurIPS) [neurips.cc]

  • 2020 - Li et al., Mapping Natural Language Instructions to Mobile UI Action Sequences (ACL) [aclanthology]

  • 2018 - Liu et al., Reinforcement Learning on Web Interfaces Using Workflow-Guided Exploration (ICLR) [openreview]

  • 2017 - Shi et al., World of Bits: An Open-Domain Platform for Web-Based Agents (ICML) [PMLR]

  • 2024.06 - Li et al., On the Effects of Data Scale on Computer Control Agents [arXiv]

  • 2024.06 - Wang et al., Mobile-Agent-v2: Mobile Device Operation Assistant with Effective Navigation via Multi-Agent Collaboration [arXiv]

  • 2024.05 - Tan et al., Towards General Computer Control: A Multimodal Agent for Red Dead Redemption II as a Case Study [arXiv]

  • 2024.05 - Rawles et al., AndroidWorld: A Dynamic Benchmarking Environment for Autonomous Agents [arXiv]

  • 2024.04 - Zhang et al., MMInA: Benchmarking Multihop Multimodal Internet Agents [arXiv]

  • 2024.04 - Lai et al., AutoWebGLM: Bootstrap And Reinforce A Large Language Model-based Web Navigating Agent [arXiv]

  • 2024.04 - Huang et al., AutoCrawler: A Progressive Understanding Web Agent for Web Crawler Generation [arXiv]

  • 2024.04 - Pan et al., Autonomous Evaluation and Refinement of Digital Agents [arXiv]

  • 2024.03 - Drouin et al., WorkArena: How Capable Are Web Agents at Solving Common Knowledge Work Tasks? [arXiv]

  • 2024.02 - Zhang et al., UFO: A UI-Focused Agent for Windows OS Interaction [arXiv]

  • 2024.02 - Niu et al., ScreenAgent: A Vision Language Model-driven Computer Control Agent [arXiv]

  • 2024.02 - Lù et al., WebLINX: Real-World Website Navigation with Multi-Turn Dialogue [arXiv]

  • 2024.02 - Baechler et al., ScreenAI: A Vision-Language Model for UI and Infographics Understanding [arXiv]

  • 2024.01 - Zheng et al., GPT-4V(ision) is a Generalist Web Agent, if Grounded [arXiv]

  • 2023.12 - Zhang et al., AppAgent: Multimodal Agents as Smartphone Users [arXiv]

  • 2023.11 - Yan et al., GPT-4V in Wonderland: Large Multimodal Models for Zero-Shot Smartphone GUI Navigation [arXiv]

  • 2023.11 - Furuta et al., Exposing Limitations of Language Model Agents in Sequential-Task Compositions on the Web [arXiv]

  • 2023.10 - Ma et al., How to Teach Programming in the AI Era? Using LLMs as a Teachable Agent for Debugging [arXiv]

  • 2023.07 - Rawles et al., Android in the Wild: A Large-Scale Dataset for Android Device Control [arXiv]

  • 2022.02 - Humphreys et al., A Data-Driven Approach for Learning to Control Computers [arXiv]

  • 2021.07 - Nakano et al., WebGPT: Browser-assisted question-answering with human feedback [arXiv]

  • 2021.05 - Toyama et al., AndroidEnv: A Reinforcement Learning Platform for Android [arXiv]

  • 2021.05 - Shvo et al., AppBuddy: Learning to Accomplish Tasks in Mobile Apps via Reinforcement Learning [arXiv]

Code Generation and Software Engineer

Survey

  • 2023.11 - Zheng et al., A Survey of Large Language Models for Code: Evolution, Benchmarking, and Future Trends [arXiv]
  • 2023.10 - Fan et al., Large Language Models for Software Engineering: Survey and Open Problems [arXiv]
  • 2023.07 - Wang et al., Software Testing with Large Language Models: Survey, Landscape, and Vision (IEEE) [arXiv]

Paper

  • 2024 - Qian et al., ChatDev: Communicative Agents for Software Development (ACL) [arXiv]

  • 2024 - Wang et al., Executable Code Actions Elicit Better LLM Agents (ICML) [arXiv]

  • 2024 - Nan et al., On Evaluating the Integration of Reasoning and Action in LLM Agents with Database Question Answering (NAACL) [arXiv]

  • 2024 - Olausson et al., Is Self-Repair a Silver Bullet for Code Generation? (ICLR) [openreview]

  • 2024 - Hong et al., MetaGPT: Meta Programming for A Multi-Agent Collaborative Framework (ICLR) [openreview]

  • 2024 - Jimenez et al., SWE-bench: Can Language Models Resolve Real-world Github Issues? (ICLR) [openreview]

  • 2024 - Xu et al., Lemur: Harmonizing Natural Language and Code for Language Agents (ICLR) [arXiv]

  • 2023 - Li et al., CAMEL: Communicative Agents for "Mind" Exploration of Large Language Model Society (NeurIPS) [arXiv]

  • 2023 - Dibia, LIDA: A Tool for Automatic Generation of Grammar-Agnostic Visualizations and Infographics using Large Language Models (ACL) [aclanthology]

  • 2023 - Zhang et al., Self-Edit: Fault-Aware Code Editor for Code Generation (ACL) [openreview]

  • 2024.05 - Yang et al., SWE-agent: Agent-Computer Interfaces Enable Automated Software Engineering [arXiv]

  • 2024.05 - Tang et al., Code Repair with LLMs gives an Exploration-Exploitation Tradeoff [arXiv]

  • 2024.04 - Zhang et al., AutoCodeRover: Autonomous Program Improvement [arXiv]

  • 2024.03 - Jain et al., LiveCodeBench: Holistic and Contamination Free Evaluation of Large Language Models for Code [arXiv]

  • 2024.03 - Tufano et al., AutoDev: Automated AI-Driven Development [arXiv]

  • 2024.03 - Tao et al., MAGIS: LLM-Based Multi-Agent Framework for GitHub Issue Resolution [arXiv]

  • 2024.02 - Hong et al., Data Interpreter: An LLM Agent For Data Science [arXiv]

  • 2024.02 - Zheng et al., OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement [arXiv]

  • 2023.12 - Huang et al., AgentCoder: Multi-Agent-based Code Generation with Iterative Testing and Optimisation [arXiv]

  • 2023.11 - Qiao et al., TaskWeaver: A Code-First Agent Framework [arXiv]

  • 2023.10 - Huang et al., MLAgentBench: Evaluating Language Agents on Machine Learning Experimentation [arXiv]

  • 2023.06 - Jiang et al., SelfEvolve: A Code Evolution Framework via Large Language Models [arXiv]

  • 2023.04 - Ma et al., Demonstration of InsightPilot: An LLM-Empowered Automated Data Exploration System [arXiv]

  • 2022.11 - Lai et al., DS-1000: A Natural and Reliable Benchmark for Data Science Code Generation [arXiv]

Recruitment Information

Algomatic creates generative AI-native businesses across various fields.
We are looking for colleagues with diverse skills.

Learn More

About

🤖 A collection of AI agents includes research papers, blogs, and products focused on developing autonomous systems.

Topics

Resources

Stars

Watchers

Forks