[IND] 5 min readOraCore Editors

AI agent papers worth tracking in one repo

A curated repo of 4 agent paper themes helps you find planning, skills, harnesses, and surveys fast.

Share LinkedIn
AI agent papers worth tracking in one repo

This repo curates AI agent papers by theme so you can scan the field fast.

This GitHub collection tracks AI agent research in themed buckets and updates it biweekly, with 1,494 stars showing strong community use. If you want a fast way to follow planning, skills, harnesses, and surveys without reading every arXiv feed, this list shows where to start.

ItemWhat it coversExample signals
HarnessRuntime structure for agent executionSafety, search, production workflows
SkillsReusable agent abilitiesSkill creation, governance, evaluation
SurveyField overviewsTaxonomy, trends, benchmarks
ArchitectureHow agents are organizedSingle-agent, multi-agent, ops
ApplicationsWhere agents are usedWeb, software, data, research

1. Harness papers for runtime design

Get the latest AI news in your inbox

Weekly picks of model releases, tools, and deep dives — no spam, unsubscribe anytime.

No spam. Unsubscribe at any time.

The harness section is the best entry point if you care about how agents actually run in production. It gathers papers on execution substrates, safety checks, search behavior, and architecture patterns, which makes it useful for builders who need more than model prompts.

AI agent papers worth tracking in one repo

Representative papers in this bucket include AI Harness Engineering: A Runtime Substrate for Foundation-Model Software Agents, Is Grep All You Need? How Agent Harnesses Reshape Agentic Search, and Harness-Bench: Measuring Harness Effects across Models in Realistic Agent Workflows.

  • Focuses on execution, not just prompting
  • Useful for agent ops, evaluation, and safety work
  • Includes survey and benchmark entries

2. Skills papers for reusable agent abilities

If your interest is what agents can learn to do repeatedly, the skills section is the most practical cluster. It covers skill creation, selection, governance, and self-evolution, so you can compare papers that treat skills as modular parts of an agent system.

That makes it a strong fit for teams building long-lived agents. Papers such as SkillOS: Learning Skill Curation for Self-Evolving Agents, SkillsVote: Lifecycle Governance of Agent Skills from Collection, Recommendation to Evolution, and SkillGrad: Optimizing Agent Skills Like Gradient Descent show how broad the topic has become.

Skill themes you will see here: - skill generation - skill memory and management - least-privilege enforcement - skill evaluation - self-evolving skill systems

3. Survey papers for fast field orientation

The survey bucket is the quickest way to understand where the research is going. Instead of one method, these papers map taxonomies, techniques, and open questions, which is helpful when you need a clean overview before choosing a subtopic.

AI agent papers worth tracking in one repo

For a broad starting point, A Comprehensive Survey on Agent Skills: Taxonomy, Techniques, and Applications and Generate, Filter, Control, Replay: A Comprehensive Survey of Rollout Strategies for LLM Reinforcement Learning show the repository’s survey style. The collection also points to related work on collaboration, failure attribution, and self-evaluation.

  • Good for literature reviews and slide prep
  • Helps identify subfields worth deeper reading
  • Pairs well with benchmark papers

4. Architecture papers for agent system design

The architecture section organizes papers around single-agent, multi-agent, and agent-ops patterns. That is useful if you are deciding how to structure a product, because the papers here are about system shape as much as model behavior.

Use this section when you need to compare coordination styles or operational patterns. The repo’s links make it easy to jump from broad architecture choices to more specific application areas like digital agents or enterprise agents.

  • Single-agent setups for focused tasks
  • Multi-agent setups for coordination and division of labor
  • Agent-ops and UX for production deployment

5. Application papers for domain-specific use cases

The application sections are where the repository becomes especially useful for practitioners. Instead of staying abstract, it sorts papers into embodied, web, mobile, software, data, research, API, deep research, enterprise, and finance agents.

That lets you jump straight to the environment you care about. If you are building a browser worker, a coding assistant, or a research copilot, the application pages narrow the reading list quickly and reduce time spent on irrelevant papers.

Examples of application clusters: - Web agents - GUI agents - Software agents - Research agents - Enterprise agents

How to decide

Pick harness papers if you care about execution and safety, skills papers if you want reusable capabilities, surveys if you need orientation, and architecture or application papers if you are building a system for a specific setting. For most readers, the fastest path is survey first, then harness or skills, then the application area that matches the product.

Because the repo is updated biweekly, it works well as a living reading list rather than a one-time roundup.