Simplifying Coding Agents: Understanding Rules, Skills, and Everything In Between |

Translate: 🇫🇷 French 🇸🇦 Arabic 🇨🇳 Chinese 🇪🇸 Spanish

The world of AI coding agents is filled with a dizzying array of terms. Rules, commands, MCP servers, subagents, modes, hooks, and tools all represent different layers of functionality. But does it need to be this complex? Let’s demystify these concepts, explore their history, and simplify them into a more coherent framework.

The Early Days: Static Context and Rules

When agents were first emerging, language models would frequently hallucinate or invent incorrect information. To combat this, the concept of a “rules file” was introduced. This file was a simple yet effective way to include essential context in every single conversation with the agent. It could contain details about your codebase, specific business requirements, or corrections for common model errors.

This worked quite well.

Over time, as these rules files grew longer, they were broken into smaller, nested sub-files for better organization. Regardless of the structure—one file or ten—they all combined into what can be called static context. This is a block of information included in every single interaction, providing a consistent foundation for the agent.

The main limitation was that this context was always present. Ideally, an agent could conditionally decide which rules were relevant for a specific task and include only those. At the time, however, models were not consistently good at tool-calling or advanced reasoning, making this a difficult challenge. It was the right idea, but a little ahead of its time.

Introducing Workflows: The Rise of Commands

As developers began integrating agents into their daily work, common patterns and workflows started to emerge. This led to the idea of a slash command or a custom command. These commands allow you to package a repeated prompt and run it on demand.

You can define a workflow, share it with your team, and even check it into version control. It’s a powerful way to conditionally execute a specific process right from your agent’s input box.

For instance, a common use case is streamlining the Git process.

Note: Example Command A single command like /create-pr could be configured to perform a series of actions:

Stage all current changes.
Generate a commit message based on the diff.
Commit the changes.
Push the branch to the remote repository.
Open a pull request on GitHub.

Expanding Capabilities: MCP Servers and Third-Party Tools

So far, we’ve only discussed injecting text into the agent’s context. But what about running code? This is where the exploration into MCP (Multi-Capability Provider) servers began.

An MCP server was more than just a prompt. It allowed you to run a full server, connect to existing systems, and—most importantly—handle authentication protocols like OAuth.

While first-party tools gave agents the ability to run shell commands or edit files, MCP servers exposed third-party tools. Suddenly, the agent could gain new skills, like reading a Slack message or creating a new issue in a project management tool. These new tools were added to the context window, effectively expanding the agent’s capabilities.

The primary downside was context bloat. If you had many different tools installed, the initial context window could become enormous, consuming valuable tokens and potentially slowing down performance.

Refining Control: Modes and Sub-Agents

The next evolution focused on providing more granular control through modes and sub-agents. A sub-agent functions like a specialized prompt. You can assign it a persona or a specific task and, crucially, limit the scope of the tools it can access.

Modes took this concept a step further. A mode not only provides instructions (e.g., “you are in planning mode”) but can also:

Modify the system prompt to inform the agent about available tools and modes.
Grant access to new tools, such as a GUI for creating and modifying plans.
Introduce other UI changes to reflect the current mode.
Add reminders in the prompt to keep the agent focused on its task.

The entire point of these concepts was to make the agent’s behavior more reliable and its capabilities more discoverable. A mode is easily surfaced in a UI, and the underlying tricks help ensure the agent produces a consistent, high-quality output. However, it’s important to remember that these are still non-deterministic systems, and things can go wrong.

Ensuring Reliability: Deterministic Hooks

To introduce absolute predictability, the next evolution was hooks. Hooks are deterministic actions that fire at specific points in the agent’s lifecycle.

For example:

Pre-execution hook: You could use a hook to always inject a piece of static context into every single run.
Post-execution hook: After a conversation finishes, a hook could log the interaction or save it to a database.

Hooks provide a 100% deterministic way to run code before or after an agent’s main task.

The Unification: The Power of Skills

All these concepts—prompts, code execution, static context, and dynamic capabilities—ideally can be unified under an open standard that everyone can rally behind. This is the goal of skills.

A skill is a versatile concept that encapsulates these ideas.

In its basic form, a skill can be just like a command. A repeatable workflow, like the Git pull request flow mentioned earlier, can be modeled as a skill. It doesn’t bloat the initial context size and can be easily packaged and shared.
In its most advanced form, a skill can be a combination of scripts, executables, and assets—anything you want to bundle together. It’s just code, and you can distribute it to your team without the negative performance hit of bloating the context when it’s not in use.

This is a fantastic simplification. It boils down the world for a user of a coding agent to two primary concepts: static and dynamic context.

The Modern Approach: Rules and Skills

With this new paradigm, we can focus on just two things:

Rules (Static Context): Prompts and information that you want to include in every single conversation.
Skills (Dynamic Context): Ways to run code or give your agent new powers without bloating the initial context window.

Coding agents themselves can then make optimizations based on this pattern. For example, some modern agents have learned from the skills pattern to improve how they handle third-party tools. If you have ten MCP servers installed, each with ten different tools, the agent can be optimized to only load the tools when they are actually used, preserving features like OAuth where needed.

While skills and rules cover most use cases, you might still need hooks for deterministic runs or custom sub-agents for complex tasks like parallel processing or research. However, for most users, thinking in terms of skills and rules is sufficient.

Best Practices and The Road Ahead

For rules, the best practice is to provide minimal, high-quality context. Since this information is included in every conversation, it should be a living artifact. If you see the coding agent make a mistake, update your agents.md or equivalent rule file. This allows the agent’s core knowledge to evolve and improve over time.

For skills, the ecosystem is still new. As more developers adopt the skills-based model over the next year, we can expect best practices to emerge and the ecosystem to flourish, making skills an increasingly important part of the development workflow.

Hopefully, this article helps you make sense of the complex history of coding agents. By compressing these concepts down to rules and skills, you can become a more effective and powerful user of AI in your daily coding.