Reduce tool count to make tool calling more reliable.

This commit is contained in:
AI Christianson 2025-02-08 18:26:08 -05:00
parent 13016278e5
commit 0c86900ce4
6 changed files with 54 additions and 63 deletions

View File

@ -31,7 +31,7 @@ def print_stage_header(stage: str) -> None:
icon = icons.get(stage_key, "🚀")
# Create styled panel with icon
panel_content = f"{icon} {stage_title}"
panel_content = f" {icon} {stage_title}"
console.print(Panel(panel_content, style="green bold", padding=0))

View File

@ -119,7 +119,9 @@ Prioritize checking current documentation for technical advice.
# New project hints
NEW_PROJECT_HINTS = """
If the user did not specify a stack, use your best judgment, or make a proposal and ask the human if the human-in-the-loop tool is available.
Because this is a new project:
- If the user did not specify a stack, use your best judgment, or make a proposal and ask the human if the human-in-the-loop tool is available.
- If the user did not specify a directory to create the project in, create directly in the current directory.
"""
# Research stage prompt - guides initial codebase analysis
@ -286,10 +288,6 @@ You have often been criticized for:
{human_section}
{web_research_section}
If you make tool calls incorrectly, you **WILL** get errors like the following:
Error: 1 validation error for emit_research_notes notes Field required [type=missing, input_value=, input_type=dict]
NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
"""
@ -501,12 +499,6 @@ You have often been criticized for:
- Not indicating confidence levels or noting uncertainties
- Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.
If you make tool calls incorrectly, you **WILL** get errors like the following:
Error: 1 validation error for emit_research_notes notes Field required [type=missing, input_value=, input_type=dict]
NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
"""
@ -602,52 +594,32 @@ NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
IMPLEMENTATION_PROMPT = """Current Date: {current_date}
Working Directory: {working_directory}
Base-level task (for reference only):
<base task>
{base_task}
</base task>
keep it simple. if the expert tool is available, use it frequently for high level logic and planning.
Plan Overview (for reference only, remember you are only implementing your specific task):
{plan}
Key Facts:
<key facts>
{key_facts}
</key facts>
Key Snippets:
<key snippets>
{key_snippets}
</key snippets>
Relevant Files:
<relevant files>
{related_files}
</relevant files>
Research Notes:
<research notes>
{research_notes}
Work done so far:
<work log>
{work_log}
</work log>
</research notes>
Important Notes:
- Focus solely on the given task and implement it as described.
- Scale the complexity of your solution to the complexity of the request. For simple requests, keep it straightforward and minimal. For complex requests, maintain the previously planned depth.
- Use delete_key_facts to remove facts that become outdated, irrelevant, or duplicated.
- Use emit_key_snippets to manage code sections before and after modifications in batches.
- Regularly remove outdated snippets with delete_key_snippets.
Instructions:
1. Review the provided base task, plan, and key facts.
2. Implement only the specified task:
<task definition>
{task}
</task definition>
3. Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
4. Use delete_key_facts to remove any key facts that no longer apply.
5. Do not add features not explicitly required.
6. Only create or modify files directly related to this task.
7. Use file_str_replace and write_file_tool for simple file modifications.
8. Delegate to run_programming_task for more complex programming tasks. This is a capable human programmer that can work on multiple files at once.
- Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
- Do not add features not explicitly required.
- Only create or modify files directly related to this task.
- Use file_str_replace and write_file_tool for simple file modifications.
- Delegate to run_programming_task for more complex programming tasks. This is a capable human programmer that can work on multiple files at once.
Testing:
@ -659,7 +631,7 @@ Testing:
- Only test UI components if there is already a UI testing system in place.
- Only test things that can be tested by an automated process.
Once the task is complete, ensure all updated files are emitted.
Once the task is complete, ensure all updated files are registered with emit_related_files.
{expert_section}
{human_section}
@ -670,10 +642,16 @@ You have often been criticized for:
- Doing changes outside of the specific scoped instructions.
- Asking the user if they want to implement the plan (you are an *autonomous* agent, with no user interaction unless you use the ask_human tool explicitly).
- Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.
- Using run_programming_task to simply write the full contents of files when you could have used write_file_tool instead.
If you make tool calls incorrectly, you **WILL** get errors like the following:
Instructions:
1. Review the provided base task, plan, and key facts.
2. Implement only the specified task:
<task definition>
{task}
</task definition>
Error: 1 validation error for run_programming_task instructions Field required [type=missing, input_value=, input_type=dict]
KEEP IT SIMPLE
NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
"""

View File

@ -30,6 +30,7 @@ from ra_aid.tools.agent import (
request_web_research,
)
from ra_aid.tools.memory import one_shot_completed
from ra_aid.tools.write_file import write_file_tool
# Read-only tools that don't modify system state
@ -48,10 +49,11 @@ def get_read_only_tools(
tools = [
emit_related_files,
emit_key_facts,
delete_key_facts,
emit_key_snippets,
delete_key_snippets,
deregister_related_files,
# *TEMPORARILY* disabled to improve tool calling perf.
# delete_key_facts,
# delete_key_snippets,
# deregister_related_files,
list_directory_tree,
read_file_tool,
fuzzy_find_project_files,
@ -70,14 +72,15 @@ def get_read_only_tools(
# Define constant tool groups
READ_ONLY_TOOLS = get_read_only_tools()
MODIFICATION_TOOLS = [run_programming_task]
MODIFICATION_TOOLS = [run_programming_task, write_file_tool]
COMMON_TOOLS = get_read_only_tools()
EXPERT_TOOLS = [emit_expert_context, ask_expert]
RESEARCH_TOOLS = [
emit_research_notes,
one_shot_completed,
monorepo_detected,
ui_detected,
# *TEMPORARILY* disabled to improve tool calling perf.
# one_shot_completed,
# monorepo_detected,
# ui_detected,
]
@ -129,9 +132,10 @@ def get_planning_tools(
# Add planning-specific tools
planning_tools = [
emit_plan,
request_task_implementation,
plan_implementation_completed,
# *TEMPORARILY* disabled to improve tool calling perf.
# emit_plan,
# plan_implementation_completed,
]
tools.extend(planning_tools)
@ -202,9 +206,10 @@ def get_chat_tools(
request_research,
request_research_and_implementation,
emit_key_facts,
delete_key_facts,
delete_key_snippets,
deregister_related_files,
# *TEMPORARILY* disabled to improve tool calling perf.
# delete_key_facts,
# delete_key_snippets,
# deregister_related_files,
]
if web_research_enabled:

View File

@ -248,6 +248,8 @@ def request_research_and_implementation(query: str) -> Dict[str, Any]:
def request_task_implementation(task_spec: str) -> Dict[str, Any]:
"""Spawn an implementation agent to execute the given task.
Task specs should have the requirements. Generally, the spec will not include any code.
Args:
task_spec: REQUIRED The full task specification (markdown format, typically one part of the overall plan)
"""

View File

@ -387,6 +387,8 @@ def plan_implementation_completed(message: str) -> str:
Returns:
Confirmation message
"""
_global_memory["task_completed"] = True
_global_memory["completion_message"] = message
_global_memory["plan_completed"] = True
_global_memory["completion_message"] = message
_global_memory["tasks"].clear() # Clear task list when plan is completed

View File

@ -33,16 +33,20 @@ def run_programming_task(
The programmer sees only what you provide, no conversation history.
Give detailed instructions but do not write their code.
Give detailed instructions including multi-file tasks but do not write their code.
They are intelligent and can edit multiple files.
The programmer cannot run commands.
If new files are created, emit them after finishing.
They can add/modify files, but not remove. Use run_shell_command to remove files. If referencing files youll delete, remove them after they finish.
Use write_file_tool instead if you need to write the entire contents of file(s).
If the programmer wrote files, they actually wrote to disk. You do not need to rewrite the output of what the programmer showed you.
Args:
instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed)
instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed, no commands allowed)
files: Optional; if not provided, uses related_files
Returns: { "output": stdout+stderr, "return_code": 0 if success, "success": True/False }