Reduce tool count to make tool calling more reliable.

2025-02-08 18:26:08 -05:00 · 2025-02-08 18:26:08 -05:00 · 0c86900ce4
parent 13016278e5
commit 0c86900ce4
6 changed files with 54 additions and 63 deletions
--- a/ra_aid/console/formatting.py
+++ b/ra_aid/console/formatting.py
@ -31,7 +31,7 @@ def print_stage_header(stage: str) -> None:
    icon = icons.get(stage_key, "🚀")
    # Create styled panel with icon
-    panel_content = f"{icon} {stage_title}"
+    panel_content = f" {icon} {stage_title}"
    console.print(Panel(panel_content, style="green bold", padding=0))
--- a/ra_aid/prompts.py
+++ b/ra_aid/prompts.py
@ -119,7 +119,9 @@ Prioritize checking current documentation for technical advice.
 # New project hints
 NEW_PROJECT_HINTS = """
-If the user did not specify a stack, use your best judgment, or make a proposal and ask the human if the human-in-the-loop tool is available.
+Because this is a new project:
 - If the user did not specify a stack, use your best judgment, or make a proposal and ask the human if the human-in-the-loop tool is available.
 - If the user did not specify a directory to create the project in, create directly in the current directory.
 """
 # Research stage prompt - guides initial codebase analysis
@ -286,10 +288,6 @@ You have often been criticized for:
 {human_section}
 {web_research_section}
 If you make tool calls incorrectly, you **WILL** get errors like the following:
 Error: 1 validation error for emit_research_notes notes Field required [type=missing, input_value=, input_type=dict]
 NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 """
@ -501,12 +499,6 @@ You have often been criticized for:
    - Not indicating confidence levels or noting uncertainties
    - Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.
 If you make tool calls incorrectly, you **WILL** get errors like the following:
 Error: 1 validation error for emit_research_notes notes Field required [type=missing, input_value=, input_type=dict]
 NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 """
@ -602,52 +594,32 @@ NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 IMPLEMENTATION_PROMPT = """Current Date: {current_date}
 Working Directory: {working_directory}
-Base-level task (for reference only):
+<key facts>
 <base task>
 {base_task}
 </base task>
 keep it simple. if the expert tool is available, use it frequently for high level logic and planning.
 Plan Overview (for reference only, remember you are only implementing your specific task):
 {plan}
 Key Facts:
 {key_facts}
 </key facts>
-Key Snippets:
+<key snippets>
 {key_snippets}
 </key snippets>
-Relevant Files:
+<relevant files>
 {related_files}
 </relevant files>
-Research Notes:
+<research notes>
 {research_notes}
-
+</research notes>
 Work done so far:
 <work log>
 {work_log}
 </work log>
 Important Notes:
 - Focus solely on the given task and implement it as described.
 - Scale the complexity of your solution to the complexity of the request. For simple requests, keep it straightforward and minimal. For complex requests, maintain the previously planned depth.
 - Use delete_key_facts to remove facts that become outdated, irrelevant, or duplicated.
 - Use emit_key_snippets to manage code sections before and after modifications in batches.
 - Regularly remove outdated snippets with delete_key_snippets.
 Instructions:
 1. Review the provided base task, plan, and key facts.
 2. Implement only the specified task:
 <task definition>
 {task}
 </task definition>
-3. Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
+- Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
-4. Use delete_key_facts to remove any key facts that no longer apply.
+- Do not add features not explicitly required.
-5. Do not add features not explicitly required.
+- Only create or modify files directly related to this task.
-6. Only create or modify files directly related to this task.
+- Use file_str_replace and write_file_tool for simple file modifications.
-7. Use file_str_replace and write_file_tool for simple file modifications.
+- Delegate to run_programming_task for more complex programming tasks. This is a capable human programmer that can work on multiple files at once.
 8. Delegate to run_programming_task for more complex programming tasks. This is a capable human programmer that can work on multiple files at once.
 Testing:
@ -659,7 +631,7 @@ Testing:
 - Only test UI components if there is already a UI testing system in place.
 - Only test things that can be tested by an automated process.
-Once the task is complete, ensure all updated files are emitted.
+Once the task is complete, ensure all updated files are registered with emit_related_files.
 {expert_section}
 {human_section}
@ -670,10 +642,16 @@ You have often been criticized for:
  - Doing changes outside of the specific scoped instructions.
  - Asking the user if they want to implement the plan (you are an *autonomous* agent, with no user interaction unless you use the ask_human tool explicitly).
  - Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.
  - Using run_programming_task to simply write the full contents of files when you could have used write_file_tool instead.
-If you make tool calls incorrectly, you **WILL** get errors like the following:
+Instructions:
 1. Review the provided base task, plan, and key facts.
 2. Implement only the specified task:
 <task definition>
 {task}
 </task definition>
-Error: 1 validation error for run_programming_task instructions Field required [type=missing, input_value=, input_type=dict] 
+KEEP IT SIMPLE
 NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 """
--- a/ra_aid/tool_configs.py
+++ b/ra_aid/tool_configs.py
@ -30,6 +30,7 @@ from ra_aid.tools.agent import (
    request_web_research,
 )
 from ra_aid.tools.memory import one_shot_completed
 from ra_aid.tools.write_file import write_file_tool
 # Read-only tools that don't modify system state
@ -48,10 +49,11 @@ def get_read_only_tools(
    tools = [
        emit_related_files,
        emit_key_facts,
        delete_key_facts,
        emit_key_snippets,
-        delete_key_snippets,
+        # *TEMPORARILY* disabled to improve tool calling perf.
-        deregister_related_files,
+        # delete_key_facts,
        # delete_key_snippets,
        # deregister_related_files,
        list_directory_tree,
        read_file_tool,
        fuzzy_find_project_files,
@ -70,14 +72,15 @@ def get_read_only_tools(
 # Define constant tool groups
 READ_ONLY_TOOLS = get_read_only_tools()
-MODIFICATION_TOOLS = [run_programming_task]
+MODIFICATION_TOOLS = [run_programming_task, write_file_tool]
 COMMON_TOOLS = get_read_only_tools()
 EXPERT_TOOLS = [emit_expert_context, ask_expert]
 RESEARCH_TOOLS = [
    emit_research_notes,
-    one_shot_completed,
+    # *TEMPORARILY* disabled to improve tool calling perf.
-    monorepo_detected,
+    # one_shot_completed,
-    ui_detected,
+    # monorepo_detected,
    # ui_detected,
 ]
@ -129,9 +132,10 @@ def get_planning_tools(
    # Add planning-specific tools
    planning_tools = [
        emit_plan,
        request_task_implementation,
-        plan_implementation_completed,
+        # *TEMPORARILY* disabled to improve tool calling perf.
        # emit_plan,
        # plan_implementation_completed,
    ]
    tools.extend(planning_tools)
@ -202,9 +206,10 @@ def get_chat_tools(
        request_research,
        request_research_and_implementation,
        emit_key_facts,
-        delete_key_facts,
+        # *TEMPORARILY* disabled to improve tool calling perf.
-        delete_key_snippets,
+        # delete_key_facts,
-        deregister_related_files,
+        # delete_key_snippets,
        # deregister_related_files,
    ]
    if web_research_enabled:
--- a/ra_aid/tools/agent.py
+++ b/ra_aid/tools/agent.py
@ -248,6 +248,8 @@ def request_research_and_implementation(query: str) -> Dict[str, Any]:
 def request_task_implementation(task_spec: str) -> Dict[str, Any]:
    """Spawn an implementation agent to execute the given task.
    Task specs should have the requirements. Generally, the spec will not include any code.
    Args:
        task_spec: REQUIRED The full task specification (markdown format, typically one part of the overall plan)
    """
--- a/ra_aid/tools/memory.py
+++ b/ra_aid/tools/memory.py
@ -387,6 +387,8 @@ def plan_implementation_completed(message: str) -> str:
    Returns:
        Confirmation message
    """
    _global_memory["task_completed"] = True
    _global_memory["completion_message"] = message
    _global_memory["plan_completed"] = True
    _global_memory["completion_message"] = message
    _global_memory["tasks"].clear()  # Clear task list when plan is completed
--- a/ra_aid/tools/programmer.py
+++ b/ra_aid/tools/programmer.py
@ -33,16 +33,20 @@ def run_programming_task(
    The programmer sees only what you provide, no conversation history.
-    Give detailed instructions but do not write their code.
+    Give detailed instructions including multi-file tasks but do not write their code.
-    They are intelligent and can edit multiple files.
+    The programmer cannot run commands.
    If new files are created, emit them after finishing.
    They can add/modify files, but not remove. Use run_shell_command to remove files. If referencing files you’ll delete, remove them after they finish.
    Use write_file_tool instead if you need to write the entire contents of file(s).
    If the programmer wrote files, they actually wrote to disk. You do not need to rewrite the output of what the programmer showed you.
    Args:
-     instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed)
+     instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed, no commands allowed)
     files: Optional; if not provided, uses related_files
    Returns: { "output": stdout+stderr, "return_code": 0 if success, "success": True/False }