Reduce tool count to make tool calling more reliable.

2025-02-08 18:26:08 -05:00 · 2025-02-08 18:26:08 -05:00 · 0c86900ce4
parent 13016278e5
commit 0c86900ce4
6 changed files with 54 additions and 63 deletions
--- a/ra_aid/console/formatting.py
+++ b/ra_aid/console/formatting.py
@ -31,7 +31,7 @@ def print_stage_header(stage: str) -> None:
    icon = icons.get(stage_key, "🚀")

    # Create styled panel with icon
-    panel_content = f"{icon} {stage_title}"
+    panel_content = f" {icon} {stage_title}"
    console.print(Panel(panel_content, style="green bold", padding=0))


--- a/ra_aid/prompts.py
+++ b/ra_aid/prompts.py
@ -119,7 +119,9 @@ Prioritize checking current documentation for technical advice.

 # New project hints
 NEW_PROJECT_HINTS = """
-If the user did not specify a stack, use your best judgment, or make a proposal and ask the human if the human-in-the-loop tool is available.
+Because this is a new project:
+- If the user did not specify a stack, use your best judgment, or make a proposal and ask the human if the human-in-the-loop tool is available.
+- If the user did not specify a directory to create the project in, create directly in the current directory.
 """

 # Research stage prompt - guides initial codebase analysis
@ -286,10 +288,6 @@ You have often been criticized for:
 {human_section}
 {web_research_section}

-If you make tool calls incorrectly, you **WILL** get errors like the following:
-
-Error: 1 validation error for emit_research_notes notes Field required [type=missing, input_value=, input_type=dict]
-
 NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 """

@ -501,12 +499,6 @@ You have often been criticized for:
    - Not indicating confidence levels or noting uncertainties
    - Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.

-
-
-If you make tool calls incorrectly, you **WILL** get errors like the following:
-
-Error: 1 validation error for emit_research_notes notes Field required [type=missing, input_value=, input_type=dict]
-
 NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 """

@ -602,52 +594,32 @@ NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 IMPLEMENTATION_PROMPT = """Current Date: {current_date}
 Working Directory: {working_directory}

-Base-level task (for reference only):
-<base task>
-{base_task}
-</base task>
-
-keep it simple. if the expert tool is available, use it frequently for high level logic and planning.
-
-Plan Overview (for reference only, remember you are only implementing your specific task):
-{plan}
-
-Key Facts:
+<key facts>
 {key_facts}
+</key facts>

-Key Snippets:
+<key snippets>
 {key_snippets}
+</key snippets>

-Relevant Files:
+<relevant files>
 {related_files}
+</relevant files>

-Research Notes:
+<research notes>
 {research_notes}
-
-Work done so far:
-<work log>
-{work_log}
-</work log>
+</research notes>

 Important Notes:
 - Focus solely on the given task and implement it as described.
 - Scale the complexity of your solution to the complexity of the request. For simple requests, keep it straightforward and minimal. For complex requests, maintain the previously planned depth.
- Use delete_key_facts to remove facts that become outdated, irrelevant, or duplicated.
 - Use emit_key_snippets to manage code sections before and after modifications in batches.
- Regularly remove outdated snippets with delete_key_snippets.
-Instructions:
-1. Review the provided base task, plan, and key facts.
-2. Implement only the specified task:
-<task definition>
-{task}
-</task definition>

-3. Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
-4. Use delete_key_facts to remove any key facts that no longer apply.
-5. Do not add features not explicitly required.
-6. Only create or modify files directly related to this task.
-7. Use file_str_replace and write_file_tool for simple file modifications.
-8. Delegate to run_programming_task for more complex programming tasks. This is a capable human programmer that can work on multiple files at once.
+- Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
+- Do not add features not explicitly required.
+- Only create or modify files directly related to this task.
+- Use file_str_replace and write_file_tool for simple file modifications.
+- Delegate to run_programming_task for more complex programming tasks. This is a capable human programmer that can work on multiple files at once.

 Testing:

@ -659,7 +631,7 @@ Testing:
 - Only test UI components if there is already a UI testing system in place.
 - Only test things that can be tested by an automated process.

-Once the task is complete, ensure all updated files are emitted.
+Once the task is complete, ensure all updated files are registered with emit_related_files.

 {expert_section}
 {human_section}
@ -670,10 +642,16 @@ You have often been criticized for:
  - Doing changes outside of the specific scoped instructions.
  - Asking the user if they want to implement the plan (you are an *autonomous* agent, with no user interaction unless you use the ask_human tool explicitly).
  - Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.
+  - Using run_programming_task to simply write the full contents of files when you could have used write_file_tool instead.

-If you make tool calls incorrectly, you **WILL** get errors like the following:
+Instructions:
+1. Review the provided base task, plan, and key facts.
+2. Implement only the specified task:
+<task definition>
+{task}
+</task definition>

-Error: 1 validation error for run_programming_task instructions Field required [type=missing, input_value=, input_type=dict] 
+KEEP IT SIMPLE

 NEVER ANNOUNCE WHAT YOU ARE DOING, JUST DO IT!
 """
--- a/ra_aid/tool_configs.py
+++ b/ra_aid/tool_configs.py
@ -30,6 +30,7 @@ from ra_aid.tools.agent import (
    request_web_research,
 )
 from ra_aid.tools.memory import one_shot_completed
+from ra_aid.tools.write_file import write_file_tool


 # Read-only tools that don't modify system state
@ -48,10 +49,11 @@ def get_read_only_tools(
    tools = [
        emit_related_files,
        emit_key_facts,
-        delete_key_facts,
        emit_key_snippets,
-        delete_key_snippets,
-        deregister_related_files,
+        # *TEMPORARILY* disabled to improve tool calling perf.
+        # delete_key_facts,
+        # delete_key_snippets,
+        # deregister_related_files,
        list_directory_tree,
        read_file_tool,
        fuzzy_find_project_files,
@ -70,14 +72,15 @@ def get_read_only_tools(

 # Define constant tool groups
 READ_ONLY_TOOLS = get_read_only_tools()
-MODIFICATION_TOOLS = [run_programming_task]
+MODIFICATION_TOOLS = [run_programming_task, write_file_tool]
 COMMON_TOOLS = get_read_only_tools()
 EXPERT_TOOLS = [emit_expert_context, ask_expert]
 RESEARCH_TOOLS = [
    emit_research_notes,
-    one_shot_completed,
-    monorepo_detected,
-    ui_detected,
+    # *TEMPORARILY* disabled to improve tool calling perf.
+    # one_shot_completed,
+    # monorepo_detected,
+    # ui_detected,
 ]


@ -129,9 +132,10 @@ def get_planning_tools(

    # Add planning-specific tools
    planning_tools = [
-        emit_plan,
        request_task_implementation,
-        plan_implementation_completed,
+        # *TEMPORARILY* disabled to improve tool calling perf.
+        # emit_plan,
+        # plan_implementation_completed,
    ]
    tools.extend(planning_tools)

@ -202,9 +206,10 @@ def get_chat_tools(
        request_research,
        request_research_and_implementation,
        emit_key_facts,
-        delete_key_facts,
-        delete_key_snippets,
-        deregister_related_files,
+        # *TEMPORARILY* disabled to improve tool calling perf.
+        # delete_key_facts,
+        # delete_key_snippets,
+        # deregister_related_files,
    ]

    if web_research_enabled:
--- a/ra_aid/tools/agent.py
+++ b/ra_aid/tools/agent.py
@ -248,6 +248,8 @@ def request_research_and_implementation(query: str) -> Dict[str, Any]:
 def request_task_implementation(task_spec: str) -> Dict[str, Any]:
    """Spawn an implementation agent to execute the given task.

+    Task specs should have the requirements. Generally, the spec will not include any code.
+
    Args:
        task_spec: REQUIRED The full task specification (markdown format, typically one part of the overall plan)
    """
--- a/ra_aid/tools/memory.py
+++ b/ra_aid/tools/memory.py
@ -387,6 +387,8 @@ def plan_implementation_completed(message: str) -> str:
    Returns:
        Confirmation message
    """
+    _global_memory["task_completed"] = True
+    _global_memory["completion_message"] = message
    _global_memory["plan_completed"] = True
    _global_memory["completion_message"] = message
    _global_memory["tasks"].clear()  # Clear task list when plan is completed
--- a/ra_aid/tools/programmer.py
+++ b/ra_aid/tools/programmer.py
@ -33,16 +33,20 @@ def run_programming_task(

    The programmer sees only what you provide, no conversation history.

-    Give detailed instructions but do not write their code.
+    Give detailed instructions including multi-file tasks but do not write their code.

-    They are intelligent and can edit multiple files.
+    The programmer cannot run commands.

    If new files are created, emit them after finishing.

    They can add/modify files, but not remove. Use run_shell_command to remove files. If referencing files you’ll delete, remove them after they finish.
+  
+    Use write_file_tool instead if you need to write the entire contents of file(s).
+
+    If the programmer wrote files, they actually wrote to disk. You do not need to rewrite the output of what the programmer showed you.

    Args:
-     instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed)
+     instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed, no commands allowed)
     files: Optional; if not provided, uses related_files

    Returns: { "output": stdout+stderr, "return_code": 0 if success, "success": True/False }