use latest aider; update tool calling prompts and minimize return values to improve tool calling performance

This commit is contained in:
AI Christianson 2025-02-24 19:08:12 -05:00
parent cc93961bf3
commit 52722f6600
9 changed files with 123 additions and 182 deletions

View File

@ -35,7 +35,7 @@ dependencies = [
"rapidfuzz>=3.11.0",
"pathspec>=0.11.0",
"pyte>=0.8.2",
"aider-chat>=0.74.2",
"aider-chat>=0.75",
"tavily-python>=0.5.0",
"litellm>=1.60.6",
"fastapi>=0.104.0",

View File

@ -209,7 +209,9 @@ def build_agent_kwargs(
Returns:
Dictionary of kwargs for agent creation
"""
agent_kwargs = {}
agent_kwargs = {
"version": "v2",
}
if checkpointer is not None:
agent_kwargs["checkpointer"] = checkpointer

View File

@ -158,7 +158,7 @@ Project State Handling:
Be aware there may be additional relevant files
Use tools like ripgrep_search and fuzzy_find_project_files to locate specific files
Be very thorough in your research and emit lots of snippets, key facts. If you take more than a few steps, be eager to emit research subtasks.{research_only_note}
When necessary, emit research subtasks.{research_only_note}
Objective
Investigate and understand the codebase as it relates to the query.
@ -259,11 +259,12 @@ Thoroughness and Completeness:
- For tasks requiring UI changes, not researching existing UI libraries and conventions.
- Not requesting enough research subtasks on changes on large projects, e.g. to discover testing or UI conventions, etc.
- Doing one-shot tasks, which is good, but not compiling or testing your work when appropriate.
- Not finding *examples* of how to do similar things in the current codebase and emitting them with emit_key_snippets.
- Not finding *examples* of how to do similar things in the current codebase and calling emit_key_snippet to report them.
- Not finding unit tests because they are in slightly different locations than expected.
- Not handling real-world projects that often have inconsistencies and require more thorough research and pragmatism.
- Not finding *ALL* related files and snippets. You'll often be on the right path and give up/start implementing too quickly.
- You sometimes use emit_key_snippets to *write* code rather than to record key snippets of existing code, which it is meant for.
- Not calling tools/functions properly, e.g. leaving off required arguments, calling a tool in a loop, calling tools inappropriately.
- Doing redundant research and taking way more steps than necessary.
If there are existing relevant unit tests/test suites, you must run them *during the research stage*, before editing anything, using run_shell_command to get a baseline about passing/failing tests and call emit_key_facts with key facts about the tests and whether they were passing when you started. This ensures a proper baseline is established before any changes.
@ -535,18 +536,6 @@ Work done so far:
{work_log}
</work log>
Fact Management:
Each fact is identified with [Fact ID: X].
Facts may be deleted if they become outdated, irrelevant, or duplicates.
Use delete_key_facts([id1, id2, ...]) with a list of numeric Fact IDs to remove unnecessary facts.
Snippet Management:
Each snippet is identified with [Snippet ID: X].
Snippets include file path, line number, and source code.
Snippets may have optional descriptions explaining their significance.
Delete snippets with delete_key_snippets([id1, id2, ...]) to remove outdated or irrelevant ones.
Use emit_key_snippets to store important code sections needed for reference in batches.
Guidelines:
If you need additional input or assistance from the expert (if expert is available), especially for debugging, deeper logic analysis, or correctness checks, use emit_expert_context to provide all relevant context and wait for the experts response.
@ -615,7 +604,6 @@ Working Directory: {working_directory}
Important Notes:
- Focus solely on the given task and implement it as described.
- Scale the complexity of your solution to the complexity of the request. For simple requests, keep it straightforward and minimal. For complex requests, maintain the previously planned depth.
- Use emit_key_snippets to manage code sections before and after modifications in batches.
- Work incrementally, validating as you go. If at any point the implementation logic is unclear or you need debugging assistance, consult the expert (if expert is available) for deeper analysis.
- Do not add features not explicitly required.
@ -713,11 +701,6 @@ Exit Criteria:
- Until such confirmation, continue to engage and ask_human if additional clarification is required.
- If there are any doubts about final correctness or thoroughness, consult the expert (if expert is available) before concluding.
Context Cleanup:
- Use delete_key_facts to remove any key facts that no longer apply.
- Use delete_key_snippets to remove any key snippets that no longer apply.
- Use deregister_related_files to remove any related files that no longer apply.
When processing request_* tool responses:
- Always check completion_message and work_log for implementation status
- If the work_log includes 'Implementation completed' or 'Plan execution completed', the changes have already been made
@ -942,11 +925,6 @@ Exit Criteria:
- Until such confirmation, continue to engage and ask_human if additional clarification is required.
- If there are any doubts about final correctness or thoroughness, consult the expert (if expert is available) before concluding.
Context Cleanup:
- Use delete_key_facts to remove any key facts that no longer apply.
- Use delete_key_snippets to remove any key snippets that no longer apply.
- Use deregister_related_files to remove any related files that no longer apply.
When processing request_* tool responses:
- Always check completion_message and work_log for implementation status
- If the work_log includes 'Implementation completed' or 'Plan execution completed', the changes have already been made

View File

@ -5,7 +5,7 @@ from ra_aid.tools import (
ask_human,
emit_expert_context,
emit_key_facts,
emit_key_snippets,
emit_key_snippet,
emit_related_files,
emit_research_notes,
fuzzy_find_project_files,
@ -41,9 +41,9 @@ def get_read_only_tools(
List of tool functions
"""
tools = [
emit_key_snippet,
emit_related_files,
emit_key_facts,
emit_key_snippets,
# *TEMPORARILY* disabled to improve tool calling perf.
# delete_key_facts,
# delete_key_snippets,

View File

@ -9,7 +9,7 @@ from .memory import (
delete_tasks,
deregister_related_files,
emit_key_facts,
emit_key_snippets,
emit_key_snippet,
emit_plan,
emit_related_files,
emit_research_notes,
@ -36,7 +36,7 @@ __all__ = [
"deregister_related_files",
"emit_expert_context",
"emit_key_facts",
"emit_key_snippets",
"emit_key_snippet",
"emit_plan",
"emit_related_files",
"emit_research_notes",

View File

@ -14,8 +14,6 @@ class WorkLogEntry(TypedDict):
class SnippetInfo(TypedDict):
"""Type definition for source code snippet information"""
filepath: str
line_number: int
snippet: str
@ -30,12 +28,13 @@ _global_memory: Dict[
Union[
Dict[int, str],
Dict[int, SnippetInfo],
Dict[int, WorkLogEntry],
int,
Set[str],
bool,
str,
List[str],
List[WorkLogEntry],
List[WorkLogEntry]
],
] = {
"research_notes": [],
@ -59,17 +58,14 @@ _global_memory: Dict[
@tool("emit_research_notes")
def emit_research_notes(notes: str) -> str:
"""Store research notes in global memory.
"""Use this when you have completed your research to share your notes in markdown format, no more than 500 words.
Args:
notes: REQUIRED The research notes to store
Returns:
The stored notes
"""
_global_memory["research_notes"].append(notes)
console.print(Panel(Markdown(notes), title="🔍 Research Notes"))
return notes
return "Research notes stored."
@tool("emit_plan")
@ -78,14 +74,11 @@ def emit_plan(plan: str) -> str:
Args:
plan: The plan step to store (markdown format; be clear, complete, use newlines, and use as many tokens as you need)
Returns:
The stored plan
"""
_global_memory["plans"].append(plan)
console.print(Panel(Markdown(plan), title="📋 Plan"))
log_work_event(f"Added plan step:\n\n{plan}")
return plan
return "Plan stored."
@tool("emit_task")
@ -94,9 +87,6 @@ def emit_task(task: str) -> str:
Args:
task: The task to store
Returns:
String confirming task storage with ID number
"""
# Get and increment task ID
task_id = _global_memory["task_id_counter"]
@ -116,9 +106,6 @@ def emit_key_facts(facts: List[str]) -> str:
Args:
facts: List of key facts to store
Returns:
List of stored fact confirmation messages
"""
results = []
for fact in facts:
@ -152,9 +139,6 @@ def delete_key_facts(fact_ids: List[int]) -> str:
Args:
fact_ids: List of fact IDs to delete
Returns:
List of success messages for deleted facts
"""
results = []
for fact_id in fact_ids:
@ -178,9 +162,6 @@ def delete_tasks(task_ids: List[int]) -> str:
Args:
task_ids: List of task IDs to delete
Returns:
Confirmation message
"""
results = []
for task_id in task_ids:
@ -206,74 +187,62 @@ def request_implementation() -> str:
Do you need to request research subtasks first?
Have you run relevant unit tests, if they exist, to get a baseline (this can be a subtask)?
Do you need to crawl deeper to find all related files and symbols?
Returns:
Empty string
"""
_global_memory["implementation_requested"] = True
console.print(Panel("🚀 Implementation Requested", style="yellow", padding=0))
log_work_event("Implementation requested.")
return ""
return "Implementation requested."
@tool("emit_key_snippets")
def emit_key_snippets(snippets: List[SnippetInfo]) -> str:
"""Store multiple key source code snippets in global memory.
Automatically adds the filepaths of the snippets to related files.
@tool("emit_key_snippet")
def emit_key_snippet(snippet_info: SnippetInfo) -> str:
"""Store a single source code snippet in global memory which represents key information.
Automatically adds the filepath of the snippet to related files.
This is for **existing**, or **just-written** files, not for things to be created in the future.
Args:
snippets: REQUIRED List of snippet information dictionaries containing:
snippet_info: Dict with keys:
- filepath: Path to the source file
- line_number: Line number where the snippet starts
- snippet: The source code snippet text
- description: Optional description of the significance
Returns:
List of stored snippet confirmation messages
"""
# First collect unique filepaths to add as related files
emit_related_files.invoke(
{"files": [snippet_info["filepath"] for snippet_info in snippets]}
# Add filepath to related files
emit_related_files.invoke({"files": [snippet_info["filepath"]]})
# Get and increment snippet ID
snippet_id = _global_memory["key_snippet_id_counter"]
_global_memory["key_snippet_id_counter"] += 1
# Store snippet info
_global_memory["key_snippets"][snippet_id] = snippet_info
# Format display text as markdown
display_text = [
"**Source Location**:",
f"- File: `{snippet_info['filepath']}`",
f"- Line: `{snippet_info['line_number']}`",
"", # Empty line before code block
"**Code**:",
"```python",
snippet_info["snippet"].rstrip(), # Remove trailing whitespace
"```",
]
if snippet_info["description"]:
display_text.extend(["", "**Description**:", snippet_info["description"]])
# Display panel
console.print(
Panel(
Markdown("\n".join(display_text)),
title=f"📝 Key Snippet #{snippet_id}",
border_style="bright_cyan",
)
)
results = []
for snippet_info in snippets:
# Get and increment snippet ID
snippet_id = _global_memory["key_snippet_id_counter"]
_global_memory["key_snippet_id_counter"] += 1
# Store snippet info
_global_memory["key_snippets"][snippet_id] = snippet_info
# Format display text as markdown
display_text = [
"**Source Location**:",
f"- File: `{snippet_info['filepath']}`",
f"- Line: `{snippet_info['line_number']}`",
"", # Empty line before code block
"**Code**:",
"```python",
snippet_info["snippet"].rstrip(), # Remove trailing whitespace
"```",
]
if snippet_info["description"]:
display_text.extend(["", "**Description**:", snippet_info["description"]])
# Display panel
console.print(
Panel(
Markdown("\n".join(display_text)),
title=f"📝 Key Snippet #{snippet_id}",
border_style="bright_cyan",
)
)
results.append(f"Stored snippet #{snippet_id}")
log_work_event(f"Stored {len(snippets)} code snippets.")
return "Snippets stored."
log_work_event(f"Stored code snippet #{snippet_id}.")
return f"Snippet #{snippet_id} stored."
@tool("delete_key_snippets")
@ -283,9 +252,6 @@ def delete_key_snippets(snippet_ids: List[int]) -> str:
Args:
snippet_ids: List of snippet IDs to delete
Returns:
List of success messages for deleted snippets
"""
results = []
for snippet_id in snippet_ids:
@ -311,9 +277,6 @@ def swap_task_order(id1: int, id2: int) -> str:
Args:
id1: First task ID
id2: Second task ID
Returns:
Success or error message depending on outcome
"""
# Validate IDs are different
if id1 == id2:
@ -338,7 +301,7 @@ def swap_task_order(id1: int, id2: int) -> str:
)
)
return "Tasks swapped."
return "Tasks deleted."
@tool("one_shot_completed")
@ -366,9 +329,6 @@ def task_completed(message: str) -> str:
Args:
message: Message explaining how/why the task is complete
Returns:
The completion message
"""
_global_memory["task_completed"] = True
_global_memory["completion_message"] = message
@ -382,9 +342,6 @@ def plan_implementation_completed(message: str) -> str:
Args:
message: Message explaining how the implementation plan was completed
Returns:
Confirmation message
"""
_global_memory["task_completed"] = True
_global_memory["completion_message"] = message
@ -413,9 +370,6 @@ def emit_related_files(files: List[str]) -> str:
Args:
files: List of file paths to add
Returns:
Formatted string containing file IDs and paths for all processed files
"""
results = []
added_files = []
@ -476,7 +430,7 @@ def emit_related_files(files: List[str]) -> str:
)
)
return "\n".join(results)
return "Files noted."
def log_work_event(event: str) -> str:
@ -550,9 +504,6 @@ def deregister_related_files(file_ids: List[int]) -> str:
Args:
file_ids: List of file IDs to delete
Returns:
Success message string
"""
results = []
for file_id in file_ids:
@ -571,7 +522,7 @@ def deregister_related_files(file_ids: List[int]) -> str:
)
results.append(success_msg)
return "File references removed."
return "Files noted."
def get_memory_value(key: str) -> str:

View File

@ -73,8 +73,6 @@ def run_programming_task(
Args:
instructions: REQUIRED Programming task instructions (markdown format, use newlines and as many tokens as needed, no commands allowed)
files: Optional; if not provided, uses related_files
Returns: { "output": stdout+stderr, "return_code": 0 if success, "success": True/False }
"""
# Build command
aider_exe = get_aider_executable()

View File

@ -115,7 +115,7 @@ def test_create_agent_anthropic(mock_model, mock_memory):
assert agent == "react_agent"
mock_react.assert_called_once_with(
mock_model, [], state_modifier=mock_react.call_args[1]["state_modifier"]
mock_model, [], version='v2', state_modifier=mock_react.call_args[1]["state_modifier"]
)
@ -258,7 +258,7 @@ def test_create_agent_anthropic_token_limiting_disabled(mock_model, mock_memory)
agent = create_agent(mock_model, [])
assert agent == "react_agent"
mock_react.assert_called_once_with(mock_model, [])
mock_react.assert_called_once_with(mock_model, [], version='v2')
def test_get_model_token_limit_research(mock_memory):

View File

@ -7,7 +7,7 @@ from ra_aid.tools.memory import (
delete_tasks,
deregister_related_files,
emit_key_facts,
emit_key_snippets,
emit_key_snippet,
emit_related_files,
emit_task,
get_memory_value,
@ -201,34 +201,45 @@ def test_delete_key_facts(reset_memory):
assert _global_memory["key_facts"][2] == "Third fact"
def test_emit_key_snippets(reset_memory):
"""Test emitting multiple code snippets at once"""
# Test snippets with and without descriptions
snippets = [
{
"filepath": "test.py",
"line_number": 10,
"snippet": "def test():\n pass",
"description": "Test function",
},
{
"filepath": "main.py",
"line_number": 20,
"snippet": "print('hello')",
"description": None,
},
]
def test_emit_key_snippet(reset_memory):
"""Test emitting a single code snippet"""
# Test snippet with description
snippet = {
"filepath": "test.py",
"line_number": 10,
"snippet": "def test():\n pass",
"description": "Test function",
}
# Emit snippets
result = emit_key_snippets.invoke({"snippets": snippets})
# Emit snippet
result = emit_key_snippet.invoke({"snippet_info": snippet})
# Verify return message
assert result == "Snippets stored."
assert result == "Snippet #0 stored."
# Verify snippets stored correctly
assert _global_memory["key_snippets"][0] == snippets[0]
assert _global_memory["key_snippets"][1] == snippets[1]
# Verify snippet stored correctly
assert _global_memory["key_snippets"][0] == snippet
# Verify counter incremented correctly
assert _global_memory["key_snippet_id_counter"] == 1
# Test snippet without description
snippet2 = {
"filepath": "main.py",
"line_number": 20,
"snippet": "print('hello')",
"description": None,
}
# Emit second snippet
result = emit_key_snippet.invoke({"snippet_info": snippet2})
# Verify return message
assert result == "Snippet #1 stored."
# Verify snippet stored correctly
assert _global_memory["key_snippets"][1] == snippet2
# Verify counter incremented correctly
assert _global_memory["key_snippet_id_counter"] == 2
@ -256,7 +267,9 @@ def test_delete_key_snippets(reset_memory):
"description": None,
},
]
emit_key_snippets.invoke({"snippets": snippets})
# Add snippets one by one
for snippet in snippets:
emit_key_snippet.invoke({"snippet_info": snippet})
# Test deleting mix of valid and invalid IDs
result = delete_key_snippets.invoke({"snippet_ids": [0, 1, 999]})
@ -280,7 +293,7 @@ def test_delete_key_snippets_empty(reset_memory):
"snippet": "code",
"description": None,
}
emit_key_snippets.invoke({"snippets": [snippet]})
emit_key_snippet.invoke({"snippet_info": snippet})
# Test with empty list
result = delete_key_snippets.invoke({"snippet_ids": []})
@ -302,12 +315,12 @@ def test_emit_related_files_basic(reset_memory, tmp_path):
# Test adding single file
result = emit_related_files.invoke({"files": [str(test_file)]})
assert result == f"File ID #0: {test_file}"
assert result == "Files noted."
assert _global_memory["related_files"][0] == str(test_file)
# Test adding multiple files
result = emit_related_files.invoke({"files": [str(main_file), str(utils_file)]})
assert result == f"File ID #1: {main_file}\nFile ID #2: {utils_file}"
assert result == "Files noted."
# Verify both files exist in related_files
values = list(_global_memory["related_files"].values())
assert str(main_file) in values
@ -331,17 +344,17 @@ def test_emit_related_files_duplicates(reset_memory, tmp_path):
# Add initial files
result = emit_related_files.invoke({"files": [str(test_file), str(main_file)]})
assert result == f"File ID #0: {test_file}\nFile ID #1: {main_file}"
assert result == "Files noted."
_first_id = 0 # ID of test.py
# Try adding duplicates
result = emit_related_files.invoke({"files": [str(test_file)]})
assert result == f"File ID #0: {test_file}" # Should return same ID
assert result == "Files noted."
assert len(_global_memory["related_files"]) == 2 # Count should not increase
# Try mix of new and duplicate files
result = emit_related_files.invoke({"files": [str(test_file), str(new_file)]})
assert result == f"File ID #0: {test_file}\nFile ID #2: {new_file}"
assert result == "Files noted."
assert len(_global_memory["related_files"]) == 3
@ -355,12 +368,12 @@ def test_related_files_id_tracking(reset_memory, tmp_path):
# Add first file
result = emit_related_files.invoke({"files": [str(file1)]})
assert result == f"File ID #0: {file1}"
assert result == "Files noted."
assert _global_memory["related_file_id_counter"] == 1
# Add second file
result = emit_related_files.invoke({"files": [str(file2)]})
assert result == f"File ID #1: {file2}"
assert result == "Files noted."
assert _global_memory["related_file_id_counter"] == 2
# Verify all files stored correctly
@ -383,13 +396,13 @@ def test_deregister_related_files(reset_memory, tmp_path):
# Delete middle file
result = deregister_related_files.invoke({"file_ids": [1]})
assert result == "File references removed."
assert result == "Files noted."
assert 1 not in _global_memory["related_files"]
assert len(_global_memory["related_files"]) == 2
# Delete multiple files including non-existent ID
result = deregister_related_files.invoke({"file_ids": [0, 2, 999]})
assert result == "File references removed."
assert result == "Files noted."
assert len(_global_memory["related_files"]) == 0
# Counter should remain unchanged after deletions
@ -404,11 +417,11 @@ def test_related_files_duplicates(reset_memory, tmp_path):
# Add initial file
result1 = emit_related_files.invoke({"files": [str(test_file)]})
assert result1 == f"File ID #0: {test_file}"
assert result1 == "Files noted."
# Add same file again
result2 = emit_related_files.invoke({"files": [str(test_file)]})
assert result2 == f"File ID #0: {test_file}"
assert result2 == "Files noted."
# Verify only one entry exists
assert len(_global_memory["related_files"]) == 1
@ -429,9 +442,8 @@ def test_emit_related_files_with_directory(reset_memory, tmp_path):
{"files": [str(test_dir), str(nonexistent), str(test_file)]}
)
# Verify specific error messages for directory and nonexistent path
assert f"Error: Path '{test_dir}' is a directory, not a file" in result
assert f"Error: Path '{nonexistent}' does not exist" in result
# Verify result is the standard message
assert result == "Files noted."
# Verify directory and nonexistent not added but valid file was
assert len(_global_memory["related_files"]) == 1
@ -439,7 +451,6 @@ def test_emit_related_files_with_directory(reset_memory, tmp_path):
assert str(test_file) in values
assert str(test_dir) not in values
assert str(nonexistent) not in values
assert str(nonexistent) not in values
def test_related_files_formatting(reset_memory, tmp_path):
@ -479,11 +490,11 @@ def test_emit_related_files_path_normalization(reset_memory, tmp_path):
try:
# Add file with absolute path
result1 = emit_related_files.invoke({"files": ["file.txt"]})
assert "File ID #0:" in result1
assert result1 == "Files noted."
# Add same file with relative path - should get same ID due to path normalization
result2 = emit_related_files.invoke({"files": ["./file.txt"]})
assert "File ID #0:" in result2 # Should reuse ID since it's the same file
assert result2 == "Files noted."
# Verify only one normalized path entry exists
assert len(_global_memory["related_files"]) == 1
@ -525,9 +536,10 @@ def test_key_snippets_integration(reset_memory, tmp_path):
},
]
# Add all snippets
result = emit_key_snippets.invoke({"snippets": snippets})
assert result == "Snippets stored."
# Add all snippets one by one
for i, snippet in enumerate(snippets):
result = emit_key_snippet.invoke({"snippet_info": snippet})
assert result == f"Snippet #{i} stored."
assert _global_memory["key_snippet_id_counter"] == 3
# Verify related files were tracked with IDs
assert len(_global_memory["related_files"]) == 3
@ -564,8 +576,8 @@ def test_key_snippets_integration(reset_memory, tmp_path):
"snippet": "def func4():\n return False",
"description": "Fourth function",
}
result = emit_key_snippets.invoke({"snippets": [new_snippet]})
assert result == "Snippets stored."
result = emit_key_snippet.invoke({"snippet_info": new_snippet})
assert result == "Snippet #3 stored."
assert _global_memory["key_snippet_id_counter"] == 4
# Verify new file was added to related files
file_values = _global_memory["related_files"].values()
@ -641,7 +653,7 @@ def test_swap_task_order_valid_ids(reset_memory):
# Swap tasks 0 and 2
result = swap_task_order.invoke({"id1": 0, "id2": 2})
assert result == "Tasks swapped."
assert result == "Tasks deleted."
# Verify tasks were swapped
assert _global_memory["tasks"][0] == "Task 3"
@ -697,7 +709,7 @@ def test_swap_task_order_after_delete(reset_memory):
# Try to swap remaining valid tasks
result = swap_task_order.invoke({"id1": 0, "id2": 2})
assert result == "Tasks swapped."
assert result == "Tasks deleted."
# Verify swap worked
assert _global_memory["tasks"][0] == "Task 3"