3.7 KiB

Raw Blame History

LLM Tool Call Fallback Feature

Overview

Add functionality to automatically fallback to alternative LLM models when a tool call experiences multiple consecutive failures.

Background

Currently, when a tool call fails due to LLM-related errors (e.g., invalid format), there is no automatic fallback mechanism. This often causes infinite loop of erroring tool calls.

Implementation Details

Configuration

Add new configuration value max_tool_failures (default: 3) to track consecutive failures before triggering fallback
Add new command line argument --no-fallback-tool to disable fallback behavior (enabled by default)
Add new command line argument --fallback-tool-models to specify a comma-separated list of fallback tool models (default: "gpt-3.5-turbo,gpt-4")
This list defines the fallback model sequence used by forced tool calls (via bind_tools) when tool call failures occur.
Track failure count per tool call context
Reset failure counter on successful tool call
Store fallback model sequence per provider
Need to validate if ENV vars are set for provider usage of that fallback model before usage, if that fallback ENV is not available then fallback to the next model
Have default list of common models, first try claude-3-5-sonnet-20241022 but have many alternative fallback models.

Tool Call Wrapper

Create a new wrapper function to handle tool call execution with fallback logic:

def execute_tool_with_fallback(tool_call_func, *args, **kwargs):
    failures = 0
    max_failures = get_config().max_tool_failures

    while failures < max_failures:
        try:
            return tool_call_func(*args, **kwargs)
        except LLMError as e:
            failures += 1
            if failures >= max_failures:
                # Use forced tool call via bind_tools with retry:
                llm_retry = llm_model.with_retry(stop_after_attempt=3)  # Try three times
                try_fallback_model(force=True, model=llm_retry)
                # Merge fallback model chat messages back into the original chat history.
                merge_fallback_chat_history()
                failures = 0  # Reset counter for new model
            else:
                raise

The prompt passed to try_fallback_model, should be the failed last few failing tool calls.

Model Fallback Sequence

Define fallback sequences for each provider based on model capabilities:

Try same provider's smaller models
Try alternative providers' similar models
Raise final error if all fallbacks fail

Risks and Mitigations

Cost
- Risk: Fallback to more expensive models
- Mitigation: Configure cost limits and preferred fallback sequences
State Management
- Risk: Loss of context during fallbacks
- Mitigation: Preserve conversation state and tool context

Relevant Files

ra_aid/agents/ciayn_agent.py
ra_aid/llm.py
ra_aid/agent_utils.py
ra_aid/main.py
ra_aid/models_params.py

Acceptance Criteria

Tool calls automatically attempt fallback models after N consecutive failures
--no-fallback-tool argument successfully disables fallback behavior
Fallback sequence respects provider and model capabilities
Original error is preserved if all fallbacks fail
Unit tests cover fallback scenarios and edge cases
README.md updated to reflect new behavior

Documentation Updates

Add fallback feature to main README
Document --no-fallback-tool in CLI help
Document provider-specific fallback sequences

Future Considerations

Allow custom fallback sequences via configuration
Add monitoring and alerting for fallback frequency
Optimize fallback selection based on historical success rates
Cost-aware fallback routing

3.7 KiB Raw Blame History