4.6 KiB

Raw Blame History

LLM Tool Call Fallback Feature

Overview

Add functionality to automatically fallback to alternative LLM models when a tool call experiences multiple consecutive failures.

Background

Currently, when a tool call fails due to LLM-related errors (e.g., API timeouts, rate limits, context length issues), there is no automatic fallback mechanism. This can lead to interrupted workflows and poor user experience.

Relevant Files

ra_aid/agents/ciayn_agent.py
ra_aid/llm.py
ra_aid/agent_utils.py
ra_aid/main.py
ra_aid/models_params.py

Implementation Details

Configuration

Add new configuration value max_tool_failures (default: 3) to track consecutive failures before triggering fallback
Add new command line argument --no-fallback-tool to disable fallback behavior (enabled by default)
Add new command line argument --fallback-tool-models to specify a comma-separated list of fallback tool models (default: "gpt-3.5-turbo,gpt-4")
This list defines the fallback model sequence used by forced tool calls (via bind_tools) when tool call failures occur.
Track failure count per tool call context
Reset failure counter on successful tool call
Store fallback model sequence per provider
Need to validate if ENV vars are set for provider usage of that fallback model before usage, if that fallback ENV is not available then fallback to the next model
Have default list of common models, first try claude-3-5-sonnet-20241022 but have many alternative fallback models.

Tool Call Wrapper

Create a new wrapper function to handle tool call execution with fallback logic:

def execute_tool_with_fallback(tool_call_func, *args, **kwargs):
    failures = 0
    max_failures = get_config().max_tool_failures

    while failures < max_failures:
        try:
            return tool_call_func(*args, **kwargs)
        except LLMError as e:
            failures += 1
            if failures >= max_failures:
                # Use forced tool call via bind_tools with retry:
                llm_retry = llm_model.with_retry(stop_after_attempt=3)  # Try three times
                try_fallback_model(force=True, model=llm_retry)
                # Merge fallback model chat messages back into the original chat history.
                merge_fallback_chat_history()
                failures = 0  # Reset counter for new model
            else:
                raise

The prompt passed to try_fallback_model, should be the failed last few failing tool calls.

Model Fallback Sequence

Define fallback sequences for each provider based on model capabilities:

Try same provider's smaller models
Try alternative providers' equivalent models
Raise final error if all fallbacks fail

Provider Strategy Updates

Update provider strategies to support fallback configuration:

Add provider-specific fallback sequences
Handle model capability validation during fallback
Track successful/failed attempts

Risks and Mitigations

Performance Impact
- Risk: Multiple fallback attempts could increase latency
- Mitigation: Set reasonable max_failures limit and timeouts
Consistency
- Risk: Different models may give slightly different outputs
- Mitigation: Validate output schema consistency across models
Cost
- Risk: Fallback to more expensive models
- Mitigation: Configure cost limits and preferred fallback sequences
State Management
- Risk: Loss of context during fallbacks
- Mitigation: Preserve conversation state and tool context

Acceptance Criteria

Tool calls automatically attempt fallback models after N consecutive failures
--no-fallback-tool argument successfully disables fallback behavior
Fallback sequence respects provider and model capabilities
Original error is preserved if all fallbacks fail
Unit tests cover fallback scenarios and edge cases
README.md updated to reflect new behavior

Testing

Unit tests for fallback wrapper
Integration tests with mock LLM failures
Provider strategy fallback tests
Command line argument handling
Error preservation and reporting
Performance impact measurement
Edge cases (e.g., partial failures, timeout handling)
State preservation during fallbacks

Documentation Updates

Add fallback feature to main README
Document --no-fallback-tool in CLI help
Document provider-specific fallback sequences

Future Considerations

Allow custom fallback sequences via configuration
Add monitoring and alerting for fallback frequency
Optimize fallback selection based on historical success rates
Cost-aware fallback routing

4.6 KiB Raw Blame History