Overview
The Agent Triage Protocol defines standard error responses that enable consistent error handling across implementations. This page describes the error response format, standard error codes, and best practices for handling errors in ATP implementations.Error Response Format
Error responses utilize a structured format that provides both machine-readable codes and human-understandable messages. The error object contains:Field | Type | Description |
---|---|---|
code | string | Standardized error identifier for programmatic handling |
message | string | Human-readable error description |
details | object | Additional context specific to the error type |
request_id | string | Unique identifier for request tracing |
request_id
field is particularly important for troubleshooting, as it allows correlation of error reports across system boundaries and log files.
HTTP Status Codes
The protocol uses standard HTTP status codes to indicate the class of error:Status Code | Description | When Used |
---|---|---|
400 Bad Request | Invalid request format or parameters | Malformed JSON, missing required fields |
401 Unauthorized | Missing or invalid authentication | Invalid or expired API key/token |
403 Forbidden | Valid auth but insufficient permissions | Attempting to access another service’s notifications |
404 Not Found | Resource doesn’t exist | Notification ID not found |
409 Conflict | Request conflicts with current state | Responding to already-answered notification |
422 Unprocessable Entity | Request validation failed | Response data doesn’t match expected format |
429 Too Many Requests | Rate limit exceeded | Too many requests in time period |
500 Internal Server Error | Server-side failure | Unexpected errors in ATP server |
503 Service Unavailable | Temporary service issues | Server maintenance or overload |
Error Codes
The protocol defines specific error codes that provide more detail than HTTP status alone. These codes allow client applications to implement specific handling logic for different error conditions.Authentication Errors
Code | Description |
---|---|
AUTH_INVALID_TOKEN | The provided token is malformed or invalid |
AUTH_EXPIRED_TOKEN | The authentication token has expired |
AUTH_INSUFFICIENT_PERMISSIONS | Token lacks required permissions |
Notification Errors
Code | Description |
---|---|
NOTIFICATION_NOT_FOUND | Notification doesn’t exist or is no longer accessible |
NOTIFICATION_EXPIRED | Notification deadline has passed |
NOTIFICATION_ALREADY_RESPONDED | Notification has already been answered |
NOTIFICATION_INVALIDATED | Service marked notification as invalid |
Validation Errors
Code | Description |
---|---|
INVALID_ACTION_ID | The specified action_id doesn’t exist for this notification |
INVALID_RESPONSE_DATA | Response data doesn’t match expected format |
CONSTRAINT_VIOLATION | Response violates defined constraints |
MISSING_REQUIRED_FIELD | Required field is missing from request |
Service Errors
Code | Description |
---|---|
SERVICE_NOT_REGISTERED | Service hasn’t been registered with ATP |
SERVICE_SUSPENDED | Service has been temporarily suspended |
CALLBACK_FAILED | Failed to deliver response to service callback |
Rate Limiting
Code | Description |
---|---|
RATE_LIMIT_EXCEEDED | Too many requests from this client/service |
QUOTA_EXCEEDED | Monthly/daily quota has been exceeded |
Client Error Handling
Robust client implementations must incorporate comprehensive error handling strategies to ensure reliable operation in production environments. The protocol distinguishes between transient failures that warrant retry attempts and permanent errors that require user intervention or alternative action.Transient vs. Permanent Errors
Transient errors are temporary issues that may resolve with time or retries:- All
5xx
series errors - Rate limiting (
429
) responses - Network connectivity issues
- Webhook delivery failures
- Authentication errors (except token expiration)
- Resource not found errors
- Validation errors
- Business logic errors (e.g., notification already responded)
Retry Strategies
For temporary failures, clients should implement exponential backoff retry strategies:- Initial retry delay should begin at one second
- Double the delay with each subsequent attempt
- Add small random jitter to prevent thundering herd problems
- Cap maximum delay at 60 seconds
- Limit total retry attempts (typically 3-5 is reasonable)
User Feedback
Client applications should provide appropriate feedback to users based on error types:- For transient errors, show a temporary “retrying” message
- For permanent errors, show clear explanation of the issue
- For validation errors, highlight the specific fields with problems
- For expired or invalidated notifications, remove them from the UI
- For authentication issues, prompt for re-authentication
Service Callback Errors
When the ATP server delivers responses to service webhook endpoints, services may encounter processing errors that prevent successful handling of user decisions. Services should communicate these errors using a consistent format that enables appropriate ATP server behavior.Field | Type | Description |
---|---|---|
code | string | Service-specific error identifier |
message | string | Technical error description for logging |
user_message | string | Human-readable message for potential user display |
retriable | boolean | Indicates whether retry attempts may succeed |
retriable
field is particularly important as it tells the ATP server whether it should attempt to deliver the response again later. If set to false
, the ATP server will not retry and may notify the user that their response could not be processed.
Logging and Monitoring
Robust ATP implementations should include comprehensive logging and monitoring for error conditions:- Log all errors with their request IDs
- Include contextual information in logs (user ID, service ID, notification ID)
- Monitor error rates by type and service
- Set up alerts for unusual error patterns
- Implement distributed tracing for complex deployments
- Never log authentication tokens
- Redact personal information from error logs
- Sanitize potentially sensitive fields in error details