Action System
Action System
Update Summary
Changes Made
- Updated Coding Actions section to reflect enhanced code generation prompt with structured role, task instructions, reasoning framework, output requirements, and safety constraints
- Added new section on Prompt Management System to explain centralized prompt handling
- Enhanced security considerations with additional safety constraints from the enhanced prompt
- Updated referenced files list to include prompt_manager.py, code_generation.json, and llm.py
- Added details about dynamic prompt enhancement with mood adaptation and safety constraints
Table of Contents
- Introduction
- Action Base Class and Execute Contract
- Action Registry and Discovery Mechanism
- Action Manager Execution Lifecycle
- Built-in Action Types and Use Cases
- Prompt Management System
- Defining and Registering Custom Actions
- System State and Service Interaction
- Security Considerations for Action Execution
- Troubleshooting Common Issues
- Conclusion
Introduction
The Action System is a core component of the Ravana AGI framework, responsible for executing tasks based on decisions made by the decision engine. It provides a structured, extensible mechanism for defining, registering, and executing actions that the AGI can perform. The system is designed with modularity, safety, and scalability in mind, enabling both built-in and custom actions to be seamlessly integrated. This document provides a comprehensive overview of the action system's architecture, functionality, and best practices.
Action Base Class and Execute Contract
The Action
class serves as the abstract base class for all executable actions within the system. It defines a standardized interface that ensures consistency across different types of actions.
Diagram sources
Key Properties
- name: A unique identifier for the action (e.g., "write_python_code")
- description: Human-readable explanation of what the action does
- parameters: List of dictionaries defining input parameters with name, type, description, and required status
Execute Method Contract
The execute
method is an abstract async method that must be implemented by all subclasses. It receives keyword arguments matching the defined parameters and returns the result of the action execution. The method contract requires:
- Parameter validation via
validate_params
before execution - Asynchronous execution using
async/await
pattern - Proper error handling and logging
- Return of a dictionary with status information or execution results
Validation and Serialization
The base class provides built-in validation through validate_params
, which checks for missing required parameters and unexpected parameters. It also includes to_dict
and to_json
methods for serializing action metadata, which is used in LLM prompts and API responses.
Section sources
Action Registry and Discovery Mechanism
The ActionRegistry
is responsible for managing the collection of available actions and providing lookup functionality.
Diagram sources
Registration Process
The registry initializes with several built-in actions:
ProposeAndTestInventionAction
LogMessageAction
WritePythonCodeAction
ExecutePythonFileAction
BlogPublishAction
CollaborativeTaskAction
These are registered during initialization with their required dependencies (system and data_service). The CollaborativeTaskAction
is now included as a core built-in action, reflecting its importance in the cross-system task delegation workflow.
Public Registration Methods
- register_action: Public method to register a new action instance
- _register_action: Internal method that handles registration with overwrite warnings
Automatic Discovery
The discover_actions
method uses Python's pkgutil.walk_packages
to automatically discover and register all action classes in the core.actions
package. It:
- Iterates through all modules in the actions package
- Imports each module
- Finds all classes that inherit from
Action
(excluding the base class itself) - Instantiates and registers each action
- Logs warnings for duplicate names and errors for instantiation failures
This discovery mechanism enables plug-and-play extensibility - new actions can be added by simply creating a new file in the actions directory with a properly defined action class.
Section sources
Action Manager Execution Lifecycle
The ActionManager
orchestrates the execution of actions, handling the complete lifecycle from decision parsing to result return.
Diagram sources
Decision Parsing
The execute_action
method handles two decision formats:
- Pre-parsed action dictionary: Contains "action" and "params" keys
- Raw LLM response: Contains "raw_response" with JSON embedded in markdown code blocks
For raw responses, the system extracts the JSON block using string manipulation and parses it. If no valid JSON block is found, it attempts to parse the entire response as JSON.
Execution Flow
- Extract action name and parameters from the decision
- Retrieve the action instance from the registry
- Log the execution attempt
- Execute the action with provided parameters
- Log successful execution to the database
- Return the result
Error Handling
The execution lifecycle includes comprehensive error handling:
- ActionException: For expected action-related errors (logged with error status)
- General Exception: For unexpected errors (logged with full traceback)
- All errors are logged to the database via
save_action_log
The method returns a standardized error object with an "error" key, ensuring consistent error reporting to the decision engine.
Section sources
Built-in Action Types and Use Cases
The system provides several built-in action types categorized by functionality.
Coding Actions
These actions enable the AGI to generate and execute code.
WritePythonCodeAction
Generates Python code based on a hypothesis and test plan using the LLM with an enhanced prompt structure.
Parameters:
- file_path: Where to save the generated code
- hypothesis: The concept to test
- test_plan: How to test the hypothesis
The action uses a structured prompt template with multiple sections to guide code generation:
- [ROLE DEFINITION]: Defines the AI as an expert programmer
- [CONTEXT]: Provides the hypothesis and test plan
- [TASK INSTRUCTIONS]: Step-by-step process for code generation
- [REASONING FRAMEWORK]: Software engineering best practices
- [OUTPUT REQUIREMENTS]: Specifications for code quality and format
- [SAFETY CONSTRAINTS]: Security and reliability guidelines
The prompt ensures high-quality code generation by requiring:
- Clear, descriptive variable and function names
- Comprehensive inline documentation
- Proper error handling and edge case management
- Efficient algorithms and data structures
- Adherence to Python conventions and best practices
- Confidence score for solution correctness (0.0-1.0)
The action extracts code from markdown code blocks in the LLM response and writes it to the specified file.
ExecutePythonFileAction
Executes a Python script and captures its output.
Parameters:
- file_path: Path to the Python script
Uses asyncio.create_subprocess_shell
to run the script, capturing stdout and stderr. Returns execution results including return code and output.
Section sources
IO Actions
LogMessageAction
Records messages to the console with configurable logging levels.
Parameters:
- message: Content to log
- level: Logging level (info, warning, error)
Appends "[AGI Thought]:" prefix to messages for easy identification in logs.
Section sources
Multi-modal Actions
These actions process various media types and are registered by the EnhancedActionManager
.
ProcessImageAction
Analyzes image files using multi-modal services.
ProcessAudioAction
Processes and analyzes audio files.
AnalyzeDirectoryAction
Analyzes all media files in a directory, optionally recursively.
CrossModalAnalysisAction
Performs analysis across multiple content types.
These actions validate file existence before processing and can add results to the knowledge base.
Section sources
Experimental Actions
ProposeAndTestInventionAction
Enables the AGI to propose novel ideas and test them through the experimentation engine.
Parameters:
- invention_description: The novel concept
- test_plan_suggestion: How to test it
Frames the invention as a hypothesis and runs it through the advanced experimentation engine, logging results to the database. This action represents the AGI's creative and scientific reasoning capabilities.
Section sources
Blog Actions
New blog-specific actions have been added to support autonomous content creation and publishing.
BlogPublishAction
Orchestrates the complete blog publishing workflow, from content generation to platform publication.
Diagram sources
Parameters:
- topic: Main subject for the blog post (required)
- style: Writing style from available options (optional)
- context: Additional aspects to focus on (optional)
- custom_tags: Tags to include beyond auto-generated ones (optional)
- dry_run: If true, generates content without publishing (optional)
The action follows a comprehensive workflow:
- Validates configuration and parameters
- Generates content using LLM with memory context
- Validates API configuration
- Publishes to the blog platform
- Logs results to data and memory services
The action includes comprehensive error handling for content generation failures, API errors, and unexpected exceptions. It also provides utility methods like test_connection()
for API connectivity testing and get_configuration_info()
for retrieving current blog settings.
Section sources
Collaborative Task Actions
New collaborative task actions have been added to support cross-system task delegation and user collaboration.
CollaborativeTaskAction
Manages collaborative tasks between RAVANA and users with feedback mechanisms.
Diagram sources
Parameters:
- task_type: Type of collaborative task (create, update, complete, cancel, provide_feedback, request_feedback)
- task_id: Unique identifier for the task (required for update, complete, cancel, provide_feedback)
- title: Title of the task (required for create)
- description: Detailed description of the task
- user_id: User ID to collaborate with
- priority: Priority level (low, medium, high, critical)
- deadline: Deadline for task completion (ISO format)
- feedback: Feedback content (required for provide_feedback)
- feedback_type: Type of feedback (positive, negative, suggestion, question)
- rating: Numerical rating for the task (1-10)
The action supports a comprehensive workflow for collaborative task management:
- Create tasks: Initiates new collaborative tasks with users
- Update tasks: Modifies existing task details
- Complete tasks: Marks tasks as completed with user notification
- Cancel tasks: Cancels tasks with user notification
- Provide feedback: Collects user feedback on task performance
- Request feedback: Proactively requests feedback from users
The action integrates with the conversational AI system to send notifications and messages to users at key points in the task lifecycle. It maintains in-memory storage for tasks and feedback, with comprehensive logging and history tracking for all collaboration events.
Section sources
Prompt Management System
The system now features a centralized PromptManager that handles all prompt templates and their dynamic enhancement.
Diagram sources
Centralized Prompt Repository
The PromptManager uses a repository pattern to store and retrieve prompt templates from JSON files in the prompts directory. Each template includes:
- name: Unique identifier for the prompt
- template: The actual prompt text with placeholders
- metadata: Additional information about category, description, and version
- version: Version tracking for prompt evolution
- created_at/updated_at: Timestamps for version history
Enhanced Prompt Structure
The coding action now uses a structured prompt with multiple sections that guide the LLM's response:
- [ROLE DEFINITION]: Establishes the AI's identity and expertise
- [CONTEXT]: Provides specific task details and parameters
- [TASK INSTRUCTIONS]: Step-by-step process for completing the task
- [REASONING FRAMEWORK]: Methodological approach to problem-solving
- [OUTPUT REQUIREMENTS]: Specific format and quality expectations
- [SAFETY CONSTRAINTS]: Security and reliability guidelines
Dynamic Prompt Enhancement
The system applies dynamic enhancements to prompts through post-processing:
- Mood adaptation: Adjusts prompt tone based on the AI's emotional state
- Safety constraints: Adds context-specific safety requirements
- Confidence scoring: Requires the LLM to include confidence scores
- Risk assessment: Adds requirements for risk identification and mitigation
Template Registration and Retrieval
The system automatically loads all JSON files in the prompts directory as templates. Templates can be retrieved by name with context variables that are automatically substituted. The system validates prompts for required sections before use.
Section sources
Defining and Registering Custom Actions
Creating custom actions follows a straightforward process outlined in the developer guide.
Action Structure Requirements
All custom actions must:
- Inherit from the
Action
base class - Implement the required properties (
name
,description
,parameters
) - Implement the
execute
method with async functionality
Step-by-Step Implementation
- Create a Python file in
core/actions/
(e.g.,core/actions/misc.py
) - Define the action class inheriting from
Action
- Implement required properties with appropriate metadata
- Implement the execute method with the desired functionality
from core.actions.action import Action
from typing import Any, Dict, List
class HelloWorldAction(Action):
@property
def name(self) -> str:
return "hello_world"
@property
def description(self) -> str:
return "A simple action that prints a greeting."
@property
def parameters(self) -> List[Dict[str, Any]]:
return [
{
"name": "name",
"type": "string",
"description": "The name to include in the greeting.",
"required": True,
}
]
async def execute(self, **kwargs: Any) -> Any:
name = kwargs.get("name")
return f"Hello, {name}!"
Registration Mechanism
Custom actions are automatically discovered and registered when:
- The action class is defined in a module within
core.actions
- The class inherits from
Action
and is not the base class itself - The module is importable
No manual registration is required - the discover_actions
method will automatically find and instantiate the action during system initialization.
Section sources
System State and Service Interaction
Actions interact with system state and services through dependencies injected during initialization.
Dependency Injection
The Action
base class constructor accepts two dependencies:
- system: Reference to the main
AGISystem
instance - data_service: Reference to the
DataService
for database operations
These dependencies are passed down from the ActionManager
through the ActionRegistry
to all action instances.
State Access Patterns
Actions access system state through the injected dependencies:
- AGISystem: Provides access to other system components like
knowledge_service
- DataService: Enables database operations like logging actions and experiments
For example, the ProposeAndTestInventionAction
uses self.data_service.save_experiment_log
to record experiment results, while multi-modal actions use self.system.knowledge_service.add_knowledge
to update the knowledge base.
Service Integration
The enhanced action manager demonstrates deeper service integration by:
- Creating a
MultiModalService
instance for media processing - Using
asyncio.to_thread
to call synchronous service methods without blocking the event loop - Implementing caching mechanisms to improve performance
This pattern ensures that actions can leverage system services while maintaining proper separation of concerns.
Section sources
Security Considerations for Action Execution
The action system incorporates several security measures to prevent misuse and ensure safe execution.
Code Execution Safety
The ExecutePythonFileAction
executes code in subprocesses, which provides isolation from the main application process. However, this still represents a potential security risk as executed code has full system access.
The EnhancedActionManager
implements a 5-minute timeout for action execution using asyncio.wait_for
, preventing infinite loops or long-running processes from blocking the system.
Enhanced Safety Constraints
The enhanced code generation prompt includes comprehensive safety constraints:
- Security vulnerabilities: Prevents injection, buffer overflows, and other vulnerabilities
- Resource leaks: Ensures proper resource management and memory handling
- Unintended actions: Validates that code performs only intended operations
- Input/output validation: Requires validation of all inputs and outputs
- Secure coding practices: Enforces adherence to security best practices
These constraints are embedded in the prompt structure and are applied to all code generation requests.
Blog-Specific Security
The BlogPublishAction
introduces new security considerations:
- API Key Management: The action uses the enhanced Gemini API key management system with automatic rotation
- Content Validation: Generated content is validated before publication
- Dry Run Mode: Allows content generation without actual publication for review
- Configuration Validation: API configuration is validated before any publication attempts
Collaboration-Specific Security
The CollaborativeTaskAction
introduces new security considerations:
- User ID Validation: Ensures user IDs are properly validated before task creation
- Rating Validation: Validates that ratings are within the 1-10 range
- Feedback Sanitization: Should implement input sanitization for feedback content
- Task Ownership: Ensures users can only modify tasks they own
Input Validation
The base Action
class includes validate_params
which:
- Checks for missing required parameters
- Rejects unexpected parameters
- Raises
InvalidActionParams
for invalid inputs
This prevents actions from executing with incomplete or malformed data.
Permission and Access Control
Currently, the system lacks explicit permission controls. Actions have access to:
- Full file system (via file paths)
- Database operations
- System process execution
For production use, additional security layers should be implemented, such as:
- Sandboxed execution environments
- File system access restrictions
- Rate limiting for resource-intensive actions
- Action-specific permission policies
Self-Modification Risks
The system allows actions to write and execute code, creating potential self-modification capabilities. While this enables powerful autonomous behavior, it also introduces risks:
- Code generation errors could corrupt system functionality
- Malicious LLM output could introduce harmful code
- Recursive self-modification could lead to instability
Mitigation strategies include:
- Code review before execution (not currently implemented)
- Backup and rollback mechanisms
- Execution in isolated environments
- Static analysis of generated code
Section sources
- action_manager.py
- enhanced_action_manager.py
- coding.py
- blog.py
- collaborative_task.py
- prompt_manager.py
Troubleshooting Common Issues
Registration Failures
Symptoms: Action not found in registry, ActionException
with "Action 'action_name' not found"
Causes and Solutions:
- Class not discovered: Ensure the action class is in a module within
core.actions
and properly imports theAction
base class - Instantiation error: Check for exceptions in the
__init__
method; the discovery mechanism logs instantiation errors - Name collision: The registry overwrites actions with duplicate names; check logs for overwrite warnings
- Missing required properties: Verify implementation of
name
,description
, andparameters
properties
Execution Timeouts
Symptoms: Actions failing with "Action timed out" message Causes and Solutions:
- Long-running operations: The
EnhancedActionManager
enforces a 5-minute timeout; optimize action logic or increase timeout if appropriate - Blocking operations: Ensure all I/O operations use async patterns; use
asyncio.to_thread
for synchronous calls - Resource constraints: Check system resources (CPU, memory, disk I/O) that may slow execution
Permission Issues
Symptoms: File not found, permission denied, or subprocess execution failures Causes and Solutions:
- File access: Verify the application has read/write permissions to specified file paths
- Directory existence: Check that directories exist before writing files
- Subprocess execution: Ensure Python is in the system PATH and the executing user has permission to run subprocesses
- Absolute vs relative paths: Use absolute paths or ensure relative paths are correct relative to the working directory
Parameter Validation Errors
Symptoms: InvalidActionParams
exceptions with messages about missing or unexpected parameters
Causes and Solutions:
- Missing required parameters: Ensure all parameters marked as
required: True
are provided - Typos in parameter names: Double-check parameter names match exactly
- Extra parameters: Remove parameters not defined in the action's
parameters
list - Type mismatches: Ensure parameter values match the expected types
General Debugging Tips
- Check application logs for detailed error messages and stack traces
- Verify action registration by checking the "Available Actions" log output during startup
- Use the
get_action_definitions
method to inspect registered action schemas - Test actions in isolation when possible
- Monitor the action log database table for execution history and errors
Section sources
Conclusion
The Action System in the Ravana AGI framework provides a robust, extensible foundation for autonomous behavior. By defining a clear contract through the Action
base class, implementing a flexible registration mechanism in ActionRegistry
, and orchestrating execution through ActionManager
, the system enables both built-in and custom actions to be seamlessly integrated. The architecture supports various action types including coding, IO, multi-modal, experimental, blog-specific, and now collaborative task actions, allowing the AGI to perform diverse tasks. The recent addition of CollaborativeTaskAction
enhances the system's cross-system task delegation capabilities, enabling end-to-end collaborative workflows with user feedback mechanisms. The enhanced prompt management system with structured templates and dynamic enhancement significantly improves code generation quality and safety. While the system includes basic security measures like input validation and execution timeouts, additional safeguards would be beneficial for production deployment, particularly around code execution and permission controls. The automatic discovery mechanism and clear implementation guidelines make it straightforward to extend the AGI's capabilities with new actions, supporting the system's goal of continuous learning and adaptation.
Referenced Files in This Document
- action.py - Base Action class implementation
- registry.py - ActionRegistry with CollaborativeTaskAction registration
- action_manager.py
- coding.py - Enhanced code generation prompt implementation
- io.py
- multi_modal.py
- experimental.py
- exceptions.py
- enhanced_action_manager.py
- blog.py - Added in recent commit
- blog_api.py - Dependency for BlogPublishAction
- blog_content_generator.py - Dependency for BlogPublishAction
- collaborative_task.py - Added in recent commit
- DEVELOPER_GUIDE.md
- prompt_manager.py - Prompt management system with enhanced templates
- code_generation.json - Enhanced code generation prompt template
- llm.py - LLM integration and call handling