ATTENTION: THIS PROJECT IS UNDER DEVELOPMENT/WILL BE DEVELOPED IN THE FUTURE
AMPLEC (Automated Malware-analysis Processing with Language Explanation for Consumers) is a software project designed to create a system that uses a Large Language Model (LLM) to interpret and explain the results of an automated malware analysis pipeline, known as Karton. The system aims to simplify the complex data generated by malware analysis, providing clear and concise explanations in natural language for security analysts.
-
Automated Interpretation:
- Uses LLM to interpret and explain malware analysis results in natural language.
-
User Interaction via Prompts:
- Provides predefined prompts for users to select relevant interpretations quickly.
-
Dynamic Data Handling:
- Manages evolving data from malware analysis, adapting to new threats and pipeline changes.
-
System Integration:
- Integrates with existing systems via APIs using Python and Flask.
-
Optional Advanced Features:
- Retrieval Augmented Generation (RAG): Adds context from external data sources.
- Function Calling: Allows the LLM to trigger further analysis tasks.
- Open Prompting: Users can create custom prompts for more flexibility.
-
Architecture:
- Three main components: the Karton malware analysis system, the LLM, and a web interface/API.
- The LLM interprets analysis results and communicates with the web interface.
-
System Connections:
- Uses APIs for component communication, implemented in Python.
-
LLM Implementation:
- Selects and configures a local LLM model.
- Utilizes Langchain for integrating actions and potential RAG extension.
- The system's effectiveness will be tested with known malware samples and compared with existing tools and human analysts to identify strengths and areas for improvement.