Skip to content

ties2/malware-ai-agent

Repository files navigation

Malware AI Agent

The Malware AI Agent is an advanced, AI-powered tool designed for malware analysis and threat intelligence generation. It collects malware metadata from multiple sources (MalwareBazaar, VirusTotal, VirusShare, and local samples), extracts features, classifies malware using machine learning, and generates detailed threat intelligence reports in multiple formats (JSON, Markdown, text). The project leverages YARA rules for static analysis, Random Forest classification for malware type prediction, and secure handling of API keys using environment variables.

Features

  • Multi-Source Data Collection: Fetches malware metadata from:
    • MalwareBazaar (abuse.ch)
    • VirusTotal (optional, with API key)
    • VirusShare (optional, with API key)
    • Local sample directories
  • Feature Extraction:
    • Static analysis using YARA rules and pattern matching
    • Metadata-based features (file size, type, tags, etc.)
    • String analysis for suspicious patterns (URLs, IP addresses, registry keys)
    • PE file analysis for Windows executables
  • Machine Learning Classification: Uses a Random Forest Classifier to predict malware types (e.g., ransomware, trojan, botnet).
  • Threat Intelligence Reports: Generates reports in JSON, Markdown, and text formats, summarizing malware families, file types, and predictions.
  • Secure Configuration: Stores API keys in a .env file or secure configuration file, excluded from version control.
  • Extensible Design: Supports additional data sources, YARA rules, and machine learning models (e.g., deep learning with transformers and torch).
  • Logging and Error Handling: Comprehensive logging to malware_agent.log for debugging and monitoring.
  • Safe Malware Handling: Optional sample downloading with containment in a quarantine directory (disabled by default).

Prerequisites

  • Python 3.8+
  • API Keys (optional, depending on data sources):
  • System Requirements:
    • Adequate disk space for data, models, and samples (if downloading)
    • Optional: GPU for deep learning with torch (if enabled)
  • Dependencies: Listed in requirements.txt (see Installation section)

Installation

Follow these steps to set up the Malware AI Agent:

  1. Clone the Repository:

    git clone https://github.com/ties2/malware-ai-agent
    cd malware-ai-agent
  2. Create a Virtual Environment (recommended):

    python -m venv venv
    source venv/bin/activate  # Linux/macOS
    venv\Scripts\activate     # Windows
  3. Install Dependencies:

    pip install -r requirements.txt
  4. Configure API Keys:

    • Create a .env file in the project root:
      MALWAREBAZAAR_API_KEY=your_malwarebazaar_api_key
      VIRUSTOTAL_API_KEY=your_virustotal_api_key
      VIRUSSHARE_API_KEY=your_virusshare_api_key
      
    • Alternatively, create a ~/.malware_agent_config.json file:
      {
          "MALWAREBAZAAR_API_KEY": "your_malwarebazaar_api_key",
          "VIRUSTOTAL_API_KEY": "your_virustotal_api_key",
          "VIRUSSHARE_API_KEY": "your_virusshare_api_key"
      }
    • Set file permissions (Linux/macOS):
      chmod 600 .env
      chmod 600 ~/.malware_agent_config.json
  5. Verify Setup:

    • Run the script to ensure dependencies and configuration are correct:
      python malware_agent.py

Usage

  1. Run the Script:

    python malware_agent.py

    This will:

    • Collect malware metadata from configured sources
    • Extract features and classify malware
    • Generate reports in output/ (JSON, Markdown, text)
  2. Customize Configuration (optional):

    • Edit malware_agent.py to modify Config settings, such as:
      • DATASOURCES: Add/remove sources (["malwarebazaar", "local", "virusshare", "virustotal"])
      • DOWNLOAD_SAMPLES: Set to True to download samples (use with caution in a secure environment)
      • REPORT_FORMATS: Choose output formats (["json", "markdown", "txt"])
  3. View Output:

    • Check output/ for generated reports (e.g., malware_report_20250428_120000.md)
    • Review malware_agent.log for execution details and errors
  4. Safety Note:

    • Do not enable DOWNLOAD_SAMPLES unless running in a secure, isolated environment (e.g., a sandbox or VM).
    • Ensure .env or ~/.malware_agent_config.json is excluded from version control (see .gitignore).

Sample Output

Below are excerpts from sample reports generated by the Malware AI Agent.

Markdown Report (malware_report_20250428_120000.md)

# Malware Analysis Report

**Generated on:** 2025-04-28 12:00:00

## Summary
- **Total Samples Analyzed:** 50
- **Detected Malware Types:**
  - ransomware: 15 samples
  - trojan: 20 samples
  - botnet: 10 samples
  - unknown: 5 samples

## Detailed Analysis

### Sample 1
- **SHA256 Hash:** a1b2c3d4e5f6...
- **File Name:** sample1.exe
- **Predicted Malware Type:** ransomware
- **Confidence:** 85.23%
- **Probability Distribution:**
  - ransomware: 85.23%
  - trojan: 10.12%
  - botnet: 4.65%
- **Key Features:**
  - file_size: 204800
  - is_pe: 1
  - tag_ransomware: 1
  - static_entropy: 7.8
  - static_suspicious_string_count: 3
- **Static Analysis:**
  - YARA Matches:
    - Rule: Ransomware_Generic (Detects potential ransomware characteristics)
  - Suspicious Strings:
    - "your files have been encrypted" (ransomware)
    - "bitcoin payment" (ransomware)
    - "http://malicious.site" (url)

### Sample 2
- **SHA256 Hash:** f6e5d4c3b2a1...
- **File Name:** trojan.dll
- **Predicted Malware Type:** trojan
- **Confidence:** 92.15%
...

## Notes
- This report was generated automatically by the Malware AI Agent.
- For detailed technical information, refer to the JSON report.

Text Report (malware_report_20250428_120000.txt)

Malware Analysis Report
==============================
Generated on: 2025-04-28 12:00:00

Summary
------------------------------
Total Samples Analyzed: 50
Detected Malware Types:
  - ransomware: 15 samples
  - trojan: 20 samples
  - botnet: 10 samples
  - unknown: 5 samples

Detailed Analysis
------------------------------
Sample 1
  SHA256 Hash: a1b2c3d4e5f6...
  File Name: sample1.exe
  Predicted Malware Type: ransomware
  Confidence: 85.23%
  Probability Distribution:
    - ransomware: 85.23%
    - trojan: 10.12%
    - botnet: 4.65%

Sample 2
  SHA256 Hash: f6e5d4c3b2a1...
  File Name: trojan.dll
  Predicted Malware Type: trojan
  Confidence: 92.15%
...

Notes
------------------------------
- This report was generated automatically by the Malware AI Agent.

Log File (malware_agent.log)

2025-04-28 12:00:00,000 - malware_ai_agent - INFO - Starting Malware AI Agent
2025-04-28 12:00:00,001 - malware_ai_agent - INFO - Loaded configuration from .env
2025-04-28 12:00:00,002 - malware_ai_agent - INFO - Created directory: output
2025-04-28 12:00:01,123 - malware_ai_agent - INFO - Collecting data from sources: malwarebazaar, local, virusshare
2025-04-28 12:00:05,456 - malware_ai_agent - INFO - Collected 50 unique samples from 3 sources
2025-04-28 12:00:10,789 - malware_ai_agent - INFO - Loaded model from models/malware_model.pkl
2025-04-28 12:00:12,234 - malware_ai_agent - INFO - Report generated: output/malware_report_20250428_120000.md
2025-04-28 12:00:12,345 - malware_ai_agent - INFO - Malware AI Agent completed successfully

CI/CD

This project uses GitHub Actions for continuous integration and deployment:

  • Linting: Runs flake8 to enforce code style.
  • Formatting: Uses black to check code formatting.
  • Type Checking: Executes mypy for static type analysis.
  • Testing: Runs pytest for unit tests (add tests in a tests/ directory).
  • Build: Ensures dependencies are installed and the script runs without errors.

Testing

The project includes unit tests for key components using pytest. To run tests:

pytest tests/

## Project Structure

Project Structure for Malware AI Agent

malware-ai-agent/
├── .env                    # Environment file for API keys (e.g., MalwareBazaar, VirusTotal, VirusShare)
├── .gitignore              # Specifies files/directories to ignore in version control (e.g., .env, logs, data)
├── requirements.txt        # Lists Python dependencies (e.g., Flask, pandas, scikit-learn, yara-python)
├── malware_agent.py        # Core Malware AI Agent script containing DataCollector, FeatureExtractor, etc.
├── Dashboard.py            # Flask web application for the black and green themed dashboard
├── templates/              # HTML templates for Flask web interface
│   ├── index.html          # Home page for initiating analysis or file uploads
│   ├── upload.html         # Page for uploading files for malware analysis
│   ├── report.html         # Page for viewing generated analysis reports
├── static/                 # Static assets for web interface
│   ├── css/
│   │   ├── style.css       # CSS file defining the black and green theme
├── data/                   # Stores collected data and extracted features
│   ├── malware_data.json   # Collected malware metadata
│   ├── extracted_features.pkl # Extracted feature data
│   ├── downloaded_samples.json # Metadata for downloaded samples
├── models/                 # Stores trained machine learning models
│   ├── malware_model.pkl   # Trained RandomForestClassifier model
│   ├── model_metadata.json # Model metadata (feature names, labels, timestamp)
├── output/                 # Stores generated analysis reports
│   ├── malware_report_*.{json,md,txt} # Reports in JSON, Markdown, and text formats
├── samples/                # Stores malware samples
│   ├── uploads/            # Subdirectory for user-uploaded files
│   ├── quarantine/         # Subdirectory for downloaded malware samples
├── patterns/               # Stores malware pattern definitions
│   ├── malware_patterns.json # JSON file with regex and hex patterns
├── yara_rules/             # Stores YARA rules for malware detection
│   ├── ransomware.yar      # YARA rule for ransomware detection
│   ├── backdoor.yar        # YARA rule for backdoor detection
│   ├── trojan.yar          # YARA rule for trojan detection
├── malware_agent.log       # Log file for Malware AI Agent operations
├── web_interface.log       # Log file for Flask web interface operations

Notes

  • Malware Handling: Only enable sample downloading in a secure, isolated environment.

For issues or feature requests, open an issue on the GitHub repository.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published