A command-line tool for ingesting and querying Software Bill of Materials (SBOM) files. This tool parses SBOM files in CycloneDX 1.6 format, stores them in a local SQLite database, and provides powerful querying capabilities to find components and licenses across your SBOM library.
- Ingest SBOMs: Parse and store CycloneDX 1.6 JSON SBOM files
- Query by Component: Find all SBOMs containing a specific component (with optional version filtering)
- Query by License: Search for SBOMs by license identifier or name
- SQLite Storage: Lightweight, file-based database with no external dependencies
- Rich Output: Beautiful, formatted tables in the terminal
- Clone this repository or navigate to the project directory
- Create a virtual environment (recommended):
python3 -m venv venv source venv/bin/activate # On Windows: venv\Scripts\activate
- Install the CLI tool:
pip install -e .
Parse and store an SBOM file in the database:
sbom-cli ingest <sbom-file.json>Example:
sbom-cli ingest examples/sample-sbom.jsonFind all SBOMs containing a specific component:
sbom-cli query --component <component-name>Find SBOMs with a specific component version:
sbom-cli query --component <component-name> --version <version>Examples:
sbom-cli query --component numpy
sbom-cli query --component requests --version 2.31.0Find all SBOMs containing a specific license:
sbom-cli query --license <license-identifier>Example:
sbom-cli query --license MIT
sbom-cli query --license Apache-2.0- main.py: CLI interface built with Typer, handles commands and user interaction
- database.py: SQLite database management with schema definition and CRUD operations
- parser.py: SBOM parser supporting CycloneDX 1.6 JSON format
- requirements.txt: Python dependencies (typer, rich)
- setup.py: Package configuration for installation
The tool uses a normalized SQLite schema with three tables:
- sboms: Stores SBOM metadata (filename, format, version, ingestion timestamp)
- components: Stores component information (name, version, type, PURL)
- licenses: Stores license information linked to components
Indexes are created on frequently queried fields (component name, version, license ID) for optimal performance.
Sample SBOM files are provided in the examples/ directory for testing.
- SQLite: Chosen for its simplicity, zero configuration, and persistence. Perfect for local usage and easy to migrate to PostgreSQL for production.
- CycloneDX 1.6: Focus on one widely-adopted SBOM format. The parser architecture is extensible for adding SPDX support.
- Typer + Rich: Provides excellent developer experience with type hints, automatic help generation, and beautiful terminal output.
- Normalized Schema: Separate tables for SBOMs, components, and licenses allow for flexible queries and avoid data duplication.
Given the 1-hour time constraint, the following could be added:
- SPDX Support: Add parser for SPDX 3.0 format
- Bulk Operations: Batch ingestion of multiple SBOMs
- Advanced Queries: Search by component type, PURL patterns, or multiple criteria
- Export: Export query results to JSON/CSV
- Deduplication: Identify and merge duplicate components across SBOMs
- Vulnerability Integration: Cross-reference with CVE databases
- Web API: REST API for multi-user access
To handle thousands of SBOMs with millions of components:
- Database: Migrate to PostgreSQL with connection pooling
- Indexing: Add full-text search (GIN/GiST indexes) and partial indexes
- Caching: Implement Redis for frequently accessed queries
- Async Processing: Use task queues (Celery) for parallel ingestion
- Partitioning: Partition tables by date or SBOM count
- Monitoring: Add logging, metrics (Prometheus), and tracing
- API Layer: Build FastAPI service with rate limiting and authentication
MIT