While we initially explored both Node.js and Python implementations, we've found that Node.js offers significant advantages for this project:
-
Multiple Error Reporting: The Node.js version, using Ajv and better-ajv-errors, can report multiple validation errors in a single run. This is crucial for efficiently identifying and fixing all issues in complex configurations.
-
Detailed Error Information: The Node.js implementation provides more comprehensive error details, including the exact location, context, and suggestions for fixing issues.
-
Better Schema Support: Ajv in Node.js offers more extensive JSON Schema support, allowing for more complex and fine-grained validation rules.
-
Performance: For large configuration files, Node.js typically offers better performance in parsing and validating JSON and YAML.
-
Ecosystem: The Node.js ecosystem has more mature and actively maintained libraries for JSON Schema validation and error reporting.
Given these advantages, we recommend using the Node.js version of the validator for the best experience and most comprehensive validation results.
This project provides a Node.js script to validate YAML configuration files against a JSON schema. It uses Ajv for validation, js-yaml for YAML parsing, and better-ajv-errors for detailed error reporting.
node validate.js
Validating configuration...
Validation errors:
TYPE must be number
216 | {
217 | "module_type": "recency_filter",
> 218 | "threshold": "2015-01-01T00:00:00.000Z"
| ^^^^^^^^^^^^^^^^^^^^^^^^^^ 👈🏽 type must be number
219 | },
220 | {
221 | "module_type": "threshold_cutoff",
Error 1:
Path: /node_lines/1/nodes/3/modules/3/threshold
Message: must be number
Additional details:
{
"type": "number"
}
Problematic config part:
"2015-01-01T00:00:00.000Z"
Error 2:
Path: /node_lines/2/nodes/1/modules/1
Message: must have required property 'module_type'
Additional details:
{
"missingProperty": "module_type"
}
Problematic config part:
{
"module_type1": "openai_llm",
"llm": "gpt-3.5-turbo",
"temperature": 0.8
}
Config is invalid. Please correct the errors and try again.
The main objectives of this validator are:
- Verify the structural correctness of YAML configuration files against a JSON schema
- Identify multiple issues in a single run, including type mismatches and missing required fields
- Provide detailed error information with exact locations and context
- Facilitate efficient correction of configuration files
-
Install required packages:
npm install js-yaml ajv better-ajv-errors glob
-
Place the JSON schema file (
config_schema.json
) and YAML configuration files in theconfig
folder. -
Run the validation script:
node main.js
-
The script will validate the configuration file and output the results:
- If valid: Outputs "Configuration is valid!" message
- If invalid: Outputs detailed information for all encountered errors
- Run the validation script and carefully read all error messages.
- For each error, locate the problematic part in your configuration file using the specified path.
- Modify the configuration file to address all reported errors.
- Re-run the validation script to ensure all issues have been resolved.
- Implement custom validation rules
- Develop a web-based interface for easier interaction
- Integrate with CI/CD pipelines for automated config validation
- Explore options for generating configuration templates based on the schema
Contributions to improve the validator are welcome. Please submit pull requests or open issues to discuss potential improvements or report bugs.
✗ python validate.py
Validating configuration...
Validation errors:
Path: node_lines -> 1 -> nodes -> 3 -> modules -> 3 -> threshold
Message: datetime.date(2015, 1, 1) is not of type 'number'
Problematic config part:
Traceback (most recent call last):
File "/Users/martin/Development/autorag-saas/yaml-schema/validate.py", line 24, in validate_config
validate(instance=config, schema=schema)
File "/Users/martin/.pyenv/versions/3.10.13/lib/python3.10/site-packages/jsonschema/validators.py", line 1307, in validate
raise error
jsonschema.exceptions.ValidationError: datetime.date(2015, 1, 1) is not of type 'number'
Failed validating 'type' in schema['properties']['node_lines']['items']['properties']['nodes']['items']['properties']['modules']['items']['properties']['threshold']:
{'type': 'number'}
On instance['node_lines'][1]['nodes'][3]['modules'][3]['threshold']:
datetime.date(2015, 1, 1)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/Users/martin/Development/autorag-saas/yaml-schema/validate.py", line 51, in <module>
is_valid = validate_config(config, schema)
File "/Users/martin/Development/autorag-saas/yaml-schema/validate.py", line 41, in validate_config
print(json.dumps(current_obj, indent=2))
File "/Users/martin/.pyenv/versions/3.10.13/lib/python3.10/json/__init__.py", line 238, in dumps
**kw).encode(obj)
File "/Users/martin/.pyenv/versions/3.10.13/lib/python3.10/json/encoder.py", line 201, in encode
chunks = list(chunks)
File "/Users/martin/.pyenv/versions/3.10.13/lib/python3.10/json/encoder.py", line 438, in _iterencode
o = _default(o)
File "/Users/martin/.pyenv/versions/3.10.13/lib/python3.10/json/encoder.py", line 179, in default
raise TypeError(f'Object of type {o.__class__.__name__} '
TypeError: Object of type date is not JSON serializable