Simple script that queries OpenSearch logs and exports them to CSV or JSON.
python -m venv .venv.venv\Scripts\activatepython -r pip install ./requirements.txt
or
pip install -r requirements.txtThe script connects to an OpenSearch cluster using the credentials and connection details provided in parameters.json. It then executes a query based on the configuration in the same file, fetching data within a specified time range and matching defined criteria. The results are streamed and saved to either a JSON or CSV file, as configured.
The parameters.json file contains all the necessary settings for the script to run. Here's a breakdown of the main sections:
connection: Specifies the OpenSearch host, port, username, password, and SSL settings.index: The index pattern to query (e.g.,your-index-pattern-*).timespan: Defines the start and end time for the data query inYYYY-MM-DDTHH:mm:ssformat.query: Contains the specific query details (see below).output: Configures the output format (jsonorcsv), file path, and batch size for fetching data.scroll: Sets the scroll time for fetching large datasets.
The query object within parameters.json allows you to specify the search criteria using the OpenSearch Query DSL.
_source: (Optional) A list of fields to include in the results. If omitted, all fields are returned.bool_conditions: (Optional) Defines boolean clauses (must,should,must_not,filter) to combine multiple query criteria. You can nest boolean queries and use various query types liketerm,match,range,wildcard,exists, etc.
Example Query Structure:
"query": {
"_source": [
"timestamp",
"applicationName",
"fields.eventCode"
],
"bool_conditions": {
"must": [
{
"bool": {
"should": [
{
"bool": {
"must": [
{"wildcard": {"applicationName": "app-prefix*"}},
{"term": {"fields.eventCode.keyword": "EVENT_CODE_1"}}
]
}
},
{
"bool": {
"must": [
{"wildcard": {"applicationName": "another-app-prefix*"}},
{"exists": {"field": "fields.correlationId"}}
]
}
}
],
"minimum_should_match": 1
}
}
]
}
}This example fetches specific fields (_source) for documents where the applicationName starts with app-prefix* AND has EVENT_CODE_1, OR where the applicationName starts with another-app-prefix* AND the fields.correlationId exists.
Once configured, run the script from your activated virtual environment:
python fetchData.pyYou can optionally provide a path to a different configuration file:
python fetchData.py /path/to/your/custom_parameters.json