elasticsearch-reindex is a CLI tool for transferring Elasticsearch indexes between different servers.
Install the package using pip:
pip install elasticsearch-reindexEnsure the source Elasticsearch host is whitelisted in the destination host. Edit the elasticsearch.yml configuration file on the destination Elasticsearch server.
You should edit Elasticsearch YML config:
/etc/elasticsearch/elasticsearch.ymlAdd the following line to the file:
reindex.remote.whitelist: <es-source-host>:<es-source-port>Use the CLI to migrate data between Elasticsearch instances:
elasticsearch_reindex \
--source_host http(s)://es-source-host:es-source-port \
--source_http_auth username:password \
--dest_host http(s)://es-dest-host:es-dest-port \
--dest_http_auth username:password \
--check_interval 5 \
--concurrent_tasks 3 \
-i test_index_1 -i test_index_2Also, there is a command alias elasticsearch-reindex:
elasticsearch-reindex ...Required fields:
-
source_host- Elasticsearch endpoint where data will be extracted. -
dest_host- Elasticsearch endpoint where data will be transfered.
Optional fields:
-
source_http_auth- HTTP Basic authentication, username and password. -
dest_http_auth- HTTP Basic authentication, username and password. -
check_interval- Time period (in second) to check task success status.Default value-10(seconds) -
concurrent_tasks- How many parallel task Elasticsearch will process.Default value-1(sync mode) -
indexes- List of user ES indexes to migrate instead of all source indexes.
from elasticsearch_reindex import ReindexManager
def main() -> None:
"""
Example reindex function.
"""
dict_config = {
"source_host": "http://localhost:9201",
"dest_host": "http://localhost:9202",
"check_interval": 20,
"concurrent_tasks": 5,
}
reindex_manager = ReindexManager.from_dict(data=dict_config)
reindex_manager.start_reindex()
if __name__ == "__main__":
main()With custom user indexes:
from elasticsearch_reindex import ReindexManager
def main() -> None:
"""
Example reindex function with HTTP Basic authentication.
"""
dict_config = {
"source_host": "http://localhost:9201",
# If the source host requires authentication
# "source_http_auth": "tmp-source-user:tmp-source-PASSWD.220718",
"dest_host": "http://localhost:9202",
# If the destination host requires authentication
# "dest_http_auth": "tmp-reindex-user:tmp--PASSWD.220718",
"check_interval": 20,
"concurrent_tasks": 5,
"indexes": ["es-index-1", "es-index-2", "es-index-n"],
}
reindex_manager = ReindexManager.from_dict(data=dict_config)
reindex_manager.start_reindex()
if __name__ == "__main__":
main()Set up and activate a Python 3 virtual environment:
make veTo install Git hooks:
make install_hooksCreate .env file and fill the data:
cp .env.example .envExport env variables:
export $(xargs < .env)Variable for enable testing:
ENV- variable for enable testing mode. For activate test mode set to value -test.
Elasticsearch docker settings:
-
ES_SOURCE_PORT- Source Elasticsearch port -
ES_DEST_PORT- Destination Elasticsearch port -
ES_VERSION- Elasticsearch version -
LOCAL_IP- Address of you local host machine in LAN like192.168.4.106.
- MacOS (find it in response):
ifconfig- Linux (find it in response):
ip rStart Elasticsearch nodes using Docker Compose:
docker-compose up -dVerify Elasticsearch nodes are running:
- Source Elasticsearch:
curl -X GET $LOCAL_IP:$ES_SOURCE_PORT- Destination Elasticsearch:
curl -X GET $LOCAL_IP:$ES_DEST_PORTExport to PYTHONPATH env variable:
export PYTHONPATH="."For run tests with pytest use:
make testFor run tests with pytest and coverage report use:
make test-cov