Skip to content

myyrakle/clockpipe

Repository files navigation

clockpipe

GitHub license

  • Data synchronization pipeline tool for self-host clickhouse users.
  • Automatically records data from the original source into Clickhouse, implemented through CDC.

Supported Source

  • PostgreSQL (ready)
  • MongoDB (ready)
  • MySQL (not yet)
  • CassandraDB (not yet)
  • ScyllaDB (not yet)

Install

Build from source code

git clone https://github.com/myyrakle/clockpipe
cd clockpipe
cargo install --path .

Using Cargo

cargo install clockpipe

Using Docker

sudo docker run -v $(pwd)/clockpipe-config.json:/app/config.json --network host myyrakle/clockpipe:v0.5.4

Requirements & Limits

  • Each source has its own set of prerequisites and limitations.
  • Please refer to the respective documentation.
  1. PostgreSQL
  2. MongoDB

How to Run

  • Prepare config file. (See documentation)
  • Enter the information about the source table you want to synchronize. (postgres example)
    "tables": [
        {
            "schema_name": "public",
            "table_name": "foo"
        },
        {
            "schema_name": "public",
            "table_name": "user_table"
        }
    ]
  • Then, Run it
clockpipe run --config-file ./clockpipe-config.json
  • Pipe automatically creates and synchronizes tables in Clickhouse by querying table information.

  • If you don't want the initial synchronization, use the skip_copy option. (CDC-based synchronization still works.)

    "tables": [
        {
            "schema_name": "public",
            "table_name": "user_table",
            "skip_copy": true
        }
    ]

ETC

  • You can also adjust the log level. You can set values such as error, warn, info, and debug to the "RUST_LOG" environment variable.
RUST_LOG=debug clockpipe run --config-file ./clockpipe-config.json

About

Clickhouse Data Synchronization Pipeline

Topics

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 2

  •  
  •  

Languages