Connect any data source, combine them in real-time and instantly get low-latency data APIs.
⚡ All with just a simple configuration! ⚡️
Dozer makes it easy to build low-latency data APIs (gRPC and REST) from any data source. Data is transformed on the fly using Dozer's reactive SQL engine and stored in a high-performance cache to offer the best possible experience. Dozer is useful for quickly building data products.
Follow the instruction below to install Dozer on your machine and run a quick sample using the NY Taxi Dataset
MacOS Monterey (12) and above
brew tap getdozer/dozer && brew install dozerUbuntu 20.04
curl -sLO https://github.com/getdozer/dozer/releases/download/latest/dozer_linux_x86_64.deb && sudo dpkg -i dozer_linux_x86_64.deb Build from source
cargo install --path dozer-orchestrator --lockedDownload sample configuration and data
Create a new empty directory and run the commands below. This will download a sample configuration file and a sample NY Taxi Dataset file.
curl -o dozer-config.yaml https://raw.githubusercontent.com/getdozer/dozer-samples/main/local-storage/dozer-config.yaml
curl --create-dirs -o data/trips/fhvhv_tripdata_2022-01.parquet https://d37ci6vzurychx.cloudfront.net/trip-data/fhvhv_tripdata_2022-01.parquetRun Dozer binary
dozer -c dozer-config.yamlDozer will start processing the data and populating the cache. You can see a progress of the execution from the console.
Query the APIs
When some data is loaded, you can query the cache using gRPC or REST
# gRPC
grpcurl -d '{"query": "{\"$limit\": 1}"}' -plaintext localhost:50051 dozer.generated.trips_cache.TripsCaches/query
# REST
curl -X POST http://localhost:8080/trips/query --header 'Content-Type: application/json' --data-raw '{"$limit":3}'Alternatively, you can use Postman to discover gRPC endpoints through gRPC reflection
Read more about Dozer here. And remember to star 🌟 our repo to support us!
Check out Dozer's samples repository for more comprehensive examples and use case scenarios.