🔥 Deploy machine learning models in Ruby (and Rails)
Works great with XGBoost, Torch.rb, fastText, and many other gems
Add this line to your application’s Gemfile:
gem "trove"And run:
bundle install
trove initAnd configure your storage in .trove.yml.
Create a bucket and enable object versioning.
Next, set up your AWS credentials. You can use the AWS CLI:
pip install awscli
aws configureOr environment variables:
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=...IAM users need:
s3:GetObjectands3:GetObjectVersionto pull filess3:PutObjectto push filess3:ListBucketands3:ListBucketVersionsto list files and versionss3:DeleteObjectands3:DeleteObjectVersionto delete files
Here’s an example policy:
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Trove",
"Effect": "Allow",
"Action": [
"s3:GetObject",
"s3:GetObjectVersion",
"s3:PutObject",
"s3:ListBucket",
"s3:ListBucketVersions",
"s3:DeleteObject",
"s3:DeleteObjectVersion"
],
"Resource": [
"arn:aws:s3:::my-bucket",
"arn:aws:s3:::my-bucket/trove/*"
]
}
]
}If your production servers only need to pull files, only give them s3:GetObject and s3:GetObjectVersion permissions.
Git is great for code, but it’s not ideal for large files like models. Instead, we use an object store like Amazon S3 to store and version them.
Trove creates a trove directory for you to use as a workspace. Files in this directory are ignored by Git but can be pushed and pulled from the object store. By default, files are tracked in .trove.yml to make it easy to deploy specific versions with code changes.
Use the trove directory to save and load models.
# training code
model.save_model("trove/model.bin")
# prediction code
model = FastText.load_model("trove/model.bin")When a model is ready, push it to the object store with:
trove push model.binAnd commit the changes to .trove.yml. The model is now ready to be deployed.
We recommend pulling files during the build process.
Make sure your storage credentials are available in the build environment.
Add to your Rakefile:
Rake::Task["assets:precompile"].enhance do
Trove.pull
endThis will pull files at the very end of the asset precompile. Check the build output for:
remote: Pulling model.bin...
remote: Asset precompilation completed (30.00s)
Add to your Dockerfile:
RUN bundle exec trove pullPush a file
trove push model.binPull all files in .trove.yml
trove pullPull a specific file (uses the version in .trove.yml if present)
trove pull model.binPull a specific version of a file
trove pull model.bin --version 123Delete a file
trove delete model.binList files
trove listList versions
trove versions model.binYou can use the Ruby API in addition to the CLI.
Trove.push(filename)
Trove.pull
Trove.pull(filename)
Trove.pull(filename, version: version)
Trove.delete(filename)
Trove.list
Trove.versions(filename)This makes it easy to perform operations from code, iRuby notebooks, and the Rails console.
By default, Trove tracks files in .trove.yml to make it easy to deploy specific versions with code changes. However, this functionality is entirely optional. Disable it with:
vcs: falseThis is useful if you want to automate training or build more complex workflows.
Trove can be used in non-Ruby projects as well.
gem install trove
trove initView the changelog
Everyone is encouraged to help improve this project. Here are a few ways you can help:
- Report bugs
- Fix bugs and submit pull requests
- Write, clarify, or fix documentation
- Suggest or add new features
To get started with development:
git clone https://github.com/ankane/trove.git
cd trove
bundle install
export AWS_ACCESS_KEY_ID=...
export AWS_SECRET_ACCESS_KEY=...
export AWS_REGION=...
export S3_BUCKET=my-bucket
bundle exec rake test