Skip to content

imputr/imputr

Repository files navigation

🎯 What is Imputr?

Imputr is an open-source library that allows users to stably impute tabular data sets with ML-based and conventional techniques. It is designed to have an extremely simple, yet extensive API, making it possible for users of all levels and tasks to deploy the library in their workflows.

🚀 Getting started

Install Imputr with PIP:

pip install imputr

AutoImputer

Here is an example of the simplest usage of the AutoImputer (our recommended workflow for newbies and intermediates), which by default automatically imputes the missing values for all columns with a modern version of the missForest algorithm.

from imputr import AutoImputer
import pandas as pd

# Import dataset with missing values
df = pd.read_csv("example.csv")

# Initialize AutoImputer with data 
imputer = AutoImputer(data=df)

# Retrieve fully imputed dataset
imputed_df = imputer.impute()

Here you can see an example of how the AutoImputer works internally.

To see what else be done with the AutoImputer API to customise its behaviour, reference our documentation.

📕 Documentation

Multiple links to documentation:

👨🏽‍💻 Contribution

Imputr is an ever-evolving open source library and can always use contributors who want to help build with the community.

See the Contribution Jumpstart page to get started with your first contribution!


Imputr is distributed under an Apache License Version 2.0. A complete version can be found here. All future contributions will continue to be distributed under this license.

About

Python library for easy and fast ML-based & conventional imputation techniques.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages