unaccent is a simple and efficient Rust crate designed to remove accents (diacritical marks) from strings. Inspired by the PostgreSQL unaccent extension, this crate offers an easy-to-use API for developers who need to normalize text by removing accents in their Rust applications.
- Unicode Support: Fully supports Unicode, ensuring accurate normalization for a wide range of languages. (Work in progress, though...)
- Lightweight: Minimal dependencies, keeping the crate efficient and easy to integrate.
- High Performance: Uses the
unicode-normalizationcrate under the hood for robust and efficient text processing. - Cross-Platform: Works seamlessly on all platforms supported by Rust.
Add unaccent to your Cargo.toml:
[dependencies]
unaccent = "0.1.1"Then, include it in your project:
use unaccent::unaccent;Here’s a quick example:
use unaccent::unaccent;
fn main() {
let input = "Café au lait élégant";
let result = unaccent(input);
println!("Unaccented: {}", result); // Outputs: "Cafe au lait elegant"
}- Text preprocessing for search or indexing.
- Standardizing user input for comparison.
- Cleaning text for machine learning or natural language processing.
Contributions are welcome! If you find a bug or have a feature request, please open an issue or submit a pull request.
-
Clone the repository:
git clone https://github.com/crowdtech-io/unaccent.git cd unaccent -
Run tests:
cargo test
This project adheres to the Rust Code of Conduct. By participating, you are expected to uphold this standard.
This project is licensed under the MIT License.
Special thanks to the creators of the PostgreSQL unaccent extension and the maintainers of the unicode-normalization crate for their foundational work.
Note: This crate is not affiliated with or endorsed by the PostgreSQL project.