Starred repositories
DuckDB is an analytical in-process SQL database management system
R package that allows to convert databases of different formats to parquet format
A drop-in replacement for dplyr, powered by DuckDB for speed.
Initialization scripts for interactive services
Une formation d'initiation au MLOps avec MLFlow
Une formation aux bonnes pratiques de développement avec Git et R
Collection of helm charts to deploy services from Onyxia's data science catalog
A Git extension for JupyterLab
Valider certaines données produites par l’Insee avec R
Python composable command line interface toolkit
Collection of Docker images to build the data science catalog of the Onyxia project
Modular, fast NLP framework, compatible with Pytorch and spaCy, offering tailored support for French clinical notes.
Run current & prior versions of R using docker. rocker/r-ver, rocker/rstudio, rocker/shiny, rocker/tidyverse, and so on.
Projet visant à simplifier la récupération des shapefiles officiels
An attempt to answer the age old interview question "What happens when you type google.com into your browser and press enter?"
State-of-the-Art Text Embeddings
Dépôt associé au cours Python pour data scientists (ENSAE 2e année)
RSTutorials: A Curated List of Algorithms about Traditional and Social Recommender System.
QRec: A Python Framework for quick implementation of recommender systems (TensorFlow Based)
Data Exploration in PySpark made easy - Pyspark_dist_explore provides methods to get fast insights in your Spark DataFrames.
Fast keyword extraction from text using graph degeneracy-based approaches