Skip to content
View szilard's full-sized avatar

Organizations

@user2014 @DataScienceLA

Block or report szilard

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

scikit-learn: machine learning in Python

Python 63,998 26,429 Updated Nov 11, 2025

Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow

C++ 27,600 8,819 Updated Nov 10, 2025

Lightweight, Portable, Flexible Distributed/Mobile Deep Learning with Dynamic, Mutation-aware Dataflow Dep Scheduler; for Python, R, Julia, Scala, Go, Javascript and more

C++ 20,832 6,752 Updated Oct 25, 2023

H2O is an Open Source, Distributed, Fast & Scalable Machine Learning Platform: Deep Learning, Gradient Boosting (GBM) & XGBoost, Random Forest, Generalized Linear Modeling (GLM with Elastic Net), K…

Jupyter Notebook 7,361 2,024 Updated Nov 11, 2025

An implementation of the Grammar of Graphics in R

R 6,817 2,112 Updated Nov 11, 2025

Easy interactive web applications with R

R 5,554 1,877 Updated Nov 11, 2025

dplyr: A grammar of data manipulation

R 4,951 2,135 Updated Nov 11, 2025

RStudio is an integrated development environment (IDE) for R

Java 4,897 1,144 Updated Nov 11, 2025

R's data.table package extends data.frame:

R 3,805 1,017 Updated Nov 10, 2025

Dynamic Documents for R

R 2,988 997 Updated Sep 29, 2025

Advanced R: a book

TeX 2,429 1,710 Updated Mar 13, 2025

A minimal benchmark for scalability, speed and accuracy of commonly used open source implementations (R packages, Python scikit-learn, H2O, xgboost, Spark MLlib etc.) of the top machine learning al…

R 1,889 332 Updated Sep 16, 2022

R configurations for Docker

Shell 1,495 267 Updated Oct 31, 2025

Read flat files (csv, tsv, fwf) into R

R 1,019 289 Updated Oct 16, 2025

Seamless R and C++ Integration

C++ 778 218 Updated Nov 5, 2025

useR! 2016 Tutorial: Machine Learning Algorithmic Deep Dive http://user2016.org/tutorials/10.html

Jupyter Notebook 400 204 Updated Mar 5, 2018

Performance of various open source GBM implementations

HTML 221 29 Updated Nov 6, 2025

Materials for STATS 418 - Tools in Data Science course taught in the Master of Applied Statistics at UCLA

HTML 137 63 Updated Jun 10, 2017

A minimal benchmark of various tools (statistical software, databases etc.) for working with tabular data of moderately large sizes (interactive data analysis).

R 89 17 Updated Jul 25, 2017

Some thoughts on how to use machine learning in production

71 10 Updated May 17, 2017

Adaptive and automatic gradient boosting computations.

R 70 11 Updated Aug 20, 2022

Quick informal survey at the Los Angeles Machine learning meetup about tools used for machine learning.

51 6 Updated Jun 28, 2015

Machine Learning #1 and #2 courses at CEU Master of Science in Business Analytics

HTML 38 44 Updated Mar 28, 2021

Materials for a short introductory/intermediate Data Science course taught in the MSc in Business Analytics program at the Central European University

HTML 33 13 Updated Sep 8, 2017

Advanced workshop on XGBoost with Tianqi Chen in Santa Monica, June 2, 2016

R 26 7 Updated Nov 21, 2016

Machine Learning #1 and #2 courses at CEU Master of Science in Business Analytics

HTML 22 58 Updated Feb 2, 2019

Tuning GBMs (hyperparameter tuning) and impact on out-of-sample predictions

HTML 21 3 Updated Sep 11, 2017

Compare the scoring speed of several open source machine learning libraries.

R 20 4 Updated Jun 19, 2017

GBM multicore scaling: h2o, xgboost and lightgbm on multicore and multi-socket systems

HTML 20 1 Updated May 13, 2018

Latency numbers every data scientist should know (aka the pyramid of analytical tasks) - the order of magnitude of computational time for the most common analytical tasks (SQL-like data munging, li…

20 4 Updated Apr 13, 2017
Next