Skip to content
View karanjeets's full-sized avatar

Organizations

@USCDataScience

Block or report karanjeets

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
11 stars written in Java
Clear filter

Suite of tools for deploying and training deep learning models using the JVM. Highlights include model import for keras, tensorflow, and onnx/pytorch, a modular and tiny c++ library for running mat…

Java 14,214 3,843 Updated Mar 27, 2026

Statistical Machine Intelligence & Learning Engine

Java 6,357 1,146 Updated Mar 27, 2026

Example code from Learning Spark book

Java 3,900 2,410 Updated Jul 12, 2025

Java version of the Playwright testing and automation library

Java 1,459 272 Updated Mar 27, 2026

A scalable, mature and versatile web crawler based on Apache Storm

Java 972 274 Updated Mar 27, 2026

Spark-Crawler: Apache Nutch-like crawler that runs on Apache Spark.

Java 420 138 Updated Mar 30, 2023

WInte.r is a Java framework for end-to-end data integration. The WInte.r framework implements well-known methods for data pre-processing, schema matching, identity resolution, data fusion, and resu…

Java 114 32 Updated May 20, 2022

ImageCat is an Apache OODT RADIX application that uses Apache Solr, Apache Tika and Apache OODT to ingest 10s of millions of files (images,but could be extended to other files) in place, and to ext…

Java 95 40 Updated Aug 26, 2018

DKPro C4CorpusTools is a collection of tools for processing CommonCrawl corpus, including Creative Commons license detection, boilerplate removal, language detection, and near-duplicate removal.

Java 52 7 Updated Jun 12, 2020

Extraction code used to create the Dresden Web Table Corpus

Java 14 8 Updated Feb 25, 2015
Java 6 Updated Mar 2, 2026