Skip to content
View kongty's full-sized avatar
  • Stanford University

Block or report kongty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don't include any personal information such as legal names or email addresses. Markdown supported. This note will be visible to only you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

A high-performance algorithmic trading platform and event-driven backtester

Rust 19,661 2,316 Updated Feb 16, 2026

CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.

Python 153 24 Updated Feb 15, 2026

NeuroSpector: Dataflow and Mapping Optimizer for Deep Neural Network Accelerators

C++ 21 3 Updated Mar 20, 2025

A pytorch version of frustum-pointnets

Python 130 30 Updated Mar 18, 2020

HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators

C++ 183 54 Updated Jan 23, 2026

A Python library for large-scale nearest neigbhor computations via k-d trees and GPUs.

C 64 19 Updated Jun 21, 2022

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 23,603 3,108 Updated Aug 15, 2024

Benchmark suite for embedded autonomous vehicle application

C++ 17 8 Updated Dec 28, 2022

An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-design

Python 22 9 Updated Dec 13, 2024

Convert pointpillars Pytorch Model To ONNX for TensorRT Inference

Python 406 85 Updated Nov 11, 2020

Offical PyTorch implementation of "BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework"

Python 945 122 Updated Apr 5, 2023

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.

C++ 454 128 Updated Sep 26, 2025

An analytical framework that models hardware dataflow of tensor applications on spatial architectures using the relation-centric notation.

C++ 87 13 Updated Apr 28, 2024
C 14 4 Updated Feb 1, 2026

Mnemosyne: Multi-Bank Memories for Heterogeneous Architectures

C++ 6 1 Updated Jun 25, 2021

🚀 A very efficient Texas Holdem GTO solver ♠️♥️♣️♦️

C++ 2,315 406 Updated Nov 5, 2024

This is the implementation of the paper [Optimus: Towards Optimal Layer-Fusion on Deep Learning Processors].

Python 9 6 Updated May 10, 2021
C++ 1,226 517 Updated Jan 19, 2026

Prototype-network-on-chip (ProNoC) is an EDA tool that facilitates prototyping of custom heterogeneous NoC-based many-core-SoC (MCSoC).

Verilog 62 23 Updated Dec 15, 2025
SystemVerilog 209 67 Updated Mar 6, 2025

RaveNoC is a configurable HDL NoC (Network-On-Chip) suitable for MPSoCs and different MP applications

SystemVerilog 189 39 Updated Nov 18, 2024

Hierarchical Deep Stereo Matching on High Resolution Images, CVPR 2019.

Python 426 78 Updated Jul 21, 2023

Memory Enhanced Global-Local Aggregation for Video Object Detection, CVPR2020

Python 577 120 Updated May 13, 2021

RTL implementation of Flex-DPE.

Verilog 115 33 Updated Feb 22, 2020

Repository to host and maintain SCALE-Sim code

Python 412 142 Updated Feb 2, 2026

Implementation of a Tensor Processing Unit for embedded systems and the IoT.

VHDL 544 71 Updated Jan 5, 2019

A open source reimplementation of Google's Tensor Processing Unit (TPU).

Python 732 91 Updated Dec 6, 2017

Classical equations and diagrams in machine learning

TeX 8,005 1,340 Updated Jul 30, 2024
Next