Skip to content
View kongty's full-sized avatar
  • Stanford University

Block or report kongty

Block user

Prevent this user from interacting with your repositories and sending you notifications. Learn more about blocking users.

You must be logged in to block users.

Maximum 250 characters. Please don’t include any personal information such as legal names or email addresses. Markdown is supported. This note will only be visible to you.
Report abuse

Contact GitHub support about this user’s behavior. Learn more about reporting abuse.

Report abuse
Showing results

Production-grade Rust-native trading engine with deterministic event-driven architecture

Rust 23,466 2,972 Updated Jun 14, 2026

CGRA-Flow is an integrated framework for CGRA compilation, exploration, synthesis, and development.

Python 161 25 Updated Feb 18, 2026

NeuroSpector: Dataflow and Mapping Optimizer for Deep Neural Network Accelerators

C++ 23 3 Updated Mar 20, 2025

A pytorch version of frustum-pointnets

Python 131 30 Updated Mar 18, 2020

HW Architecture-Mapping Design Space Exploration Framework for Deep Learning Accelerators

C++ 194 60 Updated Apr 27, 2026

A Python library for large-scale nearest neigbhor computations via k-d trees and GPUs.

C 64 19 Updated Jun 21, 2022

A minimal PyTorch re-implementation of the OpenAI GPT (Generative Pretrained Transformer) training

Python 24,532 3,271 Updated Aug 15, 2024

Benchmark suite for embedded autonomous vehicle application

C++ 17 8 Updated Dec 28, 2022

An end-to-end benchmark suite of multi-modal DNN applications for system-architecture co-design

Python 22 9 Updated Dec 13, 2024

Convert pointpillars Pytorch Model To ONNX for TensorRT Inference

Python 408 84 Updated Nov 11, 2020

Offical PyTorch implementation of "BEVFusion: A Simple and Robust LiDAR-Camera Fusion Framework"

Python 967 127 Updated Apr 5, 2023

Timeloop performs modeling, mapping and code-generation for tensor algebra workloads on various accelerator architectures.

C++ 495 131 Updated Apr 30, 2026

An analytical framework that models hardware dataflow of tensor applications on spatial architectures using the relation-centric notation.

C++ 88 14 Updated Apr 28, 2024
C 15 5 Updated Mar 17, 2026

Mnemosyne: Multi-Bank Memories for Heterogeneous Architectures

C++ 6 1 Updated Jun 25, 2021

🚀 A very efficient Texas Holdem GTO solver ♠️♥️♣️♦️

C++ 2,436 428 Updated Mar 31, 2026

This is the implementation of the paper [Optimus: Towards Optimal Layer-Fusion on Deep Learning Processors].

Python 9 6 Updated May 10, 2021
C++ 1,251 521 Updated Jan 19, 2026

Prototype-network-on-chip (ProNoC) is an EDA tool that facilitates prototyping of custom heterogeneous NoC-based many-core-SoC (MCSoC).

Verilog 65 23 Updated Jun 10, 2026
SystemVerilog 218 71 Updated May 30, 2026

RaveNoC is a configurable HDL NoC (Network-On-Chip) suitable for MPSoCs and different MP applications

SystemVerilog 192 40 Updated Nov 18, 2024

Hierarchical Deep Stereo Matching on High Resolution Images, CVPR 2019.

Python 425 77 Updated Jul 21, 2023

Memory Enhanced Global-Local Aggregation for Video Object Detection, CVPR2020

Python 576 121 Updated May 13, 2021

RTL implementation of Flex-DPE.

Verilog 117 33 Updated Feb 22, 2020

Repository to host and maintain SCALE-Sim code

Python 480 164 Updated Feb 2, 2026

Implementation of a Tensor Processing Unit for embedded systems and the IoT.

VHDL 567 75 Updated Jan 5, 2019

A open source reimplementation of Google's Tensor Processing Unit (TPU).

Python 758 97 Updated Dec 6, 2017

Classical equations and diagrams in machine learning

TeX 8,013 1,336 Updated Jul 30, 2024
Next