LWNN - Lightweight Neural Network

Mostly, inspired by NNOM, CMSIS-NN, I want to do something for Edge AI.

But as I think NNOM is not well designed for different runtime, CPU/DSP/GPU/NPU etc, it doesn't have a clear path to handle different type of runtime, and nowdays, I really want to study somehing about OpenCL, and I came across MACE, and I find there is a bunch of CL kernels can be used directly.

So I decieded to do something meaningfull, do some study of OpenCL and at the meantime to create a Lightweight Neural Network that can be suitale for decices such as PC, mobiles and MCU etc.

Architecture

And for the purpose to support variant Deep Learning frameworks such as tensorflow/keras/caffe2, pytorch etc, the onnx will be supported by lwnn, also for some old frameworks such as caffe/darknet that doesn't support onnx, they are supported by special handling.

Layers/Runtime	cpu float	cpu s8	cpu q8	cpu q16	opencl	comments
Conv1D	Y d	Y	Y	Y	Y	based on Conv2D
Conv2D	Y d	Y	Y	Y	Y
DeConv2D	Y	Y	Y	Y	Y
DepthwiseConv2D	Y	Y	Y	Y	Y
DilatedConv2D	Y	N	N	N	Y
EltmentWise Max	Y d	Y	Y	Y	Y
ReLU	Y d	Y	Y	Y	Y
PReLU	Y d	N	N	N	Y
MaxPool1D	Y d	Y	Y	Y	Y	based on MaxPool2D
MaxPool2D	Y d	Y	Y	Y	Y
Dense	Y	Y	Y	Y	Y
Softmax	Y d	Y	Y	Y	Y
Reshape	Y d	Y	Y	Y	Y
Pad	Y	Y	Y	Y	Y
BatchNorm	Y	Y	Y	Y	Y
Concat	Y	Y	Y	Y	Y
AvgPool1D	Y d	Y	Y	Y	Y	based on AvgPool2D
AvgPool2D	Y d	Y	Y	Y	Y
Add	Y d	Y	Y	Y	Y
PriorBox	Y	N	N	N	F
DetectionOutput	Y	F	F	F	F
Upsample	Y	Y	Y	Y	Y
Yolo	Y	F	F	F	F
YoloOutput	Y	F	F	F	F
Mfcc	Y	F	F	F	F
LSTM	Y	N	Y	N	F
Proposal	Y	N	N	N	N
Mul	Y d	N	N	N	Y

F means fallback to others runtime that supported that layer.
d means dynamic shape support
s8/q8/q16: all are in Q Format
s8: 8 bit symmetric quantization with zero offset, very similar to tflite quantization
q8/q16: 8/16 bit symmetric quantization, no zero offset.
q8/s8/q16 activation(ReLU/Clip) will reuse its input layer's buffer, so the activation layer's input layer must has only one consumer that is itself.

Supported Famous Models

Below is a list of command to run above models on OPENCL or CPU runtime.

# objection detection
lwnn_gtest --gtest_filter=*CL*SSDFloat -i images/dog.jpg
lwnn_gtest --gtest_filter=*CPU*SSDFloat -i images/dog.jpg
lwnn_gtest --gtest_filter=*CL*YOLOV3Float -i images/dog.jpg
lwnn_gtest --gtest_filter=*CPU*YOLOV3Float -i images/dog.jpg
lwnn_gtest --gtest_filter=*CPU*MASKRCNNFloat -i images/dog.jpg
# semantic segmentation
lwnn_gtest --gtest_filter=*CL*ENETFloat -i ENet/example_image/munich_000000_000019_leftImg8bit.png
lwnn_gtest --gtest_filter=*CPU*ENETFloat -i ENet/example_image/munich_000000_000019_leftImg8bit.png
# speech to text
lwnn_gtest --gtest_filter=*CPU*DSFloat -i speech_dataset/bird/042ea76c_nohash_0.wav
stt 49/29:                                 b  irr  d

Note: Those models has big accuracy drop when do quantization, I think quantization awareness training or something like TensorRT calibration is necessary.

Development

prepare environment

conda create -n lwnn python=3.6
source activate lwnn
conda install scons 
pip install tensorflow keras keras2onnx onnxruntime
sudo apt install nvidia-opencl-dev

build

scons

Name		Name	Last commit message	Last commit date
Latest commit History 270 Commits
docs		docs
gtest		gtest
images		images
nn		nn
tools		tools
.gitignore		.gitignore
.gitmodules		.gitmodules
.travis.yml		.travis.yml
Console.bat		Console.bat
README.md		README.md
SConscript		SConscript
SConstruct		SConstruct

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LWNN - Lightweight Neural Network

Architecture

Supported Famous Models

Development

prepare environment

build

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

LWNN - Lightweight Neural Network

Architecture

Supported Famous Models

Development

prepare environment

build

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages