FAQ

What is this?

A starter CUDA optimization project. It exercises many critical paths like reducing global memory loads, sizing grids appropriately, evaluating performance of scattered writes and so on. It is a simple algorithm that presents a rich surface for optimization.

How to run?

Run run.sh. It won't work unless you have nvcc installed in a specific place.

What are the key points?

learn about grid.sync() in cooperative kernel launch
reading and writing the array best[] in a kernel with different indices requires some synchronization coordination. Seeing wrong results because of this.
kernel launch is fast; but it is noticeable if you launch 100K times.

Results?

See results folder.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
results		results
.gitignore		.gitignore
README.md		README.md
bf.cu		bf.cu
cuda_macros.cu		cuda_macros.cu
l2bandwidth.cu		l2bandwidth.cu
random_array.cu		random_array.cu
run.sh		run.sh
sorttest.cu		sorttest.cu
test_random_array.cu		test_random_array.cu
test_tracer.cc		test_tracer.cc
tracer.h		tracer.h

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FAQ

What is this?

How to run?

What are the key points?

Results?

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

FAQ

What is this?

How to run?

What are the key points?

Results?

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages