0% found this document useful (0 votes)

33 views50 pages

Vulkan Tutorial En-151-200

ini adalah bagian ke empat dari vulkan turorial

Uploaded by

rendy anggara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

33 views50 pages

Vulkan Tutorial En-151-200

ini adalah bagian ke empat dari vulkan turorial

Uploaded by

rendy anggara

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 50

Pipeline vertex input

We now need to set up the graphics pipeline to accept vertex data in this
format by referencing the structures in createGraphicsPipeline. Find the
vertexInputInfo struct and modify it to reference the two descriptions:
1 auto bindingDescription = Vertex::getBindingDescription();
2 auto attributeDescriptions = Vertex::getAttributeDescriptions();
3
4 vertexInputInfo.vertexBindingDescriptionCount = 1;
5 vertexInputInfo.vertexAttributeDescriptionCount =
static_cast<uint32_t>(attributeDescriptions.size());
6 vertexInputInfo.pVertexBindingDescriptions = &bindingDescription;
7 vertexInputInfo.pVertexAttributeDescriptions =
attributeDescriptions.data();

The pipeline is now ready to accept vertex data in the format of the vertices
container and pass it on to our vertex shader. If you run the program now with
validation layers enabled, you’ll see that it complains that there is no vertex
buffer bound to the binding. The next step is to create a vertex buffer and
move the vertex data to it so the GPU is able to access it.
C++ code / Vertex shader / Fragment shader

Vertex buffer creation

Introduction
Buffers in Vulkan are regions of memory used for storing arbitrary data that can
be read by the graphics card. They can be used to store vertex data, which we’ll
do in this chapter, but they can also be used for many other purposes that we’ll
explore in future chapters. Unlike the Vulkan objects we’ve been dealing with
so far, buffers do not automatically allocate memory for themselves. The work
from the previous chapters has shown that the Vulkan API puts the programmer
in control of almost everything and memory management is one of those things.

Buffer creation
Create a new function createVertexBuffer and call it from initVulkan right
before createCommandBuffers.
1 void initVulkan() {
2 createInstance();
3 setupDebugMessenger();
4 createSurface();
5 pickPhysicalDevice();
6 createLogicalDevice();
7 createSwapChain();

150
8 createImageViews();
9 createRenderPass();
10 createGraphicsPipeline();
11 createFramebuffers();
12 createCommandPool();
13 createVertexBuffer();
14 createCommandBuffers();
15 createSyncObjects();
16 }
17
18 ...
19
20 void createVertexBuffer() {
21
22 }

Creating a buffer requires us to fill a VkBufferCreateInfo structure.

1 VkBufferCreateInfo bufferInfo{};
2 bufferInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
3 bufferInfo.size = sizeof(vertices[0]) * vertices.size();

The first field of the struct is size, which specifies the size of the buffer in bytes.
Calculating the byte size of the vertex data is straightforward with sizeof.
1 bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT;

The second field is usage, which indicates for which purposes the data in the
buffer is going to be used. It is possible to specify multiple purposes using a
bitwise or. Our use case will be a vertex buffer, we’ll look at other types of
usage in future chapters.
1 bufferInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;

Just like the images in the swap chain, buffers can also be owned by a specific
queue family or be shared between multiple at the same time. The buffer will
only be used from the graphics queue, so we can stick to exclusive access.
The flags parameter is used to configure sparse buffer memory, which is not
relevant right now. We’ll leave it at the default value of 0.
We can now create the buffer with vkCreateBuffer. Define a class member to
hold the buffer handle and call it vertexBuffer.
1 VkBuffer vertexBuffer;
2
3 ...
4
5 void createVertexBuffer() {

151
6 VkBufferCreateInfo bufferInfo{};
7 bufferInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
8 bufferInfo.size = sizeof(vertices[0]) * vertices.size();
9 bufferInfo.usage = VK_BUFFER_USAGE_VERTEX_BUFFER_BIT;
10 bufferInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
11
12 if (vkCreateBuffer(device, &bufferInfo, nullptr, &vertexBuffer)
!= VK_SUCCESS) {
13 throw std::runtime_error("failed to create vertex buffer!");
14 }
15 }

The buffer should be available for use in rendering commands until the end of
the program and it does not depend on the swap chain, so we’ll clean it up in
the original cleanup function:
1 void cleanup() {
2 cleanupSwapChain();
3
4 vkDestroyBuffer(device, vertexBuffer, nullptr);
5
6 ...
7 }

Memory requirements
The buffer has been created, but it doesn’t actually have any memory assigned to
it yet. The first step of allocating memory for the buffer is to query its memory
requirements using the aptly named vkGetBufferMemoryRequirements func-
tion.
1 VkMemoryRequirements memRequirements;
2 vkGetBufferMemoryRequirements(device, vertexBuffer,
&memRequirements);

The VkMemoryRequirements struct has three fields:

• size: The size of the required amount of memory in bytes, may differ
from bufferInfo.size.
• alignment: The offset in bytes where the buffer begins in the allocated re-
gion of memory, depends on bufferInfo.usage and bufferInfo.flags.
• memoryTypeBits: Bit field of the memory types that are suitable for the
buffer.
Graphics cards can offer different types of memory to allocate from. Each type of
memory varies in terms of allowed operations and performance characteristics.
We need to combine the requirements of the buffer and our own application

152
requirements to find the right type of memory to use. Let’s create a new function
findMemoryType for this purpose.
1 uint32_t findMemoryType(uint32_t typeFilter, VkMemoryPropertyFlags
properties) {
2
3 }

First we need to query info about the available types of memory using
vkGetPhysicalDeviceMemoryProperties.
1 VkPhysicalDeviceMemoryProperties memProperties;
2 vkGetPhysicalDeviceMemoryProperties(physicalDevice, &memProperties);

The VkPhysicalDeviceMemoryProperties structure has two arrays

memoryTypes and memoryHeaps. Memory heaps are distinct memory re-
sources like dedicated VRAM and swap space in RAM for when VRAM runs
out. The different types of memory exist within these heaps. Right now we’ll
only concern ourselves with the type of memory and not the heap it comes
from, but you can imagine that this can affect performance.
Let’s first find a memory type that is suitable for the buffer itself:
1 for (uint32_t i = 0; i < memProperties.memoryTypeCount; i++) {
2 if (typeFilter & (1 << i)) {
3 return i;
4 }
5 }
6
7 throw std::runtime_error("failed to find suitable memory type!");

The typeFilter parameter will be used to specify the bit field of memory types
that are suitable. That means that we can find the index of a suitable memory
type by simply iterating over them and checking if the corresponding bit is set
to 1.
However, we’re not just interested in a memory type that is suitable for the
vertex buffer. We also need to be able to write our vertex data to that memory.
The memoryTypes array consists of VkMemoryType structs that specify the heap
and properties of each type of memory. The properties define special features
of the memory, like being able to map it so we can write to it from the CPU.
This property is indicated with VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT, but
we also need to use the VK_MEMORY_PROPERTY_HOST_COHERENT_BIT property.
We’ll see why when we map the memory.
We can now modify the loop to also check for the support of this property:
1 for (uint32_t i = 0; i < memProperties.memoryTypeCount; i++) {

153
2 if ((typeFilter & (1 << i)) &&
(memProperties.memoryTypes[i].propertyFlags & properties) ==
properties) {
3 return i;
4 }
5 }

We may have more than one desirable property, so we should check if the result
of the bitwise AND is not just non-zero, but equal to the desired properties bit
field. If there is a memory type suitable for the buffer that also has all of the
properties we need, then we return its index, otherwise we throw an exception.

Memory allocation
We now have a way to determine the right memory type, so we can actually
allocate the memory by filling in the VkMemoryAllocateInfo structure.
1 VkMemoryAllocateInfo allocInfo{};
2 allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
3 allocInfo.allocationSize = memRequirements.size;
4 allocInfo.memoryTypeIndex =
findMemoryType(memRequirements.memoryTypeBits,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT);

Memory allocation is now as simple as specifying the size and type, both of
which are derived from the memory requirements of the vertex buffer and the
desired property. Create a class member to store the handle to the memory and
allocate it with vkAllocateMemory.
1 VkBuffer vertexBuffer;
2 VkDeviceMemory vertexBufferMemory;
3
4 ...
5
6 if (vkAllocateMemory(device, &allocInfo, nullptr,
&vertexBufferMemory) != VK_SUCCESS) {
7 throw std::runtime_error("failed to allocate vertex buffer
memory!");
8 }

If memory allocation was successful, then we can now associate this memory
with the buffer using vkBindBufferMemory:
1 vkBindBufferMemory(device, vertexBuffer, vertexBufferMemory, 0);

The first three parameters are self-explanatory and the fourth parameter is the
offset within the region of memory. Since this memory is allocated specifically

154
for this the vertex buffer, the offset is simply 0. If the offset is non-zero, then it
is required to be divisible by memRequirements.alignment.
Of course, just like dynamic memory allocation in C++, the memory should be
freed at some point. Memory that is bound to a buffer object may be freed once
the buffer is no longer used, so let’s free it after the buffer has been destroyed:
1 void cleanup() {
2 cleanupSwapChain();
3
4 vkDestroyBuffer(device, vertexBuffer, nullptr);
5 vkFreeMemory(device, vertexBufferMemory, nullptr);

Filling the vertex buffer

It is now time to copy the vertex data to the buffer. This is done by mapping
the buffer memory into CPU accessible memory with vkMapMemory.
1 void* data;
2 vkMapMemory(device, vertexBufferMemory, 0, bufferInfo.size, 0,
&data);

This function allows us to access a region of the specified memory resource de-
fined by an offset and size. The offset and size here are 0 and bufferInfo.size,
respectively. It is also possible to specify the special value VK_WHOLE_SIZE to
map all of the memory. The second to last parameter can be used to specify
flags, but there aren’t any available yet in the current API. It must be set to the
value 0. The last parameter specifies the output for the pointer to the mapped
memory.
1 void* data;
2 vkMapMemory(device, vertexBufferMemory, 0, bufferInfo.size, 0,
&data);
3 memcpy(data, vertices.data(), (size_t) bufferInfo.size);
4 vkUnmapMemory(device, vertexBufferMemory);

You can now simply memcpy the vertex data to the mapped memory and unmap
it again using vkUnmapMemory. Unfortunately the driver may not immediately
copy the data into the buffer memory, for example because of caching. It is also
possible that writes to the buffer are not visible in the mapped memory yet.
There are two ways to deal with that problem:
• Use a memory heap that is host coherent, indicated with VK_MEMORY_PROPERTY_HOST_COHERENT_BIT
• Call vkFlushMappedMemoryRanges after writing to the mapped memory,
and call vkInvalidateMappedMemoryRanges before reading from the
mapped memory
We went for the first approach, which ensures that the mapped memory always
matches the contents of the allocated memory. Do keep in mind that this may

155
lead to slightly worse performance than explicit flushing, but we’ll see why that
doesn’t matter in the next chapter.
Flushing memory ranges or using a coherent memory heap means that the driver
will be aware of our writes to the buffer, but it doesn’t mean that they are
actually visible on the GPU yet. The transfer of data to the GPU is an operation
that happens in the background and the specification simply tells us that it is
guaranteed to be complete as of the next call to vkQueueSubmit.

Binding the vertex buffer

All that remains now is binding the vertex buffer during rendering operations.
We’re going to extend the recordCommandBuffer function to do that.
1 vkCmdBindPipeline(commandBuffer, VK_PIPELINE_BIND_POINT_GRAPHICS,
graphicsPipeline);
2
3 VkBuffer vertexBuffers[] = {vertexBuffer};
4 VkDeviceSize offsets[] = {0};
5 vkCmdBindVertexBuffers(commandBuffer, 0, 1, vertexBuffers, offsets);
6
7 vkCmdDraw(commandBuffer, static_cast<uint32_t>(vertices.size()), 1,
0, 0);

The vkCmdBindVertexBuffers function is used to bind vertex buffers to bind-

ings, like the one we set up in the previous chapter. The first two parameters,
besides the command buffer, specify the offset and number of bindings we’re
going to specify vertex buffers for. The last two parameters specify the array
of vertex buffers to bind and the byte offsets to start reading vertex data from.
You should also change the call to vkCmdDraw to pass the number of vertices in
the buffer as opposed to the hardcoded number 3.
Now run the program and you should see the familiar triangle again:

156
Try changing the color of the top vertex to white by modifying the vertices
array:
1 const std::vector<Vertex> vertices = {
2 {{0.0f, -0.5f}, {1.0f, 1.0f, 1.0f}},
3 {{0.5f, 0.5f}, {0.0f, 1.0f, 0.0f}},
4 {{-0.5f, 0.5f}, {0.0f, 0.0f, 1.0f}}
5 };

Run the program again and you should see the following:

157
In the next chapter we’ll look at a different way to copy vertex data to a vertex
buffer that results in better performance, but takes some more work.
C++ code / Vertex shader / Fragment shader

Staging buffer
Introduction
The vertex buffer we have right now works correctly, but the memory type that
allows us to access it from the CPU may not be the most optimal memory type
for the graphics card itself to read from. The most optimal memory has the
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT flag and is usually not accessible by
the CPU on dedicated graphics cards. In this chapter we’re going to create
two vertex buffers. One staging buffer in CPU accessible memory to upload the
data from the vertex array to, and the final vertex buffer in device local memory.
We’ll then use a buffer copy command to move the data from the staging buffer
to the actual vertex buffer.

Transfer queue
The buffer copy command requires a queue family that supports transfer opera-
tions, which is indicated using VK_QUEUE_TRANSFER_BIT. The good news is that

158
any queue family with VK_QUEUE_GRAPHICS_BIT or VK_QUEUE_COMPUTE_BIT ca-
pabilities already implicitly support VK_QUEUE_TRANSFER_BIT operations. The
implementation is not required to explicitly list it in queueFlags in those cases.
If you like a challenge, then you can still try to use a different queue family
specifically for transfer operations. It will require you to make the following
modifications to your program:
• Modify QueueFamilyIndices and findQueueFamilies to explicitly look
for a queue family with the VK_QUEUE_TRANSFER_BIT bit, but not the
VK_QUEUE_GRAPHICS_BIT.
• Modify createLogicalDevice to request a handle to the transfer queue
• Create a second command pool for command buffers that are submitted
on the transfer queue family
• Change the sharingMode of resources to be VK_SHARING_MODE_CONCURRENT
and specify both the graphics and transfer queue families
• Submit any transfer commands like vkCmdCopyBuffer (which we’ll be
using in this chapter) to the transfer queue instead of the graphics queue
It’s a bit of work, but it’ll teach you a lot about how resources are shared
between queue families.

Abstracting buffer creation

Because we’re going to create multiple buffers in this chapter, it’s a good idea to
move buffer creation to a helper function. Create a new function createBuffer
and move the code in createVertexBuffer (except mapping) to it.
1 void createBuffer(VkDeviceSize size, VkBufferUsageFlags usage,
VkMemoryPropertyFlags properties, VkBuffer& buffer,
VkDeviceMemory& bufferMemory) {
2 VkBufferCreateInfo bufferInfo{};
3 bufferInfo.sType = VK_STRUCTURE_TYPE_BUFFER_CREATE_INFO;
4 bufferInfo.size = size;
5 bufferInfo.usage = usage;
6 bufferInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
7
8 if (vkCreateBuffer(device, &bufferInfo, nullptr, &buffer) !=
VK_SUCCESS) {
9 throw std::runtime_error("failed to create buffer!");
10 }
11
12 VkMemoryRequirements memRequirements;
13 vkGetBufferMemoryRequirements(device, buffer, &memRequirements);
14
15 VkMemoryAllocateInfo allocInfo{};
16 allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
17 allocInfo.allocationSize = memRequirements.size;

159
18 allocInfo.memoryTypeIndex =
findMemoryType(memRequirements.memoryTypeBits, properties);
19
20 if (vkAllocateMemory(device, &allocInfo, nullptr, &bufferMemory)
!= VK_SUCCESS) {
21 throw std::runtime_error("failed to allocate buffer
memory!");
22 }
23
24 vkBindBufferMemory(device, buffer, bufferMemory, 0);
25 }

Make sure to add parameters for the buffer size, memory properties and usage
so that we can use this function to create many different types of buffers. The
last two parameters are output variables to write the handles to.
You can now remove the buffer creation and memory allocation code from
createVertexBuffer and just call createBuffer instead:
1 void createVertexBuffer() {
2 VkDeviceSize bufferSize = sizeof(vertices[0]) * vertices.size();
3 createBuffer(bufferSize, VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, vertexBuffer,
vertexBufferMemory);
4
5 void* data;
6 vkMapMemory(device, vertexBufferMemory, 0, bufferSize, 0, &data);
7 memcpy(data, vertices.data(), (size_t) bufferSize);
8 vkUnmapMemory(device, vertexBufferMemory);
9 }

Run your program to make sure that the vertex buffer still works properly.

Using a staging buffer

We’re now going to change createVertexBuffer to only use a host visible buffer
as temporary buffer and use a device local one as actual vertex buffer.
1 void createVertexBuffer() {
2 VkDeviceSize bufferSize = sizeof(vertices[0]) * vertices.size();
3
4 VkBuffer stagingBuffer;
5 VkDeviceMemory stagingBufferMemory;
6 createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer,
stagingBufferMemory);

160
7
8 void* data;
9 vkMapMemory(device, stagingBufferMemory, 0, bufferSize, 0,
&data);
10 memcpy(data, vertices.data(), (size_t) bufferSize);
11 vkUnmapMemory(device, stagingBufferMemory);
12
13 createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT |
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, vertexBuffer,
vertexBufferMemory);
14 }

We’re now using a new stagingBuffer with stagingBufferMemory for mapping

and copying the vertex data. In this chapter we’re going to use two new buffer
usage flags:
• VK_BUFFER_USAGE_TRANSFER_SRC_BIT: Buffer can be used as source in a
memory transfer operation.
• VK_BUFFER_USAGE_TRANSFER_DST_BIT: Buffer can be used as destination
in a memory transfer operation.
The vertexBuffer is now allocated from a memory type that is device local,
which generally means that we’re not able to use vkMapMemory. However, we
can copy data from the stagingBuffer to the vertexBuffer. We have to
indicate that we intend to do that by specifying the transfer source flag for the
stagingBuffer and the transfer destination flag for the vertexBuffer, along
with the vertex buffer usage flag.
We’re now going to write a function to copy the contents from one buffer to
another, called copyBuffer.
1 void copyBuffer(VkBuffer srcBuffer, VkBuffer dstBuffer, VkDeviceSize
size) {
2
3 }

Memory transfer operations are executed using command buffers, just like draw-
ing commands. Therefore we must first allocate a temporary command buffer.
You may wish to create a separate command pool for these kinds of short-lived
buffers, because the implementation may be able to apply memory allocation
optimizations. You should use the VK_COMMAND_POOL_CREATE_TRANSIENT_BIT
flag during command pool generation in that case.
1 void copyBuffer(VkBuffer srcBuffer, VkBuffer dstBuffer, VkDeviceSize
size) {
2 VkCommandBufferAllocateInfo allocInfo{};
3 allocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;

161
4 allocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
5 allocInfo.commandPool = commandPool;
6 allocInfo.commandBufferCount = 1;
7
8 VkCommandBuffer commandBuffer;
9 vkAllocateCommandBuffers(device, &allocInfo, &commandBuffer);
10 }

And immediately start recording the command buffer:

1 VkCommandBufferBeginInfo beginInfo{};
2 beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
3 beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
4
5 vkBeginCommandBuffer(commandBuffer, &beginInfo);

We’re only going to use the command buffer once and wait with re-
turning from the function until the copy operation has finished exe-
cuting. It’s good practice to tell the driver about our intent using
VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT.
1 VkBufferCopy copyRegion{};
2 copyRegion.srcOffset = 0; // Optional
3 copyRegion.dstOffset = 0; // Optional
4 copyRegion.size = size;
5 vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, 1, &copyRegion);

Contents of buffers are transferred using the vkCmdCopyBuffer command. It

takes the source and destination buffers as arguments, and an array of regions to
copy. The regions are defined in VkBufferCopy structs and consist of a source
buffer offset, destination buffer offset and size. It is not possible to specify
VK_WHOLE_SIZE here, unlike the vkMapMemory command.
1 vkEndCommandBuffer(commandBuffer);

This command buffer only contains the copy command, so we can stop recording
right after that. Now execute the command buffer to complete the transfer:
1 VkSubmitInfo submitInfo{};
2 submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
3 submitInfo.commandBufferCount = 1;
4 submitInfo.pCommandBuffers = &commandBuffer;
5
6 vkQueueSubmit(graphicsQueue, 1, &submitInfo, VK_NULL_HANDLE);
7 vkQueueWaitIdle(graphicsQueue);

Unlike the draw commands, there are no events we need to wait on this time.
We just want to execute the transfer on the buffers immediately. There are

162
again two possible ways to wait on this transfer to complete. We could use a
fence and wait with vkWaitForFences, or simply wait for the transfer queue
to become idle with vkQueueWaitIdle. A fence would allow you to schedule
multiple transfers simultaneously and wait for all of them complete, instead
of executing one at a time. That may give the driver more opportunities to
optimize.
1 vkFreeCommandBuffers(device, commandPool, 1, &commandBuffer);

Don’t forget to clean up the command buffer used for the transfer operation.
We can now call copyBuffer from the createVertexBuffer function to move
the vertex data to the device local buffer:
1 createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT |
VK_BUFFER_USAGE_VERTEX_BUFFER_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, vertexBuffer,
vertexBufferMemory);
2
3 copyBuffer(stagingBuffer, vertexBuffer, bufferSize);

After copying the data from the staging buffer to the device buffer, we should
clean it up:
1 ...
2
3 copyBuffer(stagingBuffer, vertexBuffer, bufferSize);
4
5 vkDestroyBuffer(device, stagingBuffer, nullptr);
6 vkFreeMemory(device, stagingBufferMemory, nullptr);
7 }

Run your program to verify that you’re seeing the familiar triangle again. The
improvement may not be visible right now, but its vertex data is now being
loaded from high performance memory. This will matter when we’re going to
start rendering more complex geometry.

Conclusion
It should be noted that in a real world application, you’re not supposed
to actually call vkAllocateMemory for every individual buffer. The
maximum number of simultaneous memory allocations is limited by the
maxMemoryAllocationCount physical device limit, which may be as low as
4096 even on high end hardware like an NVIDIA GTX 1080. The right way to
allocate memory for a large number of objects at the same time is to create a
custom allocator that splits up a single allocation among many different objects
by using the offset parameters that we’ve seen in many functions.

163
You can either implement such an allocator yourself, or use the VulkanMem-
oryAllocator library provided by the GPUOpen initiative. However, for this
tutorial it’s okay to use a separate allocation for every resource, because we
won’t come close to hitting any of these limits for now.
C++ code / Vertex shader / Fragment shader

Index buffer
Introduction
The 3D meshes you’ll be rendering in a real world application will often share
vertices between multiple triangles. This already happens even with something
simple like drawing a rectangle:

Drawing a rectangle takes two triangles, which means that we need a vertex
buffer with 6 vertices. The problem is that the data of two vertices needs to be
duplicated resulting in 50% redundancy. It only gets worse with more complex
meshes, where vertices are reused in an average number of 3 triangles. The
solution to this problem is to use an index buffer.
An index buffer is essentially an array of pointers into the vertex buffer. It allows
you to reorder the vertex data, and reuse existing data for multiple vertices. The
illustration above demonstrates what the index buffer would look like for the
rectangle if we have a vertex buffer containing each of the four unique vertices.
The first three indices define the upper-right triangle and the last three indices
define the vertices for the bottom-left triangle.

164
Index buffer creation
In this chapter we’re going to modify the vertex data and add index data to
draw a rectangle like the one in the illustration. Modify the vertex data to
represent the four corners:
1 const std::vector<Vertex> vertices = {
2 {{-0.5f, -0.5f}, {1.0f, 0.0f, 0.0f}},
3 {{0.5f, -0.5f}, {0.0f, 1.0f, 0.0f}},
4 {{0.5f, 0.5f}, {0.0f, 0.0f, 1.0f}},
5 {{-0.5f, 0.5f}, {1.0f, 1.0f, 1.0f}}
6 };

The top-left corner is red, top-right is green, bottom-right is blue and the
bottom-left is white. We’ll add a new array indices to represent the contents
of the index buffer. It should match the indices in the illustration to draw the
upper-right triangle and bottom-left triangle.
1 const std::vector<uint16_t> indices = {
2 0, 1, 2, 2, 3, 0
3 };

It is possible to use either uint16_t or uint32_t for your index buffer depending
on the number of entries in vertices. We can stick to uint16_t for now because
we’re using less than 65535 unique vertices.
Just like the vertex data, the indices need to be uploaded into a VkBuffer for
the GPU to be able to access them. Define two new class members to hold the
resources for the index buffer:
1 VkBuffer vertexBuffer;
2 VkDeviceMemory vertexBufferMemory;
3 VkBuffer indexBuffer;
4 VkDeviceMemory indexBufferMemory;

The createIndexBuffer function that we’ll add now is almost identical to

createVertexBuffer:
1 void initVulkan() {
2 ...
3 createVertexBuffer();
4 createIndexBuffer();
5 ...
6 }
7
8 void createIndexBuffer() {
9 VkDeviceSize bufferSize = sizeof(indices[0]) * indices.size();
10
11 VkBuffer stagingBuffer;

165
12 VkDeviceMemory stagingBufferMemory;
13 createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer,
stagingBufferMemory);
14
15 void* data;
16 vkMapMemory(device, stagingBufferMemory, 0, bufferSize, 0,
&data);
17 memcpy(data, indices.data(), (size_t) bufferSize);
18 vkUnmapMemory(device, stagingBufferMemory);
19
20 createBuffer(bufferSize, VK_BUFFER_USAGE_TRANSFER_DST_BIT |
VK_BUFFER_USAGE_INDEX_BUFFER_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, indexBuffer,
indexBufferMemory);
21
22 copyBuffer(stagingBuffer, indexBuffer, bufferSize);
23
24 vkDestroyBuffer(device, stagingBuffer, nullptr);
25 vkFreeMemory(device, stagingBufferMemory, nullptr);
26 }

There are only two notable differences. The bufferSize is now equal to the
number of indices times the size of the index type, either uint16_t or uint32_t.
The usage of the indexBuffer should be VK_BUFFER_USAGE_INDEX_BUFFER_BIT
instead of VK_BUFFER_USAGE_VERTEX_BUFFER_BIT, which makes sense. Other
than that, the process is exactly the same. We create a staging buffer to copy
the contents of indices to and then copy it to the final device local index buffer.
The index buffer should be cleaned up at the end of the program, just like the
vertex buffer:
1 void cleanup() {
2 cleanupSwapChain();
3
4 vkDestroyBuffer(device, indexBuffer, nullptr);
5 vkFreeMemory(device, indexBufferMemory, nullptr);
6
7 vkDestroyBuffer(device, vertexBuffer, nullptr);
8 vkFreeMemory(device, vertexBufferMemory, nullptr);
9
10 ...
11 }

166
Using an index buffer
Using an index buffer for drawing involves two changes to recordCommandBuffer.
We first need to bind the index buffer, just like we did for the vertex buffer.
The difference is that you can only have a single index buffer. It’s unfortunately
not possible to use different indices for each vertex attribute, so we do still
have to completely duplicate vertex data even if just one attribute varies.
1 vkCmdBindVertexBuffers(commandBuffer, 0, 1, vertexBuffers, offsets);
2
3 vkCmdBindIndexBuffer(commandBuffer, indexBuffer, 0,
VK_INDEX_TYPE_UINT16);

An index buffer is bound with vkCmdBindIndexBuffer which has the index

buffer, a byte offset into it, and the type of index data as parameters.
As mentioned before, the possible types are VK_INDEX_TYPE_UINT16 and
VK_INDEX_TYPE_UINT32.
Just binding an index buffer doesn’t change anything yet, we also need to change
the drawing command to tell Vulkan to use the index buffer. Remove the
vkCmdDraw line and replace it with vkCmdDrawIndexed:
1 vkCmdDrawIndexed(commandBuffer,
static_cast<uint32_t>(indices.size()), 1, 0, 0, 0);

A call to this function is very similar to vkCmdDraw. The first two parameters
specify the number of indices and the number of instances. We’re not using
instancing, so just specify 1 instance. The number of indices represents the
number of vertices that will be passed to the vertex shader. The next parameter
specifies an offset into the index buffer, using a value of 1 would cause the
graphics card to start reading at the second index. The second to last parameter
specifies an offset to add to the indices in the index buffer. The final parameter
specifies an offset for instancing, which we’re not using.
Now run your program and you should see the following:

167
You now know how to save memory by reusing vertices with index buffers. This
will become especially important in a future chapter where we’re going to load
complex 3D models.
The previous chapter already mentioned that you should allocate multiple re-
sources like buffers from a single memory allocation, but in fact you should
go a step further. Driver developers recommend that you also store multiple
buffers, like the vertex and index buffer, into a single VkBuffer and use offsets
in commands like vkCmdBindVertexBuffers. The advantage is that your data
is more cache friendly in that case, because it’s closer together. It is even pos-
sible to reuse the same chunk of memory for multiple resources if they are not
used during the same render operations, provided that their data is refreshed,
of course. This is known as aliasing and some Vulkan functions have explicit
flags to specify that you want to do this.
C++ code / Vertex shader / Fragment shader

168
Uniform buffers

Descriptor layout and buffer

Introduction
We’re now able to pass arbitrary attributes to the vertex shader for each ver-
tex, but what about global variables? We’re going to move on to 3D graphics
from this chapter on and that requires a model-view-projection matrix. We
could include it as vertex data, but that’s a waste of memory and it would re-
quire us to update the vertex buffer whenever the transformation changes. The
transformation could easily change every single frame.
The right way to tackle this in Vulkan is to use resource descriptors. A descriptor
is a way for shaders to freely access resources like buffers and images. We’re
going to set up a buffer that contains the transformation matrices and have the
vertex shader access them through a descriptor. Usage of descriptors consists
of three parts:
• Specify a descriptor layout during pipeline creation
• Allocate a descriptor set from a descriptor pool
• Bind the descriptor set during rendering
The descriptor layout specifies the types of resources that are going to be ac-
cessed by the pipeline, just like a render pass specifies the types of attachments
that will be accessed. A descriptor set specifies the actual buffer or image re-
sources that will be bound to the descriptors, just like a framebuffer specifies
the actual image views to bind to render pass attachments. The descriptor
set is then bound for the drawing commands just like the vertex buffers and
framebuffer.
There are many types of descriptors, but in this chapter we’ll work with uniform
buffer objects (UBO). We’ll look at other types of descriptors in future chapters,
but the basic process is the same. Let’s say we have the data we want the vertex
shader to have in a C struct like this:
1 struct UniformBufferObject {
2 glm::mat4 model;

169
3 glm::mat4 view;
4 glm::mat4 proj;
5 };

Then we can copy the data to a VkBuffer and access it through a uniform buffer
object descriptor from the vertex shader like this:
1 layout(binding = 0) uniform UniformBufferObject {
2 mat4 model;
3 mat4 view;
4 mat4 proj;
5 } ubo;
6
7 void main() {
8 gl_Position = ubo.proj * ubo.view * ubo.model * vec4(inPosition,
0.0, 1.0);
9 fragColor = inColor;
10 }

We’re going to update the model, view and projection matrices every frame to
make the rectangle from the previous chapter spin around in 3D.

Vertex shader
Modify the vertex shader to include the uniform buffer object like it was specified
above. I will assume that you are familiar with MVP transformations. If you’re
not, see the resource mentioned in the first chapter.
1 #version 450
2
3 layout(binding = 0) uniform UniformBufferObject {
4 mat4 model;
5 mat4 view;
6 mat4 proj;
7 } ubo;
8
9 layout(location = 0) in vec2 inPosition;
10 layout(location = 1) in vec3 inColor;
11
12 layout(location = 0) out vec3 fragColor;
13
14 void main() {
15 gl_Position = ubo.proj * ubo.view * ubo.model * vec4(inPosition,
0.0, 1.0);
16 fragColor = inColor;
17 }

170
Note that the order of the uniform, in and out declarations doesn’t matter. The
binding directive is similar to the location directive for attributes. We’re going
to reference this binding in the descriptor layout. The line with gl_Position is
changed to use the transformations to compute the final position in clip coordi-
nates. Unlike the 2D triangles, the last component of the clip coordinates may
not be 1, which will result in a division when converted to the final normalized
device coordinates on the screen. This is used in perspective projection as the
perspective division and is essential for making closer objects look larger than
objects that are further away.

Descriptor set layout

The next step is to define the UBO on the C++ side and to tell Vulkan about
this descriptor in the vertex shader.
1 struct UniformBufferObject {
2 glm::mat4 model;
3 glm::mat4 view;
4 glm::mat4 proj;
5 };

We can exactly match the definition in the shader using data types in GLM.
The data in the matrices is binary compatible with the way the shader expects
it, so we can later just memcpy a UniformBufferObject to a VkBuffer.
We need to provide details about every descriptor binding used in the shaders
for pipeline creation, just like we had to do for every vertex attribute and its
location index. We’ll set up a new function to define all of this information
called createDescriptorSetLayout. It should be called right before pipeline
creation, because we’re going to need it there.
1 void initVulkan() {
2 ...
3 createDescriptorSetLayout();
4 createGraphicsPipeline();
5 ...
6 }
7
8 ...
9
10 void createDescriptorSetLayout() {
11
12 }

Every binding needs to be described through a VkDescriptorSetLayoutBinding

struct.
1 void createDescriptorSetLayout() {

171
2 VkDescriptorSetLayoutBinding uboLayoutBinding{};
3 uboLayoutBinding.binding = 0;
4 uboLayoutBinding.descriptorType =
VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
5 uboLayoutBinding.descriptorCount = 1;
6 }

The first two fields specify the binding used in the shader and the type of de-
scriptor, which is a uniform buffer object. It is possible for the shader variable
to represent an array of uniform buffer objects, and descriptorCount specifies
the number of values in the array. This could be used to specify a transfor-
mation for each of the bones in a skeleton for skeletal animation, for example.
Our MVP transformation is in a single uniform buffer object, so we’re using a
descriptorCount of 1.
1 uboLayoutBinding.stageFlags = VK_SHADER_STAGE_VERTEX_BIT;

We also need to specify in which shader stages the descriptor is going to be refer-
enced. The stageFlags field can be a combination of VkShaderStageFlagBits
values or the value VK_SHADER_STAGE_ALL_GRAPHICS. In our case, we’re only
referencing the descriptor from the vertex shader.
1 uboLayoutBinding.pImmutableSamplers = nullptr; // Optional

The pImmutableSamplers field is only relevant for image sampling related de-
scriptors, which we’ll look at later. You can leave this to its default value.
All of the descriptor bindings are combined into a single VkDescriptorSetLayout
object. Define a new class member above pipelineLayout:
1 VkDescriptorSetLayout descriptorSetLayout;
2 VkPipelineLayout pipelineLayout;

We can then create it using vkCreateDescriptorSetLayout. This function ac-

cepts a simple VkDescriptorSetLayoutCreateInfo with the array of bindings:
1 VkDescriptorSetLayoutCreateInfo layoutInfo{};
2 layoutInfo.sType =
VK_STRUCTURE_TYPE_DESCRIPTOR_SET_LAYOUT_CREATE_INFO;
3 layoutInfo.bindingCount = 1;
4 layoutInfo.pBindings = &uboLayoutBinding;
5
6 if (vkCreateDescriptorSetLayout(device, &layoutInfo, nullptr,
&descriptorSetLayout) != VK_SUCCESS) {
7 throw std::runtime_error("failed to create descriptor set
layout!");
8 }

172
We need to specify the descriptor set layout during pipeline creation
to tell Vulkan which descriptors the shaders will be using. Descrip-
tor set layouts are specified in the pipeline layout object. Modify the
VkPipelineLayoutCreateInfo to reference the layout object:
1 VkPipelineLayoutCreateInfo pipelineLayoutInfo{};
2 pipelineLayoutInfo.sType =
VK_STRUCTURE_TYPE_PIPELINE_LAYOUT_CREATE_INFO;
3 pipelineLayoutInfo.setLayoutCount = 1;
4 pipelineLayoutInfo.pSetLayouts = &descriptorSetLayout;

You may be wondering why it’s possible to specify multiple descriptor set layouts
here, because a single one already includes all of the bindings. We’ll get back to
that in the next chapter, where we’ll look into descriptor pools and descriptor
sets.
The descriptor layout should stick around while we may create new graphics
pipelines i.e. until the program ends:
1 void cleanup() {
2 cleanupSwapChain();
3
4 vkDestroyDescriptorSetLayout(device, descriptorSetLayout,
nullptr);
5
6 ...
7 }

Uniform buffer
In the next chapter we’ll specify the buffer that contains the UBO data for the
shader, but we need to create this buffer first. We’re going to copy new data
to the uniform buffer every frame, so it doesn’t really make any sense to have a
staging buffer. It would just add extra overhead in this case and likely degrade
performance instead of improving it.
We should have multiple buffers, because multiple frames may be in flight at the
same time and we don’t want to update the buffer in preparation of the next
frame while a previous one is still reading from it! Thus, we need to have as
many uniform buffers as we have frames in flight, and write to a uniform buffer
that is not currently being read by the GPU
To that end, add new class members for uniformBuffers, and uniformBuffersMemory:
1 VkBuffer indexBuffer;
2 VkDeviceMemory indexBufferMemory;
3
4 std::vector<VkBuffer> uniformBuffers;

173
5 std::vector<VkDeviceMemory> uniformBuffersMemory;
6 std::vector<void*> uniformBuffersMapped;

Similarly, create a new function createUniformBuffers that is called after

createIndexBuffer and allocates the buffers:
1 void initVulkan() {
2 ...
3 createVertexBuffer();
4 createIndexBuffer();
5 createUniformBuffers();
6 ...
7 }
8
9 ...
10
11 void createUniformBuffers() {
12 VkDeviceSize bufferSize = sizeof(UniformBufferObject);
13
14 uniformBuffers.resize(MAX_FRAMES_IN_FLIGHT);
15 uniformBuffersMemory.resize(MAX_FRAMES_IN_FLIGHT);
16 uniformBuffersMapped.resize(MAX_FRAMES_IN_FLIGHT);
17
18 for (size_t i = 0; i < MAX_FRAMES_IN_FLIGHT; i++) {
19 createBuffer(bufferSize, VK_BUFFER_USAGE_UNIFORM_BUFFER_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, uniformBuffers[i],
uniformBuffersMemory[i]);
20
21 vkMapMemory(device, uniformBuffersMemory[i], 0, bufferSize,
0, &uniformBuffersMapped[i]);
22 }
23 }

We map the buffer right after creation using vkMapMemory to get a pointer
to which we can write the data later on. The buffer stays mapped to this
pointer for the application’s whole lifetime. This technique is called “persistent
mapping” and works on all Vulkan implementations. Not having to map the
buffer every time we need to update it increases performances, as mapping is
not free.
The uniform data will be used for all draw calls, so the buffer containing it
should only be destroyed when we stop rendering.
1 void cleanup() {
2 ...
3

174
4 for (size_t i = 0; i < MAX_FRAMES_IN_FLIGHT; i++) {
5 vkDestroyBuffer(device, uniformBuffers[i], nullptr);
6 vkFreeMemory(device, uniformBuffersMemory[i], nullptr);
7 }
8
9 vkDestroyDescriptorSetLayout(device, descriptorSetLayout,
nullptr);
10
11 ...
12
13 }

Updating uniform data

Create a new function updateUniformBuffer and add a call to it from the
drawFrame function before submitting the next frame:
1 void drawFrame() {
2 ...
3
4 updateUniformBuffer(currentFrame);
5
6 ...
7
8 VkSubmitInfo submitInfo{};
9 submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
10
11 ...
12 }
13
14 ...
15
16 void updateUniformBuffer(uint32_t currentImage) {
17
18 }

This function will generate a new transformation every frame to make the ge-
ometry spin around. We need to include two new headers to implement this
functionality:
1 #define GLM_FORCE_RADIANS
2 #include <glm/glm.hpp>
3 #include <glm/gtc/matrix_transform.hpp>
4
5 #include <chrono>

175
The glm/gtc/matrix_transform.hpp header exposes functions that can be
used to generate model transformations like glm::rotate, view transforma-
tions like glm::lookAt and projection transformations like glm::perspective.
The GLM_FORCE_RADIANS definition is necessary to make sure that functions like
glm::rotate use radians as arguments, to avoid any possible confusion.
The chrono standard library header exposes functions to do precise timekeeping.
We’ll use this to make sure that the geometry rotates 90 degrees per second
regardless of frame rate.
1 void updateUniformBuffer(uint32_t currentImage) {
2 static auto startTime =
std::chrono::high_resolution_clock::now();
3
4 auto currentTime = std::chrono::high_resolution_clock::now();
5 float time = std::chrono::duration<float,
std::chrono::seconds::period>(currentTime -
startTime).count();
6 }

The updateUniformBuffer function will start out with some logic to calculate
the time in seconds since rendering has started with floating point accuracy.
We will now define the model, view and projection transformations in the uni-
form buffer object. The model rotation will be a simple rotation around the
Z-axis using the time variable:
1 UniformBufferObject ubo{};
2 ubo.model = glm::rotate(glm::mat4(1.0f), time * glm::radians(90.0f),
glm::vec3(0.0f, 0.0f, 1.0f));

The glm::rotate function takes an existing transformation, rotation angle and

rotation axis as parameters. The glm::mat4(1.0f) constructor returns an iden-
tity matrix. Using a rotation angle of time * glm::radians(90.0f) accom-
plishes the purpose of rotation 90 degrees per second.
1 ubo.view = glm::lookAt(glm::vec3(2.0f, 2.0f, 2.0f), glm::vec3(0.0f,
0.0f, 0.0f), glm::vec3(0.0f, 0.0f, 1.0f));

For the view transformation I’ve decided to look at the geometry from above
at a 45 degree angle. The glm::lookAt function takes the eye position, center
position and up axis as parameters.
1 ubo.proj = glm::perspective(glm::radians(45.0f),
swapChainExtent.width / (float) swapChainExtent.height, 0.1f,
10.0f);

I’ve chosen to use a perspective projection with a 45 degree vertical field-of-

view. The other parameters are the aspect ratio, near and far view planes. It

176
is important to use the current swap chain extent to calculate the aspect ratio
to take into account the new width and height of the window after a resize.
1 ubo.proj[1][1] *= -1;

GLM was originally designed for OpenGL, where the Y coordinate of the clip
coordinates is inverted. The easiest way to compensate for that is to flip the
sign on the scaling factor of the Y axis in the projection matrix. If you don’t
do this, then the image will be rendered upside down.
All of the transformations are defined now, so we can copy the data in the
uniform buffer object to the current uniform buffer. This happens in exactly
the same way as we did for vertex buffers, except without a staging buffer. As
noted earlier, we only map the uniform buffer once, so we can directly write to
it without having to map again:
1 memcpy(uniformBuffersMapped[currentImage], &ubo, sizeof(ubo));

Using a UBO this way is not the most eﬀicient way to pass frequently changing
values to the shader. A more eﬀicient way to pass a small buffer of data to
shaders are push constants. We may look at these in a future chapter.
In the next chapter we’ll look at descriptor sets, which will actually bind the
VkBuffers to the uniform buffer descriptors so that the shader can access this
transformation data.
C++ code / Vertex shader / Fragment shader

Descriptor pool and sets

Introduction
The descriptor layout from the previous chapter describes the type of descriptors
that can be bound. In this chapter we’re going to create a descriptor set for
each VkBuffer resource to bind it to the uniform buffer descriptor.

Descriptor pool
Descriptor sets can’t be created directly, they must be allocated from a pool like
command buffers. The equivalent for descriptor sets is unsurprisingly called a
descriptor pool. We’ll write a new function createDescriptorPool to set it up.
1 void initVulkan() {
2 ...
3 createUniformBuffers();
4 createDescriptorPool();
5 ...
6 }
7

177
8 ...
9
10 void createDescriptorPool() {
11
12 }

We first need to describe which descriptor types our descriptor sets are going to
contain and how many of them, using VkDescriptorPoolSize structures.
1 VkDescriptorPoolSize poolSize{};
2 poolSize.type = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
3 poolSize.descriptorCount =
static_cast<uint32_t>(MAX_FRAMES_IN_FLIGHT);

We will allocate one of these descriptors for every frame. This pool size structure
is referenced by the main VkDescriptorPoolCreateInfo:
1 VkDescriptorPoolCreateInfo poolInfo{};
2 poolInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_POOL_CREATE_INFO;
3 poolInfo.poolSizeCount = 1;
4 poolInfo.pPoolSizes = &poolSize;

Aside from the maximum number of individual descriptors that are available,
we also need to specify the maximum number of descriptor sets that may be
allocated:
1 poolInfo.maxSets = static_cast<uint32_t>(MAX_FRAMES_IN_FLIGHT);

The structure has an optional flag similar to command pools that determines if
individual descriptor sets can be freed or not: VK_DESCRIPTOR_POOL_CREATE_FREE_DESCRIPTOR_SET_BIT.
We’re not going to touch the descriptor set after creating it, so we don’t need
this flag. You can leave flags to its default value of 0.
1 VkDescriptorPool descriptorPool;
2
3 ...
4
5 if (vkCreateDescriptorPool(device, &poolInfo, nullptr,
&descriptorPool) != VK_SUCCESS) {
6 throw std::runtime_error("failed to create descriptor pool!");
7 }

Add a new class member to store the handle of the descriptor pool and call
vkCreateDescriptorPool to create it.

Descriptor set
We can now allocate the descriptor sets themselves. Add a createDescriptorSets
function for that purpose:

178
1 void initVulkan() {
2 ...
3 createDescriptorPool();
4 createDescriptorSets();
5 ...
6 }
7
8 ...
9
10 void createDescriptorSets() {
11
12 }

A descriptor set allocation is described with a VkDescriptorSetAllocateInfo

struct. You need to specify the descriptor pool to allocate from, the number of
descriptor sets to allocate, and the descriptor layout to base them on:
1 std::vector<VkDescriptorSetLayout> layouts(MAX_FRAMES_IN_FLIGHT,
descriptorSetLayout);
2 VkDescriptorSetAllocateInfo allocInfo{};
3 allocInfo.sType = VK_STRUCTURE_TYPE_DESCRIPTOR_SET_ALLOCATE_INFO;
4 allocInfo.descriptorPool = descriptorPool;
5 allocInfo.descriptorSetCount =
static_cast<uint32_t>(MAX_FRAMES_IN_FLIGHT);
6 allocInfo.pSetLayouts = layouts.data();

In our case we will create one descriptor set for each frame in flight, all with the
same layout. Unfortunately we do need all the copies of the layout because the
next function expects an array matching the number of sets.
Add a class member to hold the descriptor set handles and allocate them with
vkAllocateDescriptorSets:
1 VkDescriptorPool descriptorPool;
2 std::vector<VkDescriptorSet> descriptorSets;
3
4 ...
5
6 descriptorSets.resize(MAX_FRAMES_IN_FLIGHT);
7 if (vkAllocateDescriptorSets(device, &allocInfo,
descriptorSets.data()) != VK_SUCCESS) {
8 throw std::runtime_error("failed to allocate descriptor sets!");
9 }

You don’t need to explicitly clean up descriptor sets, because they will
be automatically freed when the descriptor pool is destroyed. The call
to vkAllocateDescriptorSets will allocate descriptor sets, each with one
uniform buffer descriptor.

179
1 void cleanup() {
2 ...
3 vkDestroyDescriptorPool(device, descriptorPool, nullptr);
4
5 vkDestroyDescriptorSetLayout(device, descriptorSetLayout,
nullptr);
6 ...
7 }

The descriptor sets have been allocated now, but the descriptors within still
need to be configured. We’ll now add a loop to populate every descriptor:
1 for (size_t i = 0; i < MAX_FRAMES_IN_FLIGHT; i++) {
2
3 }

Descriptors that refer to buffers, like our uniform buffer descriptor, are config-
ured with a VkDescriptorBufferInfo struct. This structure specifies the buffer
and the region within it that contains the data for the descriptor.
1 for (size_t i = 0; i < MAX_FRAMES_IN_FLIGHT; i++) {
2 VkDescriptorBufferInfo bufferInfo{};
3 bufferInfo.buffer = uniformBuffers[i];
4 bufferInfo.offset = 0;
5 bufferInfo.range = sizeof(UniformBufferObject);
6 }

If you’re overwriting the whole buffer, like we are in this case, then it is also
possible to use the VK_WHOLE_SIZE value for the range. The configuration of de-
scriptors is updated using the vkUpdateDescriptorSets function, which takes
an array of VkWriteDescriptorSet structs as parameter.
1 VkWriteDescriptorSet descriptorWrite{};
2 descriptorWrite.sType = VK_STRUCTURE_TYPE_WRITE_DESCRIPTOR_SET;
3 descriptorWrite.dstSet = descriptorSets[i];
4 descriptorWrite.dstBinding = 0;
5 descriptorWrite.dstArrayElement = 0;

The first two fields specify the descriptor set to update and the binding. We
gave our uniform buffer binding index 0. Remember that descriptors can be
arrays, so we also need to specify the first index in the array that we want to
update. We’re not using an array, so the index is simply 0.
1 descriptorWrite.descriptorType = VK_DESCRIPTOR_TYPE_UNIFORM_BUFFER;
2 descriptorWrite.descriptorCount = 1;

We need to specify the type of descriptor again. It’s possible to update multi-
ple descriptors at once in an array, starting at index dstArrayElement. The
descriptorCount field specifies how many array elements you want to update.

180
1 descriptorWrite.pBufferInfo = &bufferInfo;
2 descriptorWrite.pImageInfo = nullptr; // Optional
3 descriptorWrite.pTexelBufferView = nullptr; // Optional

The last field references an array with descriptorCount structs that actually
configure the descriptors. It depends on the type of descriptor which one of the
three you actually need to use. The pBufferInfo field is used for descriptors
that refer to buffer data, pImageInfo is used for descriptors that refer to image
data, and pTexelBufferView is used for descriptors that refer to buffer views.
Our descriptor is based on buffers, so we’re using pBufferInfo.
1 vkUpdateDescriptorSets(device, 1, &descriptorWrite, 0, nullptr);

The updates are applied using vkUpdateDescriptorSets. It accepts two kinds

of arrays as parameters: an array of VkWriteDescriptorSet and an array of
VkCopyDescriptorSet. The latter can be used to copy descriptors to each other,
as its name implies.

Using descriptor sets

We now need to update the recordCommandBuffer function to actually
bind the right descriptor set for each frame to the descriptors in the
shader with vkCmdBindDescriptorSets. This needs to be done before the
vkCmdDrawIndexed call:
1 vkCmdBindDescriptorSets(commandBuffer,
VK_PIPELINE_BIND_POINT_GRAPHICS, pipelineLayout, 0, 1,
&descriptorSets[currentFrame], 0, nullptr);
2 vkCmdDrawIndexed(commandBuffer,
static_cast<uint32_t>(indices.size()), 1, 0, 0, 0);

Unlike vertex and index buffers, descriptor sets are not unique to graphics
pipelines. Therefore we need to specify if we want to bind descriptor sets to
the graphics or compute pipeline. The next parameter is the layout that the
descriptors are based on. The next three parameters specify the index of the
first descriptor set, the number of sets to bind, and the array of sets to bind.
We’ll get back to this in a moment. The last two parameters specify an array
of offsets that are used for dynamic descriptors. We’ll look at these in a future
chapter.
If you run your program now, then you’ll notice that unfortunately nothing is
visible. The problem is that because of the Y-flip we did in the projection matrix,
the vertices are now being drawn in counter-clockwise order instead of clockwise
order. This causes backface culling to kick in and prevents any geometry from
being drawn. Go to the createGraphicsPipeline function and modify the
frontFace in VkPipelineRasterizationStateCreateInfo to correct this:
1 rasterizer.cullMode = VK_CULL_MODE_BACK_BIT;

181
2 rasterizer.frontFace = VK_FRONT_FACE_COUNTER_CLOCKWISE;

Run your program again and you should now see the following:

The rectangle has changed into a square because the projection matrix now cor-
rects for aspect ratio. The updateUniformBuffer takes care of screen resizing,
so we don’t need to recreate the descriptor set in recreateSwapChain.

Alignment requirements
One thing we’ve glossed over so far is how exactly the data in the C++ structure
should match with the uniform definition in the shader. It seems obvious enough
to simply use the same types in both:
1 struct UniformBufferObject {
2 glm::mat4 model;
3 glm::mat4 view;
4 glm::mat4 proj;
5 };
6
7 layout(binding = 0) uniform UniformBufferObject {
8 mat4 model;
9 mat4 view;

182
10 mat4 proj;
11 } ubo;

However, that’s not all there is to it. For example, try modifying the struct and
shader to look like this:
1 struct UniformBufferObject {
2 glm::vec2 foo;
3 glm::mat4 model;
4 glm::mat4 view;
5 glm::mat4 proj;
6 };
7
8 layout(binding = 0) uniform UniformBufferObject {
9 vec2 foo;
10 mat4 model;
11 mat4 view;
12 mat4 proj;
13 } ubo;

Recompile your shader and your program and run it and you’ll find that the
colorful square you worked so far has disappeared! That’s because we haven’t
taken into account the alignment requirements.
Vulkan expects the data in your structure to be aligned in memory in a specific
way, for example:
• Scalars have to be aligned by N (= 4 bytes given 32 bit floats).
• A vec2 must be aligned by 2N (= 8 bytes)
• A vec3 or vec4 must be aligned by 4N (= 16 bytes)
• A nested structure must be aligned by the base alignment of its members
rounded up to a multiple of 16.
• A mat4 matrix must have the same alignment as a vec4.
You can find the full list of alignment requirements in the specification.
Our original shader with just three mat4 fields already met the alignment re-
quirements. As each mat4 is 4 x 4 x 4 = 64 bytes in size, model has an offset
of 0, view has an offset of 64 and proj has an offset of 128. All of these are
multiples of 16 and that’s why it worked fine.
The new structure starts with a vec2 which is only 8 bytes in size and therefore
throws off all of the offsets. Now model has an offset of 8, view an offset of 72
and proj an offset of 136, none of which are multiples of 16. To fix this problem
we can use the alignas specifier introduced in C++11:
1 struct UniformBufferObject {
2 glm::vec2 foo;
3 alignas(16) glm::mat4 model;

183
4 glm::mat4 view;
5 glm::mat4 proj;
6 };

If you now compile and run your program again you should see that the shader
correctly receives its matrix values once again.
Luckily there is a way to not have to think about these alignment requirements
most of the time. We can define GLM_FORCE_DEFAULT_ALIGNED_GENTYPES right
before including GLM:
1 #define GLM_FORCE_RADIANS
2 #define GLM_FORCE_DEFAULT_ALIGNED_GENTYPES
3 #include <glm/glm.hpp>

This will force GLM to use a version of vec2 and mat4 that has the alignment
requirements already specified for us. If you add this definition then you can
remove the alignas specifier and your program should still work.
Unfortunately this method can break down if you start using nested structures.
Consider the following definition in the C++ code:
1 struct Foo {
2 glm::vec2 v;
3 };
4
5 struct UniformBufferObject {
6 Foo f1;
7 Foo f2;
8 };

And the following shader definition:

1 struct Foo {
2 vec2 v;
3 };
4
5 layout(binding = 0) uniform UniformBufferObject {
6 Foo f1;
7 Foo f2;
8 } ubo;

In this case f2 will have an offset of 8 whereas it should have an offset of 16 since
it is a nested structure. In this case you must specify the alignment yourself:
1 struct UniformBufferObject {
2 Foo f1;
3 alignas(16) Foo f2;
4 };

184
These gotchas are a good reason to always be explicit about alignment. That
way you won’t be caught offguard by the strange symptoms of alignment errors.
1 struct UniformBufferObject {
2 alignas(16) glm::mat4 model;
3 alignas(16) glm::mat4 view;
4 alignas(16) glm::mat4 proj;
5 };

Don’t forget to recompile your shader after removing the foo field.

Multiple descriptor sets

As some of the structures and function calls hinted at, it is actually possible to
bind multiple descriptor sets simultaneously. You need to specify a descriptor
layout for each descriptor set when creating the pipeline layout. Shaders can
then reference specific descriptor sets like this:
1 layout(set = 0, binding = 0) uniform UniformBufferObject { ... }

You can use this feature to put descriptors that vary per-object and descriptors
that are shared into separate descriptor sets. In that case you avoid rebinding
most of the descriptors across draw calls which is potentially more eﬀicient.
C++ code / Vertex shader / Fragment shader

185
Texture mapping

Images
Introduction
The geometry has been colored using per-vertex colors so far, which is a rather
limited approach. In this part of the tutorial we’re going to implement texture
mapping to make the geometry look more interesting. This will also allow us to
load and draw basic 3D models in a future chapter.
Adding a texture to our application will involve the following steps:
• Create an image object backed by device memory
• Fill it with pixels from an image file
• Create an image sampler
• Add a combined image sampler descriptor to sample colors from the tex-
ture
We’ve already worked with image objects before, but those were automatically
created by the swap chain extension. This time we’ll have to create one by
ourselves. Creating an image and filling it with data is similar to vertex buffer
creation. We’ll start by creating a staging resource and filling it with pixel data
and then we copy this to the final image object that we’ll use for rendering.
Although it is possible to create a staging image for this purpose, Vulkan also
allows you to copy pixels from a VkBuffer to an image and the API for this is
actually faster on some hardware. We’ll first create this buffer and fill it with
pixel values, and then we’ll create an image to copy the pixels to. Creating
an image is not very different from creating buffers. It involves querying the
memory requirements, allocating device memory and binding it, just like we’ve
seen before.
However, there is something extra that we’ll have to take care of when working
with images. Images can have different layouts that affect how the pixels are
organized in memory. Due to the way graphics hardware works, simply storing
the pixels row by row may not lead to the best performance, for example. When
performing any operation on images, you must make sure that they have the

186
layout that is optimal for use in that operation. We’ve actually already seen
some of these layouts when we specified the render pass:
• VK_IMAGE_LAYOUT_PRESENT_SRC_KHR: Optimal for presentation
• VK_IMAGE_LAYOUT_COLOR_ATTACHMENT_OPTIMAL: Optimal as attachment
for writing colors from the fragment shader
• VK_IMAGE_LAYOUT_TRANSFER_SRC_OPTIMAL: Optimal as source in a trans-
fer operation, like vkCmdCopyImageToBuffer
• VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL: Optimal as destination in a
transfer operation, like vkCmdCopyBufferToImage
• VK_IMAGE_LAYOUT_SHADER_READ_ONLY_OPTIMAL: Optimal for sampling
from a shader
One of the most common ways to transition the layout of an image is a pipeline
barrier. Pipeline barriers are primarily used for synchronizing access to re-
sources, like making sure that an image was written to before it is read, but they
can also be used to transition layouts. In this chapter we’ll see how pipeline
barriers are used for this purpose. Barriers can additionally be used to transfer
queue family ownership when using VK_SHARING_MODE_EXCLUSIVE.

Image library
There are many libraries available for loading images, and you can even write
your own code to load simple formats like BMP and PPM. In this tutorial we’ll
be using the stb_image library from the stb collection. The advantage of it
is that all of the code is in a single file, so it doesn’t require any tricky build
configuration. Download stb_image.h and store it in a convenient location,
like the directory where you saved GLFW and GLM. Add the location to your
include path.
Visual Studio
Add the directory with stb_image.h in it to the Additional Include
Directories paths.

Makefile
Add the directory with stb_image.h to the include directories for GCC:

187
1 VULKAN_SDK_PATH = /home/user/VulkanSDK/x.x.x.x/x86_64
2 STB_INCLUDE_PATH = /home/user/libraries/stb
3
4 ...
5
6 CFLAGS = -std=c++17 -I$(VULKAN_SDK_PATH)/include
-I$(STB_INCLUDE_PATH)

Loading an image
Include the image library like this:
1 #define STB_IMAGE_IMPLEMENTATION
2 #include <stb_image.h>

The header only defines the prototypes of the functions by default. One code
file needs to include the header with the STB_IMAGE_IMPLEMENTATION definition
to include the function bodies, otherwise we’ll get linking errors.
1 void initVulkan() {
2 ...
3 createCommandPool();
4 createTextureImage();
5 createVertexBuffer();
6 ...
7 }
8
9 ...
10
11 void createTextureImage() {
12
13 }

Create a new function createTextureImage where we’ll load an image and

upload it into a Vulkan image object. We’re going to use command buffers, so
it should be called after createCommandPool.
Create a new directory textures next to the shaders directory to store texture
images in. We’re going to load an image called texture.jpg from that directory.
I’ve chosen to use the following CC0 licensed image resized to 512 x 512 pixels,
but feel free to pick any image you want. The library supports most common
image file formats, like JPEG, PNG, BMP and GIF.

188
Loading an image with this library is really easy:
1 void createTextureImage() {
2 int texWidth, texHeight, texChannels;
3 stbi_uc* pixels = stbi_load("textures/texture.jpg", &texWidth,
&texHeight, &texChannels, STBI_rgb_alpha);
4 VkDeviceSize imageSize = texWidth * texHeight * 4;
5
6 if (!pixels) {
7 throw std::runtime_error("failed to load texture image!");
8 }
9 }

The stbi_load function takes the file path and number of channels to load as
arguments. The STBI_rgb_alpha value forces the image to be loaded with an
alpha channel, even if it doesn’t have one, which is nice for consistency with
other textures in the future. The middle three parameters are outputs for the

189
width, height and actual number of channels in the image. The pointer that is
returned is the first element in an array of pixel values. The pixels are laid out
row by row with 4 bytes per pixel in the case of STBI_rgb_alpha for a total of
texWidth * texHeight * 4 values.

Staging buffer
We’re now going to create a buffer in host visible memory so that we can use
vkMapMemory and copy the pixels to it. Add variables for this temporary buffer
to the createTextureImage function:
1 VkBuffer stagingBuffer;
2 VkDeviceMemory stagingBufferMemory;

The buffer should be in host visible memory so that we can map it and it should
be usable as a transfer source so that we can copy it to an image later on:
1 createBuffer(imageSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer,
stagingBufferMemory);

We can then directly copy the pixel values that we got from the image loading
library to the buffer:
1 void* data;
2 vkMapMemory(device, stagingBufferMemory, 0, imageSize, 0, &data);
3 memcpy(data, pixels, static_cast<size_t>(imageSize));
4 vkUnmapMemory(device, stagingBufferMemory);

Don’t forget to clean up the original pixel array now:

1 stbi_image_free(pixels);

Texture Image
Although we could set up the shader to access the pixel values in the buffer, it’s
better to use image objects in Vulkan for this purpose. Image objects will make
it easier and faster to retrieve colors by allowing us to use 2D coordinates, for
one. Pixels within an image object are known as texels and we’ll use that name
from this point on. Add the following new class members:
1 VkImage textureImage;
2 VkDeviceMemory textureImageMemory;

The parameters for an image are specified in a VkImageCreateInfo struct:

190
1 VkImageCreateInfo imageInfo{};
2 imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
3 imageInfo.imageType = VK_IMAGE_TYPE_2D;
4 imageInfo.extent.width = static_cast<uint32_t>(texWidth);
5 imageInfo.extent.height = static_cast<uint32_t>(texHeight);
6 imageInfo.extent.depth = 1;
7 imageInfo.mipLevels = 1;
8 imageInfo.arrayLayers = 1;

The image type, specified in the imageType field, tells Vulkan with what kind
of coordinate system the texels in the image are going to be addressed. It is
possible to create 1D, 2D and 3D images. One dimensional images can be used
to store an array of data or gradient, two dimensional images are mainly used
for textures, and three dimensional images can be used to store voxel volumes,
for example. The extent field specifies the dimensions of the image, basically
how many texels there are on each axis. That’s why depth must be 1 instead
of 0. Our texture will not be an array and we won’t be using mipmapping for
now.
1 imageInfo.format = VK_FORMAT_R8G8B8A8_SRGB;

Vulkan supports many possible image formats, but we should use the same
format for the texels as the pixels in the buffer, otherwise the copy operation
will fail.
1 imageInfo.tiling = VK_IMAGE_TILING_OPTIMAL;

The tiling field can have one of two values:

• VK_IMAGE_TILING_LINEAR: Texels are laid out in row-major order like our
pixels array
• VK_IMAGE_TILING_OPTIMAL: Texels are laid out in an implementation de-
fined order for optimal access
Unlike the layout of an image, the tiling mode cannot be changed at a later
time. If you want to be able to directly access texels in the memory of the
image, then you must use VK_IMAGE_TILING_LINEAR. We will be using a staging
buffer instead of a staging image, so this won’t be necessary. We will be using
VK_IMAGE_TILING_OPTIMAL for eﬀicient access from the shader.
1 imageInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;

There are only two possible values for the initialLayout of an image:
• VK_IMAGE_LAYOUT_UNDEFINED: Not usable by the GPU and the very first
transition will discard the texels.
• VK_IMAGE_LAYOUT_PREINITIALIZED: Not usable by the GPU, but the first
transition will preserve the texels.

191
There are few situations where it is necessary for the texels to be preserved
during the first transition. One example, however, would be if you wanted to use
an image as a staging image in combination with the VK_IMAGE_TILING_LINEAR
layout. In that case, you’d want to upload the texel data to it and then transition
the image to be a transfer source without losing the data. In our case, however,
we’re first going to transition the image to be a transfer destination and then
copy texel data to it from a buffer object, so we don’t need this property and
can safely use VK_IMAGE_LAYOUT_UNDEFINED.
1 imageInfo.usage = VK_IMAGE_USAGE_TRANSFER_DST_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT;

The usage field has the same semantics as the one during buffer creation.
The image is going to be used as destination for the buffer copy, so it should
be set up as a transfer destination. We also want to be able to access
the image from the shader to color our mesh, so the usage should include
VK_IMAGE_USAGE_SAMPLED_BIT.
1 imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;

The image will only be used by one queue family: the one that supports graphics
(and therefore also) transfer operations.
1 imageInfo.samples = VK_SAMPLE_COUNT_1_BIT;
2 imageInfo.flags = 0; // Optional

The samples flag is related to multisampling. This is only relevant for images
that will be used as attachments, so stick to one sample. There are some optional
flags for images that are related to sparse images. Sparse images are images
where only certain regions are actually backed by memory. If you were using
a 3D texture for a voxel terrain, for example, then you could use this to avoid
allocating memory to store large volumes of “air” values. We won’t be using it
in this tutorial, so leave it to its default value of 0.
1 if (vkCreateImage(device, &imageInfo, nullptr, &textureImage) !=
VK_SUCCESS) {
2 throw std::runtime_error("failed to create image!");
3 }

The image is created using vkCreateImage, which doesn’t have any particularly
noteworthy parameters. It is possible that the VK_FORMAT_R8G8B8A8_SRGB for-
mat is not supported by the graphics hardware. You should have a list of
acceptable alternatives and go with the best one that is supported. However,
support for this particular format is so widespread that we’ll skip this step. Us-
ing different formats would also require annoying conversions. We will get back
to this in the depth buffer chapter, where we’ll implement such a system.
1 VkMemoryRequirements memRequirements;

192
2 vkGetImageMemoryRequirements(device, textureImage, &memRequirements);
3
4 VkMemoryAllocateInfo allocInfo{};
5 allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
6 allocInfo.allocationSize = memRequirements.size;
7 allocInfo.memoryTypeIndex =
findMemoryType(memRequirements.memoryTypeBits,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT);
8
9 if (vkAllocateMemory(device, &allocInfo, nullptr,
&textureImageMemory) != VK_SUCCESS) {
10 throw std::runtime_error("failed to allocate image memory!");
11 }
12
13 vkBindImageMemory(device, textureImage, textureImageMemory, 0);

Allocating memory for an image works in exactly the same way as allocat-
ing memory for a buffer. Use vkGetImageMemoryRequirements instead of
vkGetBufferMemoryRequirements, and use vkBindImageMemory instead of
vkBindBufferMemory.
This function is already getting quite large and there’ll be a need to create
more images in later chapters, so we should abstract image creation into a
createImage function, like we did for buffers. Create the function and move
the image object creation and memory allocation to it:
1 void createImage(uint32_t width, uint32_t height, VkFormat format,
VkImageTiling tiling, VkImageUsageFlags usage,
VkMemoryPropertyFlags properties, VkImage& image,
VkDeviceMemory& imageMemory) {
2 VkImageCreateInfo imageInfo{};
3 imageInfo.sType = VK_STRUCTURE_TYPE_IMAGE_CREATE_INFO;
4 imageInfo.imageType = VK_IMAGE_TYPE_2D;
5 imageInfo.extent.width = width;
6 imageInfo.extent.height = height;
7 imageInfo.extent.depth = 1;
8 imageInfo.mipLevels = 1;
9 imageInfo.arrayLayers = 1;
10 imageInfo.format = format;
11 imageInfo.tiling = tiling;
12 imageInfo.initialLayout = VK_IMAGE_LAYOUT_UNDEFINED;
13 imageInfo.usage = usage;
14 imageInfo.samples = VK_SAMPLE_COUNT_1_BIT;
15 imageInfo.sharingMode = VK_SHARING_MODE_EXCLUSIVE;
16
17 if (vkCreateImage(device, &imageInfo, nullptr, &image) !=
VK_SUCCESS) {

193
18 throw std::runtime_error("failed to create image!");
19 }
20
21 VkMemoryRequirements memRequirements;
22 vkGetImageMemoryRequirements(device, image, &memRequirements);
23
24 VkMemoryAllocateInfo allocInfo{};
25 allocInfo.sType = VK_STRUCTURE_TYPE_MEMORY_ALLOCATE_INFO;
26 allocInfo.allocationSize = memRequirements.size;
27 allocInfo.memoryTypeIndex =
findMemoryType(memRequirements.memoryTypeBits, properties);
28
29 if (vkAllocateMemory(device, &allocInfo, nullptr, &imageMemory)
!= VK_SUCCESS) {
30 throw std::runtime_error("failed to allocate image memory!");
31 }
32
33 vkBindImageMemory(device, image, imageMemory, 0);
34 }

I’ve made the width, height, format, tiling mode, usage, and memory properties
parameters, because these will all vary between the images we’ll be creating
throughout this tutorial.
The createTextureImage function can now be simplified to:
1 void createTextureImage() {
2 int texWidth, texHeight, texChannels;
3 stbi_uc* pixels = stbi_load("textures/texture.jpg", &texWidth,
&texHeight, &texChannels, STBI_rgb_alpha);
4 VkDeviceSize imageSize = texWidth * texHeight * 4;
5
6 if (!pixels) {
7 throw std::runtime_error("failed to load texture image!");
8 }
9
10 VkBuffer stagingBuffer;
11 VkDeviceMemory stagingBufferMemory;
12 createBuffer(imageSize, VK_BUFFER_USAGE_TRANSFER_SRC_BIT,
VK_MEMORY_PROPERTY_HOST_VISIBLE_BIT |
VK_MEMORY_PROPERTY_HOST_COHERENT_BIT, stagingBuffer,
stagingBufferMemory);
13
14 void* data;
15 vkMapMemory(device, stagingBufferMemory, 0, imageSize, 0, &data);
16 memcpy(data, pixels, static_cast<size_t>(imageSize));
17 vkUnmapMemory(device, stagingBufferMemory);

194
18
19 stbi_image_free(pixels);
20
21 createImage(texWidth, texHeight, VK_FORMAT_R8G8B8A8_SRGB,
VK_IMAGE_TILING_OPTIMAL, VK_IMAGE_USAGE_TRANSFER_DST_BIT |
VK_IMAGE_USAGE_SAMPLED_BIT,
VK_MEMORY_PROPERTY_DEVICE_LOCAL_BIT, textureImage,
textureImageMemory);
22 }

Layout transitions
The function we’re going to write now involves recording and executing a com-
mand buffer again, so now’s a good time to move that logic into a helper function
or two:
1 VkCommandBuffer beginSingleTimeCommands() {
2 VkCommandBufferAllocateInfo allocInfo{};
3 allocInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_ALLOCATE_INFO;
4 allocInfo.level = VK_COMMAND_BUFFER_LEVEL_PRIMARY;
5 allocInfo.commandPool = commandPool;
6 allocInfo.commandBufferCount = 1;
7
8 VkCommandBuffer commandBuffer;
9 vkAllocateCommandBuffers(device, &allocInfo, &commandBuffer);
10
11 VkCommandBufferBeginInfo beginInfo{};
12 beginInfo.sType = VK_STRUCTURE_TYPE_COMMAND_BUFFER_BEGIN_INFO;
13 beginInfo.flags = VK_COMMAND_BUFFER_USAGE_ONE_TIME_SUBMIT_BIT;
14
15 vkBeginCommandBuffer(commandBuffer, &beginInfo);
16
17 return commandBuffer;
18 }
19
20 void endSingleTimeCommands(VkCommandBuffer commandBuffer) {
21 vkEndCommandBuffer(commandBuffer);
22
23 VkSubmitInfo submitInfo{};
24 submitInfo.sType = VK_STRUCTURE_TYPE_SUBMIT_INFO;
25 submitInfo.commandBufferCount = 1;
26 submitInfo.pCommandBuffers = &commandBuffer;
27
28 vkQueueSubmit(graphicsQueue, 1, &submitInfo, VK_NULL_HANDLE);
29 vkQueueWaitIdle(graphicsQueue);
30

195
31 vkFreeCommandBuffers(device, commandPool, 1, &commandBuffer);
32 }

The code for these functions is based on the existing code in copyBuffer. You
can now simplify that function to:
1 void copyBuffer(VkBuffer srcBuffer, VkBuffer dstBuffer, VkDeviceSize
size) {
2 VkCommandBuffer commandBuffer = beginSingleTimeCommands();
3
4 VkBufferCopy copyRegion{};
5 copyRegion.size = size;
6 vkCmdCopyBuffer(commandBuffer, srcBuffer, dstBuffer, 1,
&copyRegion);
7
8 endSingleTimeCommands(commandBuffer);
9 }

If we were still using buffers, then we could now write a function to record and
execute vkCmdCopyBufferToImage to finish the job, but this command requires
the image to be in the right layout first. Create a new function to handle layout
transitions:
1 void transitionImageLayout(VkImage image, VkFormat format,
VkImageLayout oldLayout, VkImageLayout newLayout) {
2 VkCommandBuffer commandBuffer = beginSingleTimeCommands();
3
4 endSingleTimeCommands(commandBuffer);
5 }

One of the most common ways to perform layout transitions is using an image
memory barrier. A pipeline barrier like that is generally used to synchronize
access to resources, like ensuring that a write to a buffer completes before read-
ing from it, but it can also be used to transition image layouts and transfer
queue family ownership when VK_SHARING_MODE_EXCLUSIVE is used. There is
an equivalent buffer memory barrier to do this for buffers.
1 VkImageMemoryBarrier barrier{};
2 barrier.sType = VK_STRUCTURE_TYPE_IMAGE_MEMORY_BARRIER;
3 barrier.oldLayout = oldLayout;
4 barrier.newLayout = newLayout;

The first two fields specify layout transition. It is possible to use

VK_IMAGE_LAYOUT_UNDEFINED as oldLayout if you don’t care about the
existing contents of the image.
1 barrier.srcQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;
2 barrier.dstQueueFamilyIndex = VK_QUEUE_FAMILY_IGNORED;

196
If you are using the barrier to transfer queue family ownership, then these
two fields should be the indices of the queue families. They must be set to
VK_QUEUE_FAMILY_IGNORED if you don’t want to do this (not the default value!).
1 barrier.image = image;
2 barrier.subresourceRange.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
3 barrier.subresourceRange.baseMipLevel = 0;
4 barrier.subresourceRange.levelCount = 1;
5 barrier.subresourceRange.baseArrayLayer = 0;
6 barrier.subresourceRange.layerCount = 1;

The image and subresourceRange specify the image that is affected and the
specific part of the image. Our image is not an array and does not have mipmap-
ping levels, so only one level and layer are specified.
1 barrier.srcAccessMask = 0; // TODO
2 barrier.dstAccessMask = 0; // TODO

Barriers are primarily used for synchronization purposes, so you must specify
which types of operations that involve the resource must happen before the bar-
rier, and which operations that involve the resource must wait on the barrier.
We need to do that despite already using vkQueueWaitIdle to manually syn-
chronize. The right values depend on the old and new layout, so we’ll get back
to this once we’ve figured out which transitions we’re going to use.
1 vkCmdPipelineBarrier(
2 commandBuffer,
3 0 /* TODO */, 0 /* TODO */,
4 0,
5 0, nullptr,
6 0, nullptr,
7 1, &barrier
8 );

All types of pipeline barriers are submitted using the same function. The first pa-
rameter after the command buffer specifies in which pipeline stage the operations
occur that should happen before the barrier. The second parameter specifies the
pipeline stage in which operations will wait on the barrier. The pipeline stages
that you are allowed to specify before and after the barrier depend on how you
use the resource before and after the barrier. The allowed values are listed in this
table of the specification. For example, if you’re going to read from a uniform
after the barrier, you would specify a usage of VK_ACCESS_UNIFORM_READ_BIT
and the earliest shader that will read from the uniform as pipeline stage, for
example VK_PIPELINE_STAGE_FRAGMENT_SHADER_BIT. It would not make sense
to specify a non-shader pipeline stage for this type of usage and the validation
layers will warn you when you specify a pipeline stage that does not match the
type of usage.

197
The third parameter is either 0 or VK_DEPENDENCY_BY_REGION_BIT. The latter
turns the barrier into a per-region condition. That means that the implementa-
tion is allowed to already begin reading from the parts of a resource that were
written so far, for example.
The last three pairs of parameters reference arrays of pipeline barriers of the
three available types: memory barriers, buffer memory barriers, and image
memory barriers like the one we’re using here. Note that we’re not using the
VkFormat parameter yet, but we’ll be using that one for special transitions in
the depth buffer chapter.

Copying buffer to image

Before we get back to createTextureImage, we’re going to write one more
helper function: copyBufferToImage:
1 void copyBufferToImage(VkBuffer buffer, VkImage image, uint32_t
width, uint32_t height) {
2 VkCommandBuffer commandBuffer = beginSingleTimeCommands();
3
4 endSingleTimeCommands(commandBuffer);
5 }

Just like with buffer copies, you need to specify which part of the buffer
is going to be copied to which part of the image. This happens through
VkBufferImageCopy structs:
1 VkBufferImageCopy region{};
2 region.bufferOffset = 0;
3 region.bufferRowLength = 0;
4 region.bufferImageHeight = 0;
5
6 region.imageSubresource.aspectMask = VK_IMAGE_ASPECT_COLOR_BIT;
7 region.imageSubresource.mipLevel = 0;
8 region.imageSubresource.baseArrayLayer = 0;
9 region.imageSubresource.layerCount = 1;
10
11 region.imageOffset = {0, 0, 0};
12 region.imageExtent = {
13 width,
14 height,
15 1
16 };

Most of these fields are self-explanatory. The bufferOffset specifies the byte
offset in the buffer at which the pixel values start. The bufferRowLength and
bufferImageHeight fields specify how the pixels are laid out in memory. For

198
example, you could have some padding bytes between rows of the image. Spec-
ifying 0 for both indicates that the pixels are simply tightly packed like they
are in our case. The imageSubresource, imageOffset and imageExtent fields
indicate to which part of the image we want to copy the pixels.
Buffer to image copy operations are enqueued using the vkCmdCopyBufferToImage
function:
1 vkCmdCopyBufferToImage(
2 commandBuffer,
3 buffer,
4 image,
5 VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL,
6 1,
7 &region
8 );

The fourth parameter indicates which layout the image is currently using. I’m
assuming here that the image has already been transitioned to the layout that is
optimal for copying pixels to. Right now we’re only copying one chunk of pixels
to the whole image, but it’s possible to specify an array of VkBufferImageCopy
to perform many different copies from this buffer to the image in one operation.

Preparing the texture image

We now have all of the tools we need to finish setting up the texture image,
so we’re going back to the createTextureImage function. The last thing we
did there was creating the texture image. The next step is to copy the staging
buffer to the texture image. This involves two steps:
• Transition the texture image to VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL
• Execute the buffer to image copy operation
This is easy to do with the functions we just created:
1 transitionImageLayout(textureImage, VK_FORMAT_R8G8B8A8_SRGB,
VK_IMAGE_LAYOUT_UNDEFINED, VK_IMAGE_LAYOUT_TRANSFER_DST_OPTIMAL);
2 copyBufferToImage(stagingBuffer, textureImage,
static_cast<uint32_t>(texWidth),
static_cast<uint32_t>(texHeight));

The image was created with the VK_IMAGE_LAYOUT_UNDEFINED layout, so that

one should be specified as old layout when transitioning textureImage. Re-
member that we can do this because we don’t care about its contents before
performing the copy operation.
To be able to start sampling from the texture image in the shader, we need one
last transition to prepare it for shader access:

199

Vulkan in C++ (By Nvidia)
100% (1)
Vulkan in C++ (By Nvidia)
32 pages
Vulkan11 Reference Guide
No ratings yet
Vulkan11 Reference Guide
16 pages
Vulkan Tutorial En-101-150
No ratings yet
Vulkan Tutorial En-101-150
50 pages
Vulkan Tutorial En-210-288
No ratings yet
Vulkan Tutorial En-210-288
79 pages
NVVK RTX On
No ratings yet
NVVK RTX On
39 pages
Vulkan Setup for Developers
No ratings yet
Vulkan Setup for Developers
32 pages
Vulkan 1.0 API Quick Reference
No ratings yet
Vulkan 1.0 API Quick Reference
13 pages
Vulkan
No ratings yet
Vulkan
77 pages
Command Function Pointers Return Codes: Vulkan
No ratings yet
Command Function Pointers Return Codes: Vulkan
13 pages
OpenGL ES 2 0 Reference Card
No ratings yet
OpenGL ES 2 0 Reference Card
4 pages
Log - 2025 04 15 - 01 40 57
No ratings yet
Log - 2025 04 15 - 01 40 57
12 pages
Amd 2018 Porting To Vulkan dx12 Adam Sawicki
No ratings yet
Amd 2018 Porting To Vulkan dx12 Adam Sawicki
45 pages
Vulkan Klein
No ratings yet
Vulkan Klein
6 pages
Assignment5 Writeup
No ratings yet
Assignment5 Writeup
7 pages
VK Loader Core
No ratings yet
VK Loader Core
54 pages
5 - Paging
No ratings yet
5 - Paging
4 pages
Virtual Memory Classnotes
No ratings yet
Virtual Memory Classnotes
70 pages
Log - 2025 04 15 - 01 44 31
No ratings yet
Log - 2025 04 15 - 01 44 31
9,018 pages
Ashmem
No ratings yet
Ashmem
10 pages
Lecture 23
No ratings yet
Lecture 23
62 pages
DMA Rendering and Store Queues With KOS On The Sega Dreamcast
No ratings yet
DMA Rendering and Store Queues With KOS On The Sega Dreamcast
3 pages
Script Imgui
No ratings yet
Script Imgui
32 pages
Complex Computing Problem
No ratings yet
Complex Computing Problem
18 pages
NetworkBuffers TheBSD Unix SVR4 and Linux Approach - SKB Reduced
No ratings yet
NetworkBuffers TheBSD Unix SVR4 and Linux Approach - SKB Reduced
124 pages
Understanding Vnodes in VFS
100% (2)
Understanding Vnodes in VFS
491 pages
Opengl44 Quick Reference Card
No ratings yet
Opengl44 Quick Reference Card
12 pages
Vulkan Abridged
No ratings yet
Vulkan Abridged
256 pages
PDC Lecture 10
No ratings yet
PDC Lecture 10
32 pages
Data-Oriented Design and C++ - Mike Acton - CppCon 2014
100% (1)
Data-Oriented Design and C++ - Mike Acton - CppCon 2014
201 pages
7 - C++ Data Types
No ratings yet
7 - C++ Data Types
3 pages
Virtual Memory & Heap Management
No ratings yet
Virtual Memory & Heap Management
61 pages
Vulkan Tutorial En-1-50
No ratings yet
Vulkan Tutorial En-1-50
50 pages
DigitalLogic ComputerOrganization L24 VirtualMemory Handout
No ratings yet
DigitalLogic ComputerOrganization L24 VirtualMemory Handout
30 pages
Class 13
No ratings yet
Class 13
19 pages
N Ntroduction TO Rogramming: Through
No ratings yet
N Ntroduction TO Rogramming: Through
21 pages
VK Loader Core2
No ratings yet
VK Loader Core2
57 pages
Less Slow C++ - Hacker News
No ratings yet
Less Slow C++ - Hacker News
3 pages
OS Internals Volume II - Kernel Mode - Jonathan Levin
No ratings yet
OS Internals Volume II - Kernel Mode - Jonathan Levin
492 pages
Lec 31
No ratings yet
Lec 31
20 pages
Vbo Vao
No ratings yet
Vbo Vao
19 pages
C++ Refresher
No ratings yet
C++ Refresher
27 pages
Chapter 3 Memory Managment
No ratings yet
Chapter 3 Memory Managment
38 pages
Double Buffering on LPC1788: Solutions
No ratings yet
Double Buffering on LPC1788: Solutions
3 pages
Large Code Bases
No ratings yet
Large Code Bases
31 pages
CEN 515-New
No ratings yet
CEN 515-New
82 pages
Pipeline
No ratings yet
Pipeline
13 pages
Section 4 - Drawing Colors
No ratings yet
Section 4 - Drawing Colors
60 pages
CUDA OpenGL
No ratings yet
CUDA OpenGL
9 pages
Linux VM MM
No ratings yet
Linux VM MM
126 pages
Lecture 8
No ratings yet
Lecture 8
67 pages
Array of Vector Pointers
No ratings yet
Array of Vector Pointers
5 pages
Modern C++ Performance Optimization
No ratings yet
Modern C++ Performance Optimization
92 pages
Lecture 13 Memory Management V2
No ratings yet
Lecture 13 Memory Management V2
46 pages
Migdalskiy Sergiy Physics Optimization Strategies
No ratings yet
Migdalskiy Sergiy Physics Optimization Strategies
104 pages
Parallel BFS On Graphs Using GPGPU
No ratings yet
Parallel BFS On Graphs Using GPGPU
10 pages
Stoic Wisdom for Personal Growth
No ratings yet
Stoic Wisdom for Personal Growth
10 pages
Vulkan Debugging with Validation Layers
No ratings yet
Vulkan Debugging with Validation Layers
50 pages
Stoic Wisdom for Modern Leaders
No ratings yet
Stoic Wisdom for Modern Leaders
10 pages
A Novel Method For Sea-Land Clutter Separation Using Regularized Randomized and Kernel Ridge Neural Networks
No ratings yet
A Novel Method For Sea-Land Clutter Separation Using Regularized Randomized and Kernel Ridge Neural Networks
21 pages
Sea Clutter Suppression For Radar PPI Images Based On SCS-GAN
No ratings yet
Sea Clutter Suppression For Radar PPI Images Based On SCS-GAN
6 pages
Multi Gpu-5
No ratings yet
Multi Gpu-5
1 page
Multi Gpu-7
No ratings yet
Multi Gpu-7
1 page
868 MHZ and 915 MHZ PCB Antenna PDF
No ratings yet
868 MHZ and 915 MHZ PCB Antenna PDF
17 pages
Fishing Reel Parts Guide
No ratings yet
Fishing Reel Parts Guide
1 page
Imaging and Design For Visual Message Using Infographics: LAS For Empowerment Technologies (Grade 11)
No ratings yet
Imaging and Design For Visual Message Using Infographics: LAS For Empowerment Technologies (Grade 11)
3 pages
Vehicle Diagnostics & Maintenance Guide
No ratings yet
Vehicle Diagnostics & Maintenance Guide
183 pages
Faqs YOURSHOT Contest
No ratings yet
Faqs YOURSHOT Contest
1 page
CS211 Exam PDF
No ratings yet
CS211 Exam PDF
8 pages
CSE (IOT) OOPS Theory Internal
No ratings yet
CSE (IOT) OOPS Theory Internal
2 pages
Fisherr GX Control Valve and Actuator System: Scope of Manual
No ratings yet
Fisherr GX Control Valve and Actuator System: Scope of Manual
68 pages
Sample Procedural Manual
No ratings yet
Sample Procedural Manual
60 pages
Start Up Routine OVF10 PDF
100% (2)
Start Up Routine OVF10 PDF
12 pages
High School Enrollment System
0% (1)
High School Enrollment System
6 pages
Semi-Thue System - Wikipedia
No ratings yet
Semi-Thue System - Wikipedia
1 page
Gas Leakage Controller Using Arduino
No ratings yet
Gas Leakage Controller Using Arduino
11 pages
Literature Review Examples For Social Work
100% (2)
Literature Review Examples For Social Work
7 pages
Cal Poly Pomona: Nguyen Jason, Andrew Co
No ratings yet
Cal Poly Pomona: Nguyen Jason, Andrew Co
6 pages
Math 2 All
100% (1)
Math 2 All
107 pages
Barco Alchemy ICMP Datasheet
No ratings yet
Barco Alchemy ICMP Datasheet
2 pages
3HAC049108 SP IRB 4600-En PDF
No ratings yet
3HAC049108 SP IRB 4600-En PDF
50 pages
Rta For Vis2515 - bcd112 (FTTH Overlay)
No ratings yet
Rta For Vis2515 - bcd112 (FTTH Overlay)
35 pages
Computer 7 TQ
No ratings yet
Computer 7 TQ
4 pages
Types of Digital Data
No ratings yet
Types of Digital Data
33 pages
CNS V2 PDF
No ratings yet
CNS V2 PDF
332 pages
2 Bac U 2 Writing A Funny Story Model PDF
No ratings yet
2 Bac U 2 Writing A Funny Story Model PDF
1 page
New Headway Academic Skills Level 2 Stud
No ratings yet
New Headway Academic Skills Level 2 Stud
7 pages
Advanced PDE Homework Solutions
No ratings yet
Advanced PDE Homework Solutions
4 pages
Updated References Second Sem 2015 2016
No ratings yet
Updated References Second Sem 2015 2016
50 pages
OS Practical Notes
No ratings yet
OS Practical Notes
21 pages
Stereo Optical Product Catalog 09 2024 Reduced
No ratings yet
Stereo Optical Product Catalog 09 2024 Reduced
15 pages
Case Studies in Dynamic Systems and Control: Figure 11.14 Simulink Diagram For The Solenoid Actuator
No ratings yet
Case Studies in Dynamic Systems and Control: Figure 11.14 Simulink Diagram For The Solenoid Actuator
1 page
Admission Brochure For PH.D Programme - Academic Year 2025-2026 (July Session)
No ratings yet
Admission Brochure For PH.D Programme - Academic Year 2025-2026 (July Session)
9 pages
Google Cloud Messaging For Android
No ratings yet
Google Cloud Messaging For Android
63 pages

Vulkan Tutorial En-151-200

Uploaded by

Vulkan Tutorial En-151-200

Uploaded by

Pipeline vertex input

Vertex buffer creation

Creating a buffer requires us to fill a VkBufferCreateInfo structure.

The VkMemoryRequirements struct has three fields:

The VkPhysicalDeviceMemoryProperties structure has two arrays

Filling the vertex buffer

Binding the vertex buffer

The vkCmdBindVertexBuffers function is used to bind vertex buffers to bind-

Abstracting buffer creation

Using a staging buffer

We’re now using a new stagingBuffer with stagingBufferMemory for mapping

And immediately start recording the command buffer:

Contents of buffers are transferred using the vkCmdCopyBuffer command. It

The createIndexBuffer function that we’ll add now is almost identical to

An index buffer is bound with vkCmdBindIndexBuffer which has the index

Descriptor layout and buffer

Descriptor set layout

Every binding needs to be described through a VkDescriptorSetLayoutBinding

We can then create it using vkCreateDescriptorSetLayout. This function ac-

Similarly, create a new function createUniformBuffers that is called after

Updating uniform data

The glm::rotate function takes an existing transformation, rotation angle and

I’ve chosen to use a perspective projection with a 45 degree vertical field-of-

Descriptor pool and sets

A descriptor set allocation is described with a VkDescriptorSetAllocateInfo

The updates are applied using vkUpdateDescriptorSets. It accepts two kinds

Using descriptor sets

And the following shader definition:

Multiple descriptor sets

Create a new function createTextureImage where we’ll load an image and

Don’t forget to clean up the original pixel array now:

The parameters for an image are specified in a VkImageCreateInfo struct:

The tiling field can have one of two values:

The first two fields specify layout transition. It is possible to use

Copying buffer to image

Preparing the texture image

The image was created with the VK_IMAGE_LAYOUT_UNDEFINED layout, so that

You might also like