Today's task: Save the terrain as a vertex buffer in the GPU, avoid
copying the vertex information for every frame.
Create the Vertex Buffer
The terrain is stored in CPU memory as a grid while the GPU expects
everything to be defined as triangles. ("Tesselation" is the process
of breaking a solid shape down to its surface triangles.) The simplest
approach is to convert each set of 4 vertexes into a tile comprised of
2 triangles of 3 vertices. Each vertex is a VertexPositionNormalColor
struct. The triangles need to be "wound clockwise" (looking down on the
XZ plane).
CreateVB():
//Copy vertices to GPU as a static vertex buffer.
//Vertices are arranged as a triangle list (complete triangles)
//3 vertices per triangle, 2 triangles per tile.
HRESULT nTerrain::CreateVB(ID3D11DeviceContext *pDC) {
HRESULT Err= 0;
std::vector<VertexPositionNormalColor> vecVrt;
vecVrt.resize(cTile*cTile*2*3);
int nVrt= 0;
for(int x0=0;x0+1<cTile;x0++) {
for(int z0=0;z0+1<cTile;z0++) {
//First triangle of the tile
vecVrt[nVrt++]= getVertex(x0,z0);
vecVrt[nVrt++]= getVertex(x0+1,z0);
vecVrt[nVrt++]= getVertex(x0+1,z0+1);
//Second triangle of the tile
vecVrt[nVrt++]= getVertex(x0,z0);
vecVrt[nVrt++]= getVertex(x0+1,z0+1);
vecVrt[nVrt++]= getVertex(x0,z0+1);
}
}
D3D11_BUFFER_DESC vbDesc;
Zero(vbDesc);
vbDesc.Usage= D3D11_USAGE_DEFAULT;
vbDesc.ByteWidth= cTile*cTile*2*3*sizeof(VertexPositionNormalColor);
vbDesc.BindFlags= D3D11_BIND_VERTEX_BUFFER;
vbDesc.CPUAccessFlags= 0;
vbDesc.MiscFlags= 0;
D3D11_SUBRESOURCE_DATA vbInit;
Zero(vbInit);
vbInit.pSysMem= vecVrt.data();
vbInit.SysMemPitch= 0;
vbInit.SysMemSlicePitch= 0;
ID3D11Device *pDevice;
pDC->GetDevice(&pDevice);
DX::ThrowIfFailed(Err= pDevice->CreateBuffer(&vbDesc,&vbInit,&pVB));
return(Err);
}
A triangle list is the easiest primitive topology to use because
each triangle is fully defined without referencing any other triangle.
The downside is a lot of redundant data as most vertices will be
duplicated in 4 triangles. A triangle strip is more efficient but
constructing it is a bit more complicated, and can wait to be an
optimization step once the basic vertex buffer is working.
The D3D11_BUFFER_DESC describes how to read the data in the buffer.
ByteWidth is the number of bytes in the entire buffer.
D3D11_BIND_VERTEX_BUFFER binds the buffer as a list of vertices.
D3D11_SUBRESOURCE_DATA contains information about where to find
the source data in CPU memory.
CreateBuffer() will both allocate GPU memory for the buffer and
copy the source data from CPU memory described in SUBRESOURCE_DATA.
Render
Now I can render the grid directly from GPU memory without copying
any vertex data. RenderSys() is the original version that rebuilds the
vertex buffer every frame. RenderVB() assumes the vertex buffer is
already present in GPU memory and simply tells the GPU to run through
the triangle list. Note that Draw() takes the count of vertices, not
triangles.
Triangle strips
are a more efficient way to store the vertices for sets of conjoined
triangles. I suspect there is no performance benefit to triangle
strips over triangle lists, the GPU renders the same polygons either
way. Triangle strips just remove redundant data from the model,
reducing the memory requirements for very large and complex
shapes.
The triangle strip can "jump" from one vertex to another by
defining a "degenerate" triangle with an area of zero. A degenerate
triangle is created where a vertex pair differs in only 1 coordinate.
In this case, I need to jump from the end of each Z row to the
beginning of the next.
CreateVB():
class Terrain {
private:
ComPtr<ID3D11Buffer> pVB; //Vertices in GPU memory.
D3D11_PRIMITIVE_TOPOLOGY vtxTopology; //How vertices are arranged
UINT cVertex; //Vertex count
};
//Copy vertices to GPU as a static vertex buffer.
//Vertices are arranged as a triangle list (complete triangles)
//3 vertices per triangle, 2 triangles per tile.
HRESULT nTerrain::CreateVB(ID3D11DeviceContext *pDC) {
HRESULT Err= 0;
std::vector<VertexPositionNormalColor> vecVrt;
vecVrt.resize(cTile*cTile*2*3);
cVertex= 0;
#if 1
vtxTopology= D3D11_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP;
for(int x0=0;x0+1<cTile;x0++) {
//Create a "degenerate" triangle with an area of 0.
//This rasters back to the beginning of the next strip.
vecVrt[cVertex++]= getVertex(x0,0);
vecVrt[cVertex++]= getVertex(x0+1,0);
for(int z0=0;z0+1<cTile;z0++) {
vecVrt[cVertex++]= getVertex(x0,z0+1);
vecVrt[cVertex++]= getVertex(x0+1,z0+1);
}
}
#else
vtxTopology= D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST;
for(int x0=0;x0+1<cTile;x0++) {
for(int z0=0;z0+1<cTile;z0++) {
//First triangle of the tile
vecVrt[cVertex++]= getVertex(x0,z0);
vecVrt[cVertex++]= getVertex(x0+1,z0);
vecVrt[cVertex++]= getVertex(x0+1,z0+1);
//Second triangle of the tile
vecVrt[cVertex++]= getVertex(x0,z0);
vecVrt[cVertex++]= getVertex(x0+1,z0+1);
vecVrt[cVertex++]= getVertex(x0,z0+1);
}
}
#endif
// Render using GPU vertex buffer.
void nTerrain::RenderVB(ID3D11DeviceContext *pDC, CommonStates *pStates, const XMMATRIX &view, const XMMATRIX &proj) {
UINT stride=sizeof(VertexPositionNormalColor), offset=0;
pDC->IASetVertexBuffers(0,1,pVB.GetAddressOf(),&stride,&offset);
pDC->IASetPrimitiveTopology(vtxTopology);
pDC->Draw(cVertex,0);
}
My first attempt was not quite right...
This makes a dramatic difference in performance. I can render a
terrain of 9 million vertices at 28fps (running natively on the
Intel Iris Pro 580). See Performance At 9,000,000 vertices each
tile is only a quarter of a pixel (window size 800x600).
WebV7 (C)2018 nlited | Rendered by tikope in 119.499ms | 18.118.119.77