dev.nlited.com

>>

Vertex Buffer

<<<< prev
next >>>>

2017-12-03 22:04:50 chip Page 2079 📢 PUBLIC

Dec 3 2017

Today's task: Save the terrain as a vertex buffer in the GPU, avoid copying the vertex information for every frame.

Create the Vertex Buffer

The terrain is stored in CPU memory as a grid while the GPU expects everything to be defined as triangles. ("Tesselation" is the process of breaking a solid shape down to its surface triangles.) The simplest approach is to convert each set of 4 vertexes into a tile comprised of 2 triangles of 3 vertices. Each vertex is a VertexPositionNormalColor struct. The triangles need to be "wound clockwise" (looking down on the XZ plane).


CreateVB(): //Copy vertices to GPU as a static vertex buffer. //Vertices are arranged as a triangle list (complete triangles) //3 vertices per triangle, 2 triangles per tile. HRESULT nTerrain::CreateVB(ID3D11DeviceContext *pDC) { HRESULT Err= 0; std::vector<VertexPositionNormalColor> vecVrt; vecVrt.resize(cTile*cTile*2*3); int nVrt= 0; for(int x0=0;x0+1<cTile;x0++) { for(int z0=0;z0+1<cTile;z0++) { //First triangle of the tile vecVrt[nVrt++]= getVertex(x0,z0); vecVrt[nVrt++]= getVertex(x0+1,z0); vecVrt[nVrt++]= getVertex(x0+1,z0+1); //Second triangle of the tile vecVrt[nVrt++]= getVertex(x0,z0); vecVrt[nVrt++]= getVertex(x0+1,z0+1); vecVrt[nVrt++]= getVertex(x0,z0+1); } } D3D11_BUFFER_DESC vbDesc; Zero(vbDesc); vbDesc.Usage= D3D11_USAGE_DEFAULT; vbDesc.ByteWidth= cTile*cTile*2*3*sizeof(VertexPositionNormalColor); vbDesc.BindFlags= D3D11_BIND_VERTEX_BUFFER; vbDesc.CPUAccessFlags= 0; vbDesc.MiscFlags= 0; D3D11_SUBRESOURCE_DATA vbInit; Zero(vbInit); vbInit.pSysMem= vecVrt.data(); vbInit.SysMemPitch= 0; vbInit.SysMemSlicePitch= 0; ID3D11Device *pDevice; pDC->GetDevice(&pDevice); DX::ThrowIfFailed(Err= pDevice->CreateBuffer(&vbDesc,&vbInit,&pVB)); return(Err); }

A triangle list is the easiest primitive topology to use because each triangle is fully defined without referencing any other triangle. The downside is a lot of redundant data as most vertices will be duplicated in 4 triangles. A triangle strip is more efficient but constructing it is a bit more complicated, and can wait to be an optimization step once the basic vertex buffer is working.

The D3D11_BUFFER_DESC describes how to read the data in the buffer. ByteWidth is the number of bytes in the entire buffer. D3D11_BIND_VERTEX_BUFFER binds the buffer as a list of vertices.

D3D11_SUBRESOURCE_DATA contains information about where to find the source data in CPU memory.

CreateBuffer() will both allocate GPU memory for the buffer and copy the source data from CPU memory described in SUBRESOURCE_DATA.

Render

Now I can render the grid directly from GPU memory without copying any vertex data. RenderSys() is the original version that rebuilds the vertex buffer every frame. RenderVB() assumes the vertex buffer is already present in GPU memory and simply tells the GPU to run through the triangle list. Note that Draw() takes the count of vertices, not triangles.

Render(): void nTerrain::Render(ID3D11DeviceContext *pDC, CommonStates *pStates, const XMMATRIX &view, const XMMATRIX &proj) { pDC->OMSetBlendState(pStates->Opaque(),nullptr,0xFFFFFFFF); pDC->OMSetDepthStencilState(pStates->DepthDefault(),0); pDC->RSSetState(pStates->CullCounterClockwise()); pDC->IASetInputLayout(d3InputLayout.Get()); if(pVB.Get()) { RenderVB(pDC,pStates,view,proj); } else { RenderSys(pDC,pStates,view,proj); } } // Render using system memory void nTerrain::RenderSys(ID3D11DeviceContext *pDC, CommonStates *pStates, const XMMATRIX &view, const XMMATRIX &proj) { batchVertex->Begin(); TERRAINVTX v0,v1,v2,v3; for(int x0=0;x0+1<cTile;x0++) { for(int z0=0;z0+1<cTile;z0++) { v0= getVertex(x0,z0); v1= getVertex(x0+1,z0); v2= getVertex(x0+1,z0+1); v3= getVertex(x0,z0+1); if(doWireFrame) { batchVertex->DrawLine(v0,v1); batchVertex->DrawLine(v1,v2); } else { batchVertex->DrawQuad(v0,v1,v2,v3); } } } batchVertex->End(); } // Render using GPU vertex buffer. void nTerrain::RenderVB(ID3D11DeviceContext *pDC, CommonStates *pStates, const XMMATRIX &view, const XMMATRIX &proj) { UINT stride=sizeof(VertexPositionNormalColor), offset=0; pDC->IASetVertexBuffers(0,1,pVB.GetAddressOf(),&stride,&offset); pDC->IASetPrimitiveTopology(D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST); pDC->Draw(cTile*cTile*2*3,0); }

Direct3D simple grid

Triangle Strips

Triangle strips are a more efficient way to store the vertices for sets of conjoined triangles. I suspect there is no performance benefit to triangle strips over triangle lists, the GPU renders the same polygons either way. Triangle strips just remove redundant data from the model, reducing the memory requirements for very large and complex shapes.

The triangle strip can "jump" from one vertex to another by defining a "degenerate" triangle with an area of zero. A degenerate triangle is created where a vertex pair differs in only 1 coordinate. In this case, I need to jump from the end of each Z row to the beginning of the next.

DirectX triangle strip


CreateVB(): class Terrain { private: ComPtr<ID3D11Buffer> pVB; //Vertices in GPU memory. D3D11_PRIMITIVE_TOPOLOGY vtxTopology; //How vertices are arranged UINT cVertex; //Vertex count }; //Copy vertices to GPU as a static vertex buffer. //Vertices are arranged as a triangle list (complete triangles) //3 vertices per triangle, 2 triangles per tile. HRESULT nTerrain::CreateVB(ID3D11DeviceContext *pDC) { HRESULT Err= 0; std::vector<VertexPositionNormalColor> vecVrt; vecVrt.resize(cTile*cTile*2*3); cVertex= 0; #if 1 vtxTopology= D3D11_PRIMITIVE_TOPOLOGY_TRIANGLESTRIP; for(int x0=0;x0+1<cTile;x0++) { //Create a "degenerate" triangle with an area of 0. //This rasters back to the beginning of the next strip. vecVrt[cVertex++]= getVertex(x0,0); vecVrt[cVertex++]= getVertex(x0+1,0); for(int z0=0;z0+1<cTile;z0++) { vecVrt[cVertex++]= getVertex(x0,z0+1); vecVrt[cVertex++]= getVertex(x0+1,z0+1); } } #else vtxTopology= D3D11_PRIMITIVE_TOPOLOGY_TRIANGLELIST; for(int x0=0;x0+1<cTile;x0++) { for(int z0=0;z0+1<cTile;z0++) { //First triangle of the tile vecVrt[cVertex++]= getVertex(x0,z0); vecVrt[cVertex++]= getVertex(x0+1,z0); vecVrt[cVertex++]= getVertex(x0+1,z0+1); //Second triangle of the tile vecVrt[cVertex++]= getVertex(x0,z0); vecVrt[cVertex++]= getVertex(x0+1,z0+1); vecVrt[cVertex++]= getVertex(x0,z0+1); } } #endif // Render using GPU vertex buffer. void nTerrain::RenderVB(ID3D11DeviceContext *pDC, CommonStates *pStates, const XMMATRIX &view, const XMMATRIX &proj) { UINT stride=sizeof(VertexPositionNormalColor), offset=0; pDC->IASetVertexBuffers(0,1,pVB.GetAddressOf(),&stride,&offset); pDC->IASetPrimitiveTopology(vtxTopology); pDC->Draw(cVertex,0); }

My first attempt was not quite right...

Direct3D simple grid

This makes a dramatic difference in performance. I can render a terrain of 9 million vertices at 28fps (running natively on the Intel Iris Pro 580). See Performance At 9,000,000 vertices each tile is only a quarter of a pixel (window size 800x600).



WebV7 (C)2018 nlited | Rendered by tikope in 73.446ms | 3.142.198.108