dev.nlited.com

>>

Direct2D SwapChains

<<<< prev
next >>>>

2016-01-24 19:29:16 chip Page 1538 📢 PUBLIC

Today's task is to use swap chains to avoid the expensive BitBlt.

I am using this MSDN article as a guide.

The original code retains these objects:
ComPtr<ID2D1DeviceContext> m_target; ComPtr<IDXGISwapChain1> m_swapChain; ComPtr<ID2D1Factory1> m_factory;

I have renamed these to:
m_target : pD2DC m_swapChain : pSwapChain m_factory : pD2DFactory

I originally dropped the use of ComPtr<> wrappers, but then decided they did provide some value by automatically calling the Release() when they dropped out of scope.

The switch to SwapChains has pervasive changes on the original code, which used an "offscreen image" and painted to the screen. The offscreen code would derive a compatible DC (hdcImg) and bitmap (hbmImg) from the onscreen DC, draw everything using hdcImg, then bitblt from hdcImg to hdcDst during the WM_PAINT message. The draw phase was quick, the expensive operation was copying the pixels from hbmImg to the onscreen device. I am not sure why this was so slow, it seems that the system should be smart enough to do this entirely inside the GPU -- but the fact that the bitblt would take 14ms tells me it was using the CPU to copy the pixels. Using D2D and SwapChains should keep everything inside the GPU, and the final "paint" operation should involve nothing more than changing a GPU pointer -- nearly instanteous.

The flow of the rendering code using hdcImg was:
CreateCompatibleDC > CreateCompatibleBitmap Update Data Redraw hdcImg Invalidate hWnd WM_PAINT BitBlt(Pnt.hdc,hdcImg)

The flow using SwapChains will be:
Create ID2D1DeviceContext Create SwapChain (with 2 buffers) Run { Update Data BeginDraw Redraw EndDraw Present (Swap buffers) }
The WM_PAINT message is no longer used, the display will be updated whenever there a Data change, which triggers a redraw to the back buffer. The back buffer is presented as soon as the redraw completes.

The Rocket project now compiles and builds using the SwapChains code.

The call to pDxFactory->CreateSwapChainForHwnd() fails with the exquisitely unhelpful error "bad parameter".

I tried to install the Direct2D Debug Layer, but the download link on the Microsoft site is broken. Fortunately, someone was kind enough to provide direct links.

I was not able to link to the debug dll, and it was only for Direct2D -- the error is in a call to the Direct3D library.

Found it. I needed to clear the SwapProp struct first.

All the initialization seemed to succeed, but nothing is visible.

Closer... The screen is now being painted once, but the WM_TIMER message doesn't seem to fire. This turned out to be an optimizer obfuscation.

OK. I now have the screen updating. The Draw times are a bit disappointing, about 25ms. This is about the same time required to perform the BitBlt, so it seems I took a long trip to nowhere. I need to do a bit more evaluation to be sure.

It appears the SwapChains method is achieving 4fps with 101 sprites. 6FPS when running natively on Pogo. Very disappointing.

My original WM_PAINT method also runs at 4fps, with a whole lot less code and simpler code.

So the question is, how does Doom3 achieve 60fps while painting a whole lot more pixels?

I think my SwapChains code is not actually using the GPU. Yes it is, it just isn't any faster than my Paint approach. Or, conversely, my Paint method is just as fast as SwapChains without a lot of the complexity.

OK, this was display error. The WM_PAINT code is actually achieving 40.8fps (60fps on Pogo) and the SwapChains is running at 41fps (60fps on Pogo). This is better, but there is still not enough of a difference between the two to justify all the trouble required to use SwapChains.

I created a "tight loop" version that spawns a thread that calls GameUpdate() and ImgPaint() in a tight loop. It maxes out at 60fps on Pogo.

The upper limit seems to be 60fps. I see 60fps when there are 420 sprites or 1.

There is a definite performance penalty to running in the VM, where the max frame rate is around 40fps.

If I turn off the "SyncInterval" flag in the call to pSwapChain->Present() I can achieve an FPS as high as 120 in the VM, although it fluctuates quite a bit. The Paint method seems to be capped at 60fps, I am assuming there is an implicit SyncInterval wait.

Running natively on Pogo, the SwapNoSync version hits a staggering 1800fps with a sustained 1300fps! This is with 100 sprites, the FPS drops in direct relation to the number of sprites. With 300 sprites I see 700fps.

Conversely, this should mean I can have many more sprites if I turn on the SyncInterval. I can run 1000 sprites at a steady 60fps, and the action is much smoother.

The Paint version is also able to hit 60fps with 1000 sprites, 2800 sprites at 58fps! 3200 sprites at 59fps at full screen! 5000 sprites, full screen, 40fps.


I now have 2 versions of Rocket: Paint and SwapChains. The Paint method creates an offscreen bitmap, draws on it, then uses BitBlt to present it in the window. SwapChains uses two buffers, one visible and the other hidden, draws to the hidden ("back") buffer, then swaps the buffers for each frame. With SyncInterval enabled, the performance of the two versions is identical. I prefer the Paint method because the code is simpler, it does not rely on Direct3D, and it is closer to the traditional GDI approach. SwapChains is better only when frames per second is much more important than being smooth -- which is never. The only other reason to use SwapChains is to have access to some of the Direct3D interfaces, such as the antialiasing functions.

This is the API for MyD2D, SwapChains method. The most significant difference is that the API uses an ID2D1DeviceContext to draw everything.


MyD2D.h: /*************************************************************************/ /** MyD2D.h: Helper functions to simplify the use of Direct2D. **/ /** (C)2013 nlited systems, Chip Doran **/ /*************************************************************************/ #pragma once #define __MYD2D_H__ 0x0101 #include #include #include typedef Handle_s *HD2D; typedef Handle_s *HD2DBITMAP; EXTERNC int MyD2DCreate(HD2D *phD2D, HINSTANCE hModule); EXTERNC int MyD2DDestroy(HD2D hD2D); EXTERNC int MyD2DCreateResourcesWnd(HD2D hD2D, HWND hWnd); EXTERNC int MyD2DReleaseResources(HD2D hD2D, HWND hWnd); EXTERNC int MyD2DResize(HD2D hD2D, const RECT *pR); #ifdef _D2D1_H_ EXTERNC int MyD2DBegin(HD2D hD2D, ID2D1DeviceContext **ppDC); EXTERNC ID2D1DeviceContext *MyD2DGetDC(HD2D hD2D); EXTERNC int MyD2DEnd(HD2D hD2D, HRESULT *pWinErr); EXTERNC IDWriteTextFormat *MyD2DTextFormat(HD2D hD2D); struct Bitmap_s { UINT BitmapID; //Sprite index const WCHAR *Type; //Resource type const WCHAR *Name; //Resource name UINT FrameCt; //Number of animation frames (arranged across) POINT AnchorPt; //Anchor point D2D1_SIZE_F SizeDPI; //Dimensions of bitmap ID2D1Bitmap *pBitmap; //pRenderTgt interface }; EXTERNC int MyD2DBitmapCreate(HD2D hD2D, struct Bitmap_s *pBitmap); #endif /*************************************************************************/ /** Sprite.h: Defines all the info needed to draw a game item. **/ /** (C)2013 nlited systems, Chip Doran **/ /*************************************************************************/ #pragma once #define __SPRITE_H__ 0x0201 #include struct Sprite_s { UINT SpriteID; UINT BitmapID; struct Bitmap_s *pBitmap; UINT Size; D2D_POINT_2F Pos; //Current position D2D_POINT_2F PosVel; //Positional velocity D2D_POINT_2F PosPrev; //Previous position FLOAT Rotation; //Current rotation FLOAT RotationVel; //Rotational velocity FLOAT RotationPrev; //Previous rotation D2D_COLOR_F Color; UINT FrameID; UINT FrameIDPrev; //Previous FrameID }; enum SPRITE_IDS { SPRITE_NONE, SPRITE_BOX, SPRITE_BITMAP, SPRITE_BULLET, SPRITE_END }; EXTERNC int SpriteDraw(HD2D hD2D, HWND hWnd, const struct Sprite_s *pSprite); EXTERNC int SpriteDrawImg(ID2D1DeviceContext *pDC,const struct Sprite_s *pSprite);

The code behind MyD2D is here: MyD2D: SwapChains Method



WebV7 (C)2018 nlited | Rendered by tikope in 64.141ms | 3.145.92.96