A little preface. In these days I’ve been quite occupied with exams (I have another exam the 24 of June) and I haven’t had enough time to keep working on the game engine and at the same time updating this blog with new material.
This post will talk about one of the areas I’ve worked in this little “break”, that is basically the profiling of your game engine. (I’ve also worked into recreating the Visual Novel Reader sub-engine, but that’s a story for another time.)
Before starting, let’s define what’s profiling. Profiling is the process of extrapolation of information about something. In our case we extrapolate the information about the speed of the execution of our routines. We have quite a lot of ways to profile an application:
1) Use a profiler application like YourKit .Net Profiler.
2) Inject your own code with a personal profiler.
3) Use a personal created performance test framework.
The first way, the most common one is by using a profiler application that will record times and number of invocations for each and every methods in your whole application. This is the first and most important way to profile the performance of your application (CPU Wise). Remember if you are using C# as language to profile with the executable launched from outside the development environment otherwise you’ll not get some optimizations from the JIT compiler (like property inlining) and you’ll get bad ideas about what’s slowing down your application.
The second way, (one I personally discourage) is done by injecting your own code with checks to record times and number of invocations. (It is not flexible and will make your code quite complex)
The third way is to use a personal created test framework to test little self-sustaining pieces of code and compare their own speed. This is what I’ve prepared in these days.
The reason behind the creation of this little framework was because I had to test a series of codes and tell which one is faster, but a profiler application to do this (these routines are VERY fast and called rarely) was not a very adeguate solution.
The idea was to have a series of classes, each one focused in testing one functionality. The classes this way created would be forwarded to a Batch management class that will handle the batch execution.
After some hours of work, here’s the result :
The framework produced was divided into two major area, the ITest/Test area and the TestBatcher area.
The first area is the area defining the base classes all tests must derive from.
The test class is composed of :
1 ) Filename – The filename where to save the results obtained by the calculation
2 ) NumeroEsecuzioni – The number of executions of each test code inside the Test class.
3 ) Results – Contains all the results obtained from an execution of the tests in milliseconds.
4 ) TestCodes – A list of delegates that contains the codes to be tested against each other.
5 ) Destroy() – Destroy every resource initialized and used by the test.
6 ) ExecuteTests() – Execute all the tests NumeroEsecuzioni times and save the best results obtained for each one of them.
7 ) Initialize() – Initialize all the resources needed by the test at runtime.
8 ) WriteResultToFile() – Save the results to a text file.
The second area is the area defining the TestBatcher class, a class that allows batching of multiple Test class execution.
It is composed of :
1 ) AddTest(Test tst) – Add a new test to the batch system
2 ) ExecuteBatch() – Execute all the tests queued in the system
3 ) Initialize() – Initialize the batch system.
With this simple framework you can test and know how long it takes to do some kind of operation and use the fastest method.
This is an example test class i’ve used to know what was faster between Map() and UpdateSubresource() in DirectX10 with SlimDX to update a resource buffer :
/* * Copyright (c) 2009 Ferreri Alessio * * Permission is hereby granted, free of charge, to any person obtaining a copy * of this software and associated documentation files (the "Software"), to deal * in the Software without restriction, including without limitation the rights * to use, copy, modify, merge, publish, distribute, sublicense, and/or sell * copies of the Software, and to permit persons to whom the Software is * furnished to do so, subject to the following conditions: * * The above copyright notice and this permission notice shall be included in * all copies or substantial portions of the Software. * * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE * AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER * LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, * OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN * THE SOFTWARE. */ using System; using System.Collections.Generic; using System.Linq; using System.Text; using SlimDX; using System.Drawing; #if DX9 using SlimDX.Direct3D9; #else using SlimDX.Direct3D10; using GameLibrary; #endif namespace TestFramework { public class TestUpdateSubresourceVsMap : Test { #if DX10 private SlimDX.Direct3D10.Buffer BufferToUpdateMap; private SlimDX.Direct3D10.Buffer BufferToUpdateUpdateSubresource; private Matrix WorldMatrix; private Single[] FinalMatrix; private byte[] FinalArray; private Device device; public override void Initialize() { base.Initialize(); device = Program.gc.Manager.Device; BufferToUpdateMap = new Buffer(Program.gc.Manager.Device, 64, ResourceUsage.Dynamic, BindFlags.VertexBuffer, CpuAccessFlags.Write, ResourceOptionFlags.None); BufferToUpdateUpdateSubresource = new Buffer(Program.gc.Manager.Device, 64, ResourceUsage.Default, BindFlags.VertexBuffer, CpuAccessFlags.None, ResourceOptionFlags.None); WorldMatrix = Matrix.Identity; FinalMatrix = new Single[16]; FinalMatrix[0] = WorldMatrix.M11; FinalMatrix[1] = WorldMatrix.M21; FinalMatrix[2] = WorldMatrix.M31; FinalMatrix[3] = WorldMatrix.M41; FinalMatrix[4] = WorldMatrix.M12; FinalMatrix[5] = WorldMatrix.M22; FinalMatrix[6] = WorldMatrix.M32; FinalMatrix[7] = WorldMatrix.M42; FinalMatrix[8] = WorldMatrix.M13; FinalMatrix[9] = WorldMatrix.M23; FinalMatrix[10] = WorldMatrix.M33; FinalMatrix[11] = WorldMatrix.M43; FinalMatrix[12] = WorldMatrix.M14; FinalMatrix[13] = WorldMatrix.M24; FinalMatrix[14] = WorldMatrix.M34; FinalMatrix[15] = WorldMatrix.M44; FinalArray = new byte[64]; TestCodes.Add(new Action(delegate { DataStream ds = BufferToUpdateMap.Map(MapMode.WriteDiscard, MapFlags.None); ds.WriteRange(FinalMatrix, 0, 16); BufferToUpdateMap.Unmap(); ds.Dispose(); })); TestCodes.Add(new Action(delegate { unsafe { ByteConverter.WriteSingleArrayToByte(ref FinalMatrix, ref FinalArray, 0); fixed (byte* arr = FinalArray) { device.UpdateSubresource(arr, 64, 64, BufferToUpdateUpdateSubresource, 0); } } })); NumeroEsecuzioni = 1000; Filename = AppDomain.CurrentDomain.BaseDirectory + "ResultsUpdateSubresourceMap.txt"; } public override void Destroy() { base.Destroy(); BufferToUpdateMap.Dispose(); BufferToUpdateUpdateSubresource.Dispose(); } #endif } }
By the way, it is faster to use the new Updatesubresource i’ve added to SlimDX rather than using Map/Unmap. Here’s the results on my notebook.
0,004050794 milliseconds with Map()/Unmap()
0,003422223 milliseconds with UpdateSubresource()
I hope this reading has been useful to you,
See ya 😉