<br><div class="gmail_quote">On Mon, Aug 15, 2011 at 1:03 PM, Marco van de Voort <span dir="ltr"><<a href="mailto:marcov@stack.nl">marcov@stack.nl</a>></span> wrote:<br><blockquote class="gmail_quote" style="margin: 0pt 0pt 0pt 0.8ex; border-left: 1px solid rgb(204, 204, 204); padding-left: 1ex;">
<div><div></div><div class="h5">In our previous episode, Max Vlasov said:<br>
> about collecting some statistics.<br>
><br>
> The idea is similar to some disk utilities that collects and sort the sizes<br>
> of directories. You know when every folder on the computer is scanned and<br>
> all the resulting paths are sorted by the summed size. Such utilities<br>
> usually help to find space on the hard drive to free.<br>
><br>
> With memory for every memory allocation we can call the path the addresses<br>
> of procedures in stack. So when a function funci() is called from the<br>
> function parentfunc() and getmem is called inside func() then we should add<br>
> this size to the corresponding entries both for func() and parentfunc().<br>> ...<br>
> Do tools(units) like this already exist? If not, is developing such unit<br>
> technically possible/hard with the currently available debug information?<br>
<br>
</div></div>This is basically what valgrind (or fulldebugmode of fastmm) does. But they<br>
do this by parsing the stack on each call to the memory manager, and then<br>
keep track of it.<br>
<br>
Note that all these techniques can be very, very slowing. E.g. I tried to debug<br>
the CHM support with valgrind, and I terminated the valgrind process after 5<br>
hours because it was not even half way to where the bug was.<br>
<br>
Without valgrind the program reached the point in 1-2 minutes.....<br>
<br></blockquote></div><br>Based on this discussion and the following research I made an attempt to implement something like this for fpc/lazarus. <br><br>The final result are a couple of units and a dialog that allows to see the results in any time inside the program. You can see the real example in the screenshot:<br>
<br> <a href="http://www.maxerist.net/downloads/procmemstat_ss.png">http://www.maxerist.net/downloads/procmemstat_ss.png</a><br><br>The download link for pascal sources:<br><br> <a href="http://www.maxerist.net/downloads/procmemstat.zip">http://www.maxerist.net/downloads/procmemstat.zip</a> (5k)<br>
<br>The module traditionally installs its memory manager replacement procs and collects statistics. To make things faster on this step it just detects and saves addresses on the stack that falls into code segment range of the main module. When a request for actual dialog is made then it parses all allocated data and resolves the addresses to symbols with GetLineInfo proc. This step is much longer, the dialog in the screenshot took 15 seconds to appear (on my 1.7 GHz Celeron). <br>
<br>Currently the monitor is only win32-compatible, but the only platform-specific code is about getting current stack range (I took it according this information <a href="http://en.wikipedia.org/wiki/Win32_Thread_Information_Block">http://en.wikipedia.org/wiki/Win32_Thread_Information_Block</a>) and getting code segment range that is made with toolhelp32 snapshots and virtualquery. Finding a way to do the same on linux will possibly make it compatible with linux also. <br>
<br><br>The usage:<br>- add uProcMemMon first in the lpr <br>- define -dPROCMEMMON in the project options-other-custom options<br>- add uProcMemDlg to unit (form) from where you want to show the statistics.<br>- call ProcMemShowDialog from any click handler. Sure you can call the dialog as many times as you want looking what's wrong or right with different states of you program<br>
- if you don't want gui then don't use uProcMemDlg, just call PMMCollectStat that will return TStringList with objects as sizes (unsorted)<br><br>Current limitations:<br>- As I already said, currently it's win32-only<br>
- I assume it's currently not thread-safe because of global variable usage<br>- Several first lines of the dialog are not very useful since they're either getmem related chain or main-related procedures, those are always on stack. <br>
- The speed of collecting is ok, but although it uses linked list so with many allocation it can drop. Resolving to symbols takes some time by default (GetLineInfo is not very good for thousands of queries) so in these areas some further optimization might possible.<br>
- No dynamic loading support. This is due to the fact that the monitor uses the address range of the main module. <br>- Initially I got numbers for winners that were bigger than total
allocated memory. This was because of exception blocks so from the point
view of the parser this getmem were called at once from multiply lines
of the same function. I fixed this by ignoring function duplicate
while parsing the same memory block. It seems it worked, but maybe there
are cases when some new trick will be necessary<br>
<br>If someone finds time to test this approach on real projects, that would be great. As for usefulness the time will show, this time I'd like to know that it at least provide sane results :)<br><br>Thanks<br><br>Max Vlasov<br>