CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Greg Young [MVP]

July 2007 - Posts

  • Books that influenced me

     

    After seeing Jeremy’s list I figured I would put up the books that have influenced me. This is by far a partial list but ...

     

    First programming book I owned: Mastering Turbo Assembler, Tom Swan

    OO

    Gang of Four

    POSA 1-5 (note 4 & 5 are recently out)

    Domain Driven Design, Evans

    P of EAA, Refactoring, Fowler

    OOAD, Booch

    Object Design Designing Object-Oriented Software, Wirfs-Brock (I agree with Jeremy these are some of the most under-rated books out there)

     

    AI type stuff

    PDP 1 PDP 2 Explorations, McClelland

    Foundations of Genetic Programming, Langdon

    GP 2 GP 3, Koza

     

    Agile Stuff

    The Six Sigma Handbook, what’s the point in becoming agile if you can’t measure the benefit?

    Extreme Programming Explained, Beck

    Lean Software Development, Poppendieck

    Lean Thinking, Womack yeah this is kind of weird here but read it!

     

    Thinking

    How To Solve It, Poyla

    And Suddenly the Inventor Appeared Innovation Algorithm, Altshuller  

     

     

    Other

    TAOCP belongs on every software engineer’s desk, as does Sedgewick

    Code Complete, McConnell

    Writing Secure Code, Howard

    The Pragmatic Programmer, Hunt

    The Pi-Calculus The Pi-Calculus You will start hearing this more and more.

    Purely Functional Data Structures MUST HAVE

    The Little Lisper This book is just beautiful in its simplicity (check out the whole series)

    The Zen of Code Optimization, Abrash

     

     

  • Article on InfoQ

    As some of you know I have become an editor over at InfoQ in the architecture section. I put up my first article today which was an interview with Galen Hunt and Jim Larus about Singularity and their experiences with topics like strong typing, the pi-calculus, and static verification (spec#) in their efforts.

    http://www.infoq.com/news/2007/07/singularity

     
     



     

  • More on Pinning [Perfmon Counters Broken]

    In my last 2 posts we spent a bit of time looking at asynchronous socket buffer management and Continued. In dealing with these two posts I came across a few things, which appear to be problems in the framework.

     

    I promised that I would be a good boy blog more so here we go ... The first problem I ran into is the # of pinned objects counter that the CLR exposes.

    Using the code provided in the second post with 500 clients, run the server (release mode). Bring up perfmon and point it at the server for “# of pinned objects” located in CLR memory. It should show 7 pinned objects.

     

     

     

    Now break the server and load SOS. Run !GCHandles

     

    !GCHandles

    GC Handle Statistics:

    Strong Handles: 28

    Pinned Handles: 7

    Async Pinned Handles: 0

    Ref Count Handles: 0

    Weak Long Handles: 46

    Weak Short Handles: 9

    Other Handles: 0

    Statistics:

          MT    Count    TotalSize Class Name

    790f7698        1           12 System.Object

    79153fd4        1           16 System.Threading.RegisteredWaitHandle

    791540c8        1           20 System.Threading._ThreadPoolWaitOrTimerCallback

    791138d0        1           24 System.Threading.ManualResetEvent

    790f90cc        1           28 System.SharedStatics

    790f8688        1           72 System.ExecutionEngineException

    790f85e4        1           72 System.StackOverflowException

    790f8540        1           72 System.OutOfMemoryException

    790f92b8        1          100 System.AppDomain

    790f8c24        2          112 System.Threading.Thread

    790fa154        5          120 System.Reflection.Assembly

    7a77273c        4          144 System.Net.Logging+NclTraceSource

    790f8790        2          144 System.Threading.ThreadAbortException

    7a772810        4          160 System.Diagnostics.SourceSwitch

    79104da8        4          192 System.Reflection.Module

    790fb43c        7          252 System.Security.PermissionSet

    790ff76c       46         3496 System.RuntimeType+RuntimeTypeCache

    7911eb1c        7        17456 System.Object[]

    Total 90 objects

     

    So it is correct so far, the 7 handles match our shown 7 pinned handles. Start the client and let all 500 connect (it will tell you when they are done). Watch perfmon until you see a pattern like in the picture here (note it goes up and then falls back down).

     

     

     

    When it has come back down to 0 (will take about 45 seconds) break into SOS again.
     

     

    !GCHandles

    PDB symbol for mscorwks.dll not loaded

    GC Handle Statistics:

    Strong Handles: 27

    Pinned Handles: 7

    Async Pinned Handles: 501

    Ref Count Handles: 0

    Weak Long Handles: 3

    Weak Short Handles: 11

    Other Handles: 0

    Statistics:

          MT    Count    TotalSize Class Name

    790f7698        1           12 System.Object

    790f90cc        1           28 System.SharedStatics

    790f8688        1           72 System.ExecutionEngineException

    790f85e4        1           72 System.StackOverflowException

    790f8540        1           72 System.OutOfMemoryException

    790f92b8        1          100 System.AppDomain

    790fa154        5          120 System.Reflection.Assembly

    7a77273c        4          144 System.Net.Logging+NclTraceSource

    790f8790        2          144 System.Threading.ThreadAbortException

    7a772810        4          160 System.Diagnostics.SourceSwitch

    79104da8        4          192 System.Reflection.Module

    790ff76c        3          228 System.RuntimeType+RuntimeTypeCache

    790fb43c        7          252 System.Security.PermissionSet

    790f8c24        6          336 System.Threading.Thread

    7911eb1c        7        17456 System.Object[]

    79153be0      501        34068 System.Threading.OverlappedData

    Total 549 objects

     

    The perf counter is telling us there are 0 pinned objects but this not seem to agree with SOS. This is very disconcerting since one of the things we should be watching during debugging is our # of pinned objects to help watch out for heap fragmentation but it gets worse.

     

    This counter only shows us the GCHandle generated pinned objects, these are not the only pinned objects. The Marshaller also pins objects when it handles calls to unmanaged code (like when we call into say the unmanaged socket libraries). In this particular example there are about 3-4 times as many pinned objects than we can see. You will notice that the pins here are on OverlappedData objects, my actual buffers are also pinned but we can’t see that they are pinned from here. In doing some research there is apparently no way to view the objects that the marshaller has pinned so in order to gauge what’s going on we are bound to only looking at resulting heap fragmentation and trying to minimize it.

     

    This can be seen in trying to compare the two versions of the server (with and without buffer management). If you run a !GCHandles where I ran the heap commands in the previous post you will notice that the same number of GCHandles are listed for each version yet the buffermanager version has far less heap fragmentation. This is because the pins on the actual byte arrays are done by the marshaller as such we can't see them but can only view their effect of causing heap fragmentation. There are other things which are being pinned in this process which is why both versions show heap fragmentation (3.5 includes a new version of this which pins far less objects, perhaps we should give it a whirl in another post?)

     

     

     

    I have placed these issues in MS connect feel free to vote or comment

     

  • How bad can it be?

    Someone sent me this yesterday and I just had to share.

    Classic 

     They just need more of a delay in the 3rd frame and this is oh too accurate. This has gone on the wall in my office.

     This comic and many other great comics can be found at http://xkcd.com/



     

  • Async Sockets and Buffer Management [CTD]

    Recently I posted a buffer manager for async sockets quite a few people asked me to put up a fuller example of the code. Well here it is ... You can download the associated project Here (or here http://codebetter.com/files/folders/codebetter_downloads/entry166135.aspx) Sorry for the delay to those who were waiting.

    To start, I will go over what the code actually does, then I will show why the buffer manager improves the server. I then will pose a question or two to the readers as to my next post or two.

    Let us quickly take a side bar in how to get the included code working. If you have never changed it before the client app probably throws and exception saying "Only one usage of each socket address (protocol/network address/port) is normally permitted." when it hits about 2500 connections. To get around this we need to tell windows to allow more ports to be used. Run regedit and in KEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters add a new item MaxUserPort with a value of 0x4000 (this should be plenty of ports although you can go higher). You will need to reboot in order to have this change take affect.

    The solution layout consists of two projects; a client and a server.

    The code presented is quite simple, it is a basic client push only protocol (because BeginReceive is the main problem spot). The clients send data to the server in the form of <STX>data<ETX> (The parsing is handled by StxEtxChunker). The server in a real life scenario would probably do some form of processing upon this data thoughas of now it just kind of "pretends" to do such processing. The server at this time works with fixed size receive buffers; if people are curious how to implement dynamically sized receive buffers (as I discussed in the last post using the IList<ArraySegment<byte>> overload) just let me know and it will be another post.

    You may be asking "wait how come it only receives data". Well the BeginReceive method is the one that is most notorious for pinning problems so I chose to isolate it. The same problems exist with BeginSend but tend to be more minor as the pinning is not as long lived (I can have a client connected who doesn't send me anything for 45 minutes but it is quite rare that I send something and it takes more than a second to return as completed)

    Let's get into some CODE!!

    The major difference in the code that the BufferManager changes is in the constructor:

    Example A: m_ReadBuffer = ApplicationContext.BufferProvider.CheckOut();

    in many systems this would be implemented as something like:

    Example B: m_ReadBuffer = new ArraySegment<byte>(new byte[4096]);

    The client code sends randomly sized packets at somewhat random intervals to the server to simulate a load on it.

     

    Now for some analysis; we are interested in a few things based upon the previous post.

    1) What is ourheap fragmentation

    2) Where does the memory live

     

    For the tests I will run the server and client until all clients are connected (when the client prints "All clients up press enter to exit"). We will then break into debug and look at what SOS has to say about our heaps. For the test and in the posted code I am running 10,000 clients with a buffer size of 4k so it is important that you see the above changes to the registry to duplicate these results.

    In the first example (Example B) which is the normal case and just creates an array to use we end up with the following output from SOS

     

    !EEHeap -gc
    PDB symbol for mscorwks.dll not loaded
    Number of GC Heaps: 1
    generation 0 starts at 0x0c7286d4
    generation 1 starts at 0x0c6bba44
    generation 2 starts at 0x013c1000
    ephemeral segment allocation context: none
    segment begin allocated size
    001a7c70 7a72c42c 7a74d308 0x00020edc(134876)
    001966c8 790d5588 790f4b38 0x0001f5b0(128432)
    013c0000 013c1000 023b7fe8 0x00ff6fe8(16740328)
    03fc0000 03fc1000 04ed9b18 0x00f18b18(15829784)
    057f0000 057f1000 0668b7ac 0x00e9a7ac(15312812)
    08740000 08741000 0973e354 0x00ffd354(16765780)
    06d40000 06d41000 07d2f1c8 0x00fee1c8(16703944)
    0c1d0000 0c1d1000 0caef020 0x0091e020(9560096)
    Large object heap starts at 0x023c1000
    segment begin allocated size
    023c0000 023c1000 023c5260 0x00004260(16992)
    Total Size 0x56f7ed4(91193044)
    ------------------------------
    GC Heap Size 0x56f7ed4(91193044)

    !DumpHeap -type Free -stat
    total 18455 objects
    Statistics:
    MT Count TotalSize Class Name
    00153a70 18455 21495952 Free
    Total 18455 objects

    For those not familiar with these outputs the key # we are looking at is the GC Heap Size vs the total Free size as it shows our fragmentation. In this example we get 91m/21m which doesn't meet the common "30%" oh my god benchmark but it isn't too far off. Essentially 1/4 of our heap is wasted. Let's try running our buffer managerred server (example A) in the test and see how it performs.


    !EEHeap -gc
    PDB symbol for mscorwks.dll not loaded
    Number of GC Heaps: 1
    generation 0 starts at 0x0bdd9ba0
    generation 1 starts at 0x0bcd3970
    generation 2 starts at 0x013c1000
    ephemeral segment allocation context: none
    segment begin allocated size
    001a7c70 7a72c42c 7a74d308 0x00020edc(134876)
    001966c8 790d5588 790f4b38 0x0001f5b0(128432)
    013c0000 013c1000 023bcd40 0x00ffbd40(16760128)
    06200000 06201000 070d8918 0x00ed7918(15563032)
    0bb70000 0bb71000 0bedbef0 0x0036aef0(3583728)
    Large object heap starts at 0x023c1000
    segment begin allocated size
    023c0000 023c1000 03365340 0x00fa4340(16401216)
    049c0000 049c1000 059610e0 0x00fa00e0(16384224)
    0a680000 0a681000 0ae51040 0x007d0040(8192064)
    Total Size 0x4992e34(77147700)
    ------------------------------
    GC Heap Size 0x4992e34(77147700)

    !DumpHeap -type Free -stat
    total 9560 objects
    Statistics:
    MT Count TotalSize Class Name
    00153a70 9560 3593904 Free
    Total 9560 objects

    Or fragmentation has gone from 21mb to 3mb! This is a huge difference and far less waste. As we can see the BufferManager has done its job in helping to prevent our heap fragmentation. In most applications it will also help keep our % of time in GC down as it reuses the same buffers as opposed to creating new ones.

    The second question was with heap usage. Looking at the results above we can see very quickly that the BufferManager example does in fact put its buffers into the LOH (40977504 with the BufferManager (Example B) 16992 without (Example A)). The difference is made up in the normal heap which causes further strain on the GC as it moves our buffers from generation to generation; and further strain when compacting the gen2 heap where they finally end up.

    This brings us to the interesting part of this post; would you like me to continue this example? There are lots of other interesting things to show, like implementing SendQueues, Authentication, and even Encryption. It would seem to me that if this can scale over 4000 m/s with 10,000 connections on my LAPTOP that we have a good beginning to an example server. My initial thought is a chat server with a flash front end (or silverlight wink wink), what are yours? If you are a flash geek or someone playing with silverlight and want to help leave a comment.

     

    UPDATE: since people were having trouble with original download I have reupload here http://codebetter.com/files/folders/codebetter_downloads/entry166135.aspx 

    Posted Jul 20 2007, 05:15 AM by Greg with 16 comment(s)
    Filed under:
More Posts