CodeBetter.Com
CodeBetter.Com
RSS 2.0 via Feedburner
           Do you Twitter? Follow us @CodeBetter

Greg Young [MVP]

June 2007 - Posts

  • Async Sockets and Buffer Management

    A long time ago I promised to release my buffer manager class that I use with my Async socket server with some explanations on my blog as to why I use it. Chris Mullins and JD Conley were a lot of help to me in coming up with this strategy some time ago (August? of last year I think) and they are continued resources on advanced socket programming who are great to bounce ideas off of. Unfortunately due to being so gosh darn busy I never had the time to share it with everyone. I should also add that Sebastien Lorion also made a really valuable suggestion in the move from a queue to a stack as the operation is much faster and can provide better localization and that Jeffrey Richter also provided some information on the non-assurance of LOH in the future.

    Yun Jin gives a very good description of the issue in this post but I will briefly summarize the issue. When you call BeginReceive/BeginSend on a socket the memory that you pass to the method will be pinned for the duration of the call. This will lead to heap fragmentation and to make matters worse it will often be in your gen0 heap if you aren't careful. This becomes especially true in the most common socket applications (request/response) as BeginReceive is often called and waits for a long time to receive data (by long time I mean a few seconds or more).

    Unfortunately the solution presented by Yun Jin is far from optimal in 2.0. This is of course of no fault of his own as it was written for 1.1 although some optimizations could apply there as well. Even as presented it will run into a few issues. The first issue and I consider this the biggest is based on one of his statements.

    If the pinned objects are allocated around same time, the free slots between each two objects would be smaller, and the situation is better.

    The code presented there does a good job of creating objects at the same time. The problem is that you don't generally pin them for the same amount of time (and they are often pinned for very long periods of time) so as collections run the GC will move the unpinned ones thus creating holes around the pinned ones (heap fragmentation) which is what we were trying to avoid. Given he does qualify his buffer manager with another rule of

    The shorter the objects are pinned, the easier GC could compact the heap

    Unfortunately in the case of most TCP servers this is not something that can always be realized.

    The main area that we can improve in by moving to .NET 2.0 is by using ArraySegments<byte> structures instead of using byte[]. This may seem like a very simple change but it in fact makes a huge difference in two areas.

    The first thing it will allow us to do which was a limitation of the previous buffer manager is to vary our buffer size per client. If you have a client who is receiving a large file for instance you probably want to use larger send buffers than a client who is not really doing much. The reason you want to do this is to limit the number of thread context switches associated with that socket (less context switches = better and more scalable). The buffer manager as presented used single sized byte arrays and as such all buffers were to be considered of equal size. .NET 2.0 added a few new overloads in dealing with BeginReceive and BeginSend that allow me to pass a List<ArraySegment<byte>> by making my BufferManager now manage ArraySegment<byte> objects a connection can at its whim increase and decrease its buffer size by checking out more buffers or returning some buffer back to the cache respectively.

    The next change that was allowed because of the use of ArraySegments was the further isolation of the memory. Instead of creating numerous small arrays which then move slowly from Gen0->Gen1 etc I can create really big arrays which will be put right into the LOH (Large Object Heap). At present objects over 80k or so are put in the LOH but this is subject to change in the future *keep this in mind as it may affect your performance slightly later as you could end up with a really big array in gen0 which is pinned at almost all times*. One of the nice things about the LOH is that there is no compaction in other words we don't have to worry about the object being pinned or about fragmentation. As you can see the ability to create this one BIG array will be quite a bit better for us as opposed to dealing with lots of little arrays.

    The biggest issue I had with the code as it stood was the handling when we ran out of buffers, it would create them one by one leading to all the same issues we would have without a buffer manager (objects not living near each other and being pinned thus fragmenting the heap). This issue is removed automatically when dealing with chunks because we always N chunks at a time (i.e. if we run out of buffers and we are told to deal with 8k chunks in 1000 buffer segments we won't create 1 new buffer but a whole new 8mb segment consisting of 1000 buffers). Although this could be wasteful for memory one is generally better off in a server application to grow items like this once as opposed to often. In fact the reason why this is not such a big issue overall is that with the previous incarnate it should always be the case that BUFFER_SIZE is larger than the maximum number of buffers you think you would need and thus you never need to resize (an extra few mb in buffer space is no big deal in the context of such an application and this is a common pattern)

    The other changes which were made were making the buffer cache thread safe. I used LockFreeStacks to do this. There are many implementations of lock free stacks that you can use available on the net Julian Bucknall has a great series of articles on this as well. I used this particularly because I run on a machine with lots of processors, you will likely be better off running with simple monitor based locking around a normal Stack<T> if running on less than 8 processors and this should
    be a fairly easy change to the code. If people are fairly interested I will spend the 15 minutes to make this change.

    UPDATE: I put the lock version in a seperate listing (you could make the locks finer grained than my quick update though)

    As a quick sidebar: the checkout method may look odd to you in the LockFreeStack version. The TryPop method will try to pop a value of the stack for me (in the tryxxx pattern format). The reason this pattern is needed is because I don't lock my count test would be innacurate, I need to do the operation attomically If a buffer is not found then I need to create more a new segment of buffers ... but another thread may have already done this for me (so I check the count in my lock before creating again). There is also an interesting case where although a new segment was created I am unlucky enough that all of it has been used up when I get to it so I have to loop. This has all of the problems of lock free algorithms especially starvation but is so unlikely to happen that I am not considerring to be a major concern.

    All of that said here is the code. Note that even with all of this long explanation the code is very short (as is the original) but its a perfect example of how subtle asynchronous socket programming with .NET can be!

     

       /// <summary>
       /// A manager to handle buffers for the socket connections
       /// </summary>
       /// <remarks>
       /// When used in an async call a buffer is pinned. Large numbers of pinned buffers
       /// cause problem with the GC (in particular it causes heap fragmentation).
       ///
       /// This class maintains a set of large segments and gives clients pieces of these
       /// segments that they can use for their buffers. The alternative to this would be to
       /// create many small arrays which it then maintained. This methodology should be slightly
       /// better than the many small array methodology because in creating only a few very
       /// large objects it will force these objects to be placed on the LOH. Since the
       /// objects are on the LOH they are at this time not subject to compacting which would
       /// require an update of all GC roots as would be the case with lots of smaller arrays
       /// that were in the normal heap.
       /// </remarks>
       public class BufferManager {
           private readonly int m_SegmentChunks;
           private readonly int m_ChunkSize;
           private readonly int m_SegmentSize;
           private readonly LockFreeStack<ArraySegment<byte>> m_Buffers;
           private readonly object m_LockObject = new Object();
           private readonly List<byte[]> m_Segments;

           /// <summary>
           /// The current number of buffers available
           /// </summary>
           public int AvailableBuffers {
               get { return m_Buffers.Count; } //do we really care about volatility here?
           }

           /// <summary>
           /// The total size of all buffers
           /// </summary>
           public int TotalBufferSize {
               get { return m_Segments.Count * m_SegmentSize; } //do we really care about volatility here?
           }

           /// <summary>
           /// Creates a new segment, makes buffers available
           /// </summary>
           private void CreateNewSegment() {
               byte[] bytes = new byte[m_SegmentChunks * m_ChunkSize];
               m_Segments.Add(bytes);
               for (int i = 0; i < m_SegmentChunks; i++) {
                   ArraySegment<byte> chunk = new
    ArraySegment<byte>(bytes, i * m_ChunkSize, m_ChunkSize);
                   m_Buffers.Push(chunk);
               }
           }

           /// <summary>
           /// Checks out a buffer from the manager
           /// </summary>
           /// <remarks>
           /// It is the client's responsibility to return the buffer to the manger by
           /// calling <see cref="Checkin"></see> on the buffer
           /// </remarks>
           /// <returns>A <see cref="ArraySegment"></see> that can be used as a buffer</returns>
           public ArraySegment<byte> CheckOut() {
               ArraySegment<byte> b;
               while (true)
               {
               while (true)
               {
                   bool found = !m_Buffers.TryPop(out b);
                   if (!found)
                   {
                       lock (m_LockObject)
                       {
                           if (m_Buffers.Count == 0)
                           {
                               CreateNewSegment();
                           }
                       }
                   }
                   else
                   {
                       return b;
                   }
               }

           }

           /// <summary>
           /// Returns a buffer to the control of the manager
           /// </summary>
           /// <remarks>
           /// It is the client's responsibility to return the buffer to the manger by
           /// calling <see cref="Checkin"></see> on the buffer
           /// </remarks>
           /// <param name="_Buffer">The <see cref="ArraySegment"></see> to return to the cache</param>
           public void CheckIn(ArraySegment<byte> _Buffer) {
               m_Buffers.Push(_Buffer);
           }

           #region constructors

           /// <summary>
           /// Constructs a new <see cref="BufferManager"></see> object
           /// </summary>
           /// <param name="_SegmentChunks">The number of chunks tocreate per segment</param>
           /// <param name="_ChunkSize">The size of a chunk in bytes</param>
           public BufferManager(int _SegmentChunks, int _ChunkSize) :
                this(_SegmentChunks, _ChunkSize, 1) { }

           /// <summary>
           /// Constructs a new <see cref="BufferManager"></see> object
           /// </summary>
           /// <param name="_SegmentChunks">The number of chunks tocreate per segment</param>
           /// <param name="_ChunkSize">The size of a chunk in bytes</param>
           /// <param name="_InitialSegments">The initial number of segments to create</param>
           public BufferManager(int _SegmentChunks, int _ChunkSize, int _InitialSegments) {
               m_SegmentChunks = _SegmentChunks;
               m_ChunkSize = _ChunkSize;
               m_SegmentSize = m_SegmentChunks * m_ChunkSize;
               m_Buffers = new LockFreeStack<ArraySegment<byte>>(_SegmentChunks * _InitialSegments);
               m_Segments = new List<byte[]>();
               for (int i = 0; i < _InitialSegments; i++) {
                   CreateNewSegment();
               }
           }
       }

    Listing 1: Code with LockFreeStack

     

    using System;
    using System.Collections.Generic;
    using System.Text;

    namespace TickerPlant
    {
       /// <summary>
       /// A manager to handle buffers for the socket connections
       /// </summary>
       /// <remarks>
       /// When used in an async call a buffer is pinned. Large numbers of pinned buffers
       /// cause problem with the GC (in particular it causes heap fragmentation).
       ///
       /// This class maintains a set of large segments and gives clients pieces of these
       /// segments that they can use for their buffers. The alternative to this would be to
       /// create many small arrays which it then maintained. This methodology should be slightly
       /// better than the many small array methodology because in creating only a few very
       /// large objects it will force these objects to be placed on the LOH. Since the
       /// objects are on the LOH they are at this time not subject to compacting which would
       /// require an update of all GC roots as would be the case with lots of smaller arrays
       /// that were in the normal heap.
       /// </remarks>
       public class BufferManager {
           private readonly int m_SegmentChunks;
           private readonly int m_ChunkSize;
           private readonly int m_SegmentSize;
           private readonly Stack<ArraySegment<byte>> m_Buffers;
           private readonly object m_LockObject = new Object();
           private readonly List<byte[]> m_Segments;

           /// <summary>
           /// The current number of buffers available
           /// </summary>
           public int AvailableBuffers {
               get { return m_Buffers.Count; } //do we really care about volatility here?
           }

           /// <summary>
           /// The total size of all buffers
           /// </summary>
           public int TotalBufferSize {
               get { return m_Segments.Count * m_SegmentSize; } //do we really care about volatility here?
           }

           /// <summary>
           /// Creates a new segment, makes buffers available
           /// </summary>
           private void CreateNewSegment() {
               byte[] bytes = new byte[m_SegmentChunks * m_ChunkSize];
               m_Segments.Add(bytes);
               for (int i = 0; i < m_SegmentChunks; i++) {
                   ArraySegment<byte> chunk = new
    ArraySegment<byte>(bytes, i * m_ChunkSize, m_ChunkSize);
                   m_Buffers.Push(chunk);
               }
           }

           /// <summary>
           /// Checks out a buffer from the manager
           /// </summary>
           /// <remarks>
           /// It is the client's responsibility to return the buffer to the manger by
           /// calling <see cref="Checkin"></see> on the buffer
           /// </remarks>
           /// <returns>A <see cref="ArraySegment"></see> that can be used as a buffer</returns>
           public ArraySegment<byte> CheckOut() {
               lock (m_LockObject) {
                   if (m_Buffers.Count == 0) {
                       CreateNewSegment();
                   }
                   return m_Buffers.Pop();
               }
           }

           /// <summary>
           /// Returns a buffer to the control of the manager
           /// </summary>
           /// <remarks>
           /// It is the client's responsibility to return the buffer to the manger by
           /// calling <see cref="Checkin"></see> on the buffer
           /// </remarks>
           /// <param name="_Buffer">The <see cref="ArraySegment"></see> to return to the cache</param>
           public void CheckIn(ArraySegment<byte> _Buffer) {
               lock (m_LockObject) {
                   m_Buffers.Push(_Buffer);
               }
           }

           #region constructors

           /// <summary>
           /// Constructs a new <see cref="BufferManager"></see> object
           /// </summary>
           /// <param name="_SegmentChunks">The number of chunks tocreate per segment</param>
           /// <param name="_ChunkSize">The size of a chunk in bytes</param>
           public BufferManager(int _SegmentChunks, int _ChunkSize) :
                this(_SegmentChunks, _ChunkSize, 1) { }

           /// <summary>
           /// Constructs a new <see cref="BufferManager"></see> object
           /// </summary>
           /// <param name="_SegmentChunks">The number of chunks tocreate per segment</param>
           /// <param name="_ChunkSize">The size of a chunk in bytes</param>
           /// <param name="_InitialSegments">The initial number of segments to create</param>
           public BufferManager(int _SegmentChunks, int _ChunkSize, int _InitialSegments) {
               m_SegmentChunks = _SegmentChunks;
               m_ChunkSize = _ChunkSize;
               m_SegmentSize = m_SegmentChunks * m_ChunkSize;
               m_Buffers = new LockFreeStack<ArraySegment<byte>>(_SegmentChunks * _InitialSegments);
               m_Segments = new List<byte[]>();
               for (int i = 0; i < _InitialSegments; i++) {
                   CreateNewSegment();
               }
           }
       }
    }

    Listing 2: Code for BufferManager with Stack<T> + monitor locking
  • More on modopt

    Previously I discussed some issues with modopt. Well there are more. In total I have logged three issues (could be many more as nearly every language I have looked at has the same problems). I figured I would put up some links and describe each issue here.

     C# specification does not define modopt varying overloads in its best match algorithm

    From 335 (7.1.1): "Two signatures that differ only by the addition of
    a custom modifier (required or optional) shall not be considered to
    match."

    There is no mention of C#'s behavior when dealing with these overloads.

    C# if given an overload based upon modopt will always select the version without the modopt. I would consider this to be part of its overload best match seeking algorithm.

    https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=282404

    Generics do not support modopt

    Generics do not properly support modopt generation when a modopt is included in a type paremeter. As an example:

    class A<T> {
        public foo(T t) {
            Console.WriteLine(t.ToString());
        }
    }

    A<modopt(foo) string> a = new A<modopt(foo) string>();

    the type created by the CLR will be of type A<string> not A<modopt(foo) string>. These items are deemed by the CLR (quote above) to be non-equivalent and will often times have different behavior associated to them by a compiler or by runtime code.

    https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=282406

    Suggestion to change C# modopt overloading

    in something else that can generate modopts ...
    public void PutSomething(string modopt(Foo) something);
    public void PutSomething(string something);
    public string modopt(Foo) GetSomething();


    in C#
    PutSomething(GetSomething())

    This call will generate a call to PutSomething(string) even though the value being returned from the library is a string modopt(Foo). C# can logically determine to call the modopt included overload here regardless of if it understands the modifier or not as the library returning it the value obviously does support the modopt.

    https://connect.microsoft.com/VisualStudio/feedback/ViewFeedback.aspx?FeedbackID=282405

     

     

    All in all this seems to be a very obscure area of the spec that was never taken into heavy consideration when designing many things and as such has alot of bugs associated with it.

  • Modopt, method signatures, and incomplete specs oh my!

    The last few days I have been playing around with spec# and reflector alot. In doing so I came accross a pretty serious bug I had to fix in my dynamic proxy. There is this mostly unused but largely important item in IL called a modopt/req. For those more interested in C#/spec# and handling ... scroll down as there is further information.

    In posting to the castle group to see if their dynamic proxy sufferred from the same issue Ayende asked if its like ref and out and the answer is sort of... Basically a modopt is like a custom attribute on a parameter to a method that can be emitted by a compiler. The big difference between the modopt and the attribute is that the modopt actually changes the signature of the method as far as the CLS is concerned. They are however very similar to out/ref in that the callee must specify it when calling and the fact that they change the signature of the method.

    From ECMA 335 (7.1.1): "Two signatures that differ only by the addition of a custom modifier (required or optional) shall not be considered to match."

    C++/CLI uses modopts fairly extensively but I have never really proxied many C++/CLI classes.

    OK on with the modopt stuff

    Well in seeing this I was quickly able to create two failing unit tests for my dynamic proxy. The first unit test involves overloading based only on a modopt difference. Let's propose I have the following two methods (as defined in spec#)

    void Foo(string s) {}
    and
    void Foo(string! s) { } //notnull string

    When I compile this code I receive method signatures as follows.

    .method public hidebysig instance void OverloadByModOpt(string foo) cil managed

    .method public hidebysig instance void OverloadByModOpt(string modopt([System.Compiler.Runtime]Microsoft.Contracts.NonNullType) foo) cil managed

    As you can see the only difference between the two signatures is the modopt of being a non-null type. My dynamic proxy at present was not aware of modopt and therefor would generate an invalid class as it would use the same signature twice.

    The second problem is when you call a method with a modopt. Let's imagine that the overload without the modopt no longer exists and I am just calling the version with the modopt. Because it varies the signature you have to actually include it in the signature of your call. Since I was not doing this I would receive a runtime method not found exception. An example of calling the method with the modopt:

    call instance void ClassLibrary1.Class1::OverloadByModOpt(string
    modopt([System.Compiler.Runtime]Microsoft.Contracts.NonNullType))

    What about C#?

    Well if you go look at ECMA 334 (the C# spec) and you search for the word "modopt" you will find 0 instances. The C# specification does not actually mention modopts in any way. I have already posted something about this to MS. In the case that C# is dealing with a modopt varied overload it will prefer the version without the modopt.

    One item which I have not yet had time to test (someone else feel free) is to give C# two methods both with different modopts to see which one it chooses. My guess is that it will give an error or it will just take the first one it sees (hopefully it gives an error).

    Obviously there are rules associated with the handling of choosing an overload here they are just unfortunately missing from the spec. Because they are missing from the spec it is quite possible that that *insert other spec compliant implementation* could work in a completely different yet compliant manner.

    I have at this point not heard back from the team on this but promise to update this post when I do.

    Finally back to spec#

    I mentioned as a spec# con in my presentation its use of modopt and people looked at me like I had 3 heads ... here is some further info. 

    The most interesting part of all of this is the spec# compiler which generated this code. It in fact suffers from the same problem as C# (which it not surprising since it is derived from C#). It will generate the overloaded method solely by modopt but will never generate a call to the modopt version of the method. In general the logic that C# ignores the optional modifier makes alot of sense since it does not understand it and the modifier is by its very definition 'optional'.

    When we look at spec# however which is generating the modifer I would expect it to honor it in its overload resolution (it understands it since it generated it) or to not let me do it all. I personally prefer not at all (largely in part due to language interop reasons since some languages can't deal with these types of overloads, J# being an example but I think there are more developers at MS maintaining J# than there are professionals using it ... talk about a death march :)) I sent an email as such to the spec# list and was told that yes it should be considerred a bug and that the compiler would in the future not do this.

     The interesting conclusion about all of this is that if the compiler won't let me overload based upon the modopt why use a modopt? The key difference between the modopt and an attribute is that it varies the signature allowing for overloading based upon it. If I can't overload might as well make it an attribute which alleviates all of the issues with my dynamic proxy (well atleast until something else with a modopt is proxied :))

    Moral of the story

    Modopt is obscure .. it is partially handled in various places. Using modopt will likely cause you problems interacting with other languages and/or environments (I believe mono did not understand modopt for a while). If you are writing a tool that deals with signatures DON'T FORGET MODOPT!

  • Presentation tomorrow

    As always I am up for a few beers and interesting conversation after!

    Where: Vancouver BC (BCIT)

    Abstract: Let us, if only for an evening, trade our microscope for a telescope and look at what is a bit further down the pipeline from Microsoft.

    Recently there has been such a surge of new things coming out of the various .NET product teams that it has been nearly impossible to keep up with what is being released, let alone what is on the distant horizon. This presentation looks at some of these products that are a bit further out. We may have to severely change how we develop software for the .NET platform if they reach adoption (ex. spec#, C omega, functional programming, and dynamic languages).

    For each piece of technology a few fundamental questions will be examined:

    1) What does it do?
    2) What types of knowledge prerequisites are needed in order to use this technology?
    3) How will this affect my current development strategy?
    4) Is there anything I can do now to make adoption in the future easier?

    The goal of looking at these questions is both to familiarize you with the technologies coming out, and to help you build a roadmap for how and if to integrate these new technologies into your own products.

    Directions and further information

More Posts