Tag Archives: memory mapped files

Looking for advice on performance difference between DotNET C Sharp and Visual C++ handling memory mapped files

Looking for advice on performance difference between DotNET C Sharp and Visual C++ handling memory mapped files

I am looking for a debate on the difference between these two languages. As for Java, I am under the impression it really offers not much advantage as compared to either language unless you want to run under Linux. To do the equivelant of this under Java, use the NIO framework.

Visual C++ examples:


C# with performance comparison


From this benchmark link:


I conclude that:
1. C++ can still be 2x time faster than in C# in certain critical number crunching scenarios

2. Dump Mono as that is not even to be used. Stick with Windows Server editions for extra horsepower anf many other advantages for server side features

Other highlights:

C++, on the other hand, reveals absolutely no difference between the template <double> version and the original non-template version

The results are clear. First of all, C++ wins in every case. If you use the x86 JIT, 

Anyway, VC++ wins big this time, typically running twice as fast as .NET, if not more (and Mono brings up the rear again, at 8.6 seconds). My experience suggests that array range checks can’t account for such a big difference, but investigation is hindered because I still don’t know how to look at C#’s optimized assembly. Go ahead and laugh at me. Ha!

The first time I ran this “Polynomials” benchmark in C# and C++ (VS 2008, x86), the result was that C++ finished in 0.034 seconds, and C# in 7.085 seconds, making C++ was 208 times faster! Of course, other than the C++ DEBUG hash_map test that took over 10 hours, this was the most extreme result I had ever seen. The C++ result seemed impossible: the outer loop does 10 million iterations and the two inner loops do 100 iterations each, for a total of 2 billion iterations. There was no way so much work could be done in 0.034 seconds on a 2.83 GHz processor. Indeed, looking at the assembly of the outer loop, you can see that the compiler had done something amazing:

Strangely, however, the MS STL’s map<K,V> is actually faster than its hashtable (hash_map or unordered_map), although not as fast as my hash_map. In the .NET world, SortedDictionary is slower than Dictionary as you would expect… but does it have to be this much slower?

C++ wins by a landslide! C++’s map<K,V> is quite fast, while C#’s SortedDictionary<K,V> is quite slow. 

There’s no clear winner for this test. Generally, C# did a better job in the “double” test (even when C++ is allowed to use SSE2), but a worse job in the FPL8 and FPL16 tests. C# is tied with C++ in the float test, until you enable SSE which causes C++ to win


Evidently, C++ is faster than C# if you read the whole file as one block. This makes sense, because .NET actually has to do more work in these tests: it is decoding UTF-8 to UTF-16. Standard C++, in contrast, doesn’t support conversion between text encodings, and these tests use “char” strings (not wchar_t), so the only extra work C++ has to do is to convert line endings (\r\n to \n), work which C# does too.

It is therefore peculiar that C# wins the second test, reading line-by-line (ReadLine() in C# vs getline() orfgets() in C++). It occurs to me I could have optimized the FILE* version better; for example, fgets() does not return the string length. So, to eliminate the '\n' at the end of the string (necessary to make the code’s behavior equivalent to the other two versions), I call strlen() to get the string length, then change the last character to ‘\0’. It might have been faster to convert to std::string first (which determines the string length) and then remove the last character.

***If you use P/Invoke a lot, though, it’s important to consider how long it takes to cross the boundary between C# and C++.

Now, these results are not terrible. But non-P/Invoke calls are much faster, as you’ll see below.



Join my FREE newsletter to learn more which is better 


A comment cam in from a newsletter subscriber so thanks to him:


I’ve done this in C#.  I did not do it in C++ as for my needs C# performance was more than sufficient for this case.
When used in conjunction with something like .NET’s AutoResetEvent which can be shared across multiple processes it is surprisingly fast.  So if you are using it for something like inter process communication it works quite well!
Now with respect to how it compares to C++, I am not sure but you could probably test this out pretty quickly.  My guess is that the underlying parts which handle this might even be written in C/C++ and .NET is using p/invoke into these OS DLL’s to offer the functionality.  A simple test might be writing in 100 million objects and seeing how each perform.
Keep in mind that you can mix native/managed fairly easily through the use of C++/CLI or P/Invoke.  Both work very well.
I find C++ to require a bit more effort than C# and a lot more than F# so it’s best saved for algorithms that REALLY benefit from it and even than you can offload to the native side where necessary.  You can always use C++/CLI to wrap your native code and hold a pointer to whatever your native class might be.  The only penalty you are faced with in this case is the additional chatter that might be taking place between the native/managed boundary.
One scenario where managed simply might not work is if it critical that the operation you are performing MUST take place within a specific tolerance level as you are vulnerable to garbage collection operations (but even in this case there is room to tune).
Hopefully you find this useful!
NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!