Tag Archives: garbage collection

How do you get around the garbage collection issue when using C# as you have been doing lately for HFT?

A question from a newsletter reader:

Bryan,

How do you get around the garbage collection issue when using C# as you have been doing lately for HFT?

Why is the garbage collection issue not a deal breaker?

This is sort of very valid.
What is the other choice? Manage your own memory in C++? That is just as complicated.

Try reading tipcs like this:
http://stackoverflow.com/questions/3267613/how-to-reduce-garbage-collection-performance-overhead

Also, my backend important model engines can be easily converted into C or C++

Read how I will do this through my FREE newsletter

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Hypertable Beats HBase in Performance Test — HBase Overwhelmed by Garbage Collection

Hypertable Beats HBase in Performance Test — HBase Overwhelmed by Garbage Collection

Hypertable is an open source, scalable database, based on Google’s proprietery Bigtable design. It is similar to HBase except that it is written in C++ for optimal performance. In this High Scalability post, we summarize a test we recently conducted comparing the performance of Hypertable with that of HBase under a number of realistic workloads.

 

 

hmmm
Sounds like someone didn’t bother tuning HBase….
(I Kid, I Kid)

Its interesting, but someone self serving….

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant development: Is it possible to achieve Java zero GC garbage collection? How?

Quant development: Is it possible to achieve Java zero GC garbage collection? How?

Zero GC is of course possible if you allocate objects only on start up, do not use 3rd party libraries and use JNI in case if you need to replace some “standard” libraries which are for sure allocates garbage (I think all Sun’s native libraries (io, nio) allocate that garbage). Agree
GC is not the only source of unpredictable latencies in Java application. There are other ones (at least in non RT OSes like regular Linux):
A. Default  process scheduler which favors RT, SCHED_FIFO, SCHED_RR and not SCHED_OTHER (which is default for Java)B. Occasional Hardware interrupts that of course have a higher priority than you precious Java threadC. Intel HT creates  excessive cache contentionB. Some BIOS default features which needs to be disabled: Processor C-StatesD. Java process page faults (occasional)E. Lack of thread affinity support in Java which results in a poor cache reuse (thread can be rescheduled to another CPU core)
It is just for starter. I am not even talking about Net and File I/O yet.

This was a comment from Vladimir Rodionov

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Zero Garbage Collection (GC) in Java?

Zero Garbage Collection (GC)

I have recently had from two different sources that they have implemented low latency Java system with a ZERO GC.

Is that really possible? If you have achieved this then please share you ideas or opinions on how you think this can be achieved

Use static final class member variables and static scalars only.-

I don’t believe that it is possible. Eventually the JVM will need to clear memory. You can minimize the frequency and impact of GCs, but they have to happen.

Why not? You just allocate as much memory as you need and never release it. In fact it would mean implementing your own memory management but since you can customize it for your specific task it makes perfect sense.

A long rambling reply …

Well everything is possible, with constraints and some “yes buts”.

I spent a year working on a Java codebase that had been built as a framework/toolset intended for low latency applications. The assumption was that the majority of the performance issues were a result of GCs.The appeal to me was that, as a long time Java user who specialised in performance tuning I’d be able to work at the sharp edge and assess whether Java really was suited for low latency. Despite being a Java fanboy I had real skepticism about Java’s suitability.

So after a year of frustration I had convinced myself that yes, you can write low latency code in Java, in the same way that you can walk on fire, or teach a pig to dance, and the results will be just pretty, and just as much fun. In my example it turned out tha the costs of L2 cache busting on Nehalem, and the cost of mode switches and context switches were all having more impact than GC costs. Now these issues will also apply in C/C++ and other languages.

Why Java isn’t right for low latency:
1. Culturally, what many Java folk view as best practices are the practices that guarantee failure if you’re attempting low latency
2. Java tends to push the developer further away from the metal with too much magic happening “under the covers.” Java makes it too easy to write enormous amounts of code.
3. Java makes it easy to write concurrent code. Concurrency is expensive.
4. C++ is harder, which means the entry criteria for being a C++ developer is higher. This sounds and is elitist, and its also misleading

Why C++ isnt right for low latency
1. I’ve worked on too much C++ code that performs badly to believe the arrogant nonsense that C++ folk like to spout about Java programmers. My experience has been that most software is between 2 and 5 orders of magnitude slower than it needs to be.

Performance architecture is a holistic practice. Large organizations will fail at low latency regardless of language. That’s because you need to jump between the language level, the messaging, DB, kernel, BIOS, NIC, physical network layers. In large corporations these are typically owned by different teams.

If you want to use Java or C, even perl for low latency code what is most important is a willingness to embrace a scientific approach, enough humility to question all your assumptions, and a willingness to work at every layer of the tech stack, not just the ones where you feel comfortable.

So as for ZERO GC: There are 4 or 4.5 ways to do this:

1. if you write C or Fortran in Java, and don’t allocate memory once the software is initialized then you can avoid GC cycles.
2. If, as is true for a friend of mine, your timescale of interest is only a few minutes (investing at market open/close) then you can get away without GCs by allocation an enormous heap (even 50GB or more) and being confident that you wont use even 60% of that heap in your active time, then recycle the process once things are quiet
3. Azul Systems make massively scalable hardware that runs a custom JVM designed to avoid GC pauses with massive heaps. I am currently working on a system that is deployed on 96core hosts and runs with a 30GB heap and has no substantive pauses when running for a day. It really works but isnt perfect. The hardware is scalable, but the 900MHz CPUs are slow (in a straight line).
4. IBM ships a RealTime JVM thats designed for contexts where predictability is more important than absolute throughput. You can configure your maximum acceptable pause and the VM will ensure that it keeps to that boundary. The downside is that the throughput of this JVM is less than that of a conventional JVM
5. Azul also offer a virtualized version of their JVM technology that can be run on commodity Intel hardware ontop of vmware.

To avoid GC you just need to make sure not to allocate memory after the startup and warm-up phase of your application.
This implies that you have to avoid using most Java libraries because of their unwanted memory allocations.
You also have to avoid using immutable object like String, avoid Boxing, re-implement Collections that pre-allocate enough entries for the whole lifetime of your program.
You also need to use JNI to work around some parts of Java NIO that allocate temporary objects (selector).
Writing Java code this way is not easy, but certainly possible.

—-

are you hearing the claim from developers or vendors of some product?

Absolute Zero-GC in Java is pretty much impossible, however there are systems that achieve Zero-GC (or near it) for their long running, primary processing loops. GC fearing Java developers can spend lots of time reducing GC risk in their low-latency applications– and some of the trade-offs you have to make aren’t always clear.
For example, if you’re reading XML documents over a socket, you may make a choice to operate on the XML document directly as bytes in an NIO ByteArray instead of parsing into a user friendly structure that carries some GC risk. In doing so, you now have to be very careful how you extract meaningful values out of this ByteArray. If you want to convert a byte sequence to an int, you’ll likely have to roll your own function for converting the sequence of bytes into an int — if you didn’t care about latency, you could just convert the bytes to a String and then use parseInt. This becomes even more complicated by the fact that a home grown byte[]->int conversion routine in Java will likely be slower than the methods that carry GC risk (i.e. it’s often faster to convert to a String and use parseInt, because any home-grown method will have to index into the byte array doing bounds checking on every byte access.) There are many cases where code becomes slower by reducing GC risk and you have to make the active choice based on a belief that the system will yield a better latency profile with less GC risk.

Java systems written in this way tend to look a lot like their C/C++ equivalents, but IMO it’s more difficult and time consuming than just writing the low-latency system in C/C++, where the GC risk is already ZERO, in the first place.

—-

This could be in reference to the Sun Real Time Java or Java RTS.

The critical point is that Java RTS uses threat prioritization to control the affect of GC scheduling (i.e. threads can be prioritized other GC) and new types of memory allocation that allow allocated memory to be exempted from GC processing.

There are a couple of catches with Java RTS. It requires specific OS versions that have the necessary real-time kernel extensions (e.g. SUSE, Red Hat and Solaris). These OS versions in turn have specific hardware requirements. I would assume that since access to real hardware clocking signals is a critical part of real time systems, then any form of virtualization is out.

The question would be what affect this has on both latency and/or jitter (i.e. latency variation) compared to other options

 

 

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

c++ interview questions on garbage collection, java, cast,

Garbage Collection:
If you were to implement a garbage collector in C++ how will you do it?

the easiest way is to use ref count list
check out

http://en.wikipedia.org/wiki/Garbage_collection_(computer_science)#Implementation_strategies
create a daemon thread

“What is multiGenerations Garbage Collection?”
Variable scope:
scope of variables – that has goto and multiple blocks in a program

casting
size_t
(Programming Language Test) When casting an object of a polymorphic class from a base class, which kind of cast executes only if it’s is valid?
choices were a number of different kinds of casts – dynamic, static, reinterpret, etc.
You should use dynamic cast. It returns a null pointer if it’s not valid. Reinterpret will give you a bad pointer since the cast will be pointing to the wrong part of the polymorphic object.

Also, it’s important to note the difference between using pointers vs references in the case of casts. In C++, if you use a pointer in a dynamic_cast<>() and it fails, the result is a null pointer. However, since you cannot have a null reference, dynamic casts need to be wrapped in try/catch blocks.
Datsa structure
Data Structures for handling high volume data

time complexities
java:
What are the trade-offs in C++ vs Java?

http://en.wikipedia.org/wiki/Comparison_of_Java_and_C%2B%2B

does java.io have for preventing malicious code from altering Serialized Objects?
What support does java.io have for preventing malicious code from altering Serialized Objects?
When using an ArrayList as the implementation for a list collection, what happens if adding an element exceeds the ArrayList’s capacity?

HOW DO YOU START A PROFITABLE TRADING BUSINESS? Read more NOW >>>

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!