Quant development: Are neural networks and genetic algorithms interesting research fields for financial modelling next time?

(Last Updated On: May 4, 2011)

Are neural networks and genetic algorithms interesting research fields for financial modelling next time?

• I think you’ll get a wide variety of opinions on that: responses of complete trash to marvelous. The main problem is that it is tremendously easy to overfit the models. So you get wonderful in-sample behavior but horrible out-of-sample prediction.

There are definitely people using neural networks to good effect, but that does not happen automatically

Neural networks are OK but you’ll have a hard job understanding what they’ve discovered (if anything). Genetic algorithms are an efficient optimisation solution but you need rules to optimise to start with. Particle swarm optimisation and differential evolution are both better IMHO. If you need trading rules rather than just optimisation then genetic programming is the probably the best tool. As Patrick says though, all of these very powerful techniques can quickly lead to over-fitted results that will have no future ability and there is considerable skill and an art to preventing that from happening

Most people who try this don’t know anything about time series analysis, finance or neural nets for that matter; therefore the results are generally overfitting. Try fitting to noisy chaotic data for practise. It’s easily generated, obviously deterministic and has many of the properties of financial data. Better yet, try fitting to noisy chaotic data with the Markov property.

For each model there is an optimal number N* of parameters to be considered. If we use a number N superior than N* then cumulative calibration error will grow up with N. And if we use a number N less then N* also the calibration error will grow up because the non linearity degree of the model model does not fit with the one of the real datas. Hence the art of neural networks is to find the suitable number of neurones and the suitable arcchitecture to fit with the real datas and in the same time to prevent overftting.


As in all matters, you need the right tool for the job. In selecting a model, assumptions can increase efficiency at the cost of potential bias; and within any model, there is a balance between too much omitted parameter bias and too much included parameter inefficiency. And of course, you want quality data accurately measured for independent variables that relevantly affect dependent variables since independent variables irrelevantly correlated with dependent variables result in overfitting. Given enough quality data, non-parametric methods such as ANNs can be terrific; however, the costs are increased inefficiency due to the lack of assumptions and increased inefficiency due to using too many parameters/neurons and due to too many training epochs since both excess parameters and excess training length again result in overfitting. Thus the importance of getting your data and model-based assumptions right and achieving the right balance between bias and efficiency (e.g. via mean square error). Simply, as in any aspect of life, you want to choose the right tool and wield it well.

Genetic algorithm has 4 elements to make it effective:
1. Selection for survivial based on the fitness value.
2. Selection for crossover based on the fitness value.
3. Crossover
4. Mutation (important to jump out of the local minimum trap, or even the initialization range)

It seems to me that Partical Swarm Optimization has #2 and #3, but no #1 and #4. Differencial Evolution has #3 only. These two may work better when the search space is relatively more regular. But if we want to be “surprised” by the findings, maybe genetic algorithm is better, and more suitable for complete random spaces and more complex search spaces, which is the case for real evolution.

That said, I guess Partical Swarm Optimization has the benefit of keeping multiple local minimums alive, therefore prevents the possible fast convergence to “one” local minimum.


NN lack a good mathematical basis: they were inspired on biological arguments by computer scientists, half a century ago, not build on mathematical/statistical principles.

There have been great advances in supervised learning since then:
* Kernel Machine / Support vector machines
* Gaussian processes for machine learning
.. and many more..

Regarding GA’s: the operators like crossover, mutation are arbitrary, and don’t have a mathematical basis. Sometimes they are effective, sometimes they are in-effective, it all depends on the shape of the optimization surface. The operators are a set of rules that tell you how you search and sample the space, they aren’t special in any way (their mathematical properties), they just result is a certain type of search diffusion/sampling
* If it’s smooth you can use gradient methods
* if it’s rough with local maxima you could try simulates annealing
* if it’s fractal like Cantor dust, you’re in trouble!


Gentlemen, Rather than bore you with my minuscule knowledge of mathematical basis, I will let the individuals at IEEE show you their knowledge of neural networks at “IEEE Transaction on Neural Networks,” One has to be careful from whom advice is taken. First, we have both supervised and unsupervised learning. Second, the IEEE folks will give you a more complete and sound description of genetic algorithms. Third, a difference exist between a non-linear space and the linear space, which multi-variate statistics is based. Fourth, if you are afraid of technology that is based on non-linear mathematics, you might want to think twice about driving some vehicles because I am willing to bet 10-1 that some of them use algorithms that reside in a non-linear space. In case you are wondering, I have over a decade of experience of using computational intelligence technologies from non-linear spaces and got more than I wanted to know about multi-variate statistical theory at GT…..Best Wishes


In trading everything is non-linear, that’s the whole issue :-). A good example is optimizing the parameters of an algorithmic trading model.

I would be interested to know on what type of non-linear problems you’ve spend thew last decade, and what types of tools (other than NN,GA?) you’ve used.


I believe that most events in this universe are nonlinear. We have
chosen to pose them in a linear framework, such that we can use our linear
tools. Coming from a statistics background, I have found modeling with the
current nonlinear tools has an analogous basis with the generalized linear
models of statistics. Also, the nonlinear tools, such as ANN, genetic
algorithms ( GAs, GPs, and so forth), and other evolutionary algorithms,
have their strengths and weaknesses; for example, one would not use the
same NN architecture for all event types. Second, NNs do what they were
designed to do quite well, make generalized functional approximations,
although some of the search algorithms are better than others. Clearly,
advances have been made, since the BackProp algorithm was developed. For
example, genetic programming is an
advance from genetic algorithms, and advances have been made beyond genetic
programming. All of these models are subject to the Garbage In-Garbage Out
effect. when children are designing algorithmic models, you get child-like
results. I mean children with respect to technical maturity, such as
statistical reasoning. This is akin to me driving a F1 vehicle in a race.
Just because I can drive, I am not qualified to race a F1 car. Developing
statistical intuition takes time, and this intuition transfers directly to
the computational intelligence framework. when I initially address a
project, I start with my linear tools of statistics and then go to the
nonlinear tools. Computational intelligence includes neural networks, fuzzy
logic, and evolutionary algorithms, and the combined application of these
tools in the same model. Once an individual understands the tools and has
developed the necessary
intuition to know which tool to use, they can be used for any problem.
This is what the folks at Ward Systems and BioComp have espoused for the
past decade

Myself, I have concentrated on the relationships between the S&P500 and Cronus, a deterministic, astrophysical relationship that I discovered in 2001. For example, Linear Discrminant Analysis can be used to find patterns, or a SOM can be used to discover them. Again, once someone truly understands the tools, they have a myriad of uses. As a side benefit, I have found the S&P Cronus calibration is directly applicable to natural
gas. For example, I published the the intraweek future of the Spx for the week beginning 4/26/2010. When I say publish, I mean a USA court-admissible copyright. In retrospect, I found that the Cronus signals also directly applied to natural gas. I had an idea that this would be true, since previousresearch years ago had shown a relationship to heating oil. Since most folks consider the SPX to be the most difficult market to forecast, I started with it, so the others can only get easier. I have developed working models on multiple time-frames for the SPX. Working models as in real money used with them. For example, I returned a 300% plus total account profit in 4Q 2008; furthermore, I published the Cronus signals weeks beforehand and the calls real-time. Meaning, I turned a one dollar
account into 4 dollars in the 4Q 2008 with minimal VaR. I publish
non-commercially, so I can make bold statements, such as these. On 5/31/2010, I stated that the SPX would experience an acceleration downward starting on 6/3/10 and the final weekly downtrend would end by the end of June 2010 and no later than the first week of July 2010. Also, I published the appropriate Cronus measures with these statements in the 5/31/2010
letter. On 1.1.2011, I published both daily and weekly Cronus measures, and a daily measure caused the Spx high to the day on 2.18.2011, and the weekly measure caused the high; in addition, I published my real-time interpretation with a weekly short for the week the SPX closed at 1322. With real money, I went short at 1341 on 2.17.2011. Without being nauseating about the ease of forecasting extremal events and change in trend time points for the S&P500, the Spx is model is basically done, except for a continuing improvement process.I am not familiar with the LinkedIn site. If you desire to contact me, you can send an email through here. If you cannot send email to me on here, write again, and we can make other arrangements. In your email, include some pertinent background information about yourself. Thank you
and Best Wishes—


I’ve had good success using simulated annealing, genetic algorithm, particle swarm, Tabu, etc, when I apply them to the specific and narrow task of (adaptive) searching. In cases where I know in advance that a search is the best way and the only way to get the result I seek, then I deploy GA et al, and am usually quite pleased with its output.

The trick seems to be identifying problems for which “search” is an optimum solution strategy. Selecting mechanical trading rules, and/or selecting parameter values for mechanical trading rules, is perhaps NOT a problem whose optimal solution is searching. On the other hand, maybe problems such as “Given this 10,000 by 10,000 covariance matrix, find a 75 by 75 submatrix having maximum value of property P1 and minimum value of property P2” are wonderful candidates for raw searching. Your opinions may differ, of course.

By the way I haven’t gotten eye-popping speedups when porting these codes to huge arrays of Graphics Processors (CUDA et al). The ratio of (memory accesses) / (CPU instructions) is far higher than Graphics Processors and their puny L2 caches are optimized to deliver, or so it appears on the problems I’ve tried. Too bad, it would have been great for marketing.


it look like your Cronus system is working good for you.

I’m however am not very much interested in (other peoples) trading system performance at the moment… I like discussing model risk though!

I personally don’t trust “external, deterministic” market drivers, I’m even beyond sceptic 🙂 One reason is information content. If a purely external information source determines markets, and if I can influence the market (by trading your system with 1B$) how can both be true? The only way I see it, is if my actions were anticipated and deterministic too.
The other reason is this: Only can easily start a search for a good trading system by examining lots of functions (or random sequences!) and at some point find one that indeed fits historically. The future performance might or might not be good, depending on if you over-fit on history data or truly generalize. Even random trading rules would give some good signals some of the time. I’m not saying that *your* system is, I haven’t contacted you, so I can;t judge in any way, and I’m sure that you can back you system with good statistics if I would ask for it, ..I’m just saying how *I* thing about this in general.

If I had to validate a system like your Cronus, I would need to benchmark it extensively. I would want statistical evidence of the assumption that your system actually finds and exploit some non-random market structure, instead of over-fitting historical data, of be lucky, in other words: significance tests. One approach would be to construct 10.000 new random models. (in your case it would be fun to sum random sin cos terms extracted from the JPL ephemeris models ), and pick the best 5% performing systems from that random set. Is a system better (“better” by compare lots of measures: sharp ratio, maximum draw-down, best runs etc)? If so then there is less that 5% change that the system performance is not pure random.

I currently have a strong focus on other quant work and can’t spend time on evaluating a system, but I’m sure that you’ll get some questions from other people about your system based on the info you’ve provided. Good luck


The fundamental problem is the time spanning. If the test is long, you will have Chi Square problems and therefore meaningless forecasts. Viceversa (if the test is too short) you will have anyway meaningless forecasts… I spent years to establish the correct period and I concluded that:
Test the system for three months, i.e. Jan, Feb and March. Now apply the model to April, May and June (and so on). Repeat the process over the last two or three years. If the model is good, you will have something good. Otherwise change the model, but do not change the spanning times. There are a lot of reasons to do this. We can speak about and elaborate on that. Per aspera ad astra, but health before wealth.


In my experience as a Quant I experimented with Neural Networks for volatility modeling and also genetic algorithms for optimization of portfolio risk.
We had comparable results to state of art techniques like eg., Garch Modeling for volatility and mean variance algorithm for portfolio risk.

Now my Aspirant PhD scholars are focussing on these techniques. Shortly we will be able to write more about these things.


I agree with Patrick: from trash to marvelous.
The tasks can be so different.
You might want to directly extract “models” from data (supervised or unsupervised), or extract “models” that helps you to make the gap between analytic-model results and reality closer, you might want to identify parameters of more or less complex models. In parameter identification (say, for model calibration) genetic type of algorithms in a hybrid optimization approach are powerful.

In a naive application (on data) you might miss the fact that “machine learning” approaches with, say, ANNs often need clever domain partitioning, and then the generalization step (transfer learning?) is difficult or impossible.


I ve been using NN with optimization through Genetic Algorithms with good results for some time. I guess it is important to follow the fundamentals and search for plausible results. It is a different approach, with NN and optimization we have to be very cautious with data mining and overfit. If we have a good understanding of the fundamentals of the market it is a great start…


Great research topic & initiative! Just want to add that you don’t neccisarity need to *forecast* to trade profitable. Forecasting is one approach, having profitable trading rules is the more standard approach on high freq small timescales that focus on micro market dynamics (orderbooks, fee structures, etc) I like your fundamental, broad, scientific, approach though!



NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Subscribe For Latest Updates

Sign up to best of business news, informed analysis and opinions on what matters to you.
Invalid email address
We promise not to spam you. You can unsubscribe at any time.
Scroll to Top