Quant analytics: Polynomial fitting algorithm?
Would someone point me to the algorithm for fitting a polynomial
to stock data? If I have 20 bars of PRICE, how do I find out what
polynomial of degree N fits the given 20 bars of data? I expect the
algorithm will return the coefficients of the N degree polynomial.
What I understand about the polynomial fitting is that I need to
have x and y coordinates. The y is PRICE, but the x is the abstract
time unit for the bar (i.e. minutes, days, weeks). Any value for x
would be extremely abstract and seems to be useless.
s far as I know, the usual technique to interpolate involves sampling of your data (price) at a certain frequency (generally constant).
So basically you decide your time-length a priori and sample data, then try with a certain basis of functions to fit your data. However, there are some major issues in doing that.
* If you use raw polynomials and use, e.g. a Lagrange multipliers’ approach, you will end up matching perfectly your data. That is a problem: since the order of your polynomial will be #of samples-1. If you decide that you need 100 observations, then you will need a polynomial of the 99th order. Now, to find the coefficient for such a thing, it is trivial even for an old laptop; but it is a non-sense. Imagine that you just use an extremely flexible rod to connect some points in time: you extract no information whatsoever.
* your data are noisy and dependent on your sampling frequency. Again: perfect matching interpolates noise and you accommodate extra information using extra degrees of freedom. Having too many degrees of freedom implies to introduce dynamics which are not simply there (very poor extrapolation and very poor approximation in between samples. Please see below).
* you are likely (with probability one, if you choose “many” samples:i.e.: >10, even less) to end up with very high oscillations. Since you are perfectly matching your samples, your polynomial is not constrained in between your sample data, at all. Result: it is likely you would get even negative or extremely high prices, in between your sampling points.
My humble suggestion: try to experiment with several basis functions (not only polynomials) and with several choice of free parameters (trying to keep them as few as possible).
Also, it is highly advisable to pre-process your raw data (for example normalizing them) and use some kind of regularization (to get rid of the noise or part of it).
An excellent reference about time series prediction is Dr Bishop’s “Neural Networks for Pattern Recognition”. You may want to look into chapter 8 and 9.
“Neural Networks” is just a fancy name; basically you can use polynomials (in fact it would be a specific kind of feed-forward-NN), but the techniques and discussions in those chapters are a real gem. Although the book is not intended to treat time-series analysis in depth, it gives many general ideas, useful in extrapolating information from time-series.
Just my two cents. Good luck with your numerical experiments.
IMHO, predicting short-term prices using polynomial fitting is not a good idea. Depending on the order of your polynomial you might predict that prices would go up or down. Most important to me however is that a system is transparant and thus the people using it know how to interpret the data. The system should not try to interpret the data himself using a very simplified model.
How about using Livenberg- Marquardt method http://en.m.wikipedia.org/wiki/Levenberg%E2%80%93Marquardt_algorithm?
‘d say interpolation in general doesn’t work for this kind of predictions, your prediction would just be following the direction established by the last two sample points.
I’d say that moving averages have probably similar if not better chances to predict prices than interpolation. At least the moving averages are less influenced by the random walk.
‘ve ordered a copy and look forward to doing some math. And if you have any “favorite” basis functions for this sort of thing, I’d be curious to know them.
If there’s a good reason that the price is a very smooth function of time (I’m stressing very), then and only then, polynomials could be useful for extrapolation
You can use Lagrange polynomial or resolve the following system of linear equations instead:
y[i] = sum (a[j]*x[i]^j) , j =0,n and i=0,n
where a[j] are the variables.
Both methods will “identify” the same polynomial (as it is expected to be unique). I guess with the latter form and CUDA (with nVidia GPU) you can achieve a good speed (by operating with matrixes and parallel computing methods). Or use MatLab as it was suggested above.
>>I’m curious if I can fit stock market price data with polynomials for generating very short term price predictions (one day).
I don’t think this is a good idea, at least this method doesn’t consider “human factors” (war, takeover, crisis, greediness, etc.). Even polynomial fitting (least squares http://math.fullerton.edu/mathews/n2003/leastsquarespoly/LeastSqPolyProof.pdf) is used mainly for synthetic calculations like calculating the Price Limits Reference Prices which is more of the regulation part.
Statistic arbitrage could be a better solution. For instance, http://stockchartist.blogspot.com/2011/03/gold-value-of-commodities.html, gold value of the oil seems to be nearly constant or if you know that GBP/EUR goes up, EUR/USD goes up then almost certainly USD/GBP goes down (GBP/EUR * EUR/USD * USD/GBP ~= 1).
can statiscal arbitrage be used for forex, as you described above?
in the way I described – probably no, though it depends on interpretation. One interpretation of the statistical arbitrage is “A leads to B (with a probability)” or “A and B are correlated” (e.g. if the price for oil goes up, this affects transportation which affects prices for commodities).
GBP/EUR * EUR/USD * USD/GBP = 1 is actually a rule. If it is not preserved, this leads to arbitrage situation anyway. From this rule, things like:
– GBP/EUR – up & EUR/USD – up => USD/GBP – down
– GBP/EUR – down & EUR/USD – down => USD/GBP – up
In this case, statistical arbitrage will/can be used to predict things like “GBP/EUR – up & EUR/USD – up” by “monitoring” the spot rates of other currency pairs involving GBP, EUR and USD or finding any other utile property related to spreads (all the currencies are indicated purely as an example, any other pairs with good liquidity may also work). Generally, it is hard to find/Google anything useful on internet at this chapter. Either this is a “money making ” secret (e.g. quite useful for spread betting as I imagine) or … pointless exercise. I planned to investigate this with more details, but I have other priorities at the moment and lack of spare time.
While I am not totally sure what your data look like or how you are restricting yourself in terms of degrees of polynomials I will offer that fitting cubic splines can be easily done even in Excel using Solver. This is done by the U.S. Treasury and other fixed income operations. Of course, ‘cubic’ is restrictive and requires that you chose ‘knot points’ and restrict slopes and changes of slopes to be equal between adjacent segments.
However, I agree that there is no particular reason to believe that fitting a polynomial curve to data translates into a good forecast. If you were willing to assume that price is only a function of time I would start with neural nets rather than polynomial functions, particularly since you are dealing with stock prices and not yields.
The thing is that if you fit an order 20 polynomial to any data set you will get something that intersects all points and at the end zooms off to positive or negative infinity, hitting floating point overflow on the way.
I would certainly hope that no one would try ot fit an order 20 polynomial!
I think the bigger question is whether polynomial curves will help you predict better than other techniques. I would think that this could be addressed with some Monte Carlo techniques using data with similar characteristics.
Also, you could use your splines approximation for short-term data prediction (extrapolation), you may add an additional condition at the last knot of your data – namely set second derivative of the “original” function there to null.
I would not try to fit the 20 pts exactly, you will fit any noise as well. Even if you do get a good looking picture, to derive meaning you must understand the assumptions of your model.
The domain of fitting of noisy data is quite rich, be prepared to invest some serious time & effort. I myself am out of form in this, but in the past I used various statistical fitting methods as well as Fourier transform (FFT). Wikipedia is a good start
• If you fit a polynomial of order k to the data, you are assuming that the underlying market moves in this fashion. why should the market move in this fashion. it won’t. at least not over longer periods of time. you may want to just use linear, quadratic, or perhaps cubic forecasts for the next time step to get some direction which way things would go but not much more than that. also, why would you want to use 20 price points for fitting, why not 10 or 30 or a variable lookback period.
• I expect that your points will be modelled by a trend plus a noise. By its nature it is not possible to forecast noise, expecially with a polynomial because of the oscillating nature of this kind of interpolation. Instead I think you can try to guess the future trend – average value – by using a smoothing function. Fft others have mentioned is one powerful way to do it – thats filtering out high frequency noise exactly. The point is that predictions will not always be correct because of the noise, but if you are performing correct predictions *on average*, by betting on many of them you will be winning on average.
think you can try to use Genetic Algorithms (GA), see polynomial chromosome. The advantage of GA is that it will find a sollution to your problem, but then you will have to interpret the result, in order to know WHY. It works great when you have complex data, where the conventional methods do not work.
Hope it helps. By the way, if you google for Polynomial GA, you will find many resources, however, the implementation of a Polynomial GA is trivial.
From a Linked In group discussionNOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!