Posts Tagged ‘High frequency trading’

New R blog created on how to do a model, algo, strategy for HFT High Frequency Trading

New R blog created on how to do a model, algo, strategy for HFT High Frequency Trading See here at http://quantlabs.net/r-blog/

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Multi Agents Systems (MAS) for quant development and high frequency trading HFT

Multi Agents Systems (MAS) for quant development and high frequency trading HFT Hi, I am looking for an interesting theme for my MSc and I wish to join MAS and Algorithmic Trading. Can anyone suggest some ideas or even papers with the same topic? Thanks == have a look at Altreva. It's an application of agent-based models for algorithmic trading. http://www.altreva.com == There is a program called Netlogo that allows developers to create multi agent based models. A good paper that worked on a simulation of the May market crash in 2010 is located here: http://papers.ssrn.com/sol3/papers.cfm?abstract_id=1932152 Many ideas were discussed at the High frequency data analysis conference, check out the abstracts: http://kolmogorov.math.stevens.edu/conference2011/index.php/abstracts-of-the-talks == Thank you for posting. I didn't know Altreva and I am starting to know it now. I already knew NetLogo and I know it is used a lot in social sciences because it is easier to programm than Java (Rpast for instance). But I will dig in a bit more. If you have additional info you would like to share I would welcome it a lot. Thanks

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Thanks to R-Bloggers.com,finding ways to use R for HFT or high frequency trading,algos, quant, models,strategy development in finance

Thanks to R-Bloggers.com,finding ways to use R for HFT or high frequency trading,algos, quant, models,strategy development in finance First, I got to say thanks for R-Bloggers.com for adding this site. Thanks for them doing that. First I am really new to R but coming from a year of Matlab. I really like it and love all the high quality toolboxes from Mathworks. It made my life much easier. Thus far, I have been somewhat hitting walls of frustration of some badly documented R packages or I should say, lack of documentation on simple installation of it from outside of CRAN. Anyhow, I am trying to figure out stuff how to accomplish the equivalent stuff I did in Matlab. We shall see but I hope the R community can help with that. So there, and this blog post is dedicated to R-bloggers.com.  It seems this site has helped me understand people’s results and reviews of all anything around R. I kind of like that and nothing lke that really exists in the world of Matlab. So I am glad this kind of site of R-Bloggers.com exists. P.S. I am sure methods for HFT does exist, I just got to learn it or find it.

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

GPGPU/OpenCL/CUDA and Hadoop for HFT High Frequency Trading

GPGPU/OpenCL/CUDA and Hadoop for HFT High Frequency Trading After reading http://hgpu.org/?p=7413 (and being interested in Hadoop quite some time), I got curious if more efforts have been done or are under development. It is clear that workers can be sped up a lot with OpenCL and alike techniques, increasing the speed of a cluster. Do note I am an OpenCL-specialist, so somewhat biased. Do you guys know of any project where Big Data has been combined with OpenCL, CUDA, Aparapi, etc?   --I attended the event that @Andrew put together in NY. @Andrew presented an overview regarding the state of play (citing some of the papers and observations made above) and also talked a little about Aparapi (which obviously made me happy :) ). There was some discussion about the use of Thrust (for those prepared to make the Java->JNI->Thrust->JNI->Java round trip) and of course the option of using JOCL/JCUDA for those who want to write their host code in Java but still use CUDA/OpenCL. There was also a session from Jack Papas (one of the TidePowerd inventors) discussing his work. The third session was from Tim Childs discussing Map-Reduce + GPU from a database perspective. I also note that Andrew is presenting a session at AFDS (www.amd.com/afds) entitled 'CC-4344 - Hadoop and GPU Compute'. AFDS should be fun this year, we have 2 Aparapi sessions and one hands-on-lab.  

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Someone works with High Frequency Trading?

Someone works with High Frequency Trading? I work with High Frequency in Brazil. Here the market is small and there are few managers who work with HFT. I'd like to meet peoples who work with HFT in major financial centers and global markets.   == Estou trabalhando com HFT no mercado brasileiro. Podemos compartilhar conhecimento.   == We have done a couple of interviews with people who do quant trading in Brazil. The people I recommend you start with are Christian Zimmer and Hellinton Hatsuo Takada. they trade for Itau AM and I believe they teach quant strategies as well. You can read the articles they wrote for us here (http://fixglobal.com/content/high-frequency-trading-brazil-mirage-or-miracle) and here (http://fixglobal.com/content/refined-product-more-raw-fix-brazil).  

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

How to get started in HFT high frequency trading?

How to get started in HFT high frequency trading? Hello, I am a fresh graduate from the engineering field and have discovered that I have a huge interest in the algorithmic trading field. I guess I just wanted to ask some advice from people in the industry on what my next stept should be. My interest mainly lays with software development. I have experience using C and Matlab. Also if anyone happens to know of any entry level or internship positions, please let me know. That would be awesome :) ==   look into the programs these guys have. http://www.xyber9trends.com/x9t/4371.html they are some of the best that are out there.   == just start coding and designing systems.. after 10,000 hours you will be ready. -- Some interesting books to get you started, 1. The Trade Lifecycle (http://www.amazon.co.uk/The-Trade-Lifecycle-Trading-Process/dp/0470685913/ref=sr_1_1?ie=UTF8&qid=1336688190&sr=8-1) 1. The Black Box (http://www.amazon.co.uk/Inside-Black-Box-Quantitative-Trading/dp/0470432063/ref=sr_1_4?s=books&ie=UTF8&qid=1336688295&sr=1-4) 1. Inside High Frequency Trading (http://www.amazon.co.uk/All-About-High-Frequency-Trading-Series/dp/0071743448/ref=sr_1_1?s=books&ie=UTF8&qid=1336688449&sr=1-1) Also as Jay says, start coding. Write a couple of html scrapers to get stock prices from Google, BBC, Bloomberg or Reuters and test your algorithms on those. Good luck.   == I'd suggest to you check NinjaTrader software, is in C++, it's free (The non-live version) and its learning curve is pretty fast.   ==   I highly recommend the following book: http://www.amazon.com/Way-Turtle-Methods-Ordinary-Legendary/dp/007148664X See if you can implement the strategy to trade the commodity futures and make it profitable. You will thoroughly enjoy the experience and learn a lot along the way. If you are comfortable with C, you might consider MQL4 and perhaps MQL5 provided by MetaTrader 4 and 5 platforms respectively.   == Having said that, I would MetaTrader more of an educational / prototyping tool (great charting capabilities and a fairly expressive language). You would have to be VERY thorough in selecting the MetaTrader broker(s) though if you ever get to that stage.   == I wholeheartedly agree with thrcomment, you can use MQ4-5 as a learning tool, but do not use this in production. Furthermore I recommend only to trade in instruments that are either protected by MIFID or the american equal (cannot remember this act) but you should be guaranteed the NBBO. National best bid/offer. Basically this means do not trade FX and CFD´s in the beginning of your "career"   Hello, I am a fresh graduate from the engineering field and have discovered that I have a huge interest in the algorithmic trading field. I guess I just wanted to ask some advice from people in the industry on what my next stepts should be. My interest mainly lays with software development. I have experience using C and Matlab. Also if anyone happens to know of any entry level or internship positions, please let me know. That would be awesome :) ==   look into the programs these guys have. http://www.xyber9trends.com/x9t/4371.html they are some of the best that are out there.   == just start coding and designing systems.. after 10,000 hours you will be ready. -- Some interesting books to get you started, 1. The Trade Lifecycle (http://www.amazon.co.uk/The-Trade-Lifecycle-Trading-Process/dp/0470685913/ref=sr_1_1?ie=UTF8&qid=1336688190&sr=8-1) 1. The Black Box (http://www.amazon.co.uk/Inside-Black-Box-Quantitative-Trading/dp/0470432063/ref=sr_1_4?s=books&ie=UTF8&qid=1336688295&sr=1-4) 1. Inside High Frequency Trading (http://www.amazon.co.uk/All-About-High-Frequency-Trading-Series/dp/0071743448/ref=sr_1_1?s=books&ie=UTF8&qid=1336688449&sr=1-1) Also as Jay says, start coding. Write a couple of html scrapers to get stock prices from Google, BBC, Bloomberg or Reuters and test your algorithms on those. Good luck.   == I'd suggest to you check NinjaTrader software, is in C++, it's free (The non-live version) and its learning curve is pretty fast.   ==   I highly recommend the following book: http://www.amazon.com/Way-Turtle-Methods-Ordinary-Legendary/dp/007148664X See if you can implement the strategy to trade the commodity futures and make it profitable. You will thoroughly enjoy the experience and learn a lot along the way. If you are comfortable with C, you might consider MQL4 and perhaps MQL5 provided by MetaTrader 4 and 5 platforms respectively.   == Having said that, I would MetaTrader more of an educational / prototyping tool (great charting capabilities and a fairly expressive language). You would have to be VERY thorough in selecting the MetaTrader broker(s) though if you ever get to that stage.   == I wholeheartedly agree with thrcomment, you can use MQ4-5 as a learning tool, but do not use this in production. Furthermore I recommend only to trade in instruments that are either protected by MIFID or the american equal (cannot remember this act) but you should be guaranteed the NBBO. National best bid/offer. Basically this means do not trade FX and CFD´s in the beginning of your "career"  

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Open Source Trading Platforms List for HFT High frequency trading

Open Source Trading Platforms List for HFT High frequency trading

algotradingindia.blogspot.in

Open Source Trading Platforms List tradelink http://code.google.com/p/tradelink/ Write automated trading systems, connect with 17+ broker APIs AIOTrade http://sourceforge.net/projects/humaitrader AIOTrade (formerly Humai Trader...

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Learn now to how build do profitable High Frequency Trading using Market Inefficiencies, order types, needing those HFT opportunities

Learn now to how build do profitable High Frequency Trading using Market Inefficiencies, order types, needing those HFT opportunities Hi there, Readers have been asking me about the HFT course I've been creating these last few weeks. I've been posting up videos and tutorials regularly. And the 72 algos are already available to Premium members. But exactly what does the course teach? Here are the first 4 major lessons you'll learn about High Frequency Trading ... 1. Evaluating Performance and HFT Strategies Which one's the best? And when's the best time to use it? Learn how to evaluate the performance of comparable HFT strategies. Understand the reasoning behind which ones to keep, what to change and what to throw out. 2. Order Types For HFT Get that order in now! Do you know which types of orders work best for HFT? I'll show you which ones are optimal under which market conditions. 3. Market Inefficiency and Profit Opportunities In Different Frequencies Understand how to look at the market from the top down with an HFT lens. Understand how market inefficiency can be your best friend in HFT. And learn how to zero in on the best chances of making money before they're gone. 4. Searching For HFT Opportunities Spotting those profit opportunities is both harder and easier than you think. I'll sow you how to identify potential profits across different frequencies to ensure you don't miss a thing! And that's just the first third of the course. Join Now to become a QuantLabs.net Premium Member: http://quantlabs.net/dlg/sell.php?prodData=m%2C3 Click here to find out more about everything available to Premium members http://quantlabs.net/quant-member-benefits/slash-your-quant-learning-curve/ Good trading, Bryan Quantlabs.net

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

GPU, OpenCL, CUDA and Hadoop in quant analytics and high frequency trading HFT

GPU, OpenCL, CUDA and Hadoop in quant analytics and high frequency trading HFT After reading http://hgpu.org/?p=7413 (and being interested in Hadoop quite some time), I got curious if more efforts have been done or are under development. It is clear that workers can be sped up a lot with OpenCL and alike techniques, increasing the speed of a cluster. Do note I am an OpenCL-specialist, so somewhat biased. Do you guys know of any project where Big Data has been combined with OpenCL, CUDA, Aparapi, etc?   == An overview I found is from 2010: http://jimmyraywv.blogspot.com/2010/12/java-parallelization-options.html   == To answer your first question have their been any projects w Hadoop and GPUs? Yes. Can I talk about the ones I know? No. Sorry. Is it possible? Sure there are a couple of ways of using GPUs in Hadoop, however there are some issues that you have to consider. Recommendations? Yes. JNI is your friend. Does using a GPU make sense? It depends. You have to consider what you are doing, and how much of a performance boost you gain. You have to weigh this against the cost of the GPU, assuming it fits in your chassis, versus the cost of just expanding your cluster. Here like in other advanced concepts, YMMV based on the quality of your code and the approach of your solution. ==   Thanks for the cliff hanger. :) One of the things I have questions about is how to do the double map best: first from total to node, and then from node to NDRange. I prefer to have it done in just one step, and send it packed per X items to each node, which in turn only needs to unpack. Also JavaCL, JOCL and Aparapi are my friends - JNI is their friend in turn. If a solution to the problem exists in both Hadoop and GPGPU, it makes sense a lot.   == Cliffhanger? Sorry no. You ask a set of questions of where 1) it depends on what you are doing... 2) is it cheaper to bulk up or grow out your cluster... 3) to give you a full answer would require the potential of violating an nda or two... Sorry to be cryptic, I'll try harder... CUDA, java API or C API? ( I would suggest C hence JNI.) Note you mention a couple of 'friends'... Another free clue... KISS. You want to keep your code small and tight... Problem... One CUDA, multiple map/reduce slots... How are you going to solve that? (again there are multiple solutions... YMMV) still cryptic? Sorry. Maybe take it offline? Not sure what you mean by double map...   == Hadoop is not particularly impressive when it comes to performance. One can look at TPC-H Hive benchmarks for a 100 GB file : https://issues.apache.org/jira/secure/attachment/12416257/TPC-H_on_Hive_2009-08-11.pdf Basically, on 11-node cluster consisting of IBM eServers, each with 8GB of memory and 4 hard drives the first TPC-H query (Q1) takes 500 seconds when using Hadoop. When using a GPU on a single box with 8GB of memory the query Q1 runs in just 18 seconds ( using some home-written software to run SQL on GPU). So yes, there is definitely a room for improvement in Hadoop.   == I think you need to learn more about Hadoop...   == So I take it that your NDA is so restrictive that you even cannot tell us what you are disagreeing with ?   = we are all curious what you you, but NDA is NDA. Loads of hints, but it is always the description of the problem-field itself that explains why it works, not the techniques used. I just wanted to know what type of problems have proven good results, or hear in what cases for example MPI is better. By double map I meant that mapping needs to be done twice (sorry, translation-problem from Dutch). First mapping from data to nodes, then mapping from node to GPU-cores. The reason for this question is that I don't want to tackle problems that can be solved under a minute on one GPU-powered machine, but one that needs to be distributed on several GPU-powered machines. Hadoop is interesting to do that distribution. You were comparing the distribution-system (local vs distributed via Hadoop) with the hardware (CPU vs GPU), so hence that remark. -- What would you like me to tell you? Sorry, but as a consultant, I end up signing NDAs all the time. So when you ask a question, the answers tend to be incomplete. Why? Because there are some things I can talk about. some things I probably can talk about, but are in a grey area, and then there are things I really can't talk about. So the hard part is trying to figure out what my client thinks that I can and can't talk about, so the easiest thing to do is not talk about it period. But you asked a question about people using GPUs which is an interesting problem. Here's what I can probably say to help point you in the right direction.... 1) Solve the problem using Java and No GPUs. Now if you want to speed things up you can just add more nodes. 2) You can also see if you can use a GPU. Note: Not all problems work well w GPUs. (More on that in a second...) If you use a GPU, you have two choices... 1) Java API or C API. We chose C, because its a more robust API for our problem. If you choose C, then you will want to wrap your CUDA code in JNI. Note... you want to avoid using frameworks because you end up increasing the size of your executable Jar and also you need to use the distributed cache to move the CUDA object code around the cluster too. So now each Mapper.map() method you set up and send your code to the GPU, get your result and clean up the GPU connection. Not too terribly efficient, but depending on the problem faster than not using a GPU. 3) You look to ways to speed that up. Now you also may have an additional problem... you can have N map slots on a node in the cluster. each slot will want to talk to the GPU at roughly the same time. That could become an issue. As to your Double Mapper problem... its more of an issue of allowing concurrent access to the GPU from multiple copies of the code running at the same time. If you go back to my earlier post... I said that the effectiveness of the GPU will depend on what you want to do and how you can utilize it in your problem that you want to solve. You also have to consider that the GPU will add a significant increase to the cost of your node. So you have to ask yourself if you would be better off trying to get the GPU to work or if you would be better off just adding nodes to the cluster. There are other issues. At one client, we had 1U boxes where we couldn't fit the GPU cards in and had to use a vendor specific approach. Unfortunately, the vendor's solution wasn't ready for prime time and we had issues. The big thing that you have to realize is that there is a cost in moving data to and from the GPU. (Ok, you probably already know this.) So while the GPU is blazingly fast, the speed improvements you may actually get is much less than you expect. Does that help? Bottom line... if you have a small cluster w GPUs and you're doing something fairly simple like a complex equation that you want to apply to each tuple in a large data set? You may find some value. If you've got to move a lot of data in and out of the GPU? Not so much. Seems you've got a chip on your shoulder when it comes to Hadoop. Hive is a subset of Hadoop. 100GB would be considered small to some who post here. The whole argument that you could write custom code to do the same thing that the TPC-H query does is a bit of a fallacy since it would fall outside of the scope of what is permitted by the TPC org. Maybe you don't remember when Oracle gamed the system with their TPC-C benchmarks or ==   there gets some misunderstanding in this conversation. Sorry! My two questions are now: * are there examples of Hadoop with GPGPU? Then specifically the Hadoop-part, as I already know GPGPU. * if another distribution-technique is used for Hadoop, do you know why? For the GPGPU-part, I know something about that already. http://www.streamcomputing.eu/blog/ is my company's blog.   == 100 GB is a lot of data for a GPU. It greatly exceeds any existing GPU memory. So it is a good indicator of a GPU's ability to handle large amount of data. About custom code - I believe that people care about correct and fast results and not whether the code is permitted by some organization .  

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

How to compare two systems in high frequency trading HFT and quant analytics?

How to compare two systems in high frequency trading HFT and quant analytics? Hi all, if you have two different systems that generate two different sets of return numbers. How do you usually decide, which one is better? I am assuming, that is not just about taking the absolute value at the end of the testing period ... is there something like a common "quality test" where you take the variance etc. into account? Cheers, ==   Well the classical way of measuring is sharpe ratio. Mean % return minus risk free rate divided by standard deviation of returns. Be careful that you are using equal time frames for returns and variance. Basically it is important how much return you made, in what time frame, and by how much risk. There are other ways of course. But sharpe ratio has good theory behind and optimal under some conditions. For example if somebody claims that they made 80% return on something. Ask them the sharpe ratio. Such returns are made by taking excesive leverage. Sharpe ratio brushes away all that and gives you an honest deleveraged picture.   == Sortino Ratio (based on the semideviation of returns) is a better metric than Sharpe. However, both can be completely misleading. For example, Beakshire Hathaway Sharpe Ratio since 1990 is below 0.6 much less than LTCM used to have. Where is LTCM? :-) Comparing two systems is not a straightforward task.   == Well, if you calculate LTCM's Sharpe by the time they vanished, it'd probably be quite low as well. Further I think it is not exactly correct to compare a hedge fund to Berkshire Hathaway. Apart from ratios, there are many other things you can take into consideration. I feel that the comparison makes sense largely if the two systems are correlated. If there is no correlation, probably a Sharpe/Sortino would be fine. In case of a correlation, depending on your requirements, you go for the one that suits your risk appetite. I am sure there would be backtesting done, look for days/trades where you lost a significant amount of money. The general metrics of Value at Risk can be used. See if the trading system is sustainable if such a loss occurs. Choose the one with the risk that you can afford to tolerate, while making sure that the risk-return trade off is correct. Further, for 2 systems giving similar results, ALWAYS choose a simpler model. COmplicated models with heavy calculations are more than likely to be prone to overfitting i.e. on historical data they would give amazing results, but on unseen data would be terrible. To solve this, we generally use cross validation - for example, we get our optimized parameters/ratio for the trading based on 2/3rd of the dates, and the returns are then computed for the remaining 1/3rd. The system with better returns on these 1/3rd of the dates is the better system.   == They had a decent Sharpe ration for some time. There is nothing wrong in comparing the two as both employ(ed) systems. Yes, there are a lot of things to consider. Correlation numbers between the two system (or the lack thereof) can be misleading. In real life, correlation tends to let you down when you need it most (the beauty of tail risk). It is important to make a thorough assessment of robustness of the systems being compared. Robustness is often more important than the observed quality of returns expressed by Sharpe/Sortino. Yes, it is worth applying Occam's Razor principle if the models are similar. Cross validation is common technique, but quite often the sample space is not big enough. There are other alternatives available.   == If you have 2 uncorrelated systems both making money, then trading both is better than trading any one just by itself. Replace 2 by 10, and you have found the holy grail.   === I am pretty much a beginner, so I won't know much. Could you elaborate more on the alternative techniques for cross validation? Also, I did not understand why correlation can be misleading? Or is it in a way saying historical data need not necessarily be a good predictor of the future? I would appreciate your comments.   == Bootstrapping can be a good alternative, given that your sample size is likely to be limited. The correlation can be misleading when the sample of returns is not representative. Consider a robust strategy and an overfitted one; both will exhibit similar behaviours (correlate) in stationary market conditions (the latter is optimised for), but diverge when a regime shift occurs.   == the holy grail or not, it is definitely the way to go. Two things that can never be overrated are stop losses and diversification.   = the problem when comparing two systems is that it is easy to get to compare apples and oranges. For example, any trading strategy can be represented using a payoff matrix: Σ(H.*ΔP) where a strategy H is applied to a price differential matrix over the whole portfolio history. If you compare 2 strategies H1 and H2 over the same portfolio (same stock selection), then you are really comparing two strategies. However, if you make changes to the stock selection process, you are not testing only 2 strategies anymore, you are also testing the stock selection process. Picking 100 stocks from the S&P500 will represent only one selection with its own signature out of many googols (10 power 100) of possibilities. So, really the question should be: what is being compared? The trading method or the stock selection process? Even comparing two systems over past market data can only give an indication, a method of analysis of what was; not of what is to be!  

Get our FREE Open Source Historical Database by answering the 2 WORLD'S FASTEST TRADER/QUANT QUESTIONS

Post to Twitter

Follow

Get every new post delivered to your Inbox

Join other followers: