Quant analytics: Is GPU good for large vector addition?

No. CPU is about 4 times faster for very large vector addition. Here is my experiment. http://lnkd.in/bTzxge


Is a GPU good for large vector addition?
Yes! A GPU is much faster than a CPU, since we talk about large vectors here.

The assumption you have made is that input and output data should reside in ‘CPU’ memory rather than in ‘GPU’ memory. IMO it is better to decouple the time required to copy data. So we talk about the same, but draw a different conclusion.

For more timings on CPU and GPU have a look at www.tivipe.com


And just how is a 4 times improvement not good?


Your results are probably correct, but meaningless: no sane use case could comprise only of using GPU to add vectors, so SAXPY would be certainly just one of the operations in the sequence, and relevant comparison would be, as Tino mentioned above, with host/device memory transfer timings taken out of the picture. Besides, there was really not much need to write a paper on the whole issue: almost all GPU programming guides are already mentioning (oftentimes pointing exactly to SAXPY as an example of an operation that is not suitable in that regard) that in order to utilize GPU to the full speed, one would need high ratio of arithmetic operations vs. memory accesses.


vector addition is always presented as a naive example as it is very easily understood. As with current memory transfer overheads CPU are better at it. But the situation would be very interesting in cases of APUs where transfer bandwidth from RAM to CPU and GPU would be comparable.

Have you done some investigations in this regard.



