HFT architecture considerations with Fastflow in C++

Learn the Secret

Get  our 2 Free Books

Get these now which land directly to their inbox.
Invalid email address
(Last Updated On: November 21, 2015)

HFT architecture considerations with Fastflow in C++

View the video to see my logic instead of potentially bad typing

NOTE: In my video, I do address FPGA/GPU like performance through software accelerator. I also address the secret sauce tricks of Goldman Sachs system wide secDB high performance risk management.

My notes from Fastflow tutorial with Redis

Download from the package at http://sourceforge.net/projects/mc-fastflow/

This is from the fftutorial.pdf

P14 for node management

Figure 3.3 shows farms with feedback (collection) not W is worker, E is emitter while C is collector

Input stream pg 20 (hello_farm2.cpp) i.e. stage 1 for input from Redis, stage 2 for algo, stage 3 for trading decision

Or pg 21 with emitter and collector defined

No collector with the main memory or send them to the next stage (in case the farm is

in a pipeline stage) provided that the next stage is de_ned as ff_minode (i.e.

multi-input node).  Pg 22 hello_farm4.cpp


3.5 Feedback channels p 27


3.6 Mixing farms pipelines and feedbacks

FastFlow pipeline, task-farm skeletons and the feedback pattern modi_er can be

nested and combined in many di_erent ways. Figure 3.4 sketches some of the

possible combinations that can be realised in a easy way.


**** 3.7 Software accelerators like an FPGA

Using FastFlow accelerator mode is not that di_erent from using FastFlow

to write an application only using skeletons (see Fig. 3.5). The skeletons must

be started as a software accelerator, and tasks have to be o_oaded from the

main program. A simple program using the FastFlow accelerator mode is shown

below: see pg 30 accelerator.cpp

Could use img_farm+pipe.cpp or img_pipe+farm.cpp from figure 3.6


On pg 40:

The next step is to reduce the number of resources used. For example the farm

Emitter can be used to read _les from the disk, whereas the farm Collector for

writing _les to the disk. Furthermore, the blur and emboss _lters may be computed

sequentially using a single workers. This is the so called “normal form”

obtained optimising the resource usage. img farm . cpp
*** For fastest processing focus on those patterns that are stateless as map on pg 44 explains

Parallel_for maybe more powerful than how Matlab does it with more options

Pg 48 with ParallelForReduce shows how to use math routines like summary of array

ff_Map for FPGA like

pg 52 Why use the ff_Map instead of using directly a ParallelFor in a sequential


Pg 52 uses matrix multiplication matmul.cpp

P 43 mandel.cpp has image processing

P 57 sobel.cpp uses image


******* P60 ff_mdf uses graph instructions (just FYI: Goldman Sachs system wide on enterprise secDB works the same (hm………..) as in figure 5.1  à creating graph tasks on p61 hello_mdf.cpp

P63 block based matrix multiiplciation on could be used for complex matrix with linear algebra techniques (????)


Join my FREE newsletter to learn more about using Fastflow for potentially high speed trading systems 






NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!
Scroll to Top