Tag Archives: scientists

Introduction to Python: For Scientists and Engineers

Introduction to Python: For Scientists and EngineersIntroduction ,Python:,Scientists ,Engineers

Another recommended book from someone on Facebook


Join my FREE newsletter to learn more about all the recommended developer books for automated trading

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant opinion: The New Einsteins Will Be Scientists Who Share

Quant opinion: The New Einsteins Will Be Scientists Who Share


From cancer to cosmology, researchers could race ahead by working together—online and in the open. Adapted from Reinventing Discovery: The New Era of Networked Science.

I couldn’t help but add to the interesting discussion over on WSJ: “For some, (scientific) data equates to intellectual property: MY data, MY IP. Although the Web (including Web 2.0) has had some impact, I believe that the most-effective approach for promoting the sharing of data is to ‘legislate it’ – i.e., ensure project sponsors (e.g., granting agencies) make the sharing of data a policy, as alluded to above. Also, as alluded to above, there needs to be some way of recognizing such efforts, so the merit of sharing data is seen to be of value in and of itself. With such efforts, the data-IP equation can be recontextualized as: OUR data, OUR IP.”

Let me play devil’s advocate for a moment here: when U. of Michigan copyrights the scanning of a book in the public domain written in 1898 and when Google (and others) gets to be co-copyright owner to everything you write online, it’s laudable but naive to think about wikis and open science, without thinking first about the server/hardware/place where that science sits.


It’s not about sharing, it’s about access to the data ! and who gets it. If it’s your tax money at work (i.e. gov’t grant) THEN it should be OPEN !


It won’t be open because the system promotes the ones who publish first. If you publish first you get a grant sooner than your opponents. This is what prevents from disclosing the data. Will the system be changed is a different topic for a different discussion.


We have a conflict of two or more competing concepts here. Yes, people thinking along similar lines can often come up with greater insights than a person working alone because they can bring more views and backgrounds to the forefront. But the public revealing of the ideas as they evolve doesn’t pay the bills, buy laboratory equipment, or pay the taxes. With the patent laws which allow “first to file” to own the rights, a lurker can run off and patent the “cooperative ideas” leaving the collective vulnerable to law suits if they attempt to commercialize their ideas. This is why most companies expect non-disclosure agreements to be signed before outsiders are allowed to be briefed on promising projects.

Research without hope of financial return is called a “hobby”, and if it requires substantial capital investment to advance it often will go nowhere. Commercialization or sales of “owned” IP or products are the only way to build the financial means to advance the ideas.

The solution is often a single company hiring the interested parties and assuring the IP ownership stays with the group. Find a way around the financial needs and incentives and we will get more than we see with “open source” software in other fields.



I fully agree that nowadays securing IP rights is a prerequisite for a successful business.

On the other hand, take a historical trip to England in XIX/XX century and think how much innovations were created that time and how much striking ideas came up in electricity, physics, chemistry and biology. A bulb, electrical engine, radio are just a few examples of brakthrough innovations that still impact our daily life.

Those innovative ideas were created by a league of gentlemen who did science in their spare time. They were rich enough to found their laboratories on their own with a little help from leading universtiites such as Oxford or Cambridge. The ideas came up through unbiased discussions and interactions that happened mostly in the pub after working hours.
Do you think they have thought of IP protection or funding when making discoveries? No, they did it because they found it interesting. They did for a greater cause than just earning money. That’s why I called them “gentlemen”.

Second remark: what would happen if there was a patent for a semiconductor? How fast would the electronics develop ? What if there was a patent for the wikipedia?

My point is that perhaps funding policies, grants, experts deciding which projects will get funded are not obligatory for science to advance. I am from Europe (Poland) and I really cannot think of ANY innovation that was funded by EU Programms that affect my life as much as a bulb, engine, radio, internet and most of all WC 🙂



NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!

Quant development: What languages, CUDA, and C++ libraries scientists use for parallel computing?

Quant development: What languages, CUDA,  and C++ libraries scientists use for parallel computing?

Which parallel programming language should be preferred for parallel computing in Molecular Dynamics ?

I am working on super computer for Molecular dynamics tools. I want to choose better parallel programming language which mainly help in Molecular Dynamics and molecule simulation. It will help to improve scalability and performance and should minimize communication latency.

MPI is now and ever shall be.
Seriously most used, most tested most widely understood.
openMP workable IF you plan to stay really small.

How about using GPGPU (CUDA) programming using NVIDIA Fermi cards. I am currently using it for a CFD (Computation Fluid Dynamics) problem and it does a great job.

If he’s going small then GPGPU (CUDA) programming may enough.
openCL gives more GPGPU portability. Would you agree that if Sagar is to be using
large distributed machines then MPI surely ought to be at least a “gluing” component?

Sagar what size machines you work on?
Will the code be home grown, open source or proprietary? Combination?
Scale of computations?

Neither “MPI” or “GPGPU(CUDA)” are programming languages.

MPI is a programming library whose interface and standards are defined by the MPI forum and therefore has multiple implementations (MPICH, Open MPI, etc.) MPI supports C, C++, and Fortran, and I believe most implementations provide bindings for other languages (Perl, Python, etc.) too.

CUDA is a combination of libraries and extensions to the C language and a few other languages (C++, for example). Using CUDA (or OpenCL) to enable GPU processing only makes sense if you already have access to GPU hardware, or know you will in the future. There is no Fortran support for CUDA right now.

MPI and CUDA do not compete with each other. They are separate, complementary technologies.

I would recommend avoiding using C++ for MPI programming. It’s hard to transmit C++ objects over MPI messages. There is the BOOST library that makes it easier, and you can can create your own MPI types, but I think it’s much easier to stick with plain C.

Since CUDA doesn’t support Fortran, and MPI programming is difficult with C++, I would recommend C if you plan on using CUDA. I would go even further and recommend strict ANSI C for maximum portability.

If you want really want to shoot for the moon and write code that will be usable and runnable on clusters *and* be able to take advantage of GPUs, it is possible to write code in C that uses MPI and use CUDA to perform vector operations on the node. This is essentially how the LANL’s IBM roadrunner is programmed, (but using Cell processors instead of NVIDIA GPUs and CUDA) as well as Tianhe-1 in China.

For max portability, at t the start of your code, you can check for CUDA hardware, and then write your functions so that if CUDA was detected, they’ll use CUDA functions, and regular implementations of those functions if CUDA is not detected. This is more work, but will allow max performance and flexibility.

I am going to work on Tera FLOPS machine and will try to develope a code which charge upto thousand of processors.
Code will be open source or proprietary purpose.

Currently we are working on MPI.But there are some limitations of MPI with C++ as Prentice say and also dynamic load balancing is difficult.But it is well know and most of the code written in MPI.Also MPI is good for parallel programming in MD simulation.
And it is really good for large scale computation.
I dont have that much experience of GPGPU programming.

Also I read about one more parallel programming language CHARM++.It is object oriented message driven language. NAMD is written in CHARM++. CHARM++ is written in C++ and programming with C++ object is easy. It provides features like Dynamic Load Balancing, object migration , virtual processor etc.

If any one know some information regarding that then please share.

MPI is a set of libraries to encapsulate the communication between machines. Then if you use a programming language like C you can have multithreading and parallelism inside a node and all the advantages of distributed memory mechanism using MPI. I don’t have experience with other programming languages, but MPI is very easy to implement both with C and Fortran

I said that they can all use an MPI library for parallel programming, but I recommend C since MPI programming with C++ can be difficult, and CUDA does not support Fortran.

CHARM++ is also not a language. It is a parallel programming library for C++

What’s wrong with Fortran? CHARMm is written and runs under Fortran, many use it in single or parallel CPU mode.

Unified Parallel C (http://upc.lbl.gov/) is perhaps another distributed shared memory parallel programming extension to C, which attempts to abstract out the MPI programming complexity by making compiler do the hardwork. But as Richard says, Fortran may be the best option depending on what you are attempting to implement.

I never said there was anything wrong with Fortran, other than it’s supported by CUDA right now, so if you plan on programming for GPUs, it’s not an option.

I guess I misunderstood. It seems that he wants to write his own MD code that can be parallelized rather than use something that already exists? Let’s reinvent the wheel?

Don’t be so quick to judge. It could be that he’s working on a new algorithm for academic research (MS or PhD thesis). Or maybe he’s working on a new problem in MD that no one else has addressed, yet, and therefore existing tools won’t work.

If you want to develop new algorithms or work on unusual problems, you may need to write a code from scratch, but you should think carefully about whether you can accomplish your goals by modifying an existing package. Especially in parallel computing, you have to write a lot of code that isn’t about the solver algorithm to make the program work (and it can be very difficult to get it right) so if you can reuse somebody else’s code for this part, your life will be easier and you’ll be able to spend more time working on innovative code and problems.

Working on something that no one has done is real practical, now isn’t it?

I guess someone here hasn’t stuck a toe in the employment waters recently (if ever).

That was mostly meant to be tongue in cheek.

From a Linked In group discussion

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!