Meetup video on Use of R core scripting to eliminate ‘NA’ and other common issue

(Last Updated On: June 10, 2014)

Meetup video on Use of R core scripting to eliminate ‘NA’ and other common issue

Join my FREE newsletter to learn about similar Meetups in the future 

Detail of Meetup from:

Use of R core scripting to eliminate ‘NA’ and other common issue

Tuesday, Jun 10, 2014, 6:00 PM

GotoMeeting Webinar online
GotoMeeting Webinar online Toronto, ON

9 Researching Traders Went

Meetup Webinar Tues Jun 10 at 6 PM EST: Use of R core scripting to eliminate ‘NA’ (“and other common recycled value problems”?)Body of presentation: I Use of rm() inside of source codeThis following portion is still under construction as I haven’t gotten as much feedback as would be helpful from core R team yet…II Manually coding a ‘divisor proc…

Check out this Meetup →

Use of R core scripting to eliminate ‘NA’ and other common issue

Tuesday, Jun 10, 2014, 6:00 PM

10 Members Went

Check out this Meetup →

Presentation material:

Manuscript of Intended Presentation:

 

 

The Use of     a<-a[-(i)]   can lead to NA’s

 

Argument is that a<-a[-is.na(a)] would then suffice to clean this up, but what are the costs if, say, a is a resultant vector from a sorting algorithm which recursively shortens the vector?

 

The reality is that removing individual elements by referring to their index can be difficult on data integrity after the remaining indices are then restructured.  Perhaps this is dependent on the cluster or R environment you are loading from.      The reality is that NA’s are a commonly recurring problem in R.

 

Since there are many precompiled functions in R, it seems logical to make use of them.  What isn’t so obvious is the usage of them for non-vector arguments.   For example,  typically rm() is a function which can be used to clean up a directory prior to inputting or after outputting a file from a program.   However, rm() can also be used for the same purposes as a<-a[-(i)], and therefore bypassing the need to subsequently call a<-a[-is.na()] afterwards,  and the risk for loss of data integrity.

 

 

More along the lines of data integrity is the loss of precision in arithmetic operations as you get close to your assigned machine precision.  What then happens is dependent on, again, your own system and which version of R you are utilizing.  Apparently 3.0.0  seems to be set up now with the idea of allowing data to just drop digits as precision is maxed out.     To quote the current developers blog:

The following function is due for release:

 

digitloss=c(“allow”, “warn”, “forbid” )

 

 

C developers can deal with this by implementing their own arithmetic procedures, keeping in mind the underlying algorithm of each.   e.g.   Division can be viewed as the inverse operation of multiplication, which in turn can be viewed as a “convolution” of two floating point integers.

 

So what does this mean..  ?     Maybe for the purposes of speeding up your system and avoiding the abovementioned data loss, converting your division problem to a multiplication by the inverse of your divisor, and then in order to convert your base 10 number to decimal formatting-  either calling strtoll()  or incorporating your own division algorithm.

 

At this point you would be ready to perform the “convolution” portion of  your multiplication formula.   Warning:   convolve() in R (as in C’s numerical recipes) incorporates  the Fourier transform, adding a full      N*logN     to your computational complexity.   So it may be best to code up your own if you think time is of importance.

 

 

 

 

 

Examples of code  demonstrating the above topics can be available upon request.  Thanks for your attendance.

 

Join my FREE newsletter to learn about similar Meetups in the future 

 

 

 

NOTE I now post my TRADING ALERTS into my personal FACEBOOK ACCOUNT and TWITTER. Don't worry as I don't post stupid cat videos or what I eat!
This entry was posted in R and tagged , , , , , , , on by .

About caustic

Hi i there My name is Bryan Downing. I am part of a company called QuantLabs.Net This is specifically a company with a high profile blog about technology, trading, financial, investment, quant, etc. It posts things on how to do job interviews with large companies like Morgan Stanley, Bloomberg, Citibank, and IBM. It also posts different unique tips and tricks on Java, C++, or C programming. It posts about different techniques in learning about Matlab and building models or strategies. There is a lot here if you are into venturing into the financial world like quant or technical analysis. It also discusses the future generation of trading and programming Specialties: C++, Java, C#, Matlab, quant, models, strategies, technical analysis, linux, windows P.S. I have been known to be the worst typist. Do not be offended by it as I like to bang stuff out and put priorty of what I do over typing. Maybe one day I can get a full time copy editor to help out. Do note I prefer videos as they are much easier to produce so check out my many video at youtube.com/quantlabs