Tag Archives: MongoDB

How to insert ,update, delete into MongoDB NOSQL database with Python

How to insert ,update, delete into  MongoDB NOSQL database with  Python

Watch this 30 minute video to guide you through this process for my database for my trading system

 

How to insert update delete in a MongoDB NOSQL database with Python

 

Source code:

Query

#https://docs.mongodb.org/getting-started/python/query/

from pymongo import MongoClient

client = MongoClient()
# db = client.qln
# cursor = db.qln.find({"positions.shortsymbol": "ibm"})
#
# for document in cursor:
#     print(document)


db = client.qln
collection = db.equity

stocks = collection.find({"shortsymbol": "k"})

print stocks.count()
for stock in stocks:
    print(stock)

Insert

#https://docs.mongodb.org/getting-started/python/insert/
from pymongo import MongoClient

client = MongoClient()
db = client.qln

#db.equity.insert( { "_id": 5, item: "box", qty: 20 } )

# from datetime import datetime
result = db.equity.insert_one(
    { "idx" : 2,
      "shortsymbol" : "k",
      "shortentryts" : 1000,
      "shortentryprice" : 100,
      "shortstoplossper" : 0.08,
      "shortupstop" : 105.1,
      "shortlowstop" : 95.1,
      "shortsofttargperc" : 0.08,
      "shortuptarget" : 105.1,
      "showlowtarget" : 95.1,
      "longsymbol" : "pep",
      "longentryts" : 1000,
      "longentryprice" : 100,
      "longstoplossper" : 0.08,
      "longupstop" : "10.5.1",
      "longlowstop" : 95.1,
      "longsofttarget" : 0.08,
      "longuptarget" : 105.1,
      "longdowntarget" : 95.1,
      "shortexitts" : 1000,
      "shortexitprice" : 95.1,
      "longexitprice" : 105.1,
      "shortqty" : 1000,
      "longqty" : 1000,
      "shortbeta" : 0.95,
      "longbeta" : 1.01
    }

)
print result.inserted_id

#https://docs.mongodb.org/getting-started/python/query/
cursor = db.qln.find()

for document in cursor:
    print(document)


Update

 

#https://docs.mongodb.org/manual/reference/method/db.collection.update/
from pymongo import MongoClient

client = MongoClient()
db = client.qln

result = db.equity.update_one(
    {"shortsymbol": "k"},
    {"$set": {"shortentryprice": 107.5, "shortstoplossper": 1.40}}
)

print result.matched_count
print result.modified_count

Delete

 

#https://docs.mongodb.org/getting-started/python/remove/
from pymongo import MongoClient

client = MongoClient()
db = client.qln

result = db.equity.delete_many({"shortsymbol": "k"})

print result.deleted_count

For R connectivity, using NOSQL options for clustering and parallelization using Redis, Cassandra, Couch, MongoDB, MYSQL, Hadoop with HBase

For R connectivity, using NOSQL options for clustering and parallelization using Redis, Cassandra, Couch, MongoDB, MYSQL, Hadoop with HBase

I have a completed my R source code walkthroughs of 14 popular forecasting models for my membership. Now I focus on my cluster to speed up the simulations of the algos. As a result, it always comes down to how R talks to the popular NOSQL options out there. It seems I have narrowed it down to MongoDB and Redis. There are really not decent client R code examples for Hadoop, Couch, or Cassandra. Here are some links that making me lean towards Redis.
http://stackoverflow.com/questions/10696463/mongodb-with-redis

Comparing MongoDB and Redis, Part 1

http://openmymind.net/2011/5/8/Practical-NoSQL-Solving-a-Real-Problem-w-Mongo-Red/

http://www.quora.com/What-are-the-advantages-and-disadvantages-of-using-MongoDB-vs-CouchDB-vs-Cassandra-vs-Redis

http://java.dzone.com/articles/should-i-use-mongodb-couchdb

http://stackoverflow.com/questions/5252577/how-much-faster-is-redis-than-mongodb

Plus the client coding examples for Redis is much more helpful.

Update: It looks like I am going with MongoDB as I have 3 32 bit Macs. There is a limitation of 2 gb with Mongo but at least they can be used. MYSQL does not support older versions of OSX as well Redis is really Linux only. Too bad on the Redis side because it looked awesome!

 

Which RDMS or NOSQL database do you use for R? MySQL, Cassandra, HBase, MongoDB, Oracle, PostgreSQL, CouchDB, SQLite?

Which RDMS or NOSQL database do you use for R? MySQL, Cassandra,  HBase, MongoDB, Oracle, PostgreSQL, CouchDB, SQLite?

This R survey is kind of important. It will show a few things:

  1. Which R most users use regardless if they are commercial vs open source vs NOSQL .
  2. This will help us figure out which database is best for R using the scalability and speed depending on the requirements. This includes multiple writes for market tick data from C++ or a Java application and access by various R algorithms for analytics purposes.

Go here for the poll.

Here are some reasonable options with reasons:

MYSQL

I would assume this to be the number one choice since it is open source (or at least they say). It also contains sharding and other scalability needs with clustering. Is this something that people are using for their trading platform requirements? This includes using MYSQL as a tick data repository.

PostgreSQL

Is anyone actually using this open source database for their R needs?

Oracle

This is easily the most popular commercial RDMS for both Linux/Unix and Windows. As Oracle has open R into with a connector into their ecosystem, I wondered if people are actually using this.

SQL Server/DB2/Sybase

I am unsure if there are any R package connectors to any of these databases. I was just curious as I am really not interested in these as a real option.

Cassandra

There seems to be no R package support for this. I once posted something on R-Bloggers.com and it lit up the site, it made me wonder if this is actually more popular than people think. It seems to meet the needs of both quick write and read access.

Redis

Now the doRedis R package looked really hot. It even showcased how to use with a potential financial analytics system. I even saw Java sharding examples which left me excited on the capabilities of this database.

MongoDB

This seems to be strangely the most popular of all. I also found various R packages which seems to support it as well.

HBase which is part of Hadoop

Eh. No support even according to Revolutionary Analytics which their lacking install R package guides. I gave up pretty quickly on these R packages.

All others database options seem fine but the ones listed above seem the most viable for any R user as repository for scaling and clustering.

Go here for the poll.

http://quantlabs.net/surveys/2012/06/19/what-rdms-or-nosql-database-should-a-r-user-focus-on/

Youtube video: Proper steps to integrate MongoDB NOSQL database with R on Ubuntu Linux with RStudio

Youtube video: Proper steps to integrate MongoDB NOSQL database with R on Ubuntu Linux with RStudio

Why NOSQL?

When comparing other RDMS including Oracle, MySQL or DB/2 against NOSQL, I was one of those old geezers who thought they were another ‘go fast nowhere’ technology trend.

Boy, was I wrong! I looked at all the options including HBase/Hadoop, Cassandra, Redis, and MongoDB. I found Hbase and Hadoop were difficult to install. The R packages for these were useless as there were few installation documents. Cassandra was easy but the the client support for R either didn’t exist or didn’t work with the RCassandra package. Redis looked awesome and is still one of the best for R thanks to the doRedis R package. The Java examples to shard looked promising too.

Then I revisited everything and came across MongoDB. It seems to have all the advantages of Cassandra and Redis with decent R package support.

NOSQL options and R

After looking at all the options, my recommended NOSQL databases is Redis or MongoDB. It was easy to install Redis but tricky with MongoDB. My recommendation is to install MongoDB with a clean current Ubuntu Linux Desktop version. All you need to do is type the following on the command line:

‘apt-get install mongo’

… and you’re ready to go. It takes care of all the permissions, users, etc. It was so easy that I could have saved a day’s work without going through the hoops I did. A lesson learned, and the time I lost is time you can save!

MongoDB install obstacles and tricks

Here are some current tricks to save you even more time:

Install Open JDK 6 not 7 as it will mess up the RJava install with R

If this happens, keep an eye out for a message that says something to the effect of checking your Linux Java environment. It will say you need to run as root, R CMD java or something similar. This message would be appear in your R console during the RJava package install. My problem came to Java 7 versus 6 which resolved it.

Create your MongoDB mydb data with:

http://www.mongodb.org/display/DOCS/Tutorial
http://pseudofish.com/blog/2011/05/25/analysis-of-data-with-mongodb-and-r/

Once installed, you should be able to do the following within your R shell or RStudio:
for RMongo:

> library(“RMongo”)

Loading required package: RUnit

> mongo <- mongoDbConnect(“mydb”)
> dbShowCollections(mongo)
[1] “system.indexes” “things”
> results <- dbGetQuery(mongo, “nutrient_metadatas”, “{}”, 0, 2)
> results <- dbGetQuery(mongo, “things”, “{}”, 0, 2)
> names(results)
[1] “X_id” “name”
> dbDisconnect(mongo)

[youtube_sc url=”http://www.youtube.com/watch?v=DbhdKBx-lK0″ playlist=”How to, Integrate, MongoDB ,NOSQL, database ,R , Ubuntu, Linux, RStudio ” title=”How%20to%20Integrate%20MongoDB%20NOSQL%20database%20with%20R%20on%20Ubuntu%20Linux%20with%20RStudio%20″]