Connecting to a DB2 database from R
Posted by The Average Investor on Jan 16, 2010
Unlike RMySQL and RSQLite there is no RDB2. However, I found it pretty straightforward connecting to a DB2 database using the JDBC driver and the RJDBC package.
For all this to work, DB2 should be setup to use TCPIP, which is not used by default. You need to set the DB2COMM DB2 environment variable to TCPIP:
db2set DB2COMM=TCPIP db2stop db2start
You also need to make sure that the dbm configuration parameter SVCENAME is set, either to a port number like 50000 or to a string like db2c_DB2.
C:\installed\sqllib\BIN>db2 get dbm cfg | grep SVCENAME TCP/IP Service name (SVCENAME) = db2c_DB2 SSL service name (SSL_SVCENAME) =
If SVCENAME (the line starting with “TCP/IP Service name” is not set you can set it to use port 50000, which is the default used by DB2.
db2 update dbm cfg using SVCENAME 50000 db2stop db2start
You also need to install the RJDBC package.
Then you have to determine the location of the DB2 JDBC driver. If DB2 is installed locally, the driver is already installed and it is located under sqllib\java. For example, I installed DB2 in C:\installed\sqllib, thus the driver I am going to use is C:\installed\sqllib\java\db2jcc4.jar.
Now it’s time to startup R, once the GUI comes up, we can load the package:
library(RJDBC)
Then we need to load the DB2 JDBC driver:
jcc = JDBC("com.ibm.db2.jcc.DB2Driver",
"c:/installed/sqllib/java/db2jcc4.jar")
At this point we can establish a database connection:
conn = dbConnect(jcc,
"jdbc:db2://localhost:50000/SAMPLE",
user="yourUserName",
password="yourPassword")
Let’s run a query (to select all records from the employee table) and store the results into a data frame:
rs = dbSendQuery(conn, "select * from employee") df = fetch(rs, -1)
Until now we were assuming that DB2 is installed. The alternative is to connect to a server running DB2. In this case you only need to install the JDBC driver from the IBM web site – the following video outlines the steps.
Stephen said
Hi,
Thank you for the code. I manage to connect to my DB2.
but I have a problem when executing the request. see below
can you help me?
> #load the RJDBC library
> library(RJDBC)
>
> #Then we need to load the DB2 JDBC driver:
> jcc = JDBC(“com.ibm.db2.jcc.DB2Driver”,”C:/Program Files/IBM/SQLLIB/java/db2jcc4.jar”)
> #connect to the Sample database
> conn = dbConnect(jcc,
+ “jdbc:db2://localhost:50000/SAMPLE”,
+ user=”db2admin”,
+ password=”db2admin”)
>
> #fetch the data from employee
> rs = dbSendQuery( conn, “select * from EMPLOYEE”)
Error in .verify.JDBC.result(r, “Unable to retrieve JDBC result set for “, :
Unable to retrieve JDBC result set for select * from EMPLOYEE (DB2 SQL Error: SQLCODE=-204, SQLSTATE=42704, SQLERRMC=DB2ADMIN.EMPLOYEE, DRIVER=4.13.80)
> df = fetch(rs, -1)
Error in is(object, Cl) :
error in evaluating the argument ‘res’ in selecting a method for function ‘fetch’: Error: object ‘rs’ not found
>
The Average Investor said
SQL204 (from the command line run db2 “? sql204″) is undefined object. Either there is no EMPLOYEE table, or it’s not in the DB2ADMIN schema.
Stephen said
I have look at that problem of Schema and I can’t find the solution.
DB2ADMIN is my username and password to my DB2 instance.
I looked at my database SAMPLE and found that EMPLOYEE is in the Schema US54321.
So I tried bSendQuery( conn, “select * from US54321.EMPLOYEE”)
and bSendQuery( conn, “select * from “US54321″.EMPLOYEE”)
but it doesn’t work. Do you have any suggestion?
My SAMPLE database is the demo from DB2 and I run on Windows 7 64bits
The Average Investor said
Not sure what is wrong, may be able to take a look over the weekend. Is the error code helpful in any way?
I found a small type in the steps in the post:
db2 set dbm cfg using SVCENAME 50000
should be
db2 update dbm cfg using SVCENAME 50000
Stephen said
Hi,
“db2 update dbm cfg using SVCENAME 50000″ didn’t help.
The problem come from the CURRENT SCHEMA.
If I connect with the user name and password that created the table, then I can fetch. But I can’t switch CURRENT SCHEMA??
I tried dbSendQuery(conn,”SET CURRENT SCHEMA US54321″) but I have the error
Error in .verify.JDBC.result(r, “Unable to retrieve JDBC result set for “, :
Unable to retrieve JDBC result set for SET CURRENT SCHEMA US54321([jcc][10103][10941][4.13.80] Method executeQuery cannot be used for update. ERRORCODE=-4476, SQLSTATE=null)
Instead i use RODBC package and it work with that code
library(RODBC)
#establish the connection
con<-odbcConnect("DB2SAMPLE",uid="db2admin",pwd="db2admin")
#search the table accessible
sqlTables(con)
sqlTables(con, tableType = "TABLE")
#put the Schema DK22551 in the serach path
a=odbcQuery(con,"SET CURRENT SCHEMA DK22551")
a #display the result of the query
res <- sqlFetch(con, "EMPLOYEE")
res #display the result of the query
#close the connection
odbcClose(con)
I would be interested to know if yo find a solution for the RJDBC package.
Cheers and thanks for the help
The Average Investor said
To set the schema, you need to use:
rs = dbSendUpdate(conn, “set current schema US54321″)
I don’t expect that this will solve the problem, because as you indicated your select (“select * from US54321.EMPLOYEE”) also failed. Might be some issues with security credentials.
Why don’t you try the following commands:
rs = dbSendUpdate(conn, “create table sss.ttt(c1 int, c2 varchar(50))”)
rs = dbSendUpdate(conn, “insert into sss.ttt values (1, ‘aaa’),(2, ‘bbb’)”)
rs = dbSendUpdate(conn, “set current schema sss”)
rs = dbSendQuery(conn, “select * from ttt”)
df = fetch(rs,-1)
df
Stephen said
Hi,
you were right, it was a user privilege issue.
1) I needed to add db2admin user into the user group DB2ADMNS: procedure here: http://publib.boulder.ibm.com/infocenter/db2luw/v9r5/index.jsp?topic=%2Fcom.ibm.db2.luw.qb.server.doc%2Fdoc%2Ft0011930.html
2) I needed to add DB2ADMNS group to the database SAMPLE and put the right plivileges. I have done it though “DB2 Control center software”
So now my code works with:
#Then we need to load the DB2 JDBC driver:
jcc = JDBC(“com.ibm.db2.jcc.DB2Driver”,”C:/Program Files/IBM/SQLLIB/java/db2jcc4.jar”)
#connect to the Sample database
conn = dbConnect(jcc,
“jdbc:db2://localhost:50000/SAMPLE”,
user=”db2admin”,
password=”db2admin”)
#fetch the data from employee
sqlst=”select * from US54321.EMPLOYEE”
rs = dbSendQuery( conn, sqlst)
df = fetch(rs, -1)
it works also with:
dbSendUpdate( conn, “set current schema US5431″)
sqlst=”select * from EMPLOYEE”
rs = dbSendQuery( conn, sqlst)
df = fetch(rs, -1)
Thank you for your help and I hope the comments will save time to others
Stephen said
Hi Again,
I discover that if you want to access to a schema that is not your loggin name, then you need to specify the shema in an SQL statement.
So if your connection is done with DB2ADMIN and with dbSendUpdate( conn, “set current schema MYSCHEMA″)
”select * from MYSCHEMA.EMPLOYEE” is working
”select * from EMPLOYEE” is NOT working