Is Python the new R?
Maybe, maybe not, but that I'm finding in recent months is more companies are asking me to use Python instead of R for some of my work.
In this blog post I will walk through the steps of setting up the Oracle driver for Python, called
cx_Oracle. The
documentation for this drive is good and detailed with plenty of examples available on
GitHub. Hopefully there isn't anything new in this post, but it is my experiences and what I did.
1. Install Oracle Client
The Python driver requires Oracle Client software to be installed. Go here, download and install. It's a straightforward install. Make sure the directories are added to the search path.
2. Download and install cx_Oracle
You can use pip3 to do this.
pip3 install cx_Oracle
Collecting cx_Oracle
Downloading cx_Oracle-6.1.tar.gz (232kB)
100% |████████████████████████████████| 235kB 679kB/s
Building wheels for collected packages: cx-Oracle
Running setup.py bdist_wheel for cx-Oracle ... done
Stored in directory: /Users/brendan.tierney/Library/Caches/pip/wheels/0d/c4/b5/5a4d976432f3b045c3f019cbf6b5ba202b1cc4a36406c6c453
Successfully built cx-Oracle
Installing collected packages: cx-Oracle
Successfully installed cx-Oracle-6.1
3. Create a connection in Python
Now we can create a connection. When you see some text enclosed in angled brackets <>, you will need to enter your detailed for your schema and database server.
# import the Oracle Python library
import cx_Oracle
# define the login details
p_username = ""
p_password = ""
p_host = ""
p_service = ""
p_port = "1521"
# create the connection
con = cx_Oracle.connect(user=p_username, password=p_password, dsn=p_host+"/"+p_service+":"+p_port)
# an alternative way to create the connection
# con = cx_Oracle.connect('/@/:1521')
# print some details about the connection and the library
print("Database version:", con.version)
print("Oracle Python version:", cx_Oracle.version)
Database version: 12.1.0.1.0
Oracle Python version: 6.1
4. Query some data and return results to Python
In this example the query returns the list of tables in the schema.
# define a cursor to use with the connection
cur = con.cursor()
# execute a query returning the results to the cursor
cur.execute('select table_name from user_tables')
# for each row returned to the cursor, print the record
for row in cur:
print("Table: ", row)
Table: ('DECISION_TREE_MODEL_SETTINGS',)
Table: ('INSUR_CUST_LTV_SAMPLE',)
Table: ('ODMR_CARS_DATA',)
Now list the Views available in the schema.
# define a second cursor
cur2 = con.cursor()
# return the list of Views in the schema to the cursor
cur2.execute('select view_name from user_views')
# display the list of Views
for result_name in cur2:
print("View: ", result_name)
View: ('MINING_DATA_APPLY_V',)
View: ('MINING_DATA_BUILD_V',)
View: ('MINING_DATA_TEST_V',)
View: ('MINING_DATA_TEXT_APPLY_V',)
View: ('MINING_DATA_TEXT_BUILD_V',)
View: ('MINING_DATA_TEXT_TEST_V',)
5. Query some data and return to a Panda in Python
Pandas are commonly used for storing, structuring and processing data in Python, using a data frame format. The following returns the results from a query and stores the results in a panda.
# in this example the results of a query are loaded into a Panda
# load the pandas library
import pandas as pd
# execute the query and return results into the panda called df
df = pd.read_sql_query("SELECT * from INSUR_CUST_LTV_SAMPLE", con)
# print the records returned by query and stored in panda
print(df.head())
CUSTOMER_ID LAST FIRST STATE REGION SEX PROFESSION \
0 CU13388 LEIF ARNOLD MI Midwest M PROF-2
1 CU13386 ALVA VERNON OK Midwest M PROF-18
2 CU6607 HECTOR SUMMERS MI Midwest M Veterinarian
3 CU7331 PATRICK GARRETT CA West M PROF-46
4 CU2624 CAITLYN LOVE NY NorthEast F Clerical
BUY_INSURANCE AGE HAS_CHILDREN ... MONTHLY_CHECKS_WRITTEN \
0 No 70 0 ... 0
1 No 24 0 ... 9
2 No 30 1 ... 2
3 No 43 0 ... 4
4 No 27 1 ... 4
MORTGAGE_AMOUNT N_TRANS_ATM N_MORTGAGES N_TRANS_TELLER \
0 0 3 0 0
1 3000 4 1 1
2 980 4 1 3
3 0 2 0 1
4 5000 4 1 2
CREDIT_CARD_LIMITS N_TRANS_KIOSK N_TRANS_WEB_BANK LTV LTV_BIN
0 2500 1 0 17621.00 MEDIUM
1 2500 1 450 22183.00 HIGH
2 500 1 250 18805.25 MEDIUM
3 800 1 0 22574.75 HIGH
4 3000 2 1500 17217.25 MEDIUM
[5 rows x 31 columns]
6. Wrapping it up and closing things
Finally we need to wrap thing up and close our cursors and our connection to the database.
# close the cursors
cur2.close()
cur.close()
# close the connection to the database
con.close()
Useful links
cx_Oracle website
cx_Oracle documentation
cx_Oracle examples on GitHub
Watch out for more blog posts on using Python with Oracle, Oracle Data Mining and Oracle R Enterprise.