In this post I want to show you one way of taking a JSON file of data and loading it into your Oracle schema using ROracle. The JSON data will then be used to create a table in your schema. Yes you could use other methods to connect to the database and to create the table. But ROracle is by far the fastest method of connecting, selecting and processing data.
1. Necessary R Packages
You will need two R library. The first of these is the ROracle package. This gives us all the connection and data processing commands to work with the Oracle database. The second package is the jsonlite R package. This package allows us to open, read and process a file that has JSON data.
> install.package("ROracle")
> install.package("jsonlite")
After you have installed the packages you can now load them into your R environment so that you can use them in your current session.
> library(ROracle)
> library(jsonlite)
Depending on your version of R you may get some working messages about the libraries being built under a different version of R. Then again maybe you won't get these :-)
2. Open & Read the JSON file in R
Now you are ready to name and open the file that contains your JSON data. In my case the file is called 'demo_json_data.json'
> jsonFile <- "c:/app/demo_json_data.json"
> jsonData <- fromJSON(jsonFile)
We now have the JSON data loaded into R. We can now look at the attributes of each JSON record and the number of records that was in the JSON file.
> names(jsonData$items)
[1] "cust_id" "cust_gender" "age"
[4] "cust_marital_status" "country_name" "cust_income_level"
[7] "education" "occupation" "household_size"
[10] "yrs_residence" "affinity_card" "bulk_pack_diskettes"
[13] "flat_panel_monitor" "home_theater_package" "bookkeeping_application"
[16] "printer_supplies" "y_box_games" "os_doc_set_kanji"
> nrow(jsonData$items)
[1] 1500
As you can see the records are grouped under a higher label of 'items'. You might want to extract these records into a new data frame.
> data <- jsonData$items
>
Now we have our data ready in a data frame and we can use this data frame to create a table and insert the data.
3. Create the connection to the Oracle Schema
I have a previous post on connecting to an Oracle Schema using ROracle. That was connecting to an 11g Oracle Database.
JSON is a new feature in Oracle 12c and the connection details are a little bit different because we are now having to deal with connection to a pluggable database. The following illustrates connecting to a 12c database and assumes you have Oracle Client already installed and configured with your tnsnames.ora entry.
# Create the connection string
> host <- "localhost"
> port <- 1521
> service <- "pdb12c"
> connect.string <- paste(
"(DESCRIPTION=",
"(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))",
"(CONNECT_DATA=(SERVICE_NAME=", service, ")))", sep = "")
> con2 <- dbConnect(drv, username = "dmuser", password = "dmuser",dbname=connect.string)
>
4. Create the table in your Oracle Schema
At this point we have our connection to our Oracle Schema setup and connected, we have read in the JSON file and we have the JSON data in a data frame. We are now ready to push the JSON data to a table in our schema.
> dbWriteTable(con, "JSON_DATA", overwrite=TRUE, value=data)
Job done :-)
The table JSON_DATA has been created and the data is stored in the table in typical table attributes and rows format.
One thing to watch our for with the above command is with the overwrite=TRUE parameter setting. This replaces a table if it already exists. So your old data will be gone. Be careful.
5. View and Query the data using SQL
When you now log into your schema in the 12c Database, you can now query the data in the JSON_DATA table. (Yes I know it is not in JSON format in this table).
How did I get/generate my JSON data?
I generated the JSON file using a table that I already had in one of my schemas. This table is part of the sample data set that is built on top of the Oracle sample schemas.
The image below shows the steps involved in generating the data in JSON format. I used SQL Developer and set the SQLFORMAT to be JSON. I then ran the query to select the data. You will need to run this as a script. Then copy the JSON data and paste it into a file.
The SQL FORMAT command sets the output format for a query back to the default query output format that we are well use to.
A nice little JSON viewer can be found at http://jsonviewer.stack.hu/
Copy and paste your JSON data into this and you can view the structure of the data. Check it out.