Friday, May 8, 2015

Loading JSON data into Oracle using ROracle and jsonlite

In this post I want to show you one way of taking a JSON file of data and loading it into your Oracle schema using ROracle. The JSON data will then be used to create a table in your schema. Yes you could use other methods to connect to the database and to create the table. But ROracle is by far the fastest method of connecting, selecting and processing data.

1. Necessary R Packages

You will need two R library. The first of these is the ROracle package. This gives us all the connection and data processing commands to work with the Oracle database. The second package is the jsonlite R package. This package allows us to open, read and process a file that has JSON data.

> install.package("ROracle")

> install.package("jsonlite")

After you have installed the packages you can now load them into your R environment so that you can use them in your current session.

> library(ROracle)

> library(jsonlite)

Depending on your version of R you may get some working messages about the libraries being built under a different version of R. Then again maybe you won't get these :-)

2. Open & Read the JSON file in R

Now you are ready to name and open the file that contains your JSON data. In my case the file is called 'demo_json_data.json'

> jsonFile <- "c:/app/demo_json_data.json"

> jsonData <- fromJSON(jsonFile)

We now have the JSON data loaded into R. We can now look at the attributes of each JSON record and the number of records that was in the JSON file.

> names(jsonData$items)

[1] "cust_id" "cust_gender" "age"

[4] "cust_marital_status" "country_name" "cust_income_level"

[7] "education" "occupation" "household_size"

[10] "yrs_residence" "affinity_card" "bulk_pack_diskettes"

[13] "flat_panel_monitor" "home_theater_package" "bookkeeping_application"

[16] "printer_supplies" "y_box_games" "os_doc_set_kanji"

> nrow(jsonData$items)

[1] 1500

As you can see the records are grouped under a higher label of 'items'. You might want to extract these records into a new data frame.

> data <- jsonData$items

>

Now we have our data ready in a data frame and we can use this data frame to create a table and insert the data.

3. Create the connection to the Oracle Schema

I have a previous post on connecting to an Oracle Schema using ROracle. That was connecting to an 11g Oracle Database.

JSON is a new feature in Oracle 12c and the connection details are a little bit different because we are now having to deal with connection to a pluggable database. The following illustrates connecting to a 12c database and assumes you have Oracle Client already installed and configured with your tnsnames.ora entry.

# Create the connection string

> host <- "localhost"

> port <- 1521

> service <- "pdb12c"

> connect.string <- paste(

"(DESCRIPTION=",

"(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))",

"(CONNECT_DATA=(SERVICE_NAME=", service, ")))", sep = "")

> con2 <- dbConnect(drv, username = "dmuser", password = "dmuser",dbname=connect.string)

>

4. Create the table in your Oracle Schema

At this point we have our connection to our Oracle Schema setup and connected, we have read in the JSON file and we have the JSON data in a data frame. We are now ready to push the JSON data to a table in our schema.

> dbWriteTable(con, "JSON_DATA", overwrite=TRUE, value=data)

Job done :-)

The table JSON_DATA has been created and the data is stored in the table in typical table attributes and rows format.

One thing to watch our for with the above command is with the overwrite=TRUE parameter setting. This replaces a table if it already exists. So your old data will be gone. Be careful.

5. View and Query the data using SQL

When you now log into your schema in the 12c Database, you can now query the data in the JSON_DATA table. (Yes I know it is not in JSON format in this table).

NewImage

How did I get/generate my JSON data?

I generated the JSON file using a table that I already had in one of my schemas. This table is part of the sample data set that is built on top of the Oracle sample schemas.

The image below shows the steps involved in generating the data in JSON format. I used SQL Developer and set the SQLFORMAT to be JSON. I then ran the query to select the data. You will need to run this as a script. Then copy the JSON data and paste it into a file.

NewImage

The SQL FORMAT command sets the output format for a query back to the default query output format that we are well use to.

A nice little JSON viewer can be found at http://jsonviewer.stack.hu/

Copy and paste your JSON data into this and you can view the structure of the data. Check it out.

4 comments:

  1. Thanks Brendan the same steps I followed it works for Emp table extracted to Json but when I tried to use a new json file this steps does not seem to work, Does this json data set need to be extracted from Oracle database in order for this to work? or Any json file should be fine to load into Oracle tables

    Thanks
    Allwyn

    ReplyDelete
    Replies
    1. Hi You will need to check the structure of the json file. If it isn't complete then you will get an error, like what you are reporting. For example you might need to add [ to the beginning of the file and ] at the end. Only if these are missing. Or you might need a label too {"items":[ ........ ]} It all depends on how the json data was generated, so you will need to verify that the contents is correct. You can use the JSON viewer (see link above) to view the contents of the json file before trying to load it into your database. By using a viewer like this you can quickly see if there are any issues with the file and if anything additional is needed Brendan

      Delete
  2. Thanks Brendan, I used http://jsonlint.com/ to validate the json structure and it is valid Json file, since it is valid JSON file it has matching curly braces {} and square brackets as well, these json files are generated Maximo or other applications instance, I am sure it does not have "items" but Attributes instead.

    I am new to R so this package Roracle supports only json that is generated from relational tables or should it support any well formed Json files.

    Thanks
    Allwyn

    ReplyDelete
    Replies
    1. You need to be careful here. ROracle does not support JSON. What the article talks about is opening and reading a JSON file. This is all done using R. The JSON file will be loaded into an R data frame. From there we can use ROracle to write the R data frame to and Oracle table.

      If you are getting errors then you need to investigate what is causing this error. After opening and reading the file, has the R data frame been created correctly, is the data in the correct format, at the data types correct, are there any potential issues with the data, data types, data sizes, etc etc. before you try to write it to the database.

      If you get an error during the writing to the database using ROracle, then their is probably something wrong before you get to this point and you will need you investigate this.

      Delete