Brendan Tierney - Oralytics Blog

Wednesday, May 27, 2015

R (ROracle) and Oracle DATE formats

When you comes to working with R to access and process your data there are a number of little features and behaviours you need to look out for.

One of these is the DATE datatype.

The main issue that you have to look for is the TIMEZONE conversion that happens then you extract the data from the database into your R environment.

There is a datatype conversions from the Oracle DATE into the POSIXct format. The POSIXct datatype also includes the timezone. But the Oracle DATE datatype does not have a Timezone part of it.

When you look into this a bit more you will see that the main issue is what Timezone your R session has. By default your R session will inherit the OS session timezone. For me here in Ireland we have the time timezone as the UK. You would time that the timezone would therefore be GMT. But this is not the case. What we have for timezone is BST (or British Standard Time) and this takes into account the day light savings time. So on the 26th May, BST is one hour ahead of GMT.

OK. Let's have a look at a sample scenario.

The Problem

As mentioned above, when I select date of type DATE from Oracle into R, using ROracle, I end up getting a different date value than what was in the database. Similarly when I process and store the data.

The following outlines the data setup and some of the R code that was used to generate the issue/problem.

Data Set-up
Create a table that contains a DATE field and insert some records.

CREATE TABLE STAFF
    (STAFF_NUMBER VARCHAR2(20),
    FIRST_NAME VARCHAR2(20),
    SURNAME VARCHAR2(20),
    DOB DATE,
    PROG_CODE VARCHAR2(6 BYTE),
    PRIMARY KEY (STAFF_NUMBER));

insert into staff values (123456789, 'Brendan', 'Tierney', to_date('01/06/1975', 'DD/MM/YYYY'), 'DEPT_1');
insert into staff values (234567890, 'Sean', 'Reilly', to_date('21/10/1980', 'DD/MM/YYYY'), 'DEPT_2');
insert into staff values (345678901, 'John', 'Smith', to_date('12/03/1973', 'DD/MM/YYYY'), 'DEPT_3');
insert into staff values (456789012, 'Barry', 'Connolly', to_date('25/01/1970', 'DD/MM/YYYY'), 'DEPT_4');

You can query this data in SQL without any problems. As you can see there is no timezone element to these dates.

Selecting the data
I now establish my connection to my schema in my 12c database using ROracle. I won't bore you with the details here of how to do it but check out point 3 on this post for some details.

When I select the data I get the following.

> res<-dbsendquery br="" con="" from="" select="" staff="">> data <- br="" fetch="" res="">> data$DOB
[1] "1975-06-01 01:00:00 BST" "1980-10-21 01:00:00 BST" "1973-03-12 00:00:00 BST"
[4] "1970-01-25 01:00:00 BST"

As you can see two things have happened to my date data when it has been extracted from Oracle. Firstly it has assigned a timezone to the data, even though there was no timezone part of the original data. Secondly it has performed some sort of timezone conversion to from GMT to BST. The difference between GMT and BTS is the day light savings time. Hence the 01:00:00 being added to the time element that was extract. This time should have been 00:00:00. You can see we have a mixture of times!

So there appears to be some difference between the R date or timezone to what is being used in Oracle.

To add to this problem I was playing around with some dates and different records. I kept on getting this scenario but I also got the following, where we have a mixture of GMT and BST times and timezones. I'm not sure why we would get this mixture.

> data$DOB
[1] "1995-01-19 00:00:00 GMT" "1965-06-20 01:00:00 BST" "1973-10-20 01:00:00 BST"
[4] "2000-12-28 00:00:00 GMT"

This is all a bit confusing and annoying. So let us look at how you can now fix this.

The Solution
Fixing the problem : Setting Session variablesWhat you have to do to fix this and to ensure that there is consistency between that is in Oracle and what is read out and converted into R (POSIXct) format, you need to define two R session variables. These session variables are used to ensure the consistency in the date and time conversions.

These session variables are TZ for the R session timezone setting and Oracle ORA_SDTZ setting for specifying the timezone to be used for your Oracle connections.

The trick there is that these session variables need to be set before you create your ROracle connection. The following is the R code to set these session variables.

> Sys.setenv(TZ = "GMT")
> Sys.setenv(ORA_SDTZ = "GMT")

So you really need to have some knowledge of what kind of Dates you are working with in the database and if a timezone if part of it or is important. Alternatively you could set the above variables to UDT.

Selecting the data (correctly this time)
Now when we select our data from our table in our schema we now get the following, after reconnecting or creating a new connection to your Oracle schema.

> data$DOB
[1] "1975-06-01 GMT" "1980-10-21 GMT" "1973-03-12 GMT" "1970-01-25 GMT"
Now you can see we do not have any time element to the dates and this is correct in this example. So all is good.

We can now update the data and do whatever processing we want with the data in our R script.

But what happens when we save the data back to our Oracle schema. In the following R code we will add 2 days to the DOB attribute and then create a new table in our schema to save the updated data.

> data$DOB
[1] "1975-06-01 GMT" "1980-10-21 GMT" "1973-03-12 GMT" "1970-01-25 GMT"

> data$DOB <- br="" data="" days="">> data$DOB
[1] "1975-06-03 GMT" "1980-10-23 GMT" "1973-03-14 GMT" "1970-01-27 GMT"

> dbWriteTable(con, "STAFF_2", data, overwrite = TRUE, row.names = FALSE)
[1] TRUE

I've used the R package Libridate to do the date and time processing.

When we look at this newly created table in our Oracle schema we will see that we don't have DATA datatype for DOB, but instead it is created using a TIMESTAMP data type.

If you are working with TIMESTAMP etc type of data types (i.e. data types that have a timezone element that is part of it) then that is a slightly different problem. Perhaps one that I'll look at soonish.

Thursday, May 14, 2015

Extracting Oracle data & Generating JSON data file using ROracle

In a previous blog post I showed you how to take a JSON data file and to load it into your Oracle Schema using R. To do this I used ROracle to connect to the database and jsonlite to do the JSON processing of the data.
Alternatives to using ROracle would be RODBC, RJDBC and DBI. So you could use one of these to connect to the database.
In this post I want to show you how to extract data from an Oracle table (or view) and to output it to a file in JSON format. Again I will be using the jsonlite R package to perform all the JSON formatting work for me.
1. Connect to the Database
This is the same connect setup that I used in the previous post.
# initialise the packages
> library(ROracle)
> library(jsonlite)
# Create the connection string
> drv <- dbdriver="" p="" racle="">
> host <- localhost="" p="">
> port <- 1521="" p="">
> service <- p="" pdb12c="">
> connect.string <- p="" paste="">
"(DESCRIPTION=",
"(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))",
"(CONNECT_DATA=(SERVICE_NAME=", service, ")))", sep = "")
# establish the connection
> con <- dbconnect="" dbname="connect.string)" drv="" p="" password="dmuser" username="dmuser">

2. Read the data from the table/view
Read the data from the table into a R data frame.
> rs <- con="" dbsendquery="" from="" mining_data_build_v="" p="" select="">
> data <- fetch="" p="" rs="">
> dim(data)
[1] 1500 18
> head(data, 3)
CUST_ID CUST_GENDER AGE CUST_MARITAL_STATUS COUNTRY_NAME
1 101501 F 41 NeverM United States of America
2 101502 M 27 NeverM United States of America
3 101503 F 20 NeverM United States of America
CUST_INCOME_LEVEL EDUCATION OCCUPATION HOUSEHOLD_SIZE YRS_RESIDENCE AFFINITY_CARD
1 J: 190,000 - 249,999 Masters Prof. 2 4 0
2 I: 170,000 - 189,999 Bach. Sales 2 3 0
3 H: 150,000 - 169,999 HS-grad Cleric. 2 2 0
BULK_PACK_DISKETTES FLAT_PANEL_MONITOR HOME_THEATER_PACKAGE BOOKKEEPING_APPLICATION
1 1 1 1 1
2 1 1 0 1
3 1 0 0 1
PRINTER_SUPPLIES Y_BOX_GAMES OS_DOC_SET_KANJI
1 1 0 0
2 1 1 0
3 1 1 0

We now have the data from the table in a data frame called data. We can now use this data frame to covert the data into JSON.
3. Convert into JSON format
To produced the JSON formatted output of the data in our table (or view) we can use the toJSON function that produces the outputted JSON data in an R String.
> jsonData <- data="" p="" tojson="">
> jsonData

4. Create the JSON file
We are now ready to output the formatted JSON data out to file. We can use the R function 'write' to write the JSON data out to a file.
> write(jsonData, file="c:/app/demo_json_data2.json")
Job Done!
5. Verify the JSON data was created correctly
To verify that JSON data file was created correctly, we can use the steps outlined in my previous post to read in the file. If all the correct then we should get no errors.
> jsonFile <- app="" c:="" demo_json_data2.json="" p="">
> jsonData <- fromjson="" jsonfile="" p="">
> str(jsonData)
> names(jsonData)
> nrow(jsonData)

You will notice that there is one difference between the code shown above and what I showed in my previous example/blog post. This time we don't have an extra wrapper class of Items.
Generating JSON data - Using SQL Developer
In my previous post I showed you one way of generating a JSON file based on the data in a table. You could do that using SQL Developer and SQLcl.
An alternative is to use the Table Export feature to export the data in JSON format.
To do this right click on the table (or view) and select Export from the drop down menu.
The Export Wizard will open. De-select the Export DDL tick box. In the export data section change the format drop-down to JSON. Then enter the location and file name for the JSON file. Then click the next buttons until you are finished.
Blog json

Friday, May 8, 2015

Loading JSON data into Oracle using ROracle and jsonlite

In this post I want to show you one way of taking a JSON file of data and loading it into your Oracle schema using ROracle. The JSON data will then be used to create a table in your schema. Yes you could use other methods to connect to the database and to create the table. But ROracle is by far the fastest method of connecting, selecting and processing data.

1. Necessary R Packages

You will need two R library. The first of these is the ROracle package. This gives us all the connection and data processing commands to work with the Oracle database. The second package is the jsonlite R package. This package allows us to open, read and process a file that has JSON data.

> install.package("ROracle")

> install.package("jsonlite")

After you have installed the packages you can now load them into your R environment so that you can use them in your current session.

> library(ROracle)

> library(jsonlite)

Depending on your version of R you may get some working messages about the libraries being built under a different version of R. Then again maybe you won't get these :-)

2. Open & Read the JSON file in R
Now you are ready to name and open the file that contains your JSON data. In my case the file is called 'demo_json_data.json'

> jsonFile <- "c:/app/demo_json_data.json"

> jsonData <- fromJSON(jsonFile)

We now have the JSON data loaded into R. We can now look at the attributes of each JSON record and the number of records that was in the JSON file.

> names(jsonData$items)

[1] "cust_id" "cust_gender" "age"

[4] "cust_marital_status" "country_name" "cust_income_level"

[7] "education" "occupation" "household_size"

[10] "yrs_residence" "affinity_card" "bulk_pack_diskettes"

[13] "flat_panel_monitor" "home_theater_package" "bookkeeping_application"

[16] "printer_supplies" "y_box_games" "os_doc_set_kanji"

> nrow(jsonData$items)

[1] 1500

As you can see the records are grouped under a higher label of 'items'. You might want to extract these records into a new data frame.

> data <- jsonData$items

Now we have our data ready in a data frame and we can use this data frame to create a table and insert the data.

3. Create the connection to the Oracle Schema

I have a previous post on connecting to an Oracle Schema using ROracle. That was connecting to an 11g Oracle Database.

JSON is a new feature in Oracle 12c and the connection details are a little bit different because we are now having to deal with connection to a pluggable database. The following illustrates connecting to a 12c database and assumes you have Oracle Client already installed and configured with your tnsnames.ora entry.

# Create the connection string

> host <- "localhost"

> port <- 1521

> service <- "pdb12c"

> connect.string <- paste(

"(DESCRIPTION=",

"(ADDRESS=(PROTOCOL=tcp)(HOST=", host, ")(PORT=", port, "))",

"(CONNECT_DATA=(SERVICE_NAME=", service, ")))", sep = "")

> con2 <- dbConnect(drv, username = "dmuser", password = "dmuser",dbname=connect.string)

> 4. Create the table in your Oracle Schema

At this point we have our connection to our Oracle Schema setup and connected, we have read in the JSON file and we have the JSON data in a data frame. We are now ready to push the JSON data to a table in our schema.

> dbWriteTable(con, "JSON_DATA", overwrite=TRUE, value=data)

Job done :-)

The table JSON_DATA has been created and the data is stored in the table in typical table attributes and rows format.

One thing to watch our for with the above command is with the overwrite=TRUE parameter setting. This replaces a table if it already exists. So your old data will be gone. Be careful.

5. View and Query the data using SQL

When you now log into your schema in the 12c Database, you can now query the data in the JSON_DATA table. (Yes I know it is not in JSON format in this table).

How did I get/generate my JSON data?

I generated the JSON file using a table that I already had in one of my schemas. This table is part of the sample data set that is built on top of the Oracle sample schemas.

The image below shows the steps involved in generating the data in JSON format. I used SQL Developer and set the SQLFORMAT to be JSON. I then ran the query to select the data. You will need to run this as a script. Then copy the JSON data and paste it into a file.

NewImage

The SQL FORMAT command sets the output format for a query back to the default query output format that we are well use to.

A nice little JSON viewer can be found at http://jsonviewer.stack.hu/

Copy and paste your JSON data into this and you can view the structure of the data. Check it out.

Monday, May 4, 2015

Oracle Data Miner (ODM 4.1) New Features

With the release of SQL Developer 4.1 we also get a number of new features with Oracle Data Miner (ODMr). These include:

Data Source node can now include data sources that contain JSON data, generating JSON schema and has a JSON viewer
Create Table can now create data in JSON
JSON Query Node allows you to view, query and process JSON data, combine it with relational data, generate sub-group by, and nested columns to be part of input to algorithms
New PL/SQL APIs for managing Data Miner projects and workflows. This includes run, cancel, rename, delete, import and export of workflows using PL/SQL.
New ODMr Repository views that allows us to query and monitor our workflows.
Transformation Node now allows you different ways of handling NULLS.
Transformation Node now allows us to create Custom Bins, define bin labels and bin values
Overall Workflow and ODMr environment improvements to allow for greater efficiency in workflow behaviour and interactions with the database. So using ODMr should feel quicker and more responsive.

What out for the Gotchas: Although support for JSON has been added to ODMr, as outlined above, you are still a bit limited to what else you can do with your JSON data. Based on the documentation you can use JSON data in the Association and Classification build nodes.

I'm not sure about the other nodes and this will need a bit of investigation to see what nodes can and cannot use JSON data. I'm sure this will all be sorted out in the next release.

Keep an eye out for some blog posts over the coming weeks on how to explore and use these new features of Oracle Data Miner.

SQL Developer 4.1 : ODM Repository upgrade

Earlier today (4th May) SQL Developer 4.1 was released :-)

For those of you who use the Oracle Data Miner tool (that is part of SQL Developer) you will need to upgrade your repository. The following steps will walk you through the process.

1. Download SQL Developer (you do need to have Java 8 installed) This download does not come with the JRE built into it. This usually comes a few days after the release.

2. Unzip the downloaded file and copy the extracted directory to where you like to keep your applications etc.

3. Start up SQL Developer by running the sqldeveloper.exe file. This will located in the extracted folder \sqldeveloper-4.1.0.19.07-no-jre\sqldeveloper

4. If you have been a previous install of SQL Developer you will be asked if you want to migrate your current settings. Click on the Yes button and all your connections and settings will be migrated.

5. To upgrade your Oracle Data Miner (ODMr) repository, you will need to open one of your ODMr connections. When you do this ODMr will check to see if the repository in your database needs to be updated. If it does you will get the following window.

NewImage

6. Enter the password for SYS

NewImage

6. When you get the following window you can click on the Start button to begin the Oracle Data Miner repository upgrade.

NewImage

7. After a couple of minutes (and depending on the number of ODM Workflows and ODM schemas to have) you will get the following window.

NewImage

Congratulations. You have now upgraded your Oracle Data Miner repository.

If you do encounter any errors during the upgrade of the repository then you should get onto the OTN Forum for Oracle Data Miner and report the errors. The Oracle Data Miner team monitor this forum and will get back to you quickly with a response.

Thursday, April 30, 2015

Turning down the always on aspects of Life

There was has been various on going discussions about the every growing “always on” or “always available” aspect of modern life.

Most of us do something like the follow each day:

check email, twitter, Facebook, etc before getting out of bed
checking emails and responding to emails and various social media before you before you leave the house
spending a long day in the office or moving between multiple clients
spending your lunch time on email and social media
go home after a long day and open the laptop/tablet etc to answer more emails.
final check of emails and social media before you go to sleep

Can you remember the last time when you finished work for the day and went home you were actually finish work for the day?

A couple of years ago I did a bit of an experiment that I called ‘Anti Social Wednesdays'

That experiment was actually very successful but after a few months I did end up going back online. But the amount of work I achieved was huge and during some of that time I started to write my first book.

I’m not going to start out on a new experiment. I’m going to turn off/delete all work related email accounts from my Phone, tablet etc. I’ll leave it on my work laptops (obviously)

What this means is that when I want to look up something on the internet or use an app on my phone, I’m no longer going to be tempted into checking emails and doing one of those quick replies.

So now I won’t be getting any of those “annoying" emails just before you go to sleep, or when you are on a day out with the family, etc.

If something is really really important that you need a response really quickly then you can phone me. Otherwise you are going to have to way until I open my laptop to find your email.

Anyone one else interested in joining in on this little experiment? Let me now.

Viewing Models Details for Decision Trees using SQL

When you are working with and developing Decision Trees by far the easiest way to visualise these is by using the Oracle Data Miner (ODMr) tool that is part of SQL Developer.
Developing your Decision Tree models using the ODMr allows you to explore the decision tree produced, to drill in on each of the nodes of the tree and to see all the statistics etc that relate to each node and branch of the tree.
But when you are working with the DBMS_DATA_MINING PL/SQL package and with the SQL commands for Oracle Data Mining you don't have the same luxury of the graphical tool that we have in ODMr. For example here is an image of part of a Decision Tree I have and was developed using ODMr.
Blog dt 1

What if we are not using the ODMr tool? In that case you will be using SQL and PL/SQL. When using these you do not have luxury of viewing the Decision Tree.
So what can you see of the Decision Tree? Most of the model details can be used by a variety of functions that can apply the model to your data. I've covered many of these over the years on this blog.
For most of the data mining algorithms there is a PL/SQL function available in the DBMS_DATA_MINING package that allows you to see inside the models to find out the settings, rules, etc. Most of these packages have a name something like GET_MODEL_DETAILS_XXXX, where XXXX is the name of the algorithm. For example GET_MODEL_DETAILS_NB will get the details of a Naive Bayes model. But when you look through the list there doesn't seem to be one for Decision Trees.
Actually there is and it is called GET_MODEL_DETAILS_XML. This function takes one parameter, the name of the Decision Tree model and produces an XML formatted output that contains the attributes used by the model, the overall model settings, then for each node and branch the attributes and the values used and the other statistical measures required for each node/branch.
The following SQL uses this PL/SQL function to get the Decision Tree details for model called CLAS_DT_1_59.
SELECT dbms_data_mining.get_model_details_xml('CLAS_DT_1_59')
FROM dual;

If you are using SQL Developer you will need to double click on the output column and click on the pencil icon to view the full listing.
Blog dt 2

Nothing too fancy like what we get in ODMr, but it is something that we can work with.
If you examine the XML output you will see references to PMML. This refers to the Predictive Model Markup Language (PMML) and this is defined by the Data Mining Group (www.dmg.org). I will discuss the PMML in another blog post and how you can use it with Oracle Data Mining.

Friday, April 24, 2015

Changing REVERSE Transformations in Oracle Data Miner

In my previous blog post I showed you how you can have a look at the transformations that the Automatic Data Preparation (ADP) feature of Oracle Data Mining produces. I also gave some example of the different types of ADF that are performed for different algorithms.

One of the features of the transformations produced is that it will generate a REVERSE_EXPRESSION. This will take the scored results and apply the inverse of the transformation that was performed when the data was being prepared for input to the algorithm.

Somethings you may want to have the scored data returned in a slightly different ways or labeled in a slightly different way.

In this blog post I will show you how to define an alternative REVERSE_EXPRESSION for an attribute.

The function we need to use for this is the ALTER_REVERSE_EXPRESSION procedure that is part of the DBMS_DATA_MINING package.

When we score data for a typical classification problem we typically use 0 (zero) and 1 to be the target variable values. But what if we wanted the output from our classification model to label the scored data slighted differently.

In this case we can use the ALTER_REVERSE_EXPRESSION procedure to define the new values. What if we wanted the zero to be labeled as NO and the 1 as YES. In this case we can use the following.

BEGIN

dbms_data_mining.alter_reverse_expression(

model_name => 'CLAS_NB_1_59',

expression => 'decode(affinity_card, ''1'', ''YES'', ''NO'')',

attribute_name => 'AFFINITY_CARD');

END;

When we view the transformations for our data mining model we can now see the transformation.

Now when we score our data the predicted target variable will now have our newly defined values.

SELECT cust_id,

PREDICTION(CLAS_NB_1_59 USING *) PRED

FROM mining_data_apply_v

FETHC FIRST 5 ROWS ONLY;

Blog dat trans 4

You can see that this is a very powerful feature and allows use to turn the scored data values is a different way to make them more useful. This is particularly the case as we work towards a more Automatic type of Predictive Analytics.

Saturday, April 18, 2015

ODM : View Transformations generated by Automatic Data Prepreparation

A very powerful feature of Oracle Data Mining and one that I think does not get enough notice is called Automatic Data Preparation.

Data Preparation is one of the most time consuming, repetitive and boring parts of the work that a Data Miner or Data Scientist performs as part of their daily tasks. Apart from gathering the data, integrating the data, getting the data into the required formation the most interesting part of the work is with feature engineering.

Then you have all the other boring data preparation tasks of how to handle missing data, type conversion, binning, normalization, outlier treatment etc.

With Automatic Data Preparation (ADP) in Oracle Data Mining you can let Oracle work all of these things out for you and to perform all the necessary coding and to store all of this coding as part of the in-database data mining model.

This is Fantastic. This ADP feature can same you hours and in some cases days of effort.

But (there is always a but :-) ) what if you are a bit unsure if the transformations that are being performed are exactly what you would wanted. Maybe you would like to see what Oracle is doing and depending on this you can do it a different way.

The first step is to examine the transformations that are generated by stored as part of the in-database data mining model. The DBMS_DATA_MINING package has a function called GET_MODEL_TRANSFORMATIONS. When you query this function, passing in the name of the data mining model, you will get returned the list of transformations that have been applied to each model.

In the following example a GLM model was created using the Oracle Data Miner tool (that is part of SQL Developer). When you use Oracle Data Miner, ADP is automatically turned on.

The following query calls the GET_MODEL_TRANSFORMATIONS function with the data mining model called CLAS_GLM_1_59/.

SELECT * FROM TABLE(DBMS_DATA_MINING.GET_MODEL_TRANSFORMATIONS('CLAS_GLM_1_59'));

The following image contains the output generated by this query.

Blog dat trans 1

When you look at the data under the EXPRESSION column we get to see what the ADP did to the data. In most of the cases there are just some simple data clean-up being performed and formatting for getting the data ready for input into the algorithm.

If we now look at the Naive Bayes model for the same data set we get a very different sent of transformations being listed under the EXPRESSION column.

SELECT * FROM TABLE(DBMS_DATA_MINING.GET_MODEL_TRANSFORMATIONS('CLAS_NB_1_59'));

Blog dat trans 2

Now we get to see some of the data binning that ADP performs and is required for input to the Naive Bayes algorithm. You will also notices that we also have some transformations in the REVERSE_EXPRESSION column. These are the inverse or reverse of the transformation that was generated in the EXPRESSION column.

I will let you explore the data transformations that are produced by ADP for the SVM and Decision Tree algorithms.

I will show you how you change the reverse expression in my next blog post, as there are times when you might want the data to be presented slightly differently after the model has been run to score your data.

To get more details of what Automatic Data Preparation is performed for each data mining algorithm you can check out this link in the 11g documentaion. This section seems to be missing from the online 12c documentation.

Friday, April 3, 2015

Oracle Magazine - Fall 1987

The headline articles of Oracle Magazine for Fall 1987 (the Second issue) were on Enhancing cellular communications for Canadian Cellular, using Oracle to combat predatory starfish on Australia's Great Barrier Reef and breaking the 640K barrier using powerful microcomputers.
This was a bumper issue in comparison to the very first edition.

Tuesday, March 31, 2015

OTech article on Predictive Queries

Last week the Spring 2015 edition of OTech Magazine was published.

Check out the link to the it here.

I was lucky to have an article accepted and published in this edition and the topic of the article was on Predictive Queries.

I've given a presentation on Predictive Queries at a few Oracle User Group conferences over the past 6 months or so, and this article covers what I talk about in that presentation.

The article covers what Predictive Queries are about and goes through some example of how you can use them. Again I give some of these examples in my presentation.

Now is your chance to try out Predictive Queries using the examples in the articles.

Recently I recorded a very short video with Bob Hubbard of OTN on this topic as part of his 2 Minute Tech Tips. Check out my blog post about this video and view the video.

Thursday, March 12, 2015

Automatic Analytics is So main stream. Not something new.

Everyone is doing advanced analytics. Right? Hmm

Everyone is talking about advanced analytics? Yes that is true.

Everyone is an expert in advanced analytics? This is so not true. Watch out for these Great Pretenders. You know what I mean! You know who I mean! Maybe you know some of them already? If not, watch out for these Great Pretenders!!!

Some people are going around talking about data mining, predictive analytics, advanced analytics, machine learning etc as if this is some new topic. Well it isn't. It isn't anything new and most of the techniques have been about for 10, 20, 30+ years.

Some people are saying you should only use language X or tool Y because. Everything else is basically rubbish.

What we do have is a wider understanding of how to use these techniques on our various data sources.

What we have is a lot more tools that allow us to perform these tasks a lot easier, at greater speed, with more functionality and without the need to fully understand the hard core maths that is going on behind the scenes.

What we have is a lot more languages to perform these tasks and to support the vast amount of work that goes into understanding the data and preparing the data.

Someone thing for all of us to watch out for, when we ready about these topics, is what kind of problem area they are addressing. The following table illustrates the three main types or categories of Analytics. These categories are Descriptive Analytics, Predictive Analytics and Prescriptive Analytics. I think most people would agree that the Descriptive and Predictive Analytics categories are very mature at this stage. With Predictive Analytics we are perhaps still evolving in this category and a lot more work needs to be done before this this become wide spread.

Some people talk as if Predictive Analytics is some new and exciting topic. But isn't all that new. It was been around for the past 30+ years. If you go back over the Gartner Hype Cycle that comes out every September, Predictive Analytics is no longer being shown on this graph. The last time it appeared on the Gartner Hype Cycle was back in 2013 and it was positioned on the far right of the graph in the section called Plateau of Productivity.

So Predictive Analytics is very mature and main stream. Part of the reason that it is main stream is that Predictive Analytics has allowed for a new category of Analytics to evolve and this is Automatic Analytics.

Automatic Analytics is where Advanced and Predictive Analytics has been build into our day to day applications that are used to run our business. We do not need the hard core type of data scientists to perform various analytic on our data. Instead these task, once they have been defined, can then be added to our applications to process, evaluate and make decisions all automatically. This is were we need the data scientists to be able to communicate with the business and be able to work with them to solve real world business projects. This is a different type of data scientist to the "hard" core data scientist who delves into the various statistical methods, machine learning methods, data management methods, etc.

The following table extends the table given above to include Automatic Analytics, and is my own take on how and where Automatic Analytics fits.

Every time we get an insurance quote, health insurance quote, get a "random" call from our Telco offering a free upgrade, get our loyalty card statements, get a loan from the bank, look at or buy a book on Amazon, etc. the list could go on and on, but these are all examples of how predictive analytics has been automated into our everyday business application.

But this is nothing new. When I first got into data mining/predictive analytics over 16 years ago, it was considered a common thing that certain types of companies did. What has happened in the time since and particularly in the past few years is that a lot more people are seeing the value in using it.

Before I finish off this post we can have a quick look at what Oracle has been doing in this area. They have their Advanced Analytics Option and Real-Time Decisions tools to all data scientists do their magic. But over the past X years (nobody can give me an exact number) they have been very, very active in building in lots and lots of predictive analytics into their various business applications, particularly with into with Fusion Apps and BI Apps.

A recent quote from Oracle highlights their aim with this,

" ... products designed to close the gap between data scientists and businesses."

Now with Oracle making a big push to the cloud, they are busy adding in more and more Automatic (Predictive) Analytics into their Cloud Applications. What we need from Oracle is a clearer identification of where they have done this. Plus with the migration of their Apps to the cloud, their Advanced Analytics Option is a core part of their Cloud platform. As they upgrade or add new features into their Cloud Apps, you will now be able to get the benefit of these Automatic (Predictive) Analytics as they come available.

Pages

Wednesday, May 27, 2015

Thursday, May 14, 2015

Friday, May 8, 2015

Monday, May 4, 2015

Thursday, April 30, 2015

Friday, April 24, 2015

Saturday, April 18, 2015

Friday, April 3, 2015

Tuesday, March 31, 2015

Thursday, March 12, 2015