Monday, September 30, 2013

Roundup of OOW13–Part 1

Oracle Open World 13 finished up last Thursday and I’ve send most of the time since either traveling back to Ireland, trying to re-adjust to Irish time and most importantly spending some time with the family.

I want to use this blog post to say ‘thank you’ to a large number of people. I will have another blog post on some of the important announcements (for me) at Oracle Open World.

I had 2 presentations at OOW and I also assisted at a hands-on lab. These were spread out over Monday to Wednesday, with one presentation/lab on each day. It was fun and many thanks to the people who came to these sessions. I had many conversations during the remainder of OOW with some of the people who attended my sessions.

I would like to thank Lillian, Vikki and all the OTN team for making all the arrangements, organizing events and lots and lots of things that happened at OOW. The Oracle ACE Programme paid for my flights and accommodation while I was at the ACE Directors briefing at Oracle HQ before OOW and during OOW.

I’ve heard stories about the ACE Directors briefing over the years. This year was my first one and all I can say is that it lived up to the stories and more. Over 2 (long-ish) days we had a number of SVPs, VPs, and other very senior people from various divisions of Oracle come into the briefing and openly tell us what is happening with the products, what will be announced at OOW and what will be coming over the next 6 to 12 months. They told us all of this under NDA, so we cannot talk about any of it until Oracle has publically announced it. We also had Thomas Kurian EVP talk to us for just over an hour. He was very impressive.  In my next blog post I will talk about some of the announcements that were made during OOW.

Thank you to DIT for making the necessary arrangements and facilitating  me to be away for this prolonged trip.

There were many social events during OOW and on some evening there was 3 different events for me to go to. Unfortunately I wasn’t able to get to all of them. I met up with lots of people from countries like Norway, Belgium, Holland, UK, Austria, Australia, Canada, Brazil, Ireland and the USA.  Many thanks to RittmanMead and Peak Indicators for the meals, and lots of user groups and various Oracle countries for the drinks.

On my last night at OOW I went to my first baseball game between the Giants and the Dodgers, with Roel Hartman from Holland. This was a great night of entertainment and I would highly recommend going to a baseball game. The Giants won.

As always I’m sure that I have missed some people off this list. I do apologise and I’ll blame the jet lag for it Smile

OOW is great for all the conference swag you can get. I tried not to get too carried away with it, but the photographs below will give you some idea of what I collected.

image

Tuesday, September 24, 2013

Adding Oracle Data Miner to OBIEE

Oracle Data Miner is a very powerful tool that provides advanced machine learning algorithms that are embedded in the Oracle database. By using Oracle Data Miner you do not have to use another tool, from another vendor, to do your data mining. You can do everything in the database, ensuring that the security of your data is maintained and use all the performance functionality that comes with the database.
To add to the advanced insights that you can get from using ODM, you can combine ODM with your OBIEE dashboards to gain a deeper level of insight of your data. This is the combining of data mining techniques and visualization techniques.
The purpose of this blog post is to show you the steps involved in adding an ODM model to your OBIEE dashboards. Lots of people have been asking for the details of how to do it, so here it is.
The following example is based on a presentation that I have given a few times (OUG Ireland, UKOUG, OOW) with Antony Heljula.
1. Export & Import the ODM model
If your data mining analysis and development was completed in a different database to where your OBIEE data resides then you will need to move the ODM model from ODM/development database to the OBIEE database.
ODM provides two PL/SQL procedures to allow you to easily move your ODM model. These procedures are part of the DBMS_DATA_MINING package. To export a model you will need to use the DBMS_DATA_MINING.EXPORT_MODEL procedure. Similarly to import your (exported) ODM model you will use the DBMS_DATA_MINING.IMPORT_MODEL procedure.
2. Create a view that uses the ODM model
You can create a view that uses the PREDICTION and PREDICTION_PROBABILITY functions to apply the import ODM model to your data. For example the following view is used to score our customer data to make a prediction of they are going to churn and the probability that this prediction is correct.
SELECT st_pk,
       prediction(clas_decision_tree using *) WITHDRAW_PREDICTION,
       prediction_probability(clas_decision_tree using *) WITHDRAW_PROBABILITY
FROM   CUSTOMER_DATA;

clip_image002
3. Import the view into the Physical layer of the BI Repository (RPD)
The view was then imported into the Physical layer of the BI Repository (RPD) where it was joined on primary key to the other customer tables (we had one records per customer in the view). With the tables being joined, we can use the prediction columns to filter the customer data. For example filter all the customer who are likely to churn, WITHDRAW_PREDICTION = ‘N’
clip_image002[11]
clip_image002[13]
4.Add the new columns to the Business Model layer
The new prediction columns were then mapped into the Business Model layer where they could be incorporated into various relevant calculations e.g. % Withdrawals Predicted, and then subsequently presented to the end users for reporting
clip_image002[9]
5. Add to your Dashboards
The Withdraw prediction columns could then be published on the BI Dashboards where they could be used to filter the data content. In the example below, the use has chosen to show data for only those customers who are predicted to Withdraw with a probability rating of >70%
clip_image002[5]

Sunday, September 22, 2013

Oracle Magazine review–May/June 2000

The headline articles of Oracle Magazine for May/June 2000 were on the evolution of organisations into adopting a e-business model. It included 8 steps on evolving to e-business, how to use the Oracle Internet Platform, Web Portal and Java technology.
image
Other articles included:
  • Tom Kyte has an article on Oracle Availability Options and explains when to implement Oracle Parallel Server or replication or a standby database.
  • There is a new release of Oracle Discover 3i and Oracle Reports 6i that support XML and are part of the Oracle Intelligent WebHouse initiative.
  • Oracle licenced the mobile moddleware developed by Nettech System, to support Oracle’s steps into this field.
  • There is an overview of the IOUG-A Live! 2000 conference which is being held between 7-11 May in the Anaheim Convention Centre. Over 4,000 attendees are expected.
  • Kelli Wiseth gives and overview of Java 2, explaining the differences between J2SE and J2EE. The article also discusses how Java is part of the Oracle Internet Platform.
  • Steven Feuerstein gives the second part of his article on using Java Classes and Objects in the Oracle 8i database.
  • Richard Niemiec has an articles on Fundamental Tuning Goals and details the followings:
    • Allocate the right amount of memory for the Oracle instance.
    • Keep the right data in memory.
    • Find problem queries.
  • Kevin Loney had an article on how to protect your database from security threats. These included:
    • Guard your backups and development environments
    • Know your default user and applications accounts
    • Control the distribution of database names and locations
    • Use auditing effectively
    • Make password changes mandatory yet simple
    • Isolate your production database
  • Venkat Devraj talks discuss six storage tips for 24x7 availability
    • Know and understand RAID options
    • Choose your disk-array size with caution
    • Do not use read ahead caches for online transaction processing applications
    • Do not reply on write caches to eliminate I/O hot spots
    • Consider using multilevel RAIDS
    • Ensure that your stripe sizes are consistent with your OS and database block sizes
To view the cover page and the table of contents click on the image at the top of this post or click here.
My Oracle Magazine Collection can be found here. You will find links to my blog posts on previous editions and a PDF for the very first Oracle Magazine from June 1987.

Thursday, September 19, 2013

Nested Tables (and Data) in Oracle & ODM

Oracle Data Mining uses Nested data types/tables to store some of its data. Oracle Data Mining creates a number of tables/objects that contain nested data when it is preparing data for input to the data mining algorithms and when outputting certain results from the algorithms.  In Oracle 11.2g there are two nested data types used and in Oracle 12.1c we get an additional two nested data types. These are setup when you install the Oracle Data Miner Repository. If you log into SQL*Plus or SQL Developer you can describe them like any other table or object.

DM_NESTED_NUMERICALS

image

DM_NESTED_CATEGORICALS

image

The following two Nested data types are only available in 12.1c

DM_NESTED_BINARY_DOUBLES

image

DM_NESTED_BINARY_FLOATS

image

These Nested data types are used by Oracle Data Miner in preparing data for input to the data mining algorithms and for producing the some of the outputs from the algorithms.

Creating your own Nested Tables

To create your own Nested Data Types and Nested Tables you need to performs steps that are similar to what is illustrated in the following steps. These steps show you how to define a data type, how to create a nested table, how to insert data into the nested table and how to select the data from the nested table.

1. Set up the Object Type

Create a Type object that will defines the structure of the data. In these examples we want to capture the products and quantity purchased by a customer.

create type CUST_ORDER as object
(product_id     varchar2(6),
quantity_sold  number(6));
/

2. Create a Type as a Table

Now you need to create a Type as a table.

create type cust_orders_type as table of CUST_ORDER;
/

3. Create the table using the Nested Data

Now you can create the nested table.

create table customer_orders_nested (
cust_id       number(6) primary key,
order_date    date,
sales_person  varchar2(30),
c_order       CUST_ORDERS_TYPE)
NESTED TABLE c_order STORE AS c_order_table;

4. Insert a Record and Query

This insert statement shows you how to insert one record into the nested column.

insert into customer_orders_nested
values (1, sysdate, 'BT', CUST_ORDERS_TYPE(cust_order('P1', 2)) );

When we select the data from the table we get

select * from customer_orders_nested;

   CUST_ID ORDER_DAT SALES_PERSON
---------- --------- ------------------------------
C_ORDER(PRODUCT_ID, QUANTITY_SOLD)
-----------------------------------------------------
         1 19-SEP-13 BT
CUST_ORDERS_TYPE(CUST_ORDER('P1', 2))

It can be a bit difficult to read the data in the nested column so we can convert the nested column into a table to display the results in a better way

select cust_id, order_date, sales_person, product_id, quantity_sold
from customer_orders_nested, table(c_order)


   CUST_ID ORDER_DAT SALES_PERSON                   PRODUC QUANTITY_SOLD
---------- --------- ------------------------------ ------ -------------
         1 19-SEP-13 BT                             P1                 2


5. Insert many Nested Data items & Query

To insert many entries into the nested column you can do this

insert into customer_orders_nested
values (2, sysdate, 'BT2', CUST_ORDERS_TYPE(CUST_ORDER('P2', 2), CUST_ORDER('P3',3)));

When we do a Select * we get

   CUST_ID ORDER_DAT SALES_PERSON
---------- --------- ------------------------------
C_ORDER(PRODUCT_ID, QUANTITY_SOLD)
-------------------------------------------------------------
         1 19-SEP-13 BT
CUST_ORDERS_TYPE(CUST_ORDER2('P1', 2))

         2 19-SEP-13 BT2
CUST_ORDERS_TYPE(CUST_ORDER2('P2', 2), CUST_ORDER2('P3', 3))

Again it is not easy to ready the data in the nested column, so if we convert it to a table again we now get a row being displayed for each entry in the nested column.

select cust_id, order_date, sales_person, product_id, quantity_sold
from customer_orders_nested, table(c_order);

   CUST_ID ORDER_DAT SALES_PERSON                   PRODUC QUANTITY_SOLD
---------- --------- ------------------------------ ------ -------------
         1 19-SEP-13 BT                             P1                 2
         2 19-SEP-13 BT2                            P2                 2
         2 19-SEP-13 BT2                            P3                 3

Friday, September 13, 2013

Upgrading ODMr and SQL Dev forEA2

The Early Adopter 2 of Oracle SQL Developer was released yesterday (Thursday 12th Sept). To install this new version of the Tool, including Oracle Data Miner, you can follow the instructions below

  • Go to the EA2 download page and download the EA2 release
  • Unzip the EA2 download
  • Create a new shortcut that point to the sqldeveloper.exe
  • Start SQL Developer EA2
  • You will be prompted for the location of the Java JDK. On my VM it was  C:\Program Files\Java\jdk1.7.0_25. 
  • Next you are prompted about importing your setting from the previous version. Select Yes.
  • After the setting have been imported SQL Developer will open and you are now able to enjoy

Oracle Data Miner

  • For Oracle Data Miner you need to make the option visible by selecting Tools->Data Miner->Make Visible. This will open the ODM connection tabs along with a couple of others. I’m running the following on a 12.1c database.
  • Open one of your ODM connections by double clicking on it.
  • ODM will check the version of the ODM repository in the DB. You will be prompted to upgrade the ODM repository to the latest version. Click on the Yes button

image

  • Enter the SYS password, or talk nicely to your DBA.

image

  • Then click the Start button to start the ODM repository upgrade

image

  • This will take anything from a minute to 10 minutes, depending on the location of the DB and your network.
  • When everything is finished you can close the window and start using Oracle Data Miner by opening an existing workflow or by creating a new one.

Friday, August 30, 2013

Oracle Magazine-March/April 2000

The headline articles of Oracle Magazine for March/April 2000 were focused on e-business. There was articles covering the typical issues in setting up an e-business, the technical environment and some reports from organisation who have used the Oracle tools.

image

Other articles included:

  • Oracle releases their Oracle XML Developer’s Kit (Oracle XDK), with support for a variety of programming languages. It included XML Parsers for Java, C, C++ and PL/SQL. XSL Processor, XML Class Generator, and XML Transviewer Java Beans.
  • Oracle 8i Lite for the Palm Computing and Psion EPOC operating systems is available.
  • Oracle acquires Carleton, who were innovators of data quality and mainframe data extraction software for customer focused data warehousing applications.
  • Oracle releases Oracle Fail Safe 3.0 which was used to protect Microsoft Windows NT applications and databases, and supported Oracle 7, 8 and 8i, Oracle Developer Server 6.0 Forms and Reports Servers, Oracle Application Server 4.0 and Microsoft Internet Information Server 4.0.
  • Steven Feuerstein has an article about getting started with Calling Java from PL/SQL and gives a simple example to illustrate how to do this. The necessary system privileges included JAVASYSPRIV for the DBA and JAVAUSERPRIV for those schemas who want to call the Java code
  • Graham Wood and Connie Dialeris give an overview of Statspack that was was released with Oracle 8.1.6. The article covered the various features, how to install it and how to configure the Snapshot Level & SQL Thresholds. The article also gave an example of how to use DBMS_JOB to automate the collecion of the statistics.
  • A Step-by-Step guide on how to use RMAN (that most of use know and love!), including the RMAN architecture, how to setup a backup, starting a backup and the all important step of recovering a backup.
  • With Oracle 7 came the ability to Clone a database. In this article it goes through the steps required to setup and clone a production database.

To view the cover page and the table of contents click on the image at the top of this post or click here.
My Oracle Magazine Collection can be found here. You will find links to my blog posts on previous editions and a PDF for the very first Oracle Magazine from June 1987.

Thursday, August 22, 2013

The 2013 Gartner Hype Cycle

The 2013 Gartner Hype Cycle is out and it can be interesting to compare the new graph with the ones from previous years. Particularly for my interests in Data Science, Big Data, Data Mining, Predictive Analytics and of course the Oracle Database.

Gartner Hype Cycles 2013

from 2012

from 2011

from 2010

from 2009

Tuesday, August 20, 2013

Schema Table Filtering for Oracle Data Miner

If you have been using Oracle Data Miner, that is part of SQL Developer 4 or SQL Developer 3, you will notice that your schema can get filled up with various tables that are created by your workflows. The following image gives an example.

image

These tables can include details of the various algorithms used and their settings, sample tables that were created using the various nodes, etc.  Basically they contain all the information that was setup by each node. Not every node in your workflow will create a table, but a lot do in particular if you have set the Cache or Sample in the Properties tab.

In most cases you do not need to be aware of or use most of these tables.

So How do I hide them, so that my schema table listing only shows me the main tables in my schema ?  By main tables, I mean the tables that you would expect to have in your schema before you started using Oracle Data Miner.

The answer to this question is to apply filters to your tables in SQL Developer. To do this go to your schema in the Connections tab. Expand to get the full list of schema objects and then right click on Tables. You should get a menu like the following.

image

Select Apply Filter from the menu and the Apply Filter window will open. Here you can create filters to apply to the tables in your schema.

To restrict Oracle Data Miner related table you will need to exclude tables that begin with, DM$ and ODMR$. The following image shows these filters.

image

When these filters are applied we only get our schema tables.

image

There are two additional filters you may want to consider. The first of these is for the tables that begin with OUTPUT. These are tables that are created when you build a node sends the outputs from running a model to a table, or some other scenario where the output is sent to a table. In reality this is bad naming and we should use a name that is more meaningful, and reflects the contents of the table. But sometimes you just want to spool the outputs to a table and the name is not important. I have an additional filter to not show these tables (see below).

With SQL Developer 4, Oracle Data Miner seems to generate IOTs, as we can see in the above image. Again another filter can be created to exclude these from the list.

Here is the full list of filters.

image

Tuesday, August 6, 2013

Depreciated ODM features in 12c

With the release of the Oracle 12c Database there has been some changes to the Advanced Analytics options. Most of the new features both in the database and in the ODM tool have been documented in previous blog posts.
But what has been removed from the Advanced Analytics Option and what is not longer supported.
The first of these is the Java API that Oracle supplied many, many years ago. They have been saying for a few years now and since the release of 11.2g that these Java APIs are no longer supported. Again the documentation states this and the demo scripts are not included in the latest SQL Developer 4.  Instead of using the Java APIs you can using the in-database SQL functions and procedures.
One of the in-database DM algorithms was the Adaptive Bayes Network (ABN). Although this was de-supported in 11.2g of the database is was still in the database. This was to give customers who were still using it time to migrate to using the other algorithms. In 12c the ABN algorithm is not in the the database.  Before you upgrade your 11.2.x Oracle database to 12c you will need to drop any ABN models that you have in your database

Thursday, July 25, 2013

12c New Data Mining functions

With the release of Oracle 12c we get new functions/procedures and some updated ones for Oracle Data Miner that is part of the Advanced Analytics option.

The following are the new functions/procedures and the functions/procedures that have been updated in 12c, with a link to the 12c Documentation that explains what they do.

  • CLUSTER_DETAILS is a new function that predicts cluster membership for each row. It can use a pre-defined clustering model or perform dynamic clustering. The function returns an XML string that describes the predicted cluster or a specified cluster.

  • CLUSTER_DISTANCE is a new function that predicts cluster membership for each row. It can use a pre-defined clustering model or perform dynamic clustering. The function returns the raw distance between each row and the centroid of either the predicted cluster or a specified.

  • CLUSTER_ID has been enhanced so that it can either use a pre-defined clustering model or perform dynamic clustering.

  • CLUSTER_PROBABILITY has been enhanced so that it can either use a pre-defined clustering model or perform dynamic clustering. The data type of the return value has been changed from NUMBER to BINARY_DOUBLE.

  • CLUSTER_SET has been enhanced so that it can either use a pre-defined clustering model or perform dynamic clustering. The data type of the returned probability has been changed from NUMBER to BINARY_DOUBLE

  • FEATURE_DETAILS is a new function that predicts feature matches for each row. It can use a pre-defined feature extraction model or perform dynamic feature extraction. The function returns an XML string that describes the predicted feature or a specified feature.

  • FEATURE_ID has been enhanced so that it can either use a pre-defined feature extraction model or perform dynamic feature extraction.

  • FEATURE_SET has been enhanced so that it can either use a pre-defined feature extraction model or perform dynamic feature extraction. The data type of the returned probability has been changed from NUMBER to BINARY_DOUBLE.

  • FEATURE_VALUE has been enhanced so that it can either use a pre-defined feature extraction model or perform dynamic feature extraction. The data type of the return value has been changed from NUMBER to BINARY_DOUBLE.

  • PREDICTION has been enhanced so that it can either use a pre-defined predictive model or perform dynamic prediction.

  • PREDICTION_BOUNDS now returns the upper and lower bounds of the prediction as the BINARY_DOUBLE data type. It previously returned these values as the NUMBER data type.

  • PREDICTION_COST has been enhanced so that it can either use a pre-defined predictive model or perform dynamic prediction. The data type of the returned cost has been changed from NUMBER to BINARY_DOUBLE.

  • PREDICTION_DETAILS has been enhanced so that it can either use a pre-defined predictive model or perform dynamic prediction.

  • PREDICTION_PROBABILITY has been enhanced so that it can either use a pre-defined predictive model or perform dynamic prediction. The data type of the returned probability has been changed from NUMBER to BINARY_DOUBLE.

  • PREDICTION_SET has been enhanced so that it can either use a pre-defined predictive model or perform dynamic prediction. The data type of the returned probability has been changed from NUMBER to BINARY_DOUBLE.

Tuesday, July 23, 2013

Oracle Data Miner New Features (SQL Dev 4)

With the release of the new Oracle 12c database and SQL Developer 4 we have a range of Oracle Data Miner new features . Some of these are embedded into the database and are only available in 12c. Check out my previous blog post on these new features.

In this blog post I will look at the new Oracle Data Miner features that come with the ODM tool in SQL Developer4.

The new features of the Oracle Data Miner tool can be grouped into 2 categories. The first category contains the new features that are available to all user of the tool (11.2g and 12c). The second category contains the new features that are only available in 12c. The new features of each of these categories will be explained below.

Category 1 – Common new features for 11.2g and 12c Database users

There is a new View Data feature that allows you to drill down to view the customer object and to view nested tables.

A new Graph Node that allows you to create graphs such as line, bar, scatter and boxplots for data at any stage of a workflow. You can specify any of the attributes from the data source for the graphs. You don’t seem to be limited to the number of graphs you can create.

image

A new SQL Node. This is welcome addition, as there has been many times that I’ve need to write some SQL or PL/SQL to do a specific piece of processing on the data that was not available with the other nodes. There are 2 important elements to this SQL node really. The first is that you can write SQL and PL/SQL code to do whatever processing you want to do. But you can only do it on the Data node you are connected to.

image

The second is that you can use it to call some ORE code. This allows you to use the power of R and extensive range of packages that are available to expand the analytic functionality that is available in the database. If there is some particular function that you cannot do in Oracle and it is available in R, you can now embed this function/code as an ORE object in the database. You can then called using SQL. 

WARNING: this particular feature will only work if you have ORE installed on your 11.2.0.3g or 12.1c database

New Model Build Node features, include node level text specifications for text transformations, displays the heuristic rules responsible for excluding predictor columns and being able to control the amount of classification and regression test results that are generated.  I’ll be covering these in later blog posts.

New Workflow SQL Script Deployment features. Up to now the workflow SQL script, I found to be of limited use. The development team have put a lot of work into generating a proper script that can be used by developers and DBA. But there are some limitations still. You can use the script will run the workflow automatically in the database without having the use the ODM tool. But it can only be run the in the schema that the workflow was generated. You will still have to do a lot of coding (although a lot less than you used to) to get your ODM models and workflows to run in another schema or database.

image

This will output the script to a file buried deep somewhere inside you SQL  Developer directory.  Unfortunately in the EA1 release, the size of this location field is small and scrolling has not been enabled. So you cannot (currently) scroll to the end of the field to see the actual location.  You can edit this location to have a different shorter location.

image

Maybe this will be fixed for the official release.

Category 2 – New features for 12c Database users.

Now for the new features that are only visible when you are running ODM / SQL Dev 4 against a 12c database. No configuration changes are needed. The ODM tool checks to see what version of the database you are logging into. It will then present the available features based on the version of the database.

New Predictive Query nodes allows you to build a node based on the new non-transient feature in 12c called Predictive Queries (PQs). In SQL Developer we get 3 addition types of Predictive Queries. These can be used for Anomaly Detection, Clustering and Feature Extraction

image

It is important to remember that underlying model produced by these PQs to not exist in the database after the query has executed. The model is created, used on the data and then the model deleted.

The Clustering node has the new algorithm Expectation Maximization in addition to the existing algorithms of K-Means and O-Cluster.

image

The Feature Extraction node has the new algorithm called Principal Component Analysis in addition to the existing Non-Negative Matrix Factorization algorithm.

image

Text Transformations are now built into the model build nodes. These text transformations will be part of the Automatic Data Processing steps for the model build nodes. This is illustrated in the above images.

The Generalized Linear Model that is part of the Classification Node has a Feature Selection option in the Algorithm Settings. The default setting is Ridge Regression. Now there is an additional option of using Feature Selection.

image

Prediction Result Explanations gives the scoring details used to to explain why the prediction was made.

 

Look out for blog post on each of these new features.

Friday, July 19, 2013

Oracle 12c Books

Oracle 12c is only a few days old and there are books available on Amazon. The carousels below list some of the books available on Amazon.com and Amazon.co.uk