Showing posts with label SQL Developer. Show all posts
Showing posts with label SQL Developer. Show all posts

Monday, August 28, 2017

How to speed up your Oracle Data Mining with in-memory and parallel

Have you have found running a workflow in Oracle Data Miner slow or running the scripts in the database slow ?

No. Good, because I haven't found it slow.

But (there is always a but) it really depends on the volume of data your are dealing with. For the vast majority of us who aren't of the size of google, amazon, etc have data volumes that are not that large really and a basic server can process many millions of records extremely quickly using Oracle Data Mining.

But what if we have a large volume of data. In one recent project I had a data set containing over 3.5 billion records. Now that is big data. All of this data sitting in an Oracle Database.

So how can we process over 3.5 billion records in a couple of seconds, building 4 machine learning models in that time? Is that really possible with just using an Oracle Database? Yes is the answer and very easily. (Surely I needed Hadoop and Spark to process this data? Nope!)

The Oracle Data Miner (ODMr) tool comes with a new feature in SQL Developer 4 (and higer) that allows you to manage using Parallel execution and the in-memory DB features. These can be accessed on the ODMr Worksheet tool bar.

NewImage

The best time to look at these setting is when you have created your workflow and are ready to run it for the first time. When you click on the 'Performance Options' link, you will get the following window. It will display the list of nodes you have in the workflow and will then indicate if the Degree of Parallel and the In-Memory options can be set for each of the nodes.

NewImage

The default values are shown and you can changes these. For example, in a lot of scenarios you might prefer to leave the Degree of Parallel as System Determined. This will then use whatever the the default is for the database and controlled by the DBA, but if you want to specify a particular value then you can, for example setting the degree of parallel to 4 for the 'Class Build' node, in the above image. Similarly for the in-memory option, this will only be available for nodes where the in-memory option would be applicable. This will be where there is a lot of data processing (preparing data, transforming data, performing specific statistics, etc) and for storing any data that is generated by Oracle Data Mining.

But what if you want to change the default values. You can change these at a global level within the SQL Developer Preferences. Here you can set the default to be used for each of the different types of Oracle Data Mining nodes.

NewImage

I mentioned at the start that I've been able to build 4 machine learning models using Oracle Data Mining on a data set of over 3.5 billion records, all in a couple of seconds. In my scenario Parallel was set to 16 and we didn't use in-memory as we didn't have the licence for it. You can see that machine learning at lighting speed (ish) is possible. This timing is only for building the models, which is the step that consumes the most about of resources and time. When it comes to scoring the data, that is lighting fast. In may scenario, scoring over 300,000 was less than a second, and I didn't use parallel or anything else to speed things up. Because we didn't need to.

Go give it a try!

Monday, August 21, 2017

Scheduling ODMr Workflows in SQL Developer 4.2+

A new feature for Oracle Data Mining (ODM) (part of SQL Developer 4.2) is the ability to schedule an ODM workflow to run a defined time or frequency. This blog post will bring you through the steps need to schedule an ODM workflow using this new feature. The first thing that you need is an ODMr workflow. The following image is a familiar looking one that I typically use to get a very quick demo of how easy it is to build a machine learning workflow. NewImage Just above the workflow worksheet we have a row of icon buttons. In the above image one of these is highlighted by a red box. This is the workflow scheduler. So go ahead on click on it. NewImage In most cases you will want to run the entire workflow. The default option presented to is 'All Nodes'. If you would only like a subset of the nodes to run, you can click-on or select the node in the workflow and then click on the scheduler icon. In our example we are going to run the entire workflow, so select 'All Nodes' from the menu. NewImage The main scheduler window will open. Here you can set the Start Date and time of the first run, what the Repeat frequency is (none, every day, every week or custom) and to End the Repeat (Never, After, On Date). To schedule a once off run of the workflow just set the Date and Time, set the Repeat to 'None' and End Repeat should disappear in this instance. If Repeat was set to another value then you can set a value for End Repeat. Go ahead and run the scheduler by clicking on the OK button. NewImage A Scheduled Jobs window should open that will display the details of the scheduled job. When this job is run in the database, this will be shown in the Workflow Jobs window. Here you can see and monitor the progress of the of the workflow. NewImage and that's it. Nice an simple. But there is a something you needed to be WARNED about. When you schedule a workflow, Oracle Data Miner will lock the workflow. This is to ensure that no changes can be made to the scheduled workflow. This is indicated with the Locked button appearing on the icon menu. If you click on this button to unlock the workflow, it will also cancel your scheduled jobs associated with this workflow. NewImage Also when the scheduled workflow is finished, the workflow will remain locked. So you will have to click on this Locked button to unlock the workflow. There are a few additional advanced features. These can be found by clicking on the 'Advanced...' button in the main scheduler window. The first table displayed allows you to specify if you want an email sent for the different stages of the scheduled job. The second tab allows you to set the Job Priority, Max Failures, Max Run Duration and Schedule Limits. NewImage

Wednesday, August 26, 2015

Enabling Autotrace in SQL Developer

This is mainly a note to myself to save me from looking up the details very time I need it. I can just go to my own post.

To use Autotrace in SQL you need to grant the schema the PLUSTRACE role.

But in SQL Developer you need the PLUSTRACE role that has a few additional privileges.

Connect to SYS as sysdba and run the following.

grant select on v_$session to plustrace;
grant select on v_$sql_plan to plustrace;
grant select on v_$sql_plan_statistics to plustrace;
grant select on v_$sql to plustrace;

Now you will be able to run Autotrace in SQL Developer.

(Make sure to reconnect your SQL Developer connection so that the privileges can be picked up).

Monday, May 4, 2015

Oracle Data Miner (ODM 4.1) New Features

With the release of SQL Developer 4.1 we also get a number of new features with Oracle Data Miner (ODMr). These include:

  • Data Source node can now include data sources that contain JSON data, generating JSON schema and has a JSON viewer
  • Create Table can now create data in JSON
  • JSON Query Node allows you to view, query and process JSON data, combine it with relational data, generate sub-group by, and nested columns to be part of input to algorithms
  • New PL/SQL APIs for managing Data Miner projects and workflows. This includes run, cancel, rename, delete, import and export of workflows using PL/SQL.
  • New ODMr Repository views that allows us to query and monitor our workflows.
  • Transformation Node now allows you different ways of handling NULLS.
  • Transformation Node now allows us to create Custom Bins, define bin labels and bin values
  • Overall Workflow and ODMr environment improvements to allow for greater efficiency in workflow behaviour and interactions with the database. So using ODMr should feel quicker and more responsive.

What out for the Gotchas: Although support for JSON has been added to ODMr, as outlined above, you are still a bit limited to what else you can do with your JSON data. Based on the documentation you can use JSON data in the Association and Classification build nodes.

I'm not sure about the other nodes and this will need a bit of investigation to see what nodes can and cannot use JSON data. I'm sure this will all be sorted out in the next release.

Keep an eye out for some blog posts over the coming weeks on how to explore and use these new features of Oracle Data Miner.

Friday, March 6, 2015

RIP SQL*Plus & hello SQL Command Line

Over the past couple of months Oracle has been releasing some EA (Early Adopter) versions of a new tool that is currently called SQL Command Line.

The team behind this new tool is the SQL Developer development team and they have been working on creating a new command line SQL tool that is based on some of the technology that is included in SQL Developer.

SQL Command Line in an stand alone tool and all you need to do is to download and un-zip the tile.

What I want to show in this blog post is some of new features that are available and that I have found particularly useful. But before we get onto those commands let us first have a look at how you can get setup and running with SQL Command Line.

Download & Setup

The current download of SQL Command Line can be found under the SQL Developer 4.1 EA Download page. I'm assuming when 4.1 is formally released the download for SQL Command line will be on the main SQL Developer Download web page.

SQL CL 1

After you have downloaded the file, all you need to do is to unzip the file and then copy the unzipped directory to where you want the software to be located on your client.

Now you are ready to get started with using SQL Command Line.

Connecting to your Oracle Schema

(That) Jeff Smith and Barry McGillin have a couple of good blog posts on the different connection methods and some setup or configuration you might need to consider. Check out these links for more details.

For me I did not have to do any additional setup or configuration. I was able to use the TNS Names and the EZConnect methods without any problems.

The following how to connect to my (DMUSER) schema using the EZConnect method. With this method we pass in the username, password, the host name, port number and the service name. Just like this

> sql dmuser/dmuser@localhost:1521/pdb12c

We can not have a look at the JDBC connection details.

SQL> show jdbc

-- Database Info --

Database Product Name: Oracle

Database Product Version: Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

With the Partitioning, OLAP, Advanced Analytics and Real Application Testing options

Database Major Version: 12

Database Minor Version: 1

-- Driver Info --

Driver Name: Oracle JDBC driver

Driver Version: 12.1.0.2.0

Driver Major Version: 12

Driver Minor Version: 1

Driver URL: jdbc:oracle:thin:@localhost:1521/pdb12c

SQL>


If we have a TNSNAMES.ORA file on our computer and the directory that it is in, is on the search PATH, then we can use the service names defined in the TNSNAMES.ORA file. The following example shows you how to use this in two ways. The first shows how to enter all the details when you are starting SQL CL and the other is when SQL CL prompts you for each parameter.

> sql dmuser/dmuser@pdb12c

and when we are prompted to enter the parameters, we get the following.

> sql

SQLcl: Release 4.1.0 Beta on Thu Mar 05 15:16:12 2015

Copyright (c) 1982, 2015, Oracle. All rights reserved.

SQLcl: Release 4.1.0 Beta on Thu Mar 05 15:16:14 2015

Copyright (c) 1982, 2015, Oracle. All rights reserved.

Username? (''?) dmuser

Password? (**********?) ******

Database? (''?) pdb12c

Connected to:

Oracle Database 12c Enterprise Edition Release 12.1.0.2.0 - 64bit Production

SQL>


As you can see these work in the same way as when we use SQL*Plus.


Now that you are connected to your schema, what else can you do? The following sections are some useful commands.

Commands & Help

The following list of commands is by no means a complete list of commands available in SQL Command Line. Theoretically everything you can currently do in SQL*Plus you can also do in SQL Command Line (theoretically) But the commands I give examples of below are some of my favourites (so far).

You can get the list of commands by typing help at the SQL prompt.

SQL> help

Then to get help on a specific command you can just add the command after the help.

SQL> help cd

CD

---

Changes path to look for script at after startup.

(show SQLPATH shows the full search path currently:

- CD current directory setting set by last cd command

- baseURL (url for subscripts)

- topURL (top most url when starting script)

- Last Node opened (i.e. file in worksheet)

- Where last script started

- Last opened on sqlplus path related file chooser

- SQLPATH setting

- "." if in SQLDeveloper UI (included in SQLPATH in command line (sdsql))

).

SQL>

Some work is still needed on the help documentation and what is listed for each command, as the current version is missing some important details.

Alais

This is by far my favourite new feature. This allows us to take some of our most common SQL statements and to create a shortcut for it.

Very soon I will not be using Oracle SQL but I will be using My SQL, as I will have created my own personalised version of SQL.

To list what aliases you have defined in your schema you can type

SQL > alais

Oracle will have a few aliases already defined in SQL CL. By having a look at some of these you can see some of what you want they can do and get ideas for what you might want to do with them. To list the contents of an alias, you can use the following command.

alias list {alias name}

for example

SQL > alias list tables

This command lists the query that is used for the 'tables' alias that comes with SQL CL.

I use Oracle Data Miner a lot and when you use this tool it can create a number of tables with a variety of names in your schema. Most of these you will never need to look at. So what I do is create an alias that excludes these from the list of tables in my schema.

SQL> alias tables2=select table_name from user_tables where table_name not like 'ODMR$%' and table_name not like 'DM$%' and table_name not like 'SYS_IOT%';

So now all I need to do to list my important data only tables (and exclude all the Oracle Data Miner tables) I can run my alias 'table2'.

SQL> tables2

You will quickly build up a suite of commands using aliases.

info and >info+

info and info+ are the new commands to replace the DESC command.

The difference between info and info+ is that info+ gives you some statistical information about the table and the attributes in the table. This is illustrated in the following examples.

Example using 'info'

Sqlcl 2

Example using 'info+'

Sqlcl 3

CTAS & DDL

If you want to get the DDL script to create a copy of a table you have two options open to you. The first of these is the DDL command. This creates a DDL statement based on the meta data for the table, just like in the following

Sqlcl 4

An alternative to this is to use the CTAS command that will give a slightly different output to DDL command. With the CTAS we also get the CREATE TABLE .. AS SELECT ...

History

In SQL*Plus we had a limited scroll through our previous commands. The same kind of scrolling is available in SQL CL, but we can get to see all our previous commands using the 'history' command. The following illustrates how you can list all you previous commands, I'm sure it is limited to a certain number or will be otherwise it will become a very long, long list.

SQL> history

To find out how often each command has been run you can run

SQL> history usage

and to find out how long the query took to run the last time it was run

SQL> history time


There are lots more that I could show, but this post is way, way to long as it is. What I suggest you do is go and download SQL CL (Command Line) and start using it today.

Friday, November 7, 2014

ODMr : Graph Node: Zooming in on Graphs

When Oracle Data Miner (ODMr) 4.0 (which is part of SQL Developer) came out back in late 2013 there was a number of new features added to the tool. One of these was a Graph node that allows us to create various graphs and charts that include Line, Scatter, Bar, Histogram and Box plot.

I've been using this node recently to produce graphs and particularly scatter plots. I've been using the scatter plots to graph the Actual values in a data set against the Predicted values that was generated by ODMr. In this scenario I had a separate data set for training my ODM data mining models and another testing data set for, well testing how well the model performed against an unseen data set.

In general the graphs produced by the Graph node look good and gives you the information that you need. But what I found was that as you increased the size of the data set, the scatter plot can look a messy. This was in part due to the size of the square used to represent a data point. As the volume of data increased then your scatter plot could just look like a coloured in area of blue squares. This is illustrated in the following image.

Graph node 1

What I discovered today is that you can zoom in on this graph to explore different regions and data point on it. This do this you need to select an data that is within the x-axis and y-axis area. When you do this you will see a box form on your graph that selects the area that you indicate by moving your mouse. After you have finished selecting the area, the Graph Node will zooms into this part of the graph and shows the data points. For example if I select the area from about 1000 on the x-axis and 1000 on the y-axis, I will get the following.

Graph node 2

Again if I select a similar are area of 350 on the x-axis and 400 on the y-axis I get the following zoomed area.

Graph node 3

You can keep zooming in on various areas.

At some point you will have finished zooming in and you will want to return to the original graph. To zoom back outward all you need to do in the graph is to click on it. When you do this you will go back to the previous step or image of the graph. You can keep doing this until you get back to the original graph. Alternatively you can zoom in and out on various parts of the graph.

Hopefully you will find this feature useful.

Thursday, March 20, 2014

Issues with using latest release of ODM

The title of this blog post makes it sound more dramatic than it actually is.

The reason for this blog post is down to me receiving a recent comment on the blog, plus having received numerous emails and a recent OTN Discussion Forum topic for Oracle Data Mining.

The main thing that they have in common is that if I use the latest version of Oracle Data Mining (ODM) it tells me that I need to upgrade my ODM Repository. What impact will this have?

The ODM Repository stores lots of information about the workflows you create using the (free) Oracle Data Mining tool that comes as part of SQL Developer. Yes you do have to pay for the OAA option, so is it really free? Well some part are like the explore node and the graph node.

If you download and want to use the latest version of the ODM tool or you want to try it out before rolling it out to others then you will need to upgrade your ODM repository.

And this the problem that people are facing.

If you upgrade then the ODM Repository it is updated to work with the latest version of the ODM tool. But what happens to everyone else who is using the previous release of the tool? The answer to that is they can no longer use ODM against their database.

Why is that? Well the version of the tool is tied to a version of the Repository. If you upgrade to the newer tool and repository then your older versions of the ODM tool no longer work.

The result of all of this is that you cannot have a mixture of versions of the ODM tool (SQL Developer) being used in your team/company.

There is a very simple solution to all of this. Everyone uses the same version of the ODM tool (i.e. the same version of SQL Developer). For example your team might be using SQL Dev 4 that was released last December. But in early March there was a new patch release 4.1. In order to use this new version of the tool all of your team needs to start using it at the same time. The first person to use it will be prompted to migrate the ODM repository. This is automatically done once you enter the password for SYS.

But in some teams this is not possible to do, you want to try out the tool to see that it works correctly before getting others to use it. The way around this is to have a separate database and use it for your testing. You can easily copy across your workflows and ODM objects to the test database.

This might not be possible for everyone, so what can you do. Create a Virtual Machine and try it out on your own desktop is one way.

The answer to this problem is not ideal, but hopefully you have a better idea of why things are happening this way and what you can or cannot do about it.

Like I said at the topic of this blog post that the title is a bit more dramatic than is really the case :-)


My next blog post will be on another question I've been asked a few times and this is 'When I go to use the ODM tool it tells me that the Oracle Text feature of Oracle needs to be enabled'

Wednesday, March 5, 2014

ODM Graph Node new feature

With the recent release of SQL Developer 4.0.1 there has been some very minor bug fixes for Oracle Data Miner. But there has been one particular enhancement that I wanted to have a look at. This blog post will look at this new feature and how you can use it too. In the previously released version of the Oracle Data Miner tool we had a Graph Node. This is really a new feature that came with SQL Developer 4 and was available in the Early Adopter releases since July 2013. During the Early Adopter releases and with the official release of SQL Dev 4, the Line Graph feature of the Graph Node only allowed you to have one line drawn i.e. for a single Y attribute. Now in SQL Dev/ODM 4.0.1 the Graph Node has been updated to allow you to have multiple Line Graphs that are stacked. To illustrate this we will need to define what data source we are going to use and to create a shell of a Graph Node in Oracle Data Miner. The following diagram illustrates this. In this example I'm using the CARS data set that is available on OTN.

New graph

When you have these nodes created you are now ready to create your graphs. To do this double click on the Graph Node. You can now set the attribute to use for the X-axis, in my example this will be MODEL_YEAR. Then for the Y-Axis select the attributes you want to include in the stacked graph by holding down the control key as you select each attribute

New graph2

You are now ready to create the graph. To do this click on the OK button and you will have your stacked line graph.

New graph3

Wednesday, October 9, 2013

How to Fix Odd Layout Behaviour in ODM v4 EA1 & EA2

If you have been using the last versions of the Oracle Data Miner tool that have been released as part of SQL Developer 4 EA1 and EA2, you may have noticed that the layout of your worksheets and other areas are inconsistent each time you open ODM.

This can be very frustrating as you have to rearrange the layout of the worksheets, property inspection, the ODM connections tab, etc, etc, etc.

This “feature” seems to be linked to when you installed the new version of the software. When you open SQL Dev 4 EA1 & EA2 you are asked if you would like to migrate your settings.  If you selected Yes for this then it is this that is causing the project.

How do you fix this?

1. Export your SQL Developer Connections. To do this go to the Connections tab and right click on Connections and select export from the menu

image

This will create a XML file.  Save this to your desktop so that you can find it easily later.

2. Locate the AppData directory for SQL Developer.  This really depends on your environment. If you are using Windows for your client then the AppData directory will be located somewhere live the following

C:\Users\<your username>\AppData\Roaming\SQL Developer

3. Rename the System4.0… Directory.  Located in the AppData/SQL Developer directory called some like  ‘system4.0.0.12.84’  if you have installed and migrated to SQL Dev 4 EA2.  Rename this directory to another name e.g. ‘system4.0.0.12.84_old’.  This effectively deletes all your setting for SQL Developer 4

4. Start SQL Dev 4.  When you start SQL Developer 4 EA1 or EA2 it will be like you are running the software for the first time.  But this time, when you are asked do you want to migrate your setting from an earlier version, select No.  This will create a clean system folder.

5. Import your Connections.  When SQL Developer opens you can right click on the connection link and select Import Connections (see image above). Then enter the name and location of the file you create in step 1 above.

 

After completing all of the above steps your unusual layout when you open ODM should now be fixed.

Friday, September 13, 2013

Upgrading ODMr and SQL Dev forEA2

The Early Adopter 2 of Oracle SQL Developer was released yesterday (Thursday 12th Sept). To install this new version of the Tool, including Oracle Data Miner, you can follow the instructions below

  • Go to the EA2 download page and download the EA2 release
  • Unzip the EA2 download
  • Create a new shortcut that point to the sqldeveloper.exe
  • Start SQL Developer EA2
  • You will be prompted for the location of the Java JDK. On my VM it was  C:\Program Files\Java\jdk1.7.0_25. 
  • Next you are prompted about importing your setting from the previous version. Select Yes.
  • After the setting have been imported SQL Developer will open and you are now able to enjoy

Oracle Data Miner

  • For Oracle Data Miner you need to make the option visible by selecting Tools->Data Miner->Make Visible. This will open the ODM connection tabs along with a couple of others. I’m running the following on a 12.1c database.
  • Open one of your ODM connections by double clicking on it.
  • ODM will check the version of the ODM repository in the DB. You will be prompted to upgrade the ODM repository to the latest version. Click on the Yes button

image

  • Enter the SYS password, or talk nicely to your DBA.

image

  • Then click the Start button to start the ODM repository upgrade

image

  • This will take anything from a minute to 10 minutes, depending on the location of the DB and your network.
  • When everything is finished you can close the window and start using Oracle Data Miner by opening an existing workflow or by creating a new one.

Tuesday, August 20, 2013

Schema Table Filtering for Oracle Data Miner

If you have been using Oracle Data Miner, that is part of SQL Developer 4 or SQL Developer 3, you will notice that your schema can get filled up with various tables that are created by your workflows. The following image gives an example.

image

These tables can include details of the various algorithms used and their settings, sample tables that were created using the various nodes, etc.  Basically they contain all the information that was setup by each node. Not every node in your workflow will create a table, but a lot do in particular if you have set the Cache or Sample in the Properties tab.

In most cases you do not need to be aware of or use most of these tables.

So How do I hide them, so that my schema table listing only shows me the main tables in my schema ?  By main tables, I mean the tables that you would expect to have in your schema before you started using Oracle Data Miner.

The answer to this question is to apply filters to your tables in SQL Developer. To do this go to your schema in the Connections tab. Expand to get the full list of schema objects and then right click on Tables. You should get a menu like the following.

image

Select Apply Filter from the menu and the Apply Filter window will open. Here you can create filters to apply to the tables in your schema.

To restrict Oracle Data Miner related table you will need to exclude tables that begin with, DM$ and ODMR$. The following image shows these filters.

image

When these filters are applied we only get our schema tables.

image

There are two additional filters you may want to consider. The first of these is for the tables that begin with OUTPUT. These are tables that are created when you build a node sends the outputs from running a model to a table, or some other scenario where the output is sent to a table. In reality this is bad naming and we should use a name that is more meaningful, and reflects the contents of the table. But sometimes you just want to spool the outputs to a table and the name is not important. I have an additional filter to not show these tables (see below).

With SQL Developer 4, Oracle Data Miner seems to generate IOTs, as we can see in the above image. Again another filter can be created to exclude these from the list.

Here is the full list of filters.

image

Tuesday, July 23, 2013

Oracle Data Miner New Features (SQL Dev 4)

With the release of the new Oracle 12c database and SQL Developer 4 we have a range of Oracle Data Miner new features . Some of these are embedded into the database and are only available in 12c. Check out my previous blog post on these new features.

In this blog post I will look at the new Oracle Data Miner features that come with the ODM tool in SQL Developer4.

The new features of the Oracle Data Miner tool can be grouped into 2 categories. The first category contains the new features that are available to all user of the tool (11.2g and 12c). The second category contains the new features that are only available in 12c. The new features of each of these categories will be explained below.

Category 1 – Common new features for 11.2g and 12c Database users

There is a new View Data feature that allows you to drill down to view the customer object and to view nested tables.

A new Graph Node that allows you to create graphs such as line, bar, scatter and boxplots for data at any stage of a workflow. You can specify any of the attributes from the data source for the graphs. You don’t seem to be limited to the number of graphs you can create.

image

A new SQL Node. This is welcome addition, as there has been many times that I’ve need to write some SQL or PL/SQL to do a specific piece of processing on the data that was not available with the other nodes. There are 2 important elements to this SQL node really. The first is that you can write SQL and PL/SQL code to do whatever processing you want to do. But you can only do it on the Data node you are connected to.

image

The second is that you can use it to call some ORE code. This allows you to use the power of R and extensive range of packages that are available to expand the analytic functionality that is available in the database. If there is some particular function that you cannot do in Oracle and it is available in R, you can now embed this function/code as an ORE object in the database. You can then called using SQL. 

WARNING: this particular feature will only work if you have ORE installed on your 11.2.0.3g or 12.1c database

New Model Build Node features, include node level text specifications for text transformations, displays the heuristic rules responsible for excluding predictor columns and being able to control the amount of classification and regression test results that are generated.  I’ll be covering these in later blog posts.

New Workflow SQL Script Deployment features. Up to now the workflow SQL script, I found to be of limited use. The development team have put a lot of work into generating a proper script that can be used by developers and DBA. But there are some limitations still. You can use the script will run the workflow automatically in the database without having the use the ODM tool. But it can only be run the in the schema that the workflow was generated. You will still have to do a lot of coding (although a lot less than you used to) to get your ODM models and workflows to run in another schema or database.

image

This will output the script to a file buried deep somewhere inside you SQL  Developer directory.  Unfortunately in the EA1 release, the size of this location field is small and scrolling has not been enabled. So you cannot (currently) scroll to the end of the field to see the actual location.  You can edit this location to have a different shorter location.

image

Maybe this will be fixed for the official release.

Category 2 – New features for 12c Database users.

Now for the new features that are only visible when you are running ODM / SQL Dev 4 against a 12c database. No configuration changes are needed. The ODM tool checks to see what version of the database you are logging into. It will then present the available features based on the version of the database.

New Predictive Query nodes allows you to build a node based on the new non-transient feature in 12c called Predictive Queries (PQs). In SQL Developer we get 3 addition types of Predictive Queries. These can be used for Anomaly Detection, Clustering and Feature Extraction

image

It is important to remember that underlying model produced by these PQs to not exist in the database after the query has executed. The model is created, used on the data and then the model deleted.

The Clustering node has the new algorithm Expectation Maximization in addition to the existing algorithms of K-Means and O-Cluster.

image

The Feature Extraction node has the new algorithm called Principal Component Analysis in addition to the existing Non-Negative Matrix Factorization algorithm.

image

Text Transformations are now built into the model build nodes. These text transformations will be part of the Automatic Data Processing steps for the model build nodes. This is illustrated in the above images.

The Generalized Linear Model that is part of the Classification Node has a Feature Selection option in the Algorithm Settings. The default setting is Ridge Regression. Now there is an additional option of using Feature Selection.

image

Prediction Result Explanations gives the scoring details used to to explain why the prediction was made.

 

Look out for blog post on each of these new features.

Tuesday, July 16, 2013

Installing & Setting up SQL Developer 4

The EA1 (early adopter) release of SQL Developer is now available. The main reason that I’m interested in this tools is that it has the upgraded Oracle Data Mining workflow tool. I’ve been using SQL Developer for a long, long time.  I was lucky enough to see a demo of it before it was ever released, back ……(well a long, long time ago) when Barry McGillin gave a demo of what they called Project Raptor, to a small group of (12) Oracle users in the Oracle East Point office, Dublin, Ireland. Barry was one of a couple of developers who were developing Project Raptor.
The EA1 release of SQL Developer 4 comes without the JDK install. For SQL Developer 4 you will need to install JDK 1.7.  There is a link from the SQL Developer 4 download page.
image
After installing JDK 1.7 or maybe you have it installed already, you are ready to setup SQL Developer 4. The following instructions are for installing SQL Developer 4 on Windows.
After downloading it from the download page, all you have to do is to unzip the download. There is no install program. You are almost ready to start using SQL Developer.
There are 2 types of setup for SQL Developer. The first is where you have not used SQL Developer before. Point 1 below shows what is involved with this scenario.  Point 2 below shows what is involved if you have used previous releases of SQL Developer.
0.   Common steps to installing and setting up SQL Developer
  • Unzip the SQL Developer 4 download file to a location where you want the software to be located.
  • Go down the directories to where the sqldeveloper.exe is located.
  • Create a shortcut on your desktop for this file.
  • Double click on the shortcut on your desktop
  • Enter the location where JDK 1.7 was installed
    • C:\Program Files\Java\jdk1.7.0_25
  • SQL Developer will start
image
1.   Scenario: Env. that has not used SQL Dev before
image
  • Double click on the connection to open the SQL Worksheet
  • Finally enjoy 12c Smile
2. Scenario: Previous releases of SQL Developer exist
  • When asked about importing preferences from your previous SQL Developer installation, say Yes. This will take the connections from the most recent version of SQL Developer that you have installed. If you want to change this click on the button and select the version from the list
  • The install will progress updating everything and pull in your connects.
  • When finished SQL Developer 4 will open
  • But before you get going you should test that your connections work. An easy way of doing this is to use the pingall command. Open a SQL worksheet, connect to one of your schemas (this will test that your connection works), type pingall and press F5. This will test all of your connections and tell you which ones are currently working and which connections are not (you will see a –1ms).
  • You can now enjoy SQL Developer 4.
During the install of SQL Developer 4 I had an error. After inserting the directory for Java, the progress bar of the loading window got to about 1cm, displaying Registering Extensions above it, and then the loading window closed. SQL Dev 4 did not open.  After various attempts at investigating the problem, it looks like the directory created in AppData (Windows 7) was corrupted in some way. The solution to this problem is to rename or remove the directory.
\AppData\Roaming\SQL Developer\system4.0.0.12.27
When you have renamed or removed this directory, try starting SQL Dev 4 again. Everything should work now. Well it did for me.
Many thanks to Turloch in Oracle for his help.

Thursday, November 24, 2011

Applying an ODM Model to new data in Oracle – Part 2

This is the second of a two part blog posting on using an Oracle Data Mining model to apply it to or score new data. The first part looked at how you can score data the DBMS_DATA_MINING.APPLY procedure for scoring data batch type process.

This second part looks at how you can apply or score the new data, using our ODM model, in a real-time mode, scoring a single record at a time.

PREDICTION Function

The PREDICTION SQL function can be used in many different ways. The following examples illustrate the main ways of using it. Again we will be using the same data set with data in our (NEW_DATA_TO_SCORE) table.

The syntax of the function is

PREDICTION ( model_name, USING attribute_list);

Example 1 – Real-time Prediction Calculation

In this example we will select a record and calculate its predicted value. The function will return the predicted value with the highest probability

SELECT cust_id, prediction(clas_decision_tree using *)
FROM   NEW_DATA_TO_SCORE
WHERE cust_id = 103001;

   CUST_ID PREDICTION(CLAS_DECISION_TREEUSING*)
---------- ------------------------------------
    103001                                    0

So a predicted class value is 0 (zero) and this has a higher probability than a class value of 1.

We can compare and check this results with the result that was produced using the DBMS_DATA_MINING.APPLY function (see previous blog post).

SQL> select * from new_data_scored
  2  where cust_id = 103001;

   CUST_ID PREDICTION PROBABILITY
---------- ---------- -----------
    103001          0           1
    103001          1           0

Here we can see that the class value of 0 has a probability of 1 (100%) and the class value of 1 has a probability of 0 (0%).

Example 2 – Selecting top 10 Customers with Class value of 1

For this we are selecting from our NEW_DATA_TO_SCORE table. We want to find the records that have a class value of 1 and has the highest probability. We only want to return the first 10 of these

SELECT cust_id
FROM    NEW_DATA_TO_SCORE
WHERE PREDICTION(clas_decision_tree using *) = 1
AND       rownum <=10;

   CUST_ID
----------
    103005
    103007
    103010
    103014
    103016
    103018
    103020
    103029
    103031
    103036

Example 3 – Selecting records based on Prediction value and Probability

For this example we want to find our from what Countries do the customer come from where the Prediction is 0 (wont take up offer) and the Probability of this occurring being 1 (100%). This example introduces the PREDICTION_PROBABILITY function. This function allows use to use the probability strength of the prediction.

select country_name, count(*)
from   new_data_to_score
where  prediction(clas_decision_tree using *) = 0
and    prediction_probability (clas_decision_tree using *) = 1
group by country_name
order by count(*) asc;

COUNTRY_NAME                               COUNT(*)
---------------------------------------- ----------
Brazil                                            1
China                                             1
Saudi Arabia                                      1
Australia                                         1
Turkey                                            1
New Zealand                                       1
Italy                                             5
Argentina                                        12
United States of America                        293

The examples that I have give above are only the basic examples of using the PREDICTION function. There are a number of other uses that include using the PREDICTION_COST, PREDICTION_SET, PREDICTION_DETAILS. Examples of these will be covered in a later blog post

Monday, October 31, 2011

ODM 11.2–Data Mining PL/SQL Packages

The Oracle 11.2 database contains 3 PL/SQL packages that allow you to perform all (well almost all) of your data mining functions.

So instead of using the Oracle Data Miner tool you can write some PL/SQL code that will you to do the same things.

Before you can start using these PL/SQL packages you need to ensure that the schema that you are going to use has been setup with the following:

  • Create a schema or use and existing one
  • Grant the schema all the data mining privileges: see my earlier posting on how to setup an Oracle schema for data mining – Click here and YouTube video
  • Grant all necessary privileges to the data that you will be using for data mining

The first PL/SQL package that you will use is the DBMS_DATA_MINING_TRANSFORM. This PL/SQL package allows you to transform the data to make it suitable for data mining. There are a number of functions in this package that allows you to transform the data, but depending on the data you may need to write your own code to perform the transformations. When you apply your data model to the test or the apply data sets, ODM will automatically take the transformation functions defined using this package and apply them to the new data sets.

The second PL/SQL package is DBMS_DATA_MINING. This is the main data mining PL/SQL package. It contains functions to allow you to:

  • To create a Model
  • Describe the Model
  • Exploring and importing of Models
  • Computing costs and text metrics for classification Models
  • Applying the Model to new data
  • Administration of Models, like dropping, renaming, etc

The next (and last) PL/SQL package is DBMS_PREDICTIVE_ANALYTICS.The routines included in this package allows you to prepare data, build a model, score a model and return results of model scoring. The routines include EXPLAIN which ranks attributes in order of influence in explaining a target column. PREDICT which predicts the value of a target attribute based on values in the input. PROFILE which generates rules that describe the cases from the input data.

Over the coming weeks I will have separate blog posts on each of these PL/SQL packages. These will cover the functions that are part of each packages and will include some examples of using the package and functions.

Saturday, October 29, 2011

ODM PL/SQL API 11.2 New Features

The PL/SQL API interface for Oracle Data Miner has had a number of new features. These are listed below along with the new API features added with the 11.1 release.

  • Support for Native Transactional Data with Association Rules: you can build association rule models without first transforming the transactional data.
  • SVM class weights specified with CLAS_WEIGHTS_TABLE_NAME: including the GLM class weights
  • FORCE argument to DROP_MODEL: you can now force a drop model operation even if a serious system error has interrupted the model build process
  • GET_MODEL_DETAILS_SVM has a new REVERSE_COEF parameter: you can obtain the transformed attribute coefficients used internally by an SVM model by setting the new REVERSE_COEF parameter to 1

11.1g API New Features

  • Mining Model schema objects: previous releases, DM models were implemented as a collection of tables and metadata within the DMSYS schema. in 11.1 models are implemented as data dictionary objects in the SYS schema. A new set of DD views present DM models and their properties
  • Automatic and Embedded Data Preparation: previously data preparation was the responsibility of the user. Now it can be automated
  • Scoping of Nested Data: supports nested data types for both categorical and numerical data. Most algorithms require multi-record case data to the presented as columns of nested rows, each containing an attribute name/value pair. ODM processes each nested row as a separate attribute.
  • Standardised Handling of Sparse Data & Missing Values: standardised across all algorithms.
  • Generalised Linear Models: has a new algorithm and supports classification (logistic regression) and regression (linear regression)
  • New SQL Data Mining Function: PREDICTION_BOUNDS has been introduced for Generalised Linear Models. This returns the confidence bounds on predicted values (regression models) or predicted probabilities (classification)
  • Enhanced Support for Cost-Sensitive Decision Making: can be added or removed using DATA_MINING.ADD_COST_MATRIX and DBMS_DATA_MINING_REMOVE_COST_MATRIX.

Wednesday, October 12, 2011

SQL Developer 3.1 EA & Bug

The new/updated SQL Developer 3.1 Early Adopter has just been released.

For the Data Miner, there are no major changes and it appears that there has been some bug fixes and some minor enhancements to so parts.

The main ODM features, apart from bug fixes, in this release include:

  • Globalization support, including translated error messages and GUI for all languages supported by SQL Developer
  • Improved accessibility features including the addition of a Structure navigator that lists all the nodes and links displayed in a workflow

Bug / Feature

After unzipping the download I opened SQL Developer. With each new release you will have to upgrade the existing ODM repository. The easiest way of doing this is to open the ODM connections pane and double click on one of your ODM schemas. SQL Developer will then run the necessary scripts to upgrade the repository.

I discovered a bug/feature with SQL Developer 3.1 EA1  upgrade script. The repository upgrade does not complete and an error is report.

I logged this error on the ODM forum on OTN. Mark Kelly who is the Development Manager for ODM and monitors the ODM forum, and his team, were quickly onto investigating the error. Mark has posted an update on the ODM form and give a script that needs to be run before you upgrade your existing repository.

You can download the pre-upgrade script from here.

If you don’t have an existing repository then you don’t have to run the script.

Check out the message on the ODM forum.

https://forums.oracle.com/forums/ann.jspa?annID=1678

https://forums.oracle.com/forums/thread.jspa?threadID=2296374&tstart=0

 

How to Upgrade SQL Developer & ODM

You will have to download the new SQL Developer 3.1 EA install files.

http://www.oracle.com/technetwork/developer-tools/sql-developer/sqldev-ea-download-486950.html

  • Unzip this into your SQL Developer directory
  • Create a shortcut for  sqldeveloper.exe on your desktop and relabel it SQL Developer 3.1 EA
  • Double-click this short cut

SNAGHTML137e3ee9

  • You should be presented with the above window. Select the Yes button to migrate you previous install settings
  • SQL Developer should now open and contains all your previous connections

If you have an existing ODM repository, you need to run the pre-upgrade script (see above) at this point 

  • You will now have to upgrade the ODM repository in the database. The simplest way of doing this is to allow SQL Developer to run the necessary scripts.
  • From the View Menu, select Oracle Data Miner –> Connections
  • In the ODM Connections pane double click one of your ODM schemas. Enter the username and password and click OK

SNAGHTML1383bc81[4]

  • You will then be prompted to migrate/update the ODM repository to the new version. Click Yes.
  • Enter the SYS username and Password

SNAGHTML1385934f

  • Click Start button, to start the migrate/upgrade scripts
  • On my laptop this migrate/upgrade step took less than 1 minute
  • The upgrade is now finished and you can start using ODM.

ODM – SQL Developer 3.1 EA – Release Notes

The ODM release notes can be found at

http://www.oracle.com/technetwork/database/options/odm/dataminer-31-relnotes-489144.html

Saturday, August 6, 2011

New Frontiers for Oracle Data Miner

Oracle Data Miner functionality is now well established and proven over the years. In particular with the release of the ODM 11gR2 version of the tool. But how will Oracle Data Miner develop into the future.

There are 4 main paths or Frontiers for future developments for Oracle Data Miner:

Oracle Data Miner Tool

The new ODM 11gR2 tool is a major development over the previous version of the tool. With the introduction of workflows and some added functionality for some of the features. the tool is now comparable with the likes of SAS Enterprise Miner and SPSS.

But the new tool is not complete and still needs a bit of fine tuning of most of the features. In particular with the usability and interactions. Some of the colour schemes needs to be looked at or to allow users to select their own colours.

Apart from the usability improvements for the tool another major development that is needed, is the ability to translate the workflow and the underlying database objects into usable code. This code can then be incorporated into our applications and other tools. The tool does allow you to produce shell code of the nodes, but there is still a lot of effort needed to make this usable.  Under the previous version of the tool there was features available in JDeveloper and SQL Developer to produced packaged code that was easy to include in our applications.

“A lot done – More to do”

Oracle Applications

Over the past couple of months there has been a few postings on how Oracle Data Miner (11gR2) has been, or will be, incorporated in various Oracle Applications. For example Oracle Fusion Human Capital Management and Oracle Real Time Decision (RTD). Watch out of other applications that will be including Oracle Data Miner.

“A bit done – Lots more to do”

Oracle Business Intelligence

One of the most common places where ODM can be used is with OBIEE. OBIEE is the core engine for the delivery of the BI needs for an organisation. OBIEE coordinates the gathering of data from various sources, the defining of the business measures and then the delivery of this information in various forms to the users. Oracle Data Miner can be included in this process and can add significant value to the BI needs and report.

“A lot done – Need to publicise more”

Customized Projects

Most data mining projects are independent of various Applications and BI requirements. They are projects that are hoping to achieve a competitive insight into their organisational data. Over time as the success of some pilot projects become know they need for more data mining projects will increase. This will lead to organisations have a core data mining team to support these project. With this, the team will need tools to support them in the delivery of their project and with the delivery. This is were OBIEE and Oracle Fusion Apps will come increasingly important.

“A lot done – more to do”

Monday, July 18, 2011

VirtaThon Presentation

Today I gave my VirtaThon presentation on the new Oracle Data Miner 11gR2 tool.

It was an interesting experience as VirtaThon was a virtual conference. The organisation and administration of the conference was excellent.

I had over 25 participants for my presentation, including Carolyn Hamm who has written a book on using Oracle Data Miner 10g.  She seemed to enjoy my presentation as she was asking for more at the end, but we had run out of time.

The presentation was an unusual but interesting experience. All the participants were muted, so I could not hear anyone or be asked questions as the presentation progressed. I was not able to judge the body language or facial expressions, for me to work out how the presentation was going.

I was sitting in my living room when giving the presentation and spent almost an hour talking to myself. At time the concentration levels dipped and I have to refocus and used some visualisation to help me concentrate.

The presentation was divided into 2 parts. The first part was a presentation consisting of some background to ODM, how to get setup and running with ODM, and finally a discussion of some of the new features. This first part took approx. 30 minutes which surprised me as during my rehearsals it was talking 16 minutes. The second part of the presentation was a demo of using ODM to create a workflow, generating a classification model and then applying this model to some new data. During my rehearsals this was taking approx. 40 minutes.

I only had 50-55 minutes for my VirtaThon presentation so after my presentation I had 20-25 minutes for the demo. So I had to get through the demo quickly and I had to cut out a discussion of how the data exploration functionality in ODM can be used to get an insight into the data before you start using the data mining features. I will put together a blog post and video of this in a couple of weeks time that will explain it in more detail.

I managed to finish at 49 minutes, which left 6 minutes for questions. There was only a couple of questions, plenty of Thank You’s along with Good Presentation, which is always good to hear.

Thank you to everyone who attended the presentation and to the organisers of VirtaThon.

Brendan Tierney

Friday, May 27, 2011

Creating ODM Schemas & Repository for ODM 11g R2

Before you can start using the Oracle Data Miner features that are now available in SQL Developer 3, there are a few steps you need to perform. This post will walk you through these steps and I have put together a video which goes into more detail. The video is available on my YouTube channel.

http://www.youtube.com/user/btierney70

Oracle Data Miner 11g R2 : Creating ODM User & Repository video

I will be posting more How To type videos over the coming weeks and months. Each video will focus in one one particular feature within the new Oracle Data Mining tool.

So following steps are necessary before you can start using the ODM tool

Set up of Oracle Data Miner tabs

To get the ODM tabs to display in SQL Developer, you need to go to the View menu and select the following from the Data Miner submenu

  • Data Miner Connections
  • Workflow Jobs
  • Property Inspector

image

Create an ODM Schema

There are two main ways to create a Schema. The first and simplest way is to use SQL Developer. To do this you need to create a connection to SYS. Right click on the Other Users option and select Create User.

The second option is to use SQL*Plus to create the user. Using both methods you need to grant Connect & Resource privileges to the user.

Create the Repository

Before you can start using Oracle Data Mining, you need to create an Oracle Data Miner Repository in the database. Again there are two ways to do this. The simplest is to use the inbuilt functionality in SQL Developer. In the Oracle Data Miner Connections tab, double click on the ODM schema you have just created. SQL Developer will check the database to see if the ODM Repository exists in the database. If it will create the repository for you. But you will need to provide the SYS password.

The other way to create the repository is to run the installodmr.sql script that in available in the ‘datamining’ directory.

@installodmr.sql <default tablespace> <temp tablespace>

example:   @installodmr.sql USER TEMP

Create another ODM Schema

It is typical that you would need to have more than one schema for your data mining work. After creating the default Oracle schema, the next step is to grant the schema the privileges to use the Data Mining Repository. This script is called

usergrants.sql <DM Schema>

example:    @usergrants.sql DMUSER

Hint: The schema name needs to be in upper case. 

IMPORTANT: The last grant statement in the script may give an error. If this occurs then it is due to an invalid hidden character on the line. If you do a cut and paste of the grant statement and execute this statement, everything should run fine.

If you want to demo data to be created for this new ODM schema then you need to run

@instdemodata.sql <DM Schema>

example:    @instdemodata.sql DMUSER

All of these scripts can be found in SQL developer directories

\sqldeveloper\dataminer\scripts