Brendan Tierney - Oralytics Blog

Friday, February 3, 2012

ODM 11gR2–Attribute Importance

I had a previous blog post on Data Exploration using Oracle Data Miner 11gR2. This blog post builds on the steps illustrated in that blog post.

After we have explored the data we can identity some attributes/features that have just one value or mainly one value, etc. In most of these cases we know that these attributes will not contribute to the model build process.

In our example data set we have a small number of attributes. So it is easy to work through the data and get a good understanding of some of the underlying information that exists in the data. Some of these were pointed out in my previous blog post.

The reality is that our data sets can have a large number of attributes/features. So it will be very difficult or nearly impossible to work through all of these to get a good understanding of what is a good attribute to use, and keep in our data set, or what attribute does not contribute and should be removed from the data set.

Plus as our data evolves over time, the importance of the attributes will evolve with some becoming less important and some becoming more important.

The Attribute Importance node in Oracle Data Miner allows use to automate this work for us and can save us many hours or even days, in our work on this task.

The Attribute Importance node using the Minimum Description Length algorithm.

The following steps, builds on our work in my previous post, and shows how we can perform Attribute Importance on our data.

1. In the Component Palette, select Filter Columns from the Transforms list

2. Click on the workflow beside the data node.

3. Link the Data Node to the Filter Columns node. Righ-click on the data node, select Connect, move the mouse to the Filter Columns node and click. the link will be created

4. Now we can configure the Attribute Importance settings.Click on the Filter Columns node. In the Property Inspector, click on the Filters tab.

- Click on the Attribute Importance Checkbox

- Set the Target Attribute from the drop down list. In our data set this is Affinity Card

5. Right click the Filter Columns node and select Run from the menu

After everything has run, we get the little green box with the tick mark on the Filter Column node. To view the results we right clicking on the Filter Columns node and select View Data from the menu. We get the list of attributes listed in order of importance and their Importance measure.

We see that there are a number of attributes that have a zero value. It algorithm has worked out that these attributes would not be used in the model build step. If we look back to the previous blog post, some of the attributes we identified in it have also been listed here with a zero value.

Wednesday, February 1, 2012

Oracle Ireland Security Seminar–Dublin 8th March 2012

Oracle Ireland will be hosting an security event on Thursday 8th March, between 9:30-13:30, in their East Point offices.

The seminar is titled Defending Against Hackers and Hacking.

There will be a couple of demos what show some of the simple and not so simple techniques that hack and compromise your systems. This can lead to loss of confidential information.

To register for the event go to

http://www.oracle.com/webapps/events/ns/EventsDetail.jsp?p_eventId=143168&src=7302155&src=7302155&Act=58

Wednesday, January 25, 2012

Update on Exalytics Pricing

In my previous blog post (Exalytics : How much will it cost me ?) I gave an outline of the pricing you might expect for an Exalytics machine.

The final pricing that I gave of approx $3+M was based on the per processor licencing.

Yesterday (24th Jan) the Oracle Business Intelligence blog by Manan, included the pricing based on the per user licences.

The following is a breakdown of the Exalytics pricing based on the minimum 100 user licencing.

Licence Costs (100 users)

Exalytics machine = $135,000

TimesTen = $300 x 100 users = $30,000

BI Foundation Suite = $3,675 x 100 users = $367,500

Giving a grand total of $532,500.

Support Costs (100 users)

But we need to add the annual support costs to this.

Exalytics machine support = $29,700.

TimesTen support = $66 x 100 users = $6,600

BI Foundations suite = $809 x 100 users = $80,900

Total support costs (100 users) = $116,500

First year & on-going costs costs

Total first year cost for an Exalytics machine = $532,500 + $117,200 = $649,700

Plus on going annual support costs of $117,200 in year 2 and subsequent years.

Discounted Costs

If you are one of the lucky customer who can If I use the same discounts, as I did in my previous blog post, of 25% discount on hardware and 60% discount on the software, we get:

Year 1 cost of : ($135,000*0.75) + ($397,500*0.40) = $260,250

So it might be possible to get an Exalytics machine for $260+K, plus annual support costs.

Monday, January 23, 2012

Exalytics : How much will it cost me ?

Over the past couple of weeks the costing for the Oracle Exalytics machine has been made public by Oracle and there has been a number of articles. What I’ve done in this blog post is to collate this information. I give what I understand to be the cost of purchasing an Exalytic machine and to get setup and running.

The pricing structure starts at

Exalytics machine + cost of BI Foundation Suite + TimesTen licences

Exalytics machine = $135,000

TimesTen = $34,500 per processor licence or $300 per named user(min 100 users)

BI Foundation Suite = $450,000 per processor licence or $3,675 per named user (same number of users as for TimesTen = min 100 users)

Annual Support Costs

Exalytics machine = $29,700

TimesTen = 22% of software licence – $7,590 per processor licence or $66 per named user (min 100 users)

BI Foundation Suite = $99,000 per processor licence or $809 per named user(min 100 users)

The Exalytics machine consists of a single server with 1TB of RAM and 4 Intel Xeon E7-4800 processors, with 10 cores each.

So the total cost of an Exalytics machine based on the processor licence will be something towards the $10M. Now this is before the discounts that you can negotiate. There are reports of discounts ranging up to 25% on hardware and 60% on software. The size of the discount is depended on your size etc. So this initial $10M cost could be reduced to $3M+.

Please note that I may have gotten some or all of this pricing wrong. If I have then forgive me and let me know what is wrong. I can correct it to ensure that we have the correct costs.

Friday, January 20, 2012

OUG Ireland 2012 Conference & Re-launching the SIGs in Ireland

The schedule of presentations for the 2012 Oracle User Group Ireland conference has just been released.

This year the conference will be in the Dublin Convention Centre.

There are 5 parallel streams with sessions running all day. Key notes presentations will be given by Eileen O’Mara from Oracle Ireland (given in the morning) and the key note at the end of the day will be given by Mogen Norgaard from Miracle AS in Denmark and founder of the OakTable Network.

There will be 7 Oracle ACE Directors and 3 Oracle ACEs presenting at the conference in addition to other speakers from Ireland, UK and Netherlands. Here is the full list of Oracle ACEs that are presenting

Debra Lilley, ACE Director
Mark Rittman, ACE Director
Roel Hartman, ACE Director
Simon Haslam, ACE Director
Frits Hoogland, ACE Director
Mogens Norgaard, ACE Director
Lonneke Dikmans, ACE Director
Brendan Tierney, ACE
Marcin Przepiorowski, ACE
Grant Ronald, ACE

It is also FREE to attend. So go ahead an book yourself a place for a full day Oracle training.

To register your place – Click here

The agenda for the conference can be found here – Click here

There are a few important web and social media links that might be of interest for all Oracle people in Ireland.

Web :       www.oug.og/ireland
Twitter :   @oug_ire
                @oug_ire2012 For the 2012 conference related tweets
LinkedIn : http://www.linkedin.com/groups?home=&gid=3441701

Re-launching the SIGS in Ireland

Over the past few years the User Group SIGs have been a bit quiet here in Ireland. Except for the BI (&EPM) SIG, which has been getting a good attendance since it launched back in 2009.

In particular the Oracle Technology SIG basically ceased to exist over the past 5 or 6 years, since the days when John Knox and Tagdh Cashman were running it.

At the conference it is hoped that to get some people to volunteer to get this SIG back up and running again. Maybe we could get 3 SIG events each year. What do you think ?

Drop me an email if you have any questions regarding the SIGs in Ireland (I’m deputy chair of the BI & EPM SIG).Hopefully

I’ll see you there on the 21st March.

Monday, January 9, 2012

New additions to my Oracle Magazine Collection

I put out a call a few months ago looking for help in completing my Oracle Magazine collection. I also had a letter in the Nov/Dec 2011 edition of Oracle Magazine looking for help.

The following people contacted me in December offering to send me some of my missing editions.

Lisa Dobson - Oracle DBA at Durham University and Vice President of UKOUG
Debra Lilley - President of UKOUG
Christian Antognini - Trivadis AG, Switzerland
Dan Vlamis - Vlamis Software Solutions, MO, USA

Many thanks for your donations.

It seems that people don’t keep their Oracle Magazines!

I’m now just missing a very small number of editions since 1992. Can anyone else help ?

Does anyone have any Oracle Magazines from before 1992 ?

The following table gives the current collection (printed editions). The grey boxes are the editions that I’m still missing, and keep an eye on my Oracle Magazine page for updates.

Friday, January 6, 2012

ODM 11gR2–Real-time scoring of data

In my previous posts I gave sample code of how you can use your ODM model to score new data.

Applying an ODM Model to new data in Oracle – Part 2

Applying an ODM Model to new data in Oracle – Part 1

The examples given in this previous post were based on the new data being in a table.

In some scenarios you may not have the data you want to score in table. For example you want to score data as it is being recorded and before it gets committed to the database.

The format of the command to use is

prediction(ODM_MODEL_NAME USING <list of values to be used and what the mode attribute they map to>)

prediction_probability(ODM_Model_Name, Target Value, USING <list of values to be used and what model attribute they map to>)

So we can list the model attributes we want to use instead of using the USING * as we did in the previous blog posts

Using the same sample data that I used in my previous posts the command would be:

Select prediction(clas_decision_tree
USING
20 as age,
'NeverM' as cust_marital_status,
'HS-grad' as education,
1 as household_size,
2 as yrs_residence,
1 as y_box_games) as scored_value
from dual;

SCORED_VALUE
------------
0

Select prediction_probability(clas_decision_tree, 0
USING
20 as age,
'NeverM' as cust_marital_status,
'HS-grad' as education,
1 as household_size,
2 as yrs_residence,
1 as y_box_games) as probability_value
from dual;

PROBABILITY_VALUE
-----------------
1

So we get the same result as we got in our previous examples.

Depending of what data we have gathered we may or may not have all the values for each of the attributes used in the model. In this case we can submit a subset of the values to the function and still get a result.

Select prediction(clas_decision_tree
USING
20 as age,
'NeverM' as cust_marital_status,
'HS-grad' as education) as scored_value2
from dual;

SCORED_VALUE2
-------------
0

Select prediction_probability(clas_decision_tree, 0
USING
20 as age,
'NeverM' as cust_marital_status,
'HS-grad' as education) as probability_value2
from dual;

PROBABILITY_VALUE2
------------------
1

Again we get the same results.

Tuesday, January 3, 2012

ODM 11gR2–Using different data sources for Build and Testing a Model

There are 2 ways to connect a data source to the Model build node in Oracle Data Miner.

The typical method is to use a single data source that contains the data for the build and testing stages of the Model Build node. Using this method you can specify what percentage of the data, in the data source, to use for the Build step and the remaining records will be used for testing the model. The default is a 50:50 split but you can change this to what ever percentage that you think is appropriate (e.g. 60:40). The records will be split randomly into the Built and Test data sets.

The second way to specify the data sources is to use a separate data source for the Build and a separate data source for the Testing of the model.

To do this you add a new data source (containing the test data set) to the Model Build node. ODM will assign a label (Test) to the connector for the second data source.

If the label was assigned incorrectly you can swap what data sources. To do this right click on the Model Build node and select Swap Data Sources from the menu.

Saturday, December 31, 2011

My first set of Oracle Products

I started working with Oracle back in 1993 and my first project involved working with Oracle 5, Forms 2.3 and for reports RPT.

The Oracle Database and tools were very simple back then, but there was lots of “features” to work around.

Check out this video, for a short demo of Oracle 5 and Forms 2

What version of the Oracle Database do I need ?

Oracle has recently made available a very useful webpage that that lists the functionality available for each version of the 11g Database. So before you decide which version of the database to purchase, check out this webpage.

http://www.oracle.com/us/products/database/product-editions-066501.html

Friday, December 23, 2011

i-BI : A new name for real BI

I’ve been working in the BI and related fields since the mid 90s. Over the past number of years I’ve gotten a little bit confused about what Business Intelligence (BI) really means. Maybe it’s just a bit of old age kicking in way too early.

It seems to me that the term Business Intelligence has been hijacked by a large number of companies and software vendors. It seems that every “reporting tool” has been re-labelled into a Business Intelligence tool, without providing any really intelligence features. You are still just a reporting tool with no real intelligence features. Yes you do have some nice graphics that can be used instead of just listing numbers. But that is not Business Intelligence.

Business Intelligence is going beyond what these tools are capable off. Most of the skills and abilities for BI comes from the people who are doing it, not the tools. In reality you will need to use a number of tools or to write some custom code to help you gain the extra bit of insight into your data. The “reporting tools” can then deliver the results.

Also Ralph Kimball said a long time ago that the skills of someone working in the DW/BI area was that they needed to be half-DBA and half-MBA.

A quote that I heard recently from the Predictive Analytics World Conference, was “You need to be able to ask the right question”. This is to ensure that you can frame your analytics projects correctly and be able to measure the results.

I think that this question was key back in the mid 90s when I started out in the BI field and I still think it applies to all areas of BI. The thing that we have lost in BI is the real intelligence part of it.

So I’m proposing a new name for really BI. It is intelligent-Business Intelligence (i-BI).

Lets differentiate between BI and the real intelligent BI work.

What do I mean by intelligent BI (i-BI) ? What I mean area skills in Data Warehousing, Time Series Analysis, Advanced Analytics, Data Mining, Predictive Analysis, solving or addressing real business problems, etc.

Or maybe I’m just wrong and have missed some developments in BI over the past 16+ years. Or maybe I’m becoming a bit too cynical.

What do you think ?

Wednesday, December 21, 2011

Article for Oracle Scene–Due 13th Jan

As we approach Christmas, many of us will be looking forward to a few days holidays/vacation. During this period we may start thinking about some techniques or methods that we discovered over the past 12 months or about things we need to find out more on, over the coming months.

One thing to consider is to write an article on these techniques or methods, for Oracle Scene. The next due date for submitting articles is 13th January.

http://www.ukoug.org/what-we-offer/oracle-scene/editorial-calendar/

For more details and ideas check out my webpage Oracle Scene

Make sure you check out the Article Guidelines and Submission Details

http://www.ukoug.org/what-we-offer/oracle-scene/article-submissions/

I hope to write an article based on the presentation I gave at the UKOUG Conference in Birmingham.

The most common question that I get asked is ‘how long should it be?’. The length of an article can be anything from half a page, up to 4 or 5 pages long.

Pages