Brendan Tierney - Oralytics Blog

Wednesday, February 6, 2013

Oracle ACE Director

Towards the end of last week I received and email from Oracle saying that I had been nominated and accepted, by Oracle, to become an Oracle ACE Director.

This is something that makes me very proud and honours the work I have been doing over the past few years on Data Mining in Oracle (Advanced Analytics Option, Data Science/Predictive Analaytics, or whatever you want to call it). Thank you to everyone who nominated me.

If you are not familiar with the Oracle ACE Program, it is a way for Oracle to acknowledge not only technical skills but also personal engagement with the Oracle Community and Technology overall. There is even a FAQ that explains how this program works.

There are a few perks that come with the title, and Oracle have a few expectations too. Most of these expectations I’m already doing!! What I’m looking forward to later this years is my first Oracle ACE Director briefing at Oracle Open World (22-26 September)

Wednesday, January 30, 2013

Oracle Magazine-Nov/Dec. 1998

The headline articles for the Nov/Dec 1998 edition of Oracle Magazine were on building web based applications and thin client computing. A large part of the magazine was dedicated to these topics. This was a bumper edition with a total of 152 pages of content.

Monday, January 28, 2013

Rexer Analytics 2013 Data Miner Survey

Rexer Analytics has been conducting the Data Miner Survey since 2007. Each survey explores the analytic behaviors, views and preferences of data miners and analytic professionals. Over 1300 people from around the globe participated in the 2011 survey. Summary reports (PDFs of about 40 pages) from previous surveys are available FREE to everyone who requests them by emailing us at DataMinerSurvey@RexerAnalytics.com. Also, highlights of earlier Data Miner Surveys are available online, including best practices shared by respondents on analytic success measurement, overcoming data mining challenges, and other topics. The FREE Summary Report for this 2013 Data Miner Survey will be available to everyone in the fall of 2013.

Your survey responses are completely confidential. This research is not being conducted on behalf of any third party, but is solely for Rexer Analytics to disseminate the findings throughout the data mining and analytics community.

To participate, please click on the link below, then click on the “Start Survey” link on the bottom of the webpage. Please enter the access code in the space provided. The survey should take approximately 15-20 minutes to complete. Anyone who has had this email forwarded to them should use the access code in the forwarded email.

Survey Link: www.RexerAnalytics.com/Data-Miner-Survey-2013-Intro.html

Access Code: UL3X7

Friday, January 25, 2013

OUG Norway Agenda is now live

The OUG Norway spring conference (17th April – 19th April) agenda is now live and is open for registrations.

Click here for the Conference Agenda

Click here for the Conference Registration

This is a 3 day conference. The first day (17th April) will be held in the Radisson BLU Scandinavia ( Holbergsplass ) and the next two (and a bit) days will be on the Color Magic boat that will be travelling between Oslo and Kiel in Germany and back to Oslo. The boat will be arriving back in Oslo on the Saturday morning (20th April).

There will be some presentations in Norwegian, but it looks like most of the presentations will be in English. There will also be some well known names from the Oracle world presenting at this conference.

In addition to these people, I will be giving two presentations on using Predictive Analytics in Oracle using the Oracle Data Miner tool and in-database functionality.

My first presentation will be an overview of the advanced analytics option and a demonstration of what you can do using the Oracle Data Miner tool (part of SQL Developer). This presentation is currently scheduled for Thursday (18th April) at 5pm.

My second presentation will be at 9:30am on the Friday morning (19th April). In this presentation we will look at the in-database features, what can we do in SQL and PL/SQL, and we will look at what you need to do deploy you Oracle Data Mining models in a production environment.

If possible we might be able to review some new 12c new features for Oracle Data Miner Smile

Tuesday, January 22, 2013

Agenda Planning for OUG Ireland Annual event 2013

Over the past week there has been a number of meetings by the organising committee of the Annual OUG Ireland event, to arrange the agenda for the 2013 event. This will be held on the 12th March in the DCC (same as last year)

We have had a large number of submitted presentations from experts from around the world. The choices we have had to make were very difficult.

The agenda is almost complete. A few finishing touches and we should have it all sorted out by the end of the week.

This year we will have the largest conference/annual event yet. There will be a good mixture of presentations from Oracle, Customers, Partners, Oracle ACE’s and other people (with an interesting story to tell) from around the world.

At the moment it looks like we will have tracks on

Oracle Database
Tech
Development
Fusion
EBS
Product & JDE
BI & EPM (this may have 2 parallel tracks)

When the agenda is available and live, I put up a new post.

Registration is now live and by reports a lot of people have already registered.

OUG Ireland Annual event – online Registrations

In previous years this event used to be called the Annual OUG Ireland Conference, but the Conference part has been dropped. I’ll try to explain it over a drink some time.

Monday, January 21, 2013

Oracle Magazine-Sept/Oct 1998

The headline articles for the Sept/Oct1998 edition of Oracle Magazine were on all on how to build and deploy intranets within an organisation, using Oracle products. There were a few case studies illustrating the benefits that using intranets can bring to an organisation

Saturday, January 19, 2013

BIWA Oracle Data Scientist Certificate

Last week I had had the opportunity to present at the BIWA Summit conference. This was held in the Sofitel Hotel beside the Oracle HQ buildings at Redwood Shores just out side of San Francisco.

This conference was a busy 2 days of with 4 parallel streams of presentations and another stream for Hands-on Labs. The streams covered Big Data, Advanced Analytics, Business Intelligence and Data Warehousing. There was lots of great presentations from well known names in the subject areas.

The BIWA Oracle Data Scientist Certificate was launched at the summit. The requirements for this certificate was to attend my presentation on ‘The Oracle Data Scientist’ (this was compulsory) and then to attend a number of other data science related presentations and hands-on labs. In addition to these presentations there is a short exam to take. This consists of some 30-ish questions, which were based on my presentation and some of the other presentations and hand-on labs. The main topic areas covered in the exam include what is data science about, Oracle Data Miner, Oracle R Enterprise and then some questions based on the key notes, in particular the keynote by Ari Kaplan.

There are a few days left to take the exam. Your answers to the questions will be reviewed and you should receive an email within a couple of days with your result and hopefully your certificate.

https://www.surveymonkey.com/s/BiwaSummitDataScientistCertificate

This was my first trip to Redwood Shores and I had some time to go for a walk around the Oracle HQ campus. Hopefully it wont be my last. Here is a photo I took of some of the Oracle buildings.

The BIWA Summit conference returns to Redwood Shores again in 2014 around the 14th and 15th January. It will be in the Oracle Conference centre that is part of the Oracle HQ campus.

Maybe I’ll see you there in 2014.

Thursday, January 17, 2013

The ‘Oh No You Don’t’ of (Oracle) Data Science

Over the past couple of weeks I’ve had conversations with a large number of people about Data Science in the Oracle arena.

A few things have stood out. The first and perhaps the most important of these is that there is confusion of what Data Science actually means. Some think it is just another name for Statistics or Advanced Statistics, some Predictive Analytics or Data Mining, or Data Analysis, Data Architecture, etc.. The reality is it is not. It is more than what these terms mean and this is a topic for discussion for another day.

During these conversations the same questions or topics keep coming up and the simplest answer to all of these is taken from a Pantomime (Panto).

We need to have lots of statisticians
       'Oh No You Don't !'
We can only do Data Science if we have Big Data
        'Oh No You Don't !'
We can only do data mining/data science if we have 10’s or 100’s of Million of records
        'Oh No You Don't !'
We need to have an Exadata machine
        'Oh No You Don't !'
We need to have an Exalytics machine
        'Oh No You Don't !'
We need extra servers to process the data
        'Oh No You Don't !'
We need to buy lots of Statistical and Predictive Analytics software
        'Oh No You Don't !'
We need to spend weeks statistically analysing a predictive model
        'Oh No You Don't !'
We need to have unstructured data to do Data Science
        'Oh No You Don't !'
Data Science is only for large companies
        'Oh No You Don't !'
Data Science is very complex, I can not do it
        'Oh No You Don't !'

Let us all say it together for one last time ‘Oh No You Don’t’

In its simplest form, performing Data Science using the Oracle stack, just involves learning and using some simple SQL and PL/SQL functions in the database.

Maybe we (in the Oracle Data Science world and those looking to get into it) need to adopt a phrase that is used by Barrack Obama of ‘Yes We Can’, or as he said it in Irish when he visited Ireland back in 2011, ‘Is Feidir Linn’.

Remember it is just SQL.

Friday, January 4, 2013

My Blog Stats for 2012

Here are the stats from my blog for 2012.

In total I’ve had almost 28,000 blog post views. This is a 7 fold increase on the number of blog post views I had in 2011.

I had 92 blog posts in 2012 and the most popular blog posts were

Top search keywords used to find my blog

exalytics pricing
oracle data mining
oracle data miner
data science
brendan tierney

Top Countries

United States 52%
Ireland 8%
United Kingdom 8%
India 4%
Russia 4%
Germany 3%
France 3%
Netherlands 1%
Canada 1%
Turkey 1%

Top OS

Windows 59%
Macintosh 28%
Linux 5%
iPhone 2%
iPad 1%

Top Browsers

Firefox 47%
Internet Explorer 26%
Chrome 15%
Safari 4%

Wednesday, January 2, 2013

OUG Norway April 2013 - New Year’s News

I received an email at 23:24 on the 1st January from the OUG in Norway telling me that I’ve had two presentations accepted for the Annual OUG Norway seminar event. This will be on during the 17th-19th April.

The first day of this event (17th April) will be held in a hotel in Oslo. Then on the morning of 18th April we board the Color Magic cruise for the next two days of the conference. The ferry/cruise will go from Oslo to Kiel in Germany and then back again to Oslo, returning around 10am on Saturday 20th April.

I will be giving two presentations on the Oracle Advanced Analytics Option. The first presentation, ‘Using Predictive Analytics in Oracle’, will give an overview of the Oracle Advanced Analytics Option and will then focus on the Oracle Data Miner work-flow tool. This will presentation will include a live demo of using Oracle Data Miner to create some data mining models.

The second presentation, ‘How to Deploy and Use your Oracle Data Miner Models in Production’, builds on the examples given in the first presentation and will show how you can migrate, user and update your Oracle Data Miner models using the features available in SQL and PL/SQL. Again a demo will be given.

Thursday, December 20, 2012

Articles wanted for Oracle Scene–Spring 2013

The Call for Articles is now open for the Spring edition of Oracle Scene magazine. This is a publication of the UKOUG.

We are looking for technical articles covering all product offerings from Oracle.

Typically articles will range from 3 pages to 8 pages (MS Word format). These will convert into 2 to 5 page articles in Oracle Scene.

Check out the Article Formatting Guidelines before submitting.
All pictures and images should be 300dpi.
Include a 100(max) word Bio and your photo
Email your article and images to

articles@ukoug.org.uk

For more details about submitting an article, check out
http://www.ukoug.org/what-we-offer/oracle-scene/article-submissions/

Wednesday, December 19, 2012

Association Rules in ODM-Part 4

This is a the final part of a four part blog post on building and using Association Rules in the Oracle Database using Oracle Data Miner. The following outlines the contents of each post in the series on Association Rules

This first part will focus on how to building an Association Rule model
The second post will be on examining the Association Rules produced by ODM – This blog post
The third post will focus on using the Association Rules on your data.
The final post will look at how you can do some of the above steps using the ODM SQL and PL/SQL functions.

In my previous posts I showed how you can go about setting up for Association Rule analysis in Oracle Data Miner and how to examine the rules that are generated.

This post will focus on how we build and use association rules using the functionality that is available in SQL and PL/SQL.

Step 1 – Build the Settings Table

As with all Oracle Data Mining functions in SQL and PL/SQL you will need to setup or build a settings table. This table contains all the settings necessary to run the model build functions. It is a good idea to create a separate settings table for each model build that you complete.

CREATE TABLE assoc_sample_settings (
setting_name VARCHAR2(30),
setting_value VARCHAR2(4000));

Step 2 – Define the Settings for the Model

Before you go to generate your model you need to set some of the parameters for the algorithm. To start with you need to defined that we are going to generate an Association Rules model, turn off the Automatic Data Preparation.

We can also set 3 additional settings for Association Rules.

The ASSO_MIN_SUPPORT has a default of 0.1 or 10%. That means that only rules that exist in 10% or more of the cases will be generated. This is really a figure that is too high. In the code below we will set this to a 1%. This matches the settings that we used in SQL Developer in my previous posts.

BEGIN

INSERT INTO assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.algo_name, dbms_data_mining.ALGO_APRIORI_ASSOCIATION_RULES);

INSERT into assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.prep_auto, dbms_data_mining.prep_auto_off);

INSERT into assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.ODMS_ITEM_ID_COLUMN_NAME, ‘PROD_ID’);

INSERT into assoc_sample_settings (setting_name, setting_value) VALUES

(dbms_data_mining.ASSO_MIN_SUPPORT, 0.01);

COMMIT;

END;

/

Step 3 – Prepare the Data

In our example scenario we are using the SALE data that is part of the SH schema. The CREATE_MODEL function needs to have an attribute (CASE_ID) that identifies the key of the shopping basket. In our case we have two attributes, so we will need to use a combined key. This combined key consists of the CUST_ID and the TIME_ID. This links all the transaction records related to the one shopping event together.

We also just need the attribute that has the information that we need. In our Association Rules (Market Basket Analysis) scenario, we will need to include the PROD_ID attribute. This contains the product key of each product that was included in the basket

CREATE VIEW ASSOC_DATA_V AS (

SELECT RANK() OVER (ORDER BY CUST_ID, TIME_ID) CASE_ID,

t.PROD_ID

FROM SH.SALES t );

Step 4 – Create the Model

We will need to use the DBMS_DATA_MINING.CREATE_MODEL function. This will use the settings in our ASSOC_SAMPLE_SETTINGS table. We will use the view created in Step 3 above and use the CASE_ID attribute we created as the Case ID in the function all.

BEGIN
   DBMS_DATA_MINING.CREATE_MODEL(
     model_name          => 'ASSOC_MODEL_2',
     mining_function     => DBMS_DATA_MINING.ASSOCIATION,
     data_table_name     => 'ASSOC_DATA_V',
     case_id_column_name => ‘CASE_ID’,
     target_column_name => null,
     settings_table_name => 'assoc_sample_settings');

END;

On my laptop this took approximately 5 second to run on just over 918K records involving just over 143K cases or baskets.

Now that is quick!!!

Step 5 – View the Model Outputs

There are a couple of functions that can be used to extract the rules produced in our previous step. These include:

GET_ASSOCIATION_RULES : This returns the rules from an association model.

SELECT rule_id,
       antecedent,
       consequent,
       rule_support,

       rule_confidence

FROM TABLE(DBMS_DATA_MINING.GET_ASSOCIATION_RULES('assoc_model_2', 10));

The 10 here returns the top 10 records or rules. GET_FREQUENT_ITEMSETS : returns a set of rows that represent the frequent item sets from an association model. In the following code we want the top 30 item sets to be returned, but filtered to only display item sets where there are 2 or more rules.

SELECT itemset_id,

       items,

       support,

       number_of_items

FROM TABLE(DBMS_DATA_MINING.GET_FREQUENT_ITEMSETS('assoc_model_2', 30))

WHERE number_of_items >= 2;

Pages