Friday, March 4, 2011

2010 Rexer Analytics Data Mining Survey

The Rexer Analytics 4th Annual Rexer Analytics Data Miner Survey for 2010 is now available.  735 data miners participated in the 2010 survey. The main highlights of the survey are:

FIELDS & GOALS:  Data miners work in a diverse set of fields.  CRM / Marketing has been the #1 field in each of the past four years.  Fittingly, “improving the understanding of customers”, “retaining customers” and other CRM goals are also the goals identified by the most data miners surveyed.

ALGORITHMS:  Decision trees, regression, and cluster analysis continue to form a triad of core algorithms for most data miners.  However, a wide variety of algorithms are being used.  This year, for the first time, the survey asked about Ensemble Models, and 22% of data miners report using them. 
A third of data miners currently use text mining and another third plan to in the future.

MODELS:  About one-third of data miners typically build final models with 10 or fewer variables, while about 28% generally construct models with more than 45 variables.

TOOLS:  After a steady rise across the past few years, the open source data mining software R overtook other tools to become the tool used by more data miners (43%) than any other.  STATISTICA, which has also been climbing in the rankings, is selected as the primary data mining tool by the most data miners (18%).  Data miners report using an average of 4.6 software tools overall.  STATISTICA, IBM SPSS Modeler, and R received the strongest satisfaction ratings in both 2010 and 2009.

TECHNOLOGY:  Data Mining most often occurs on a desktop or laptop computer, and frequently the data is stored locally.  Model scoring typically happens using the same software used to develop models.  STATISTICA users are more likely than other tool users to deploy models using PMML.

CHALLENGES: As in previous years, dirty data, explaining data mining to others, and difficult access to data are the top challenges data miners face.  This year data miners also shared best practices for overcoming these challenges.  

FUTURE:  Data miners are optimistic about continued growth in the number of projects they will be conducting, and growth in data mining adoption is the number one “future trend” identified.  There is room to improve:  only 13% of data miners rate their company’s analytic capabilities as “excellent” and only 8% rate their data quality as “very strong”.

You can request a copy of the full report by going to their data mining survey webpage

Tuesday, March 1, 2011

Changing the Domain of Oracle Database Server

I recently had the task of moving our Database server onto a new domain. The following steps outline what was involved in performing this task.

1. Change the Domain of the server

  • Take the server of the current domain
  • Reboot
  • Change the domain to the new one
  • Reboot

2. Update the Listener.ora and Tnsnames.ora

  • Change to the new domain name

3. Make sure the the instance is running

  • sqlplus / as sysdba

4. Drop the Enterprise Manager Console

  • emca -deconfig dbcontrol db -repos drop
  • Enter the SID name (ORA11gDB)
  • Listener port number = 1521
  • Password for SYS user
  • Password for SYSMAN user
  • Do you wish to continue = Y
  • Depending on the size of the DB it can take some minutes to complete the (10 minutes)

5. Reinstall the Enterprise Manager Console

  • emca -config dbcontrol db -repos create
  • Enter the SID name (ORA11gDB)
  • Listener port number = 1521
  • Password for SYS
  • Password for DBSNMP
  • Password for SYSMAN
  • Email address for notification
  • Outgoing mail (SMTP)
  • Do you wish to continue = Y
  • Again this can take some minutes to complete (20 minutes

6. Restart the database

7. Test connections

8. All should be OK

Wednesday, February 23, 2011

New Oracle Data Miner tool is now Available

Today the new Oracle Data Miner tool has been made available as part of the SQL Developer 3.0 (Early Adoptor Release 4).


The new ODM tool has been significantly redeveloped, with a new workflow interface and new graphical outputs. These include graphical representations of the decision trees and clustering.

To download the tool and to read the release documentation go to
http://tinyurl.com/62u3m4y

If you download and use the new tool, let me know what you think of it.

Tuesday, February 22, 2011

Data Analytics Videos–CNBC–Big Brother–Big Business

The following list of videos are available on Youtube from the CNBC program Big Brother – Big Business. Each video is between 8 minutes and 10 minutes long.

They give a good incite into how data analytics can be used and is currently being used by organisations to gain new information and knowledge of what is going on in their business.

Most of the techniques used in the examples given in the videos do not use any complex technique, but shows how a business can use their data to gain a incite into what is really going on in business

Video1, Video2, Video3, Video4, Video5, Video6, Video7, Video8, Video9, Video10

Let me know what you think of these videos.

If you come across any other interesting Data Analysis videos, let me know and I can add them to the list above

Brendan Tierney

Wednesday, February 16, 2011

New Oracle Data Mining tool video

Charlie Berger has recently put together a video demonstrating the new Oracle Data Mining tool.
The link to this video is
http://tinyurl.com/6jhsth4

The video gives a demonstration of some of the main stets in building and applying a classification model. He also demonstrates applying classification to the same data.

The new ODM interface is due to me made available within the next month or two on a limited basis initially and will be part of an Early Adopter (EA) release of SQL Developer 3.

Brendan


Friday, November 13, 2009

Friday, November 6, 2009

How to Prepar for Data Mining

There is a new article by Eric King of The Modeling Agency on How to Prepare for Data Mining. It has some interesting points on various aspects that you would need to look out for.

I think some of the important things is that you need to plan such a project carefully, that you understand what you can get out of a data mining project, and have an appreciation of the type of techniques/technologies involved.

Brendan Tierney

Data Modelling for MDM course

This one-day seminar provides a systematic approach to the special data modeling challenges in Master Data Management (MDM). In particular it will focus on:

What master data is and its special management needs
Higher-level and lower-level data models for MDM
How to deal with history in master data
Designing for embedded and discrete metadata to support MDM
The effects of architecture on data models for MDM
The challenge of hidden subtypes in MDM data models
Designing models for master data production versus distribution
SOA and canonical data models for MDM
Levels of abstraction in master data
Incorporating data standards
Designing for identity cross-referencing
Modeling for the integration of master data
Change control among data models for MDM
Incorporating data governance requirements
Data models and information knowledge management
Incorporating vendor supplied master data content

By
Malcolm Chisholm
February 1 , 2010
Crowne Plaza Hotel, Clark, NJ

Seven Most Commonly Asked Questions about MDM

Master data management (MDM) enables organizations to maintain a single, clean, and consistent set of reference data on common business entities that can be used by any individual or application that requires it. Wayne Eckerson, director of TDWI Research, has written an excellent article that answers the seven most commonly asked questions about MDM, including:
- What's the best place to start with MDM?
- How do you fund MDM?
- What technical pitfalls will I encounter?
To read the article, please visit the link below. Note that you may now leave comments at the conclusion of the article.


Brendan Tierney

Monday, November 2, 2009

BI Quotient

I've joing up with Uli Bethke of BI Quotient
http://www.business-intelligence-quotient.com/

Together we are offering a number of training courses on Data Warehousing, Data Mining, Oracle Data Mining and Master Data Management.

I will also be regularly posting on their blog on various topics relating to Data Mining.

Welcome to my Data Mining Blog

I will post my postings from various other blogs and locations here. They will be all based on my view of data mining and performing data mining in Oracle.