Wednesday, March 26, 2014

Predicting using ORE package

In a previous post I gave a an overview of the various in-database data mining algorithms that you can use in your Oracle R Enterprise scripts.

To create data mining models based on those algorithms you need to use the ore.odm functions.

After you have developed and tested your models you will select one of these to score your new data.

How can you do this using ORE? There is a suite of ORE functions called ore.predict that you can use to apply your data mining model to score or label new data.

The following table lists the ore.predict functions:

ORE Predict Function Description
ore.predict-glm Generalized linear model
ore.predict-kmeans k-Means clustering mode
ore.predict-lm Linear regression model
ore.predict-matrix A matrix with no more than 1000 rows
ore.predict-multinom Multinomial log-linear model
ore.predict-nnet Neural network models
ore.predict-ore.model An Oracle R Enterprise model
ore.predict-prcomp Principal components analysis on a matrix
ore.predict-princomp Principal components analysis on a numeric matrix
ore.predict-rpart Recursive partitioning and regression tree model


As you will see from the above table there are more ore.predict functions than there are ore.odm functions. The reason for this is that ORE comes with some additional data mining algorithms. These are in addition to the sub-set of Oracle Data Mining algorithms that it uses. These include the ore.glm, ore.lm, ore.neural and ore.stepwise.

You also need to watch out for the data mining algorithms that are not used in prediction. These include the Minimum Description Length, Apriori and Non-Negative Matrix Factorization.

Remember that these ore.predict functions are run inside the Oracle Database. No data is extracted to the data analyst laptop or desktop. All the data stays in the database. The ORE functions are run in the database on the data in the database

No comments:

Post a Comment