The following is the fifth and final set of demo code from my video on using R in the Oracle Database. Check out the video before using the following code. The blog post for the video will be updated to contain links to all blog posts that have the various demo code.
The following examples illustrate how you can use the Oracle R Enterprise capabilities within SQL and PL/SQL. The following illustrate building a GLM model using the glm algorithm that comes with the R language, and then uses this mode to score or label new data that is stored in a table. The last part of the example illustrates how you can perform What-If analysis using this ORE model
-- Build & save the R script, called Demo_GLM in the DB
-- This builds a GLM DM model in the DB
--
Begin
sys.rqScriptDrop('Demo_GLM');
sys.rqScriptCreate('Demo_GLM',
'function(dat,datastore_name) {
mod <- glm(AFFINITY_CARD ~ CUST_GENDER + AGE + CUST_MARITAL_STATUS + COUNTRY_NAME + CUST_INCOME_LEVEL + EDUCATION + HOUSEHOLD_SIZE + YRS_RESIDENCE, dat, family = binomial())
ore.save(mod, name=datastore_name, overwrite=TRUE) }');
end;
/
--
-- After creating the script you need to run it to create the GLM model
--
select *
from table(rqTableEval(
cursor(select CUST_GENDER,
AGE,
CUST_MARITAL_STATUS,
COUNTRY_NAME,
CUST_INCOME_LEVEL,
EDUCATION,
HOUSEHOLD_SIZE,
YRS_RESIDENCE,
AFFINITY_CARD
from mining_data_build_v),
cursor(select 1 as "ore.connect", 'myDatastore' as "datastore_name” from dual),
'XML', 'Demo_GLM' ));
--
-- There are 2 ways to use the GLM model : in Batch and in Real-Time mode
--
-- First Step : Build the in-database R script to score you new data
--
Begin
sys.rqScriptDrop('Demo_GLM_Batch');
sys.rqScriptCreate('Demo_GLM_Batch',
'function(dat, datastore_name) {
ore.load(datastore_name)
prd <- predict(mod, newdata=dat)
prd[as.integer(rownames(prd))] <- prd
res <- cbind(dat, PRED = prd)
res}');
end;
/
--
-- Now you can run the script to score the new data in Batch model
-- The data is located in the table MINING_DATA_APPLY
--
select * from table(rqTableEval(
cursor(select CUST_GENDER, AGE, CUST_MARITAL_STATUS, COUNTRY_NAME, CUST_INCOME_LEVEL, EDUCATION, HOUSEHOLD_SIZE, YRS_RESIDENCE
from MINING_DATA_APPLY_V
where rownum <= 10),
cursor(select 1 as "ore.connect", 'myDatastore' as "datastore_name" from dual),
'select CUST_GENDER, AGE, CUST_MARITAL_STATUS, COUNTRY_NAME, CUST_INCOME_LEVEL, EDUCATION, HOUSEHOLD_SIZE, YRS_RESIDENCE, 1 PRED from MINING_DATA_APPLY_V','Demo_GLM_Batch'))
order by 1, 2, 3;
--
-- Now let us use the Demo_GLM_Batch script to score data in Real-Time
-- The data values are passed to the GLM model
--
select * from table(rqTableEval(
cursor(select 'M' CUST_GENDER,
23 AGE,
'Married' CUST_MARITAL_STATUS,
'United States of America' COUNTRY_NAME,
'B: 30,000 - 49,999' CUST_INCOME_LEVEL,
'Assoc-A' EDUCATION,
'3' HOUSEHOLD_SIZE,
5 YRS_RESIDENCE
from dual),
cursor(select 'myDatastore' "datastore_name", 1 "ore.connect" from dual),
'select CUST_GENDER, AGE, CUST_MARITAL_STATUS, COUNTRY_NAME, CUST_INCOME_LEVEL, EDUCATION, HOUSEHOLD_SIZE, YRS_RESIDENCE, 1 PRED from MINING_DATA_APPLY',
'Demo_GLM_Batch')) order by 1, 2, 3;