Thursday, June 27, 2013

Oracle Magazine-Jan/Feb 2000

The headline articles of Oracle Magazine for January/February 2000 were focused on looking forward to what is to come, now that the year 2000 bomb. These articles include large scale, 24x7 data warehouses and marts, more development using Java, more and better B2B with XML.

This issue of Oracle Magazine introduced a new layout and design.

image

Other articles included:

  • an Introduction of XML was given, including some basics, the differences between XSL, XSLT and XCETERA, how XML can be used with Oracle, and in the Oracle Internet Platform
  • Building mobile apps with Oracle 8i Lite’s Web-to-go. Oracle 8i Lite includes, Oracle Lite DBMS, iConnect and Web-to-go. An example was given on how to create a Web-to-go Servlet. An overview of the development process was given that included, create the database tables, compile the java servlet code, register the application and then test drive the application.
  • Creating packages and procedures by using invoker rights can increase code reuse, simplify maintenance and exercise more control over security. Some examples were given to illustrate how this can be all be done.
  • Oracle WebDB 2.1 allows everyone to build website without having to rely on a webmaster.
  • Experienced Oracle DBAs know that I/O is the single greatest component of response time. When Oracle retrieves a block from a data file on disk, the reading process must wait for the physical I/O operation to complete. Consequently anything you can do to minimize I/O or reduce bottlenecks can greatly improve the performance of an Oracle system.

To view the cover page and the table of contents click on the image at the top of this post or click here.
My Oracle Magazine Collection can be found here. You will find links to my blog posts on previous editions and a PDF for the very first Oracle Magazine from June 1987.

Monday, June 17, 2013

Oracle Magazine-Nov/Dec 1999

The headline articles of Oracle Magazine for November/December 1999 were E-Business and how you can use the Oracle product set to put your business online. These articles included features on companies such as AMR, Fogdog, Cognitiative, Drug Emporium, Click-fil-A, Living, CD Now, Trilux and Lycos Networks.
image
Other articles included:
  • Oracle Developer and Developer Server 6i was released. These new tools also form the underlying technology for the Oracle Applications release 11i.
  • We also have the launch of Oracle Designer 6i with new features including: Repository based configuration management, support for files and folders, detailed dependency analysis, enhanced support for Oracle 8i server generation, enhanced generation of Oracle Developer Forms and visual repository extensibility.
  • Oracle releases the Oracle Discoverer Y2K Assistance. This was workbook identifies possible Y2K errors in the end user layer
  • Oracle Express 6.3 is released.
  • Oracle 8i is released for the Apple Macintosh
  • Oracle JDeveloper Modeling Tools are scheduled for release in early 2000 and will provide an single integrated toolset that will include: the Unified Modelling Language (UML) support, Java editing, compiling and debugging, Java runtime component framework for persistence and transactions, Multi-user repository for managing models as well as files, and Deployment to choice of servers in an n-tier environment.
  • The Business and Accounting Software Developers Association and the German Association for Technical Inspection have certified Oracle Financials Release 11 for Euro (€) compliance.
  • Donald Burleson has an article on Tuning Disk I/O in Oracle 8. Be sure to tune your SQL before you start to reorganise your disks.The article looks at how you can investigate if a disk becomes stalled while handling simultaneous I/O request and proposes a couple of ways you can address these issues.
  • Joe Johnson’s article on Using Oracle Database Auditing to Tune Performance looks at how you can tune the components of the SGA, in particular the shared pool and the database buffer cache.
As this was the Oracle Open World edition you can imagine that there was a large number of advertisements in the magazine.
To view the cover page and the table of contents click on the image at the top of this post or click here.
My Oracle Magazine Collection can be found here. You will find links to my blog posts on previous editions and a PDF for the very first Oracle Magazine from June 1987.

Wednesday, June 12, 2013

Part 3–Getting start with Statistics for Oracle Data Science projects

This is the Part 3 blog post on getting started with Statistics for Oracle Data Science projects.

The table below is a collection of most of the statistical functions in Oracle 11.2. The links in the table bring you to the relevant section of the Oracle documentation where you will find a description of each function, the syntax and some examples of each.

ABS

LENGTH2

REGR_AVGX

ACOS

LENGTH4

REGR_ACGY

Aggregrate functions

LENGTHB

REGR_COUNT

Analytic functions

LENGTHC

REGR_INTERCEPT

Arithmetic operators

LN

REGR_R2

ASIN

LNNVL

REGR_SLOPE

ATAN

LOG

REGR_SXX

ATAN2

LOWER

REGR_SXY

AVG

LPAD

REGR_SYY

CAST

LTRIM

ROLLUP clause

Comparison functions

MAX

ROUND

CONCAT

MEDIAN

SAMPLE

CORR

MIN

SIN

CORR_K

MOD

SINH

CORR_S

MODEL clause

SQRT

COS

NTH_VALUE

STATS_BINOMIAL_TEST

COSH

Numeric Functions

STATS_CROSSTAB

COUNT

PERCENT_RANK

STATS_F_TEST

COVAR_POP

PERCENTILE_CONT

STATS_KS_TEST

COVAR_SAMP

PERCENTILE_DISC

STATS_MODE

CUBE clause

Pivot operations

STATS_MW_TEST

CUME_DIST

POWER

STATS_ONE_WAY_ANOVA

CV

PREDICTION

STATS_T_TEST_INDEP

Data functions

PREDICTION_BOUNDS

STATS_T_TEST_INDEPU

DENSE_RANK

PREDICTION_COST

STATS_T_TEST_ONE

EXP

PREDICTION_PROBABILITY

STATS_T_TEST_PAIRED

FLOOR

PREDICTION_SET

STATS_WSR_TEST

GREATEST

PRESENTNNV

STDDEV

Grouping Sets

PRESENTNTV

STDEEV_POP

INTERSECT

Prior clause

STDDEV_SAMP

Interval arithmetic

PRIOR

SUM

INTERVAL

RANK

TAN

Julian dates

RAWTOHEX

TANH

LAG

REGEXP_COUNT

t-test

LAST

REGEXP_INSTR

VAR_POP

LEAD

REGEXP_LIKE

VAR_SAMP

LEAST

REGEXP_REPLACE

VARIANCE

LENGTH

REGEXP_SUBSTR

WIDTH_BUCKET

The list about may not be complete (I’m sure it is not), but it will cover most of what you will need to use in your Oracle projects.

If you come across or know of other useful statistical functions in Oracle let me know the details and I will update the table above to include them.

Friday, June 7, 2013

DBMS_PREDICTIVE_ANALYTICS & Predict

In this blog post I will look at the PREDICT procedure that is part of the DBMS_PREDICTIVE_ANALTYICS package. This package allows you to perform data mining in an automated way without having to go through the steps of building, testing and scoring data.

I had a previous blog post that showed how to use the EXPLAIN function to create an Attribute Importance model.

The predictive analytics procedures analyze and prepare the input data, create and test mining models using the input data, and then use the input data for scoring. The results of scoring are returned to the user. The models and supporting objects are not persisted and are removed from the database when the procedure is finished.

The PREDICT procedure should only be used for a Classification problem and data set.

The PREDICT procedure create a model based on the supplied data (out input table) and a target value,  and returns scored data set in a new table. When using PREDICT you do not get to select an algorithm to use.

The input data source should contain records that already have the target value populated.  It can also contain records where you do not have the target value. In this case the PREDICT function will use the records that have a target value to generate the model. This model will then score all records a the predicted target value

The syntax of the PREDICT procedure is:

DBMS_PREDICTIVE_ANALYTICS.PREDICT (
   accuracy OUT NUMBER,
   data_table_name IN VARCHAR2,
   case_id_column_name IN VARCHAR2,
   target_column_name IN VARCHAR2,
   result_table_name IN VARCHAR2,
   data_schema_name IN VARCHAR2 DEFAULT NULL);

Where

Parameter Name Description
accuracy This output parameter from the procedure. You do not pass anything into this parameter. The Accuracy value returned is the predictive confidence of the model generated/used by the PREDICT procedure
data_table_name The name of the table that contains the data you want to use
case_id_column_name The case id for each record. This is unique for each record/case.
target_column_name The name of the column that contains the target column to be predicted
result_table_name The name of the table that will contain the results. This table should not exist in your schema, otherwise an error will occur.
data_schema_name The name of the schema where the table containing the input data is located. This is probably in your current schema, so you can leave this parameter NULL.

The PREDICT procedure will produce an output tables (result_table_name parameter) and will contain 3 attributes.

CASE_ID This is the Case Id of the record from the original data_table_name. This will allow you to link up the data in the source table to the prediction in the result_table_name
PREDICTION This will be the predicted value of the target attribute
PROBABILITY This is the probability of the prediction being correct

Using the sample example data set that I have given in previous blog posts and in the blog post on the EXPLAIN procedure, the following code illustrates how to use the PREDICT procedure.

set serveroutput on

DECLARE
   v_accuracy NUMBER(10,9);
BEGIN
   DBMS_PREDICTIVE_ANALYTICS.PREDICT(
      accuracy => v_accuracy,
      data_table_name => 'mining_data_build_v',
      case_id_column_name => 'cust_id',
      target_column_name => 'affinity_card',
      result_table_name => 'PA_PREDICT');
   DBMS_OUTPUT.PUT_LINE('Accuracy of model = ' || v_accuracy);
END;

image

This took about 15 seconds to run on my laptop, which is surprisingly quick given all the work that is doing internally. To see the predictions and the results from the PREDICT procedure, you will need to query the PA_PREDICT table.

image

The final step that you might be interested in is to compare the original target value with the prediction value.

SELECT v.cust_id,
       v.affinity_card,
       p.prediction,
       p.probability
FROM   mining_data_build_v  v,
       pa_predict p
WHERE  v.cust_id = p.cust_id
AND    rownum <= 12;

image

Remember we do not get to see how or what Oracle did to generate these results. We do not get the opportunity to tune the process and the model.

So you have to be careful when you use the PREDICT function and on what data. Would you use this as a way to explore your data and to see if predictive analytics/data mining might be useful for your? Yes it would. Would you use it in a production scenario? the answer is maybe but it depends on the scenario. In reality if you want to do this in a production environment you will put some work into developing data mining models that best fit your data. To do this you will need to move onto the ODM tool and the DBMS_DATA_MINING package. But the PREDICT function is a quick way to get some small data scored (in some way) based on your existing data. If your marketing department says they want to start a tele marketing campaign in a couple of hours then PREDICT is what you need to use. It may not give you the most accurate of results, but it does give you results that you can start using quickly.