Friday, July 12, 2013

DBMS_PREDICTIVE_ANALYTICS & Profile

In this blog post I will look at the PROFILE procedure that is part of the DBMS_PREDICTIVE_ANALYTICS package. The PROFILE procedure generates rules that identify the records that have the same target value.

Like the EXPLAIN procedure, the PROFILE procedure only works with classification type of problems. What the PROFILE procedure does is it works out some rules that determine a particular target value. For example, what rules determine if a customer will take up an affinity card and the rules for those who do not take up an affinity card. So you will need a pre-labelled data set with the value of the target attribute already determined.

Oracle does not tell us what algorithm that they use to calculate these rules, but they are similar to the rules that are produced by some of the classification algorithms that are in the database (and can be used by ODM).

The syntax of the PROFILE procedure is

DBMS_PREDICTIVE_ANALYTICS.PROFILE (
     data_table_name           IN VARCHAR2,
     target_column_name        IN VARCHAR2,
     result_table_name         IN VARCHAR2,
     data_schema_name          IN VARCHAR2 DEFAULT NULL);

Where

Parameter Name Description
data_table_name Name of the table that contains the data that you want to analyze.
target_column_name The name of the target attribute.
result_table_name The name of the table that will contain the results. This table should not exist in your schema, otherwise an error will occur
data_schema_name The name of the schema where the table containing the input data is located. This is probably in your current schema, so you can leave this parameter NULL.

The PROFILE procedure will produce an output table called ‘result_table_name) in your schema and this table will contain 3 attributes.

PROFILE_ID This is the PK/unique identifier for the profile/rule
RECORD_COUNT This is the number of records that are described by the profile/rule
DESCRIPTION This is the profile rule and it is in XML format and has the following XSD

<xs:element name="SimpleRule">
  <xs:complexType>
    <xs:sequence>
      <xs:group ref="PREDICATE"/>
      <xs:element ref="ScoreDistribution" minOccurs="0" maxOccurs="unbounded"/>
    </xs:sequence>
    <xs:attribute name="id" type="xs:string" use="optional"/>
    <xs:attribute name="score" type="xs:string" use="required"/>
    <xs:attribute name="recordCount" type="NUMBER" use="optional"/>
  </xs:complexType>
</xs:element>

Using the examples I have used in my previous blog posts, the following illustrates how to use the PROFILE procedure.

BEGIN
   DBMS_PREDICTIVE_ANALYTICS.PROFILE(
      DATA_TABLE_NAME    => 'mining_data_build_V',
      TARGET_COLUMN_NAME => 'affinity_card',
      RESULT_TABLE_NAME  => 'PA_PROFILE');
END;

image

image

 

NOTE: For the above examples I used and 11.2.0.3 database.

No comments:

Post a Comment