BUCKET_WIDTH: Calculating the size of the bucket

Monday, July 28, 2014

BUCKET_WIDTH: Calculating the size of the bucket

Some time ago I had some blog posts introducing some of the basic Statistical function available in Oracle. Here are the links to these.

The first blog post in the series looked at the DBMS_STAT_FUNCS PL/SQL package, what it can be used for and I give some sample code on how to use it in your data science projects. I also give some sample code that I typically run to gather some additional stats.
The second blog post looks at some of the other statistical functions that exist in SQL that you will/may use regularly in your data science projects.
The third blog post provides a summary of the other statistical functions that exist in the database.

Most people do not realise that Oracle has over 250+ statistical functions that are available (no addition cost) in all the database versions.

I've had a query about one of the functions BUCKET_WIDTH. The question was wondering if it was possible to get the width of the bucket in each case. There does not seem to be a build in feature to get this value, so we have to calculate this ourselves.

Here is an example of how to calculate the bucket width, as on the example I used in my previous blog post.

SELECT bucket, max(age)-min(age) BUCKET_WIDTH, count(*)

FROM (SELECT cust_id,

age,

width_bucket(age,

(SELECT min(age) from mining_data_build_v),

(select max(age)+1 from mining_data_build_v),

10) bucket

FROM mining_data_build_v

GROUP BY cust_id, age )

GROUP BY bucket

ORDER BY bucket;

What this query gives is an approximate value of the size of the Bucket Width based on the values/records that are in a bucket. The actual values used cannot be determined exactly as there is not function/value in SQL that tells us the actual value.

Brendan Tierney - Oralytics Blog

Pages

Monday, July 28, 2014

BUCKET_WIDTH: Calculating the size of the bucket

No comments:

Post a Comment