Some time ago I had some blog posts introducing some of the basic Statistical function available in Oracle. Here are the links to these.
- The first blog post in the series looked at the DBMS_STAT_FUNCS PL/SQL package, what it can be used for and I give some sample code on how to use it in your data science projects. I also give some sample code that I typically run to gather some additional stats.
- The second blog post looks at some of the other statistical functions that exist in SQL that you will/may use regularly in your data science projects.
- The third blog post provides a summary of the other statistical functions that exist in the database.
Most people do not realise that Oracle has over 250+ statistical functions that are available (no addition cost) in all the database versions.
I've had a query about one of the functions BUCKET_WIDTH. The question was wondering if it was possible to get the width of the bucket in each case. There does not seem to be a build in feature to get this value, so we have to calculate this ourselves.
Here is an example of how to calculate the bucket width, as on the example I used in my previous blog post.
SELECT bucket, max(age)-min(age) BUCKET_WIDTH, count(*)
FROM (SELECT cust_id,
  age,
width_bucket(age,
(SELECT min(age) from mining_data_build_v),
(select max(age)+1 from mining_data_build_v),
10) bucket
FROM mining_data_build_v
GROUP BY cust_id, age )
GROUP BY bucket
ORDER BY bucket;
What this query gives is an approximate value of the size of the Bucket Width based on the values/records that are in a bucket. The actual values used cannot be determined exactly as there is not function/value in SQL that tells us the actual value.
No comments:
Post a Comment