Friday, April 19, 2019

Data Sets for Analytics

When working with analytics, in whatever flavor, one of the key things you need is some data. But data comes in many different shapes and sizes, but where can you get some useful data, be it transactional, time-series, meta-data, analytical, master, categorical, numeric, regression, clustering, etc.

Many of the popular analytics languages have some data sets built into them. For example the R language comes pre-loaded with data sets and these can be accessed using
data()

but many of the R packages also come with data sets.

Similarly if you are using Python, it comes with some pre-loaded data sets and similarly many of the Python libraries have data sets build into them. For example scikit learn.
from sklearn import datasets

But where else can you get data sets. There are lots and lots of website available with data sets and the list could be very long. The following is a list of, what I consider, the websites with the best data sets.

Kaggle
Amazon Open Data
UCI Machine Learning Repository
Google Search Engine
Google Open Images Data
Google Fiance
Microsoft Open Data
Awesome Public Datasets Collection
EU Open Data
US Government Data
US Census Bureau
Ireland Open Data
Northern Ireland Public Open Data
UK Open Data
Image Processing Data
Carnegie Mellon University Data Sets
World Bank Open Data
IMF Open Data
Movie Reviews Data Set
Amazon Reviews
Amazon public data sets
IMDb Datasets

No comments:

Post a Comment