Monday, May 28, 2018

Twitter Analytics using Python - Part 1

(This is probably the first part of, probably, a five part blog series on twitter analytics using Python. Make sure to check out the other posts and I'll post a wrap up blog post that will point to all the posts in the series)

(Yes there are lots of other examples out there, but I've put these notes together as a reminder for myself and a particular project I'm testing)

In this first blog post I will look at what you need to do get get your self setup for analysing Tweets, to harvest tweets and to do some basics. These are covered in the following five steps.

Step 1 - Setup your Twitter Developer Account & Codes

Before you can start writing code you need need to get yourself setup with Twitter to allow you to download their data using the Twitter API.

To do this you need to register with Twitter. To do this go to apps.twitter.com. Log in using your twitter account if you have one. If not then you need to go create an account.

Next click on the Create New App button.

Twitter app1

Then give the Name of your app (Twitter Analytics using Python), a description, a webpage link (eg your blog or something else), click on the 'add a Callback URL' button and finally click the check box to agree with the Developer Agreement. Then click the 'Create your Twitter Application' button.

You will then get a web page like the following that contains lots of very important information. Keep the information on this page safe as you will need it later when creating your connection to Twitter.

Twitter app2

The details contained on this web page (and below what is shown in the above image) will allow you to use the Twitter REST APIs to interact with the Twitter service.

Step 2 - Install libraries for processing Twitter Data

As with most languages there is a bunch of code and libraries available for you to use. Similarly for Python and Twitter. There is the Tweepy library that is very popular. Make sure to check out the Tweepy web site for full details of what it will allow you to do.

To install Tweepy, run the following.

pip3 install tweepy

It will download and install tweepy and any dependencies.

Step 3 - Initial Python code and connecting to Twitter

You are all set to start writing Python code to access, process and analyse Tweets.

The first thing you need to do is to import the tweepy library. After that you will need to use the important codes that were defined on the Twitter webpage produced in Step 1 above, to create an authorised connection to the Twitter API.

Twitter app3

After you have filled in your consumer and access token values and run this code, you will not get any response.

Step 4 - Get User Twitter information

The easiest way to start exploring twitter is to find out information about your own twitter account. There is a API function called 'me' that gathers are the user object details from Twitter and from there you can print these out to screen or do some other things with them. The following is an example about my Twitter account.

#Get twitter information about my twitter account
user = api.me()

print('Name: ' + user.name)
print('Twitter Name: ' + user.screen_name)
print('Location: ' + user.location)
print('Friends: ' + str(user.friends_count))
print('Followers: ' + str(user.followers_count))
print('Listed: ' + str(user.listed_count))

Twitter app4

You can also start listing the last X number of tweets from your timeline. The following will take the last 10 tweets.

for tweets in tweepy.Cursor(api.home_timeline).items(10):
    # Process a single status
    print(tweets.text)
Twitter app5

An alternative is, that returns only 20 records, where the example above can return X number of tweets.

public_tweets = api.home_timeline()
for tweet in public_tweets:
    print(tweet.text)

Step 5 - Get Tweets based on a condition

Tweepy comes with a Search function that allows you to specify some text you want to search for. This can be hash tags, particular phrases, users, etc. The following is an example of searching for a hash tag.

for tweet in tweepy.Cursor(api.search,q="#machinelearning",
                           lang="en",
                           since="2018-05-01").items(10):
    print(tweet.created_at, tweet.text)

Twitter app7

You can apply additional search criteria to include restricting to a date range, number of tweets to return, etc


Check out the other blog posts in this series of Twitter Analytics using Python.

No comments:

Post a Comment