Raul Pefaur

By: Raul Pefaur

08.07.2014

Harnessing the Twitter API for Sentiment Strategies

In this project we will be writing an application which downloads tweets from Twitter. We are continuing our journey leaning C#, as we started with our Yahoo Finance data downloader.

Twitter has a REST API that allows us to  search for tweets, users, timelines, or even post new messages. We will use an incredible C# Twitter Library called Tweetinvi. It has everything you need to start building your own program. There are other alternatives, but we found this was the easiest and most complete. To use this program, you need to have a Twitter developer account, and use your own credentials.

Application Structure

The program is separated to keep the twitter interactions, file management and application logic separate. These are separated into three files: Program.cs, FileManagement.cs, Twitter.cs and Tweet.cs.

  • Program.cs – Loop over twitter symbols, downloading symbols and send them to files to be saved.
  • FileManagement.cs – Load user names, and append new tweets to end of tweet files.
  • Twitter.cs – Download tweets, manage the rate limit constraints, login to twitter API.
  • Tweet.cs – Give format to the downloaded tweets
Download C# Source Code for this ArticleDownload

This allows you to easily change the application logic, the location you save files or even change the twitter library without affecting other parts of your code. The program starts by logging into twitter with the SetCredentials function. This requires 4 keys from the twitter developer website.

Twitter.SetCredentials(accessToken:"xxxxx", accessTokenSectet:"xxxxx", consumerKey:"xxxxx", consumerSecret:"xxxxx");

We retrieve the twitter user names from a file twitterUsernames.csv, which contains the list of usernames to download. We’ve collected a list of 3000 financial and news symbols for this project that we scrapped from twitter lists and search results. We also estimate the next best time to update the users tweets, based on their frequency of tweeting.

var usernames = FileManagement.GetUsernames();
var nextUpdateTime= FileManagement.GetNextUpdateTime();

The tweet downloading and rate limiting are entirely managed by the twitter class; the function GetTimeline manages downloading all the historical tweets possible, or only downloading updates.

var tweetList = Twitter.GetTimeline(userName, lastTweet, ref tweetCount);

The freshly downloaded tweets are serialized to JSON by Json.NET and then written to a file – one per twitter username.

Optimizing Tweet Downloads

We want our program to download the maximum historical tweets possible per user, and then recheck accounts for new tweets. Twitter rate limits API requests to 300 requests per 15 minutes, and allows access to a maximum of 3200 historical tweets. Additionally each request can download a maximum of 200 tweets at a time. To maximize the productive use of our requests we will constantly calculate an average time span from the user’s latest tweets, and set the time the program should recheck for new tweets so we’re confident there will be at least 1 new tweet. The following code reads the tweets from the file and calculates the average gap between tweets:

public static TimeSpan GetAverageTimeSpan(List tweets)
{
     if (tweets.Count == 0)
     {
         return TimeSpan.FromSeconds(500);
     }
     else
     {
         List dates = new List();
         foreach (var line in tweets)
         {
             dates.Add(line.Time);
         }
         var difference = dates.Max().Subtract(dates.Min());
         var averageTimes = TimeSpan.FromMilliseconds(difference.TotalMilliseconds / (dates.Count()));
         return averageTimes;
     }
}
Downloading Tweets

When downloading tweets, we check if we have already downloaded tweets for this user. If we have historical tweets for this user, we’ll only download the updates. Twitter’s API has 2 ways for doing this: Each tweet in the tweetosphere has an unique ID number. To download updates, we download every tweet since an ID (since_id). This means, “download all tweets since the last tweet we got”.

public static long LastSavedTweetID(List getTweets)
{
     var lastLine = getTweets.First();
     long lastTweetID = lastLine.ID;
     return lastTweetID;
}

If we don’t have any historical tweets for this user, the program will download all historical tweets possible. With each request, we’ll attempt to download the last 200 tweets, and the max_id specifies the ID of the most recent tweet we want in this request.

public static List (string userName, List getTweets, ref int tweetCount)
{
   List tweets;
   if (getTweets.Count == 0)
   {
         Console.WriteLine(" First time downloading " + ticker + ", creating new file.");
         tweets = TweetsDownload(true, userName, getTweets, ref tweetCount);
   }
   else
   {
         tweets = TweetsDownload(false, userName, getTweets, ref tweetCount);
   }
Encoding and Saving Tweets

Each Tweet comes in its own format, containing a lot of information (ID, language, message, date, etc). We save a personalized subset of this information in the Tweet class:

/// Create a new tweet from an original Tweetinvi object
public Tweet(Tweetinvi.Core.Interfaces.ITweet original)
{
    this.ID = original.Id;
    this.Text = original.Text.Replace(",", "");
    this.Time = original.CreatedAt;
    this.Retweets = original.RetweetCount;
    this.Favourites = original.FavouriteCount;
    this.User = original.Creator.Name;
    this.Followers = original.Creator.FollowersCount;
}

The new encoded tweets are added to a list, that is then written to its “username.txt” file.

//Encode each tweet and add them to a list
public static List Serializer(List tweetList)
 {
    var encodedList = new List();
    foreach (var line in tweetList)
    {
       var encodedTweet = Tweet.Serializer(line);
       encodedList.Add(encodedTweet);
    }
    return encodedList;
 }
//Open & Write to file only if there are new tweets
FileManagement.Writer(encodedList, file);
API Restrictions management

Finally, we should rate limit the requests we do to the API. The API Ready function will make the program sleep until new requests are available.

/// Check if API is ready for new request
private static int WaitForAPIReady()
{
      int count = 0;
      do
      {
          DateTime currentTime = DateTime.Now;
          currentTime = currentTime.AddMinutes(-15);

          count = (from time in timeStamps
                   where time > currentTime
                   select time).Count();
          if (count > 290)
          {
              Console.WriteLine(" Twitter downloading limit reached. Waiting...");
              Thread.Sleep(50000);            
          }
      } while (count > 290);
      return (300 - count);
}

That was a brief explanation of how we handled twitter’s API limitations using Tweetinvi. The downloader is built, now the fun part begins: What accounts shall we scan? What can we do with the downloaded data? It would be fun to see an algorithm that uses twitter sentiment data to make investing decisions!

Download C# Source Code for this ArticleDownload

P.S: The libraries needed (Json.Net, Tweetinvi) are not included in the file. You can download them from NuGet in Visual Studio, or in the developer’s website.

img Back to Blog

Related Articles

Our New Quantpedia Strategy Library

By: Jared Broad • 24.07.2018

Powered By LEAN: Abbington Investment Group

By: Jared Broad • 11.07.2018

Powered By LEAN: DropShot Capital

By: Jared Broad • 14.06.2018 Correlation

Rotating Inversely Correlated Assets – NIFTY and USDINR

By: Raul Pefaur • 08.08.2014 visual studio

Backtesting with a REST API

By: Jared Broad • 04.08.2014