INTroducing Trends: How we're tracking the Conversation this fringe
Fringebiscuit is dedicated to producing quality, instantly digestible theatre news and reviews. But there's a problem – no matter how hard our reviewers work (and boy, do they work hard) there will always be some shows we can't see in time. So instead, Fringebiscuit is harnessing the power of Twitter to bring you the freshest possible analysis of what's hot at this Edfringe. Here's a more detailed – but not too technical – overview of how we do this:
Tracking the twitter conversation
Fringebiscuit Trends works by collecting a large sample of daily tweets using Twitter's built-in API. We select these tweets using keywords relating to the Edinburgh Festival Fringe. Then we use each show and performer's official Twitter handle and hashtag to see who's talking about what.
We use Natural Language Processing techniques to gather and analyse many more conversations than we could ever hope to actually read the old-fashioned way. Based solely on what you are talking about, we'll be bringing you the results of these analyses in easy-to-digest visualisations and blog posts.
Crunching the numbers — the gory details
Here's a slightly more in-depth overview of how we do what we do.
Every day, we start with a very large dataset containing all of the tweets our search routine has found from the previous 24 hours. Some of these relate to shows, but many are just background noise. We filter out retweets right away, as these hugely increase the size of our dataset without adding anything new or interesting. Once we have our sample, we do a number of things with it:
- We extract hashtag and handle mentions — this is a great way for us to tell who's talking about what.
- We use a technique called Natural Language Processing (NLP) to clean the tweet text itself, removing commonly used words and phrases and other 'noise'.
- We use an analytical approach called Sentiment Analysis to figure out whether the tweet is saying mostly positive or negative things. This is a tricky call to make, since people express themselves in a many different ways, so we're constantly training this part of the program to do better on our data.
- Finally, we analyse the handles, hashtags and the tweet text itself to try and figure out which (if any) Fringe show the tweets are relating to. Each day we use this link to build our dataset for each and every show in the Fringe, so we know who is talking about what, and why.
We try and keep things as simple as possible in terms of what we present for each show. Here's a quick run-through of the daily metrics we calculate:
Daily Mentions: The simplest way to keep track of what's trending in Fringe Twitter-space, this is a daily number of Fringe-goers mentioning a show. If ten users see a performance and tweet about it afterwards, that's ten Daily Mentions.
Mention Share: Things are getting competitive! This metric describes the proportion of the Fringe conversation (within each genre) which is about a particular show. If 100 users tweet about comedy, and one show is mentioned in 10 of those tweets, it has a 10% Mention Share in the conversation.
Positivity Share: This is where we see who's actually making Fringe-goers happy. After testing positivity in each tweet with our Sentiment Analysis, we calculate a total Positivity Score for each show indicating Fringe-goers response. For all the 'positivity' surrounding a particular genre (e.g. Comedy), if 10% of it is associated with a certain show, that's a Positivity Share of 10%.
We'll be calculating these results daily for every single show in the Fringe, and letting you know who's at the top.
Fringe-goers — Want to make sure your Tweets are included?
Chances are, if you've been tweeting about the Fringe, your tweets have already made their way into our analyses. We search for a range of keywords like "Fringe", "Edfringe" and "Edinburgh Festival" to pull our daily sample. We also limit our results by show and user, so if you tweet ten times in a day about a single show, they won't all make it into the sample (though if you see ten different shows and tweet about them all, we'll use all ten of your tweets).
If you want to make absolutely sure we get your input, use #FringebiscuitTrends to put your tweets at the top of our search list.
Performers — want to make sure we can find your show?
We're constantly developing our algorithms to get better and better at linking conversations out there in Twitter-space to specific shows here at the Fringe. It's as hard it sounds, but the low-down is this:
- If you have a unique Twitter handle for your show (whether it's personal or not doesn't matter, it just has to be used only on one show), we'll use this to identify tweets relating to your show. If you've published it on your show's Edfringe.com entry, we are already using it to link tweets to your show. If you use one handle for multiple shows, we can't use this to identify your show.
- We also make a best-guess at the hashtag users are likely to use to refer to your show. For example, for a show called Joe Bloggs Does Stand-Up, we would look for #JoeBloggsDoesStandUp. We look for abbreviations of this too. So, we would find tweets with #JoeBloggs, but not #StandUpJoeBloggs or #JoeBloggsComedy. Shows with short or common names (e.g. a show simply called "Comedy") won't have a best-guess hashtag since #Comedy is likely to be used in a wide range of tweets so can't be reliably deemed to refer to that specific show.
We're constantly adding new search terms to our database, so we can adapt as the Fringe progresses. It'll never be perfect, but we're getting as close as we can. If your show has a hashtag or handle which you think we've missed, let us know at: