Author: Giacomo Boscaini-Gilroy

Reflections after Four Weeks at Full Fact.

When I started at Full Fact, I wanted to build a program to carry out their first ever automated factcheck.

Four weeks later, that what’s I finished with: so far, the automated factchecker can check ‘Employment is rising’. It reads the word ‘employment’, goes to the Office for National Statistics’ labour force data, and runs a couple of simple tests to get an idea of whether the number of people in work really is rising.

Beyond that one example, the code I wrote shows real promise. It understands what kind of data needs to be looked up when presented with an example sentence.

I spent four weeks on a problem that will take months to solve, so I was only creating a basic framework. Although I did not have enough time to get deep into the part of the software that analyses ONS data, I was able to find a good approach for the part that understands what data to retrieve.

There are still many challenges. When you get into all the different ways sentences are formulated, it seems impossible that there might be one way to process them all. How do you work out that ‘since 2010’, ‘recently’ and ‘last month’ all give you information about time periods, while ‘to Europe’ and ‘in my constituency’ are both location-related?

By looking at phrases from different points of view, and discussing them with others, I found and was able to implement a structure for the code that will form a good basis for future work.

It turned out my time at Full Fact was about more than just coding. Along with experience programming and communicating, I learnt about linguistics, an unexpected but interesting field. To make sure that the tool picked up all relevant claims, and excluded irrelevant ones,I had to think about the construction of speech in some detail.

I also learnt it’s difficult to get tired of £2.50 falafel wraps at lunchtime.

Weeks 2-3 at Full Fact.

The automated factchecking project is split between two parts: scanning text and checking its validity. When I started it a few weeks ago, I intended to spend equal amounts of time on each. However, I started off in the first week using some very rudimentary programming tools, and it became clear that it would be much more worthwhile to explore new avenues and come up with new ideas in order to produce work that will be useful in the long term. And the long term is important, because I am kicking off something that will be built on in the coming years.

So I got stuck into the first part, learning about natural language processing and understanding how to tease out the important information from sentences. It’s interesting to spend your time thinking about how language is formulated, and when you see how complicated it is, it makes you wonder how the tech giants have built intelligent personal assistants like Siri and Cortana.

Having found some much more useful analytical tools, I came up with ways to decide what data is required to check the claim that is fed in. There are some immense difficulties and limitations. For instance, how can you tell that “this government has reduced spending on new housing” is a factual claim, but “this government might well reduce its infrastructure investment” is just speculation, and that only one of these should be factchecked?

I saw an opportunity to take a diversion from these thoughts when last week, Director of Full Fact Will Moy  was on BBC Radio 4’s moral maze. Giles Fraser, one of the panellists, explored the conflict between technology and humanity in the context of automated factchecking, saying that once a computer algorithm decides what is right and wrong, “the truth” has been dehumanised.

This gives an opportunity to get further into what automated factchecking really aims to do. It does not ask computer software to make a moral judgement like a human can. Full Fact provides people with the tools they need to check things by themselves and come to an informed decision.  The factchecks don’t just give a yes or no answer, they also point out the shades of grey. In the same way, my code will not tell you what to believe and what is right, but will allow people to confidently decide for themselves.

Importantly, rather than replacing the factchecker, the software’s role is to make their work easier. Every time a simple phrase like “unemployment stands at 5%” appears, a person should not have to take up their time retrieving statistics from the ONS website, when a computer could do that instead.

Week 1 at Full Fact.

My internship is at Full Fact, the UK’s independent factchecking organisation. Ahead of this year’s referendum, they worked with ITV and Sky News to correct factual errors made in live debates, and they have asked for and got corrections in all the national newspapers. They play an ever-growing role in the effort to hold the media and politicians accountable to their claims.

Many assertions made in public debate come up again and again, they call them “zombie claims” at Full Fact (because they just don’t die). Claims like ‘poverty increased in the past six months’ or ‘unemployment decreased last year’. Factcheckers spend valuable time finding and interpreting government data for poverty or unemployment every time new datasets are released. In order for Full Fact to spend more time getting into deeper questions, and for journalists to have faster access to the truth, the charity is aiming to automate the most repetitive parts of their work.

This is what I’m working on: automated factchecking. The code I write will hopefully lead to the first ever factcheck carried out by Full Fact with a computer program. My project has quickly taken shape, and is divided into two parts. The first is natural language processing (NLP), that interprets, for example, Jeremy Corbyn’s claim at Prime Minister’s Questions that more people than ever live on the streets. The second does the statistical work, checking the claim against government data for homelessness.

So far I have been using NLP and discovered that the problems involved in teaching a computer how to read are both interesting to solve, and also a massive challenge. Sentences are constructed with linguistic rules, but they are no way near as logical as  instructions a computer understands. Every rule that I come up with that can interpret text seems to require a hundred exceptions.

More broadly, the office experience is great. Having been provided with a good starting point, I am working on the code itself on my own. I am even helping shape what direction to take it in, so there is a certain amount of responsibility I have never felt before in the workplace. I am doing this because I like politics and this is something that can make a difference to democracy. I love having a TV on the wall showing the BBC news channel, and receiving emails with information on the day’s political happenings! To top it off, sunny lunchtimes with a falafel wrap in Gray’s Inn Gardens are beautiful.