Sentence structure is central to human language. We understand the difference between the sentences: “Sam is happy because he won the Lottery.” and “Won the Lottery, Sam is happy.” The former follows the rules of the English language; the latter is more likely to be spoken by Yoda in Star Wars.
We are able to understand such simple sentences as well as more complicated ones. However, how do we ensure that a computer (or SkyNet) is able to do so?
Well, this is the job of Natural Language Processing, or NLP for short. My job at Full Fact involves improving their automated factchecking process, and this entails using NLP to process whatever claims that politicians, journalists etc. might make.
If you haven’t read my previous post, our factchecking process can be narrowed down to 3 stages: the first involves using NLP to process the claim while the second involves going to the relevant websites, such as the Office for National Statistics, to get the relevant data. At the last stage, we would present the simplified data in a way that is easy for all of mankind to understand.
This week, much of our focus was on the bridge between the first and the second stage. While it might seem feasible on paper, NLP presently has yet to reach the capabilities of Jarvis, Tony Stark’s ultracapable artificial intelligence in Iron Man. We focused on obtaining the keywords from sentences such as ‘GDP rose in 2015’ and then linking these keywords to claims of a certain type. The latter gives us an idea of what data to obtain from the ONS website and then present.
We are still working on this ‘bridge’. While our NLP programme understands a simple sentence like ‘GDP rose in 2015’, it encounters trouble for more complicated sentences like ‘GDP has been rising consistently from 2010 to 2015’. Hopefully this ‘bridge’ in the future would as strong and stable as London Bridge.