Welcome to issue 20 of the Big Data Innovation magazine.
When we look at what big data can do, it is almost always through the context of how the technology helps us to perform analysis and come to conclusions.
When you think about collecting data, then forming analyzable datasets, analyzing them and then finding correlations, they are all relatively simple things to do with effective software. However, what it cannot do is show why this correlation is showing what it does or even what this correlation means in a wider context.
This is something that the FTC recently found in their study ‘Big Data: A Tool for Inclusion or Exclusion?.’ One of the most important findings was that ‘Companies should remember that while big data is very good for detecting correlations, it does not explain which correlations are meaningful.’
In order to function properly, data needs to have human input and interpretation. To a database or computer, the number of people entering a shop is just a number, so the context in which the data is gathered is not understood. It could just as easily be the number of apples grown in an orchard or crimes in a neighbourhood. The correlations found between datasets are then just numerical patterns, rather than being contextualized.
Without this kind of human interaction, data can be detrimental to a company, often creating bias and discrimination. For instance, in the same report the example is given of ‘one company determined that employees who live closer to their jobs stay at these jobs longer than those who live farther away. However, another company decided to exclude this factor from its hiring algorithm because of concerns about racial discrimination, particularly since different neighbourhoods can have different racial compositions.’
So although big data has a significant part to play in businesses today, companies who rely on it too readily and without a human context can end up shooting themselves in the foot.