In the midst of all the wonderful things we keep hearing about big data, it sure feels like business will just take off once you implement a good data strategy. For companies new to this, big data does not always just magically start showing spectacular results to your bottom line. It takes a lot of time and perseverance to collect data.
Then you need to separate the wheat from the chaff, prioritize, optimize, visualize and extract the right insights. In the process, errors invariably creep in. Businesses need to learn from their mistakes and sharpen their big data strategy and implementation as they go.
To prime you up though, here are some of the common problems that cause errors and can potentially affect your project. Knowing them can help you prepare for and prevent them, to achieve the best results.
Too much data
Today, data is pouring in from everywhere – apps and websites, desktops and mobiles, even watches. Smart devices, connected cars – the sources of data are bottomless. To collect everything you can and then begin to mine it for something meaningful is a recipe for disaster.
Yes, it is called big data for a reason. The whole idea is to analyze very large datasets to uncover patterns and glean insights into how different aspects of your business are functioning, or how your customers are interacting, or how your campaigns are performing. However, according to the Big Data Executive Survey, 85% of organizations aim to be data-driven, but only 37% report success in this area. If nothing else, it is a huge waste of investment.
A report conducted by Workfront.com indicated that 13% of survey responders had so much data that their work becomes more, not less, confusing. While it may not seem like a huge number compared to the other factors that are ailing data analysts, it is not an insignificant portion.
Even in the world of these humongous datasets, there are parts of data you need and parts you do not. A huge superfluity of data can lead to complexities and many different problems. Often termed as data saturation, collecting too much data just because the option is now available to us often results in heaps of unstructured data that is difficult to sort through and draw meaningful insights from.
It is just like walking into a rug shop and getting the shopkeeper to unroll all the rolls of carpet, when you already know you want one in blue. You think that more is better and the fear of missing out (FOMO) makes you check out every option you have. But it only ends up becoming even more confusing with every roll and, by the end of it, you are completely confused and buried in fabric.
A good data scientist begins with a structured plan from the very beginning. Start with business objectives and think about the key questions you want to answer. Purposeful data collection that is more strategic than voluminous is the way to efficient discoveries. You can consider developing a data governance council in your company that works to ensure redundancy in data collection and helps identify key objectives and corresponding datasets.
Poor data quality
According to a 2018 study conducted by Gartner, poor data quality costs companies a whopping $15m per year. That is a heavy price to pay for messy, unstructured data collection practices. Gartner also observes that this situation could worsen considering the complex nature of data sources and the massive volumes being collected. Poor quality data has detrimental effect on business value: It leads to informational crisis, not to mention the wasted time and resources that were put in organizing all that useless data.
Utilizing advanced data integration tools require structured data and in case of poor quality data it needs to be entered manually which today seems like a primitive thing to do. Additionally, there is the risk of typographic errors and other human errors. You might end up with embarrassing mistakes like a 47-year layover between flights.
Invest in collecting just enough data and make it good quality. Gartner recommends that marketers create compelling business cases that build on a connection between data quality improvement and key business priorities. The business performance metric needs to be isolated before beginning data collection. It is important to describe the target state before commencing a data project, to ensure efficiency and quality. Big data is the future of smart businesses, but only when it used good quality data to begin with.
Overestimating predictive analysis
One of the most alluring promises of big data is predictive analytics. It is an exciting possibility when you picture in the internet of things. One examples which illustrates the risk of blind reliance upon predictive analysis is the May 6, 2010, stock market "flash crash," in which the Dow Jones stock dropped by 1,000 points in a single day:
"According to Vuorenmaa and Wang, when a major firm sold an abnormally large number of futures contracts over a short period of time, a “feedback loop” arose between high-frequency traders. Algorithms blindly passed the same “hot potato” shares back and forth between high-frequency trading firms until the whole stock market had been severely disrupted."
Likewise, predictive analytics for businesses is a wonderful way to identify patterns and build algorithms to generate models that will help you perform better customer segmentation and provide wonderfully personalized services, but it cannot tell you the future – ultimately, data analysis still has no idea whether you will have a profitable holiday season or if a blizzard will derail your Christmas collections.
The idea is to set realistic expectations from your big data project and always use it with a pinch of human intellect and business knowledge. Every insight you draw comes with a context and that context must be factored into your plan of action.
Data science is a potent methodology to help businesses correctly interpret mathematical analyses of past outcomes to prepare for future outcomes. It is not, however, a silver bullet to end all your questions. It needs to be used as a tool, along with appropriate context and business cases to make it a worthwhile asset to your strategy. Every time a business reports that its big data project is failing to deliver the desired results, the time comes for introspect and to find out if they have been wrapped up in any of these three errors – and fixing them will bring the work on track.