Top Big Data Challenges Revisited

We’ve now been wrestling with Enterprise Big Data for a few years. Here’s a summary of our progress so far.

There’s still a lot of technology to learn

There’s a lot of new technology to master, and it’s changing fast. There are new in-memory platforms like SAP HANA and open source projects like Hadoop and Spark. There are new techniques such as graph databases, text analytics, spatial processing, machine learning, and many more.

The good news is we’re now past the peak of the Hype Curve. In other words, people have had the chance to experiment and carry out pilots. They now have a better idea of what the new technologies can and can’t do.

When people adopt new technology, they tend to do old things in new ways. But now that people have had a closer look, they’re finding new things to do.

For example, there’s a lot less less talk about ripping-and-replacing traditional analytics and more about wrapping-and-renewing: using Big Data to extend and innovate existing solutions.

The right people are still hard to find

Implementing Big Data means getting the right people with the right skills. It’s still hard to find people who really know these systems.

The good news is that companies have been able to turn to existing analytics staff, who typically jump at the opportunity to learn new marketable skills. This is helped by the quantity of free training resources available online, such as open.sap.com and the SAP HANA academy. These courses explain the detailed technical steps to get value out of both in-memory and Hadoop.

There’s also a shortage of people who can get the most out of all this new data. Harvard business review called Data Scientist the sexiest job of the 21st century (it certainly sounds sexier than “actuary,” which is perhaps the closest 20th century equivalent!). Data Scientists have deep analytic and statistical skills combined with knowledge of the business. They are in high demand and they command high salaries.

What’s new is that technology is helping to remove some of the the bottlenecks. For example there are now easier to use, more automated predictive analytics tools that can be deployed by, say, marketing staff looking to optimize campaigns. This frees up overloaded data scientists for the most strategic projects. And companies are starting to offer “data science as a service.” For example, SAP has teams of people with advanced degrees in statistics who work on co-innovation projects with customers.

The right business case is key

One of the biggest challenges is still building the initial business case for Big Data projects in enterprise environments. And as usual there are two different approaches: either cutting costs, or creating new opportunities.

One of my favorite cost-saving examples is a large european airline. They needed to be able to store and query large amounts of detailed historical information for legal discovery reasons. Initially, they would back it up to tape, and reload it to the data warehouse only when there was a court case. But that required a lot of work so they ended up leaving it in the data warehouse even though it was seldom used. By moving that data to a Hadoop platform, they were able to save a lot of money, and the relatively slow query speed wasn’t a concern for this use case.

Another example is Molson Coors in Canada. They moved their BW system to HANA, not to speed up queries (they were already using the in-memory BW accelerator) but because they had increasing analytic demands, yet limited BI resources. Using SAP HANA vastly simplified the system, resulting in less data storage, lower maintenance, and reduced development effort.

The second approach is to go for something high-value that the company has always wanted to do, but it just wasn’t feasible — and a lot of these have to do with improving customer service.

For example, Center Point Energy in Houston used SAP HANA to create a “mind-reading” call center application. It identifies callers by their call ID, then runs a predictive algorithm using two years of customer history to choose among 40 different reasons the customer could be calling — all in less than a second. It then directs them to the most appropriate service, and provides all the information the operators need to handle the call. The result is that customers get better, faster service at a lower cost to the company.

The project wasn’t new — it was something they had wanted to do for a long time, but previous attempts had failed. In order to be feasible, the results had to be available in less than five seconds. Using traditional systems it took over a minute and a half — far too slow to be useable on the call. And precalculating the information didn’t work either — not only was all that processing and storage expensive, customers were often calling about a bill they had tried to pay just five minutes before.

In all these cases, the initial projects generated buy-in from the business users, and the initial investments were then leveraged for other applications at a lower marginal cost.

Integrating with existing systems is more important

Most enterprise Big Data project today are in standalone silos, with limited links to the existing corporate infrastructure. For example, while it’s clear that Hadoop is enterprise ready in that it can be used for valuable projects, it’s also clear that it has to be a lot enterprise-readier. In order to get the full value of these new technologies, it’s vital to connect them to other corporate data. For example, it’s pointless to track your Facebook likes without accounting for your marketing spend and whether they actually have any effect on sales.

Truly integrated Big Data is still at the planning stage in most organizations. Recently, I’ve been in a lot of meetings with customers who are debating their strategic information roadmaps. They’re looking for new best-practice architecture blueprints that combine the new big data systems not only with traditional analytics, but also with their core business applications.

New business models are the next big opportunity

To get the full value of big data, you have to change your business processes, and change management is notoriously difficult. But of course, it’s also the big opportunity. Because you can now measure new things in new ways, you can create new products and services.

For example, Vodafone Netherlands was able to use predictive analytics to figure out the right people to target for discounted roaming during the ski season. And Kaeser Compressor used the data coming their equipment’s sensors to create new online predictive maintenance services.

Conclusion

Big Data has come a long way. The fact that we’re now in the trough of disillusionment shows we’re one big step closer to the slope of enlightenment and the plateau of productivity.

Posted

May 22, 2015

Best practice

Timo Elliott

Tags:

#bigdata, Analytics, Big Data, Challenges, Predictive

Comments

2 responses to “Top Big Data Challenges Revisited”

2015 Presentations Review | Business Analytics

December 30, 2015

[…] London, UK: The Key Challenges to Big Data Project Success […]
Neil Raden

July 8, 2015

I resent that insinuation about “actuary” not being sexy! LOL But seriously, I do have a problem with the idea that more “automated predictive analytic tools” are ready for prime time. I dealt with this in a paper http://www.slideshare.net/NeilRaden/pervasive-analytics-needs-organizational-change-better-software-and-training . It’s unquestionably a good idea, but it takes more than dragging a Naive Bayes icon onto a desktop, it takes a restructured organization to make it work and it takes a long-term commitment to training and mentoring, not an on-line class.

In our engagements, we find that enterprise big data initiatives are still thinking about Hadoop in its earliest stages – a big, cheap data dump to be used by a handful of people writing code for batch queries. To work in the enterprise, they need to think about how Hadoop can address metadata (it basically has none), dynamic load balancing, real-time, security, reliability. Part of the calculus that made Hadoop “cheap” was the complete lack of these features. So, the cheap thing is marginalized now and Hadoop for the enterprise has to stand on it own two feet. It’s getting there.

But overall, I agree with your position that big data should be pointed toward new opportunities, not just re-platforming old apps.