Between answering questions this week at BI2014 in Orlando and preparing to ask some at Hadoop Summit in Amsterdam next week, here are some thoughts on where we are with Hadoop for the enterprise.
The elephant in the room. It’s now clear that Hadoop will play a key part of your future enterprise architecture. But how will it integrate with existing infrastructures and other new technologies?
Growing pains. It’s teenagers vs parents. Teenagers are justifiably proud of disrupting the way their parents did things. But they have a tendency to underestimate the value of their parents’ experience. As they get older, they’re forced to deal with more of the boring necessities of life — and realize their parents knew a thing or two after all.
As Hadoop matures beyond pixel-only business models (social websites, online video games) and inexpensive data storage it is being forced to embrace — and learn a few lessons from — the enterprise systems that drive real-world business.
Old dogs are learning new tricks. Even without Hadoop, enterprise systems are poised for big disruptive changes. New in-memory systems eliminate the need for separate operational and analytic environments, combining transaction integrity with breakthrough performance. The data is stored once, and every transaction is available for analysis the instant it is written. The increased simplicity and agility of the system means it’s both faster and — despite the higher costs of memory — cheaper to run than traditional architectures.
Simplicity and agility. Both Hadoop and in-memory deliver enterprise architects the agility that has been sorely missing in the past. Hadoop doesn’t require upfront definition of data schemas. In-memory systems offer fast analysis without complex caching and aggregate tables.
Changes to support the business can be made in metadata, with less need to physically move data around. Updates can be made faster and more iterative. The future is uncertain—above all, you need an architecture than can change fluidly with new business opportunities.
Not just new tech in old ways. Information (not your transactional systems) is the new foundation of your future architecture. The data you are storing now you will still have in 20 years even through your applications will long since have changed.
This means that it’s about more than using new technology in old ways, such as replacing part of enterprise data warehouses with more flexible Hadoop “data lakes” or adding faster in-memory data marts. It has to be about supporting the business needs of the future
Enterprises are looking for an “innovation platform” with real-time analysis tightly connected to operations in order to power more personalized customer experiences and flexible business models.
Today’s complex choices. Companies want to take advantage of the cost advantages of Hadoop systems, but they realize that Hadoop doesn’t yet do everything they need (for example, Gartner surveys show a steady decline in the proportion of CIOs that believe that NoSQL will replace existing data warehousing rather than augmenting it — now just 3%). And companies see the performance advantages of in-memory processing, but aren’t sure how it can make a difference to their business.
The new technologies confound easy classification and the boundaries continue to blur. The elephant is starting to wear a tie, with projects to introduce transaction integrity to Hadoop. Enterprise systems are providing support for things like in-database MapReduce and text analysis.
There’s no easy one-size-fits-all answer today. Different “innovation applications” require different tradeoffs, based on the types of data, the ratio of analysis to action, and the need for speed. Here are some examples of how Hadoop can fit in with enterprise architectures today.
It’s not just about technology. Organizations shouldn’t work on architecture without also thinking hard about how their business models may look in the future. And the success of Hadoop in the enterprise space depends as much on ecosystems of enterprise-savvy vendors and partners as it does on technology.
The future seamless experience. Companies are looking for the “new best practice” of how to put together an end-to-end, enterprise-strength information architecture.
Vendors are racing to support that architecture vision with a combination of new and tried-and-true technologies. The goal is to hide complexity by automating as much as possible the handoffs between the different data systems to provide a coherent system rather than companies having to duct tape everything together themselves. This will require enterprise vendors to embrace Hadoop as if it were their own technology — which may require culture changes for some.
Vendors are also working on packaged “next generation” applications that combine operations, analytics, enterprise-ready mobile interfaces, links to third-party data, and integration with industry networks.
Trust no one. The technology continues to evolve at a rapid rate. There has been lots of enterprise experimentation with Hadoop but few organizations have yet declared Hadoop a first-tier element of their enterprise architecture. In-memory processing systems are also in their infancy, just now making the transition from fast analytics to all-around business platforms.
Best practice is being discovered as we speak — and may change fast in the light of new technology changes. Don’t trust anybody that claims that they can tell you the right way to do things unless they’ve first spent a lot of time understanding your business.
Good luck, and join us in Amsterdam for more discussion! (SAP/Hadoop presentation by John Schitka at 15h10 on Wednesday April 2nd)
Other Links
- Modern Hadoop architectures
- SAP and Hortonworks partnership
- Connecting SAP products to the Hortonworks sandbox
- SAP HANA and HADOOP
- The benefits of in-memory for startup developers