Why In-Memory Analytics is Like Digital Photography: An Industry Transformation
When was the last time you had a roll of film developed?
If you’re reading this, you’re no technology luddite, so it was probably already many years ago. Film – a technology that had been slowly improved and perfected over 200 years – was replaced in less than a decade by a much faster, cheaper, and more convenient technology approach.
In an earlier post, I outlined why I think we’re on the brink of a real revolution in business analytics infrastructures. This post draws out some of the parallels with the shift from analog to digital image processing, and the effects it had on the industry as a whole.
Film Cameras Compared to Today’s Data Warehouses
Data warehousing today has a lot in common with the first photos I took fifteen years ago with my then-brand-new Canon Elan II camera:
Buying film and getting pictures processed got very expensive very quickly. Only a tiny minority of professionals could take as many shots as they wanted – the rest of us had to be choosy about our shots.
Today’s data warehousing is complex and expensive, and organizations have to ruthlessly prioritize which projects will be undertaken. Meanwhile, other worthy projects have to wait, business users have to rely on gut feel.
Upfront Planning Required
In order to get the best results, you had to know in advance what kind of thing you were going to be taking shots of. Want to take a landscape shot? You needed 100 ISO and daylight film. An action shot in a hockey rink? 1600 ISO and tungsten lighting. A colorful tropical scene? High-saturation Fujifilm Velvia. A moody night shot of Paris? High-grain Kodak black-and-white xyz…. And the film only came in 24 or 36 pictures – you just changed your mind about what you wanted to take picture of? Tough.
Today’s data warehouses require you to decide in advance what data you’ll want to access later – and if you change your mind, you have to go back to the source data and reload it with different transformations.
Slow Feedback Loop
What you saw through the eye-piece was very different from the resulting picture. By the time you got the film back and realized that the exposure was wrong, or the framing a little off, there wasn’t that much you could do about it. The slow feedback loop meant that lots of good pictures were missed, and that you had to take more pictures than you needed. Want a great shot of a high-contrast night scene? You needed to take multiple long exposures (with the associated high costs) in order to have a hope of getting what you wanted.
By the time business people have had a chance to access the data, reports, and dashboards in your business intelligence solutions, you’ve already put in huge amounts of efforts – and the business has moved on, and now has different needs. To try to avoid this, you try to add all the data you might need into the data warehouse, but this increases the project cost and complexity.
With enough training, skilled photographers understand all the variables involved in getting the right picture, and were able to get more consistent, reliable results. But even they had to rely on specialized film processors to actually get the results.
Because of all the factors mentioned above, data warehousing today requires lots of arcane knowledge to be successful, and the people that know how to do it are in high demand and short supply.
Digital Cameras Compared to New Analytics Platforms Powered by In-Memory and Other Technologies*
I purchased one of the first consumer digital cameras, the Sony DSC-F1. It transformed the way I took pictures, just as in-memory computing is transforming analytics.
Cheap, Easy, Iterative Learning and Experimentation
The screen shows you exactly the shot that you’re going to take, and digital cameras don’t require film or processing, so the marginal cost of another picture is effectively zero. You can try things, and if they don’t work out, you can instantly adjust your approach and try again. Things like f-stops still make a difference to the end result, but you can learn how it works in an intuitive way, rather than having to wade through mathematics. Previously essential accessories like light meters become unnecessary – you can get the effect you need through trial-and-error in the same time it took you to work out the light readings. We are all better photographers now – because we can experiment, and throw away 90% of our pictures without worrying about it.
Once you have the row-level data in-memory, you can easily change your analytic view on the fly. You want to use a different attribute, rolled-up in a different way? Rather than having to reload and retransform the data, you can simply make a change to the metadata, and the users get the results they want. You can quickly and iteratively prototype your business intelligence solutions, rather than having to try to rigorously plan everything in advance. Faster, more convenient analytics results in better analytics.
The combination of photography and computing has transformed what you can do with photos. Here are just some of the things that were almost unimaginable with analog film:
- Different camera angles. You used to have to view everything through the eyepiece. Replacing it with a screen made it easy to take pictures at previously-awkward angles (on the ground, or to take pictures over the heads of a crowd).
- Taking a picture of something that happened in the past. This seems like science fiction, but several digital cameras use a buffer to store the last few seconds of whatever you’re looking at, so even if pressed the shutter too late when your daughter scored the winning goal, you can back up to the right shot.
- High dynamic range. The human eye sees more shades than any current camera. The latest cameras automatically take several shots and combine them to create a full range of shades, and you can choose the exposure afterwards.
- Selective focus. Today, some celebrity photographers only take pictures fully in-focus, and rely on Photoshop to introduce focus wherever they need it. And a brand-new camera promises to use some cool multi-lens technology and digital processing to let the rest of us do this very easily.
We’re just at the very start of what we’ll be able to do with in-memory systems. For example, it seems that in-memory column stores are well-adapted to extending enterprise analytics to unstructured data, real-time data, social data, etc — things we’ve been struggling with using traditional data warehouse approaches.
Integration With Other Systems
My main “camera” is now my iPhone, and has become a “feature” in a larger system. The image quality of the iPhone camera isn’t great, but combined with its convenience, the flexibility of the apps, HD video, and the ability to instantly share the results, it’s replaced my larger Canon Digital EOS camera for most things.
In-memory analytics isn’t just about analytics – these technologies will be integrated directly into operational systems. There will be “one version of the truth”, because we’re doing everything with one set of data…
Morals of the story
- Digital photography transformed an industry by eliminating obsolete layers. In-memory analytics and related technologies* will do the same.
- The change from analog to digital photography didn’t happen overnight. The digital cameras were relatively expensive compared to film, and some kinds of pictures made more sense, and in particular it took a time for the new digital cameras to rival the effective picture resolution of larger-format films. “Old-style” data warehousing won’t vanish overnight, but it will inevitably be relegated to particular types of tasks as in-memory analytics becomes more robust and takes on larger volumes of data.
- Today, many BI projects end up in failure, just like most of your old photographs. In-memory analytics will improve the quality and success of analytics projects.
- Some people jumped on the early limitations of digital cameras and insisted that the answer was to tweak the existing methods (buy scanners, etc.) – which missed the bigger picture. Today, some people try to insist that in-memory is “just a memory cache”, and that incremental technologies like Flash Disk / SSDs are the answer. Don’t be in denial.
- If your job relies on your existing data warehousing skills, better get used to the new world, or move to another role…
For more about SAP HANA and in-memory technology, visit http://experiencesaphana.com. And if you are interested in photography, you can find my photos here: http://blog.timoelliott.com (daily updates) and http://timoelliott.com/personal
*Updated from original post — although I say “in-memory”, it’s really about a collection of various technologies including in-memory, massively parallel hardware, column stores, and in-database analytics – please see the link at the start of the post for more details.