Hasso Plattner, SAP’s co-founder, chairman, and “Chief Software Advisor” has been giving a series of talks including his keynote at SAPPHIRE 2009 on why in his view “disk has become yesterday’s tape”, and why column storage and in-memory techniques are the future for both data warehousing AND enterprise applications, displacing the 20-year reign of relational databases.
Experts have advocated column databases such as Sybase IQ and LucidDB for data warehousing for many years.
But now, based on research done at the his Institute (and a lifetime of trying to get the best performance possible for enterprise applications), Plattner is advocating basing both OLAP and OLTP systems on column storage in order to eliminate the need for the cumbersome ETL process, and reduce system complexity and the number of database tables.
More details are available in his white paper A Common Database Approach for OLAP and OLTP Using an In-Memory Column Database and a related presentation. The research was conducted using SAP’s TREX technology and real customer data (to see what the technology can do in a business intelligence context, check out SAP BusinessObjects Explorer).
Given that the vast majority of relational databases are used for OLTP transactions today, this would represent a radical change for the enterprise application market.
And although data warehousing will still need to exist, because of master-data management, data synchronization, data quality, etc., it would certainly be a lot simpler, and a lot faster, since much of it could be done directly on the “transactional” data source.
While nobody has promised that SAP will develop the concept as a product, Plattner has presented his views widely to SAP employees, and leaves no doubt that he thinks this is the way ahead.
Here’s the last slide of Hasso’s presentation at the recent SIGMOD 2009 conference for database researchers – this is certainly something that we’ve been looking forward to for a while…
Comments
7 responses to “The End of Relational Databases?”
Timo,
If this is the end of the relational databases, then how would the BI star and snowflake schemas work? I’m currently working on a project to transform numerous tables with the column storage to a more relational model because Cognos had a hard time converting the column storage into star schema. So thats why we had to convert them.
Here’s my question – with a column storage DB, will OLAP and OLTP become one and the same or will a conversion process be needed to transform the OLTP to OLAP?
Thanks,
Abhishek
Chris,
The school of thought of not doing updates is gaining momentum. A version of this school is called “Data Vault”, and is the child of Dan Linstedt You will notice that Data Vault works very well with column stores like e.g. Terradata.
Many of the column store databases are relational databases. It is a storage model we are talking about here. There are operations that are best handled by row-at a time behaviors and there are operations that are best handled by column oriented systems. Is either superior as a general model? I doubt it.
Most of the column store models (and I make this statement as a bit of a guess) fall into the WORM (Write Once Read Many) category. It is devilishly hard to do updates in systems like Vertica. There is a school of thought that we shouldn’t be doing updates – that each attempted update creates a new representation of the underlying resource. For example, the change of state of a passenger reservation doesn’t update the current reservation, it creates a new resource called the “checked in reservation.” Should we implement systems that way? It depends what we are after. For example if we want to do analysis of how many times checkin was performed, it might be handy to create instances of checkid-in reservation and count them. I am not advocating this – and actually that state transition is probably wrong – as anyone who has tried to check in twice on some airlines may have discovered.
I don’t think that the whole relational model is dead. I do see opportunities, however, for different storage models and a move away from the shared storage/slow disk models that we have lived with for the last 30+ years.
To put it in IBM parlance, “System R is dead, long live relational databases”
Note System R was IBM’s research prototype for relational database in the 1970s and early 80s. It formed the basic model from which many other relational databases evolved..
Every data store technology be it row, column or domain oriented has its place. Storage Availability(memory,disk,ssd) has some impact on this, but usage has too. So long DBMS systems cannot change their internal representations independently from their logical representation (and very few can) vendors ar barking up the wrong tree IMO.
Agree with M. Evers, the RDBMS is far from doomed. However it is an old philosophy which is due a competing idea and Column store db’s certainly have their place – they will replace a portion of the RDBMS market, but it will not eliminate it.
However his final slide is the same target the BI industry has been trying to hit for 20 years. Why a new database storage method will suddenly solve this escapes me!
I agree relational databases are not doomed (it’s rare to find a technology that becomes completely obsolete) — but we shouldn’t just blithely ignore what has changed, either. I’m guessing that Codd and Date might well have agreed with the analysis. What changed is the ability to use memory instead of disk (and moving to 64 bit operating systems makes the easily-addressable memory 1000x larger), fundamentally changing the (inevitable and previously identified) tradeoffs.
Codd will turn in his grave for this nonsense!
The relational model is not dead, nor are relational databases. Direct Image databases(row store databases) are not optimal, but Code and Date knew this a long time ago!!