The 2Q Digitalist Magazine Executive Quarterly is now out! It includes an article Data Lakes: Insight from the Deep that SAP Big Data expert John Schitka and I helped write on how organizations are using Data Lakes to store large amounts of information and find deep correlation patterns.
The Climate Corporation uses a data lake to collect massive amounts of agricultural data and applies machine-learning techniques to help farmers optimize their planting.
Predictive maintenance for trains
SAP Customer CSX is a transportation company in Florida. Previously, raw data from wheel bearings sensors could only be kept for 10 days because of volume restrictions. Using a data lake based on SAP HANA and Hadoop, the company is now able to keep it as long as it likes, and look for deeper correlations with information from other sensors.
Combatting insider trading
The Financial Industry Regulatory Authority (FINRA) regulates broker behavior in the United States. It uses a data lake and algorithms to find the patterns of fraud that human analysts might miss.
How not to drown in the data lake!
The article also includes best practices for implementing data lakes in your organization. For a start, don’t expect it to replace data warehousing any time soon:
“There have been so many millions of dollars going to data warehousing over the last two decades. The idea that you’re just going to move it all into a data lake isn’t going to happen” — Mike Ferguson, managing director of Intelligent Business Strategies, a UK analyst firm.
Yes, SAP does Hadoop, Spark & more!
You’ve probably heard a lot about SAP HANA’s big data capabilities, but that’s only part of the bigger data lake strategy that includes SAP Vora, providing business analysis and more, natively on Spark, and an elastic cloud platform using Hadoop, Spark, etc, based on the acquisition of Altiscale last year.