{"id":12366,"date":"2013-11-28T19:02:48","date_gmt":"2013-11-28T18:02:48","guid":{"rendered":"http:\/\/timoelliott.com\/blog\/?p=5886"},"modified":"2013-11-28T19:02:48","modified_gmt":"2013-11-28T18:02:48","slug":"why-enterprises-should-be-more-interested-in-hadoop","status":"publish","type":"post","link":"https:\/\/timoelliott.com\/blog\/2013\/11\/why-enterprises-should-be-more-interested-in-hadoop.html","title":{"rendered":"Why Enterprises Should Be More Interested in Hadoop"},"content":{"rendered":"<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;\" title=\"hanadoop-example\" alt=\"hanadoop-example\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2013\/11\/hanadoop-example.jpg?resize=690%2C310&#038;ssl=1\" width=\"690\" height=\"310\" border=\"0\" \/><\/p>\n<p><em>An example use case of Hadoop and in-memory systems extracted from the <a href=\"http:\/\/www.saphana.com\/docs\/DOC-3777\" target=\"_blank\">CIO guide to using Hadoop and SAP systems<\/a>.\u00a0<\/em><\/p>\n<h3>It\u2019s Time For Two Worlds To Come Together<\/h3>\n<p>Earlier this year, I attended the <a href=\"http:\/\/hortonworks.com\/blog\/hadoop-summit-2013-amsterdam-its-a-wrap\/\" target=\"_blank\">Hadoop Summit in Europe<\/a>, sponsored by Hortonworks. There were many excellent presentations at the conference, but the divide between \u201cold\u201d and \u201cnew\u201d analytics was very clear. There were relatively few \u201ctraditional\u201d companies presenting sessions, and those that were seemed faintly embarrassed to mention that they still had data warehouses.<\/p>\n<p>The Hadoop use cases discussed were mainly new, standalone systems rather than integrations with with more traditional systems or analytic architectures. There were two notable exceptions.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; margin: 0px 5px 0px 0px; padding-left: 0px; padding-right: 0px; display: inline; float: left; padding-top: 0px; border: 0px;\" title=\"alasdair_anderson\" alt=\"alasdair_anderson\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2013\/11\/alasdair_anderson.jpg?resize=150%2C191&#038;ssl=1\" width=\"150\" height=\"191\" align=\"left\" border=\"0\" \/>The first was by <a href=\"http:\/\/hadoopsummit.org\/amsterdam-blog\/meet-the-presenters-alasdair-anderson-of-hsbc\/\" target=\"_blank\">Alasdair Anderson<\/a>, Global Head of Architecture for HSBC Global Banking and Markets, who presented on the theme of \u201cEnterprise Integration of Disruptive Technologies.\u201d<\/p>\n<p>The bank needed a single data platform that could provide 360-degree views of clients, operations and products. To provide this, the team had been struggling with a complex, \u201cbrittle\u201d architecture based on over 150 source systems, 900 ETL jobs, 3 data warehouses, and 15 data marts.<\/p>\n<p>The resulting system was expensive, and too slow to meet the business needs: it took months or years to make changes. The team concluded that they needed a different way of doing things, one that would support more agile, parallel streams of development, without being disruptive.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;\" title=\"hsbc\" alt=\"hsbc\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2013\/11\/hsbc.jpg?resize=690%2C393&#038;ssl=1\" width=\"690\" height=\"393\" border=\"0\" \/><\/p>\n<p>HSBC decided to try using Hadoop, with the work done in Gaungzhou, China. The project was a big success:<\/p>\n<ul>\n<li>Hadoop was installed and operational in a single week<\/li>\n<li>The 18 RDBMS data warehouses and marts were ported to Hadoop in 4 weeks<\/li>\n<li>The time it took to run an existing batch job dropped from 3 hours to 10 minutes<\/li>\n<li>New data sources could be included, such as information about financial derivatives stored in .pdf format.<\/li>\n<\/ul>\n<p>At the same time, however, Anderson explained how his analytics needs were maybe a little different from more traditional data warehousing. The focus of the project was fast-moving, \u201cagile information\u201d typically requiring several different iterations of analysis &#8212; and he explained that other parts of the business such as the retail banking division, did not have the same \u201cfunky needs.\u201d<\/p>\n<p>He admitted that\u00a0 HSBC &#8220;genuinely doesn&#8217;t know yet how the new architecture will combine with existing data warehouses in other regions&#8221;<\/p>\n<p>The second notable presentation was Deutsche Telekom\u2019s J\u00fcrgen Urbanksi on how to determine the right technical solutions for different types of enterprise data usage. He presented an overall view of different architectures and suggested questions that should be asked of the business in order to determine which technology was the best fit.<\/p>\n<p>Hadoop was generally positioned as the \u201cbetter\u201d choice, although some of the comparisons with in-memory systems already seemed out of date (e.g. see slide below), and there was little discussion of how to integrate transaction systems (other than as simple data sources).<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;\" title=\"deutsche telekom data storage\" alt=\"deutsche telekom data storage\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2013\/11\/deutsche-telekom-data-storage.jpg?resize=690%2C514&#038;ssl=1\" width=\"690\" height=\"514\" border=\"0\" \/><\/p>\n<p><span style=\"font-size: 1.17em;\">Integrating Hadoop With Existing Systems<\/span><\/p>\n<p>The unspoken assumption of many of the delegates seemed to be that it was just a question of time before Hadoop gained the extra features that would enable it to take over all enterprise needs. Some seemed almost proud to ignore existing enterprise data architectures and any best practice learned over the previous decades (information governance springs to mind \u2013 this is still a very new concept for many organizations using Hadoop).<\/p>\n<p>Many users of enterprise systems, on the other hand, seemed to have decided that Hadoop that only applies to web companies, or is restricted to refining semi-structured data before putting it into a &#8220;normal&#8221; data warehouse.<\/p>\n<p>I believe that Hadoop is an incredible opportunity for most enterprises, both large and small. But I also believe that the big changes in enterprise architecture driven by in-memory systems, and the need for analytics close to transactions, mean that the ultimate best practice architecture will be one based on a combination of existing approaches, not just Hadoop alone.<\/p>\n<p>With this in mind, <img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px; margin-left: 5px; margin-right: 5px;\" title=\"hadoop-logo\" alt=\"hadoop-logo\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2013\/11\/hadoop-logo.jpg?resize=300%2C71&#038;ssl=1\" width=\"300\" height=\"71\" align=\"right\" border=\"0\" \/>SAP has <a href=\"http:\/\/www.sap.com\/news-reader\/index.epx?articleID=21527\" target=\"_blank\">teamed up<\/a> with major Hadoop providers to combine\u00a0the speed of in-memory computing with the storage power and flexibility of Hadoop.<\/p>\n<p>SAP will redistribute and support the <a href=\"http:\/\/hadoop.intel.com\/\" target=\"_blank\">Intel Distribution Apache Hadoop<\/a> and the <a href=\"http:\/\/hortonworks.com\/products\/hdp\">Hortonworks Data Platform<\/a> &#8212; InformationWeek journalist <a href=\"http:\/\/www.informationweek.com\/big-data\/authors\/Doug-Henschen\" target=\"_blank\">Doug Henschen<\/a> explains the background to the deals in his article <a href=\"http:\/\/www.informationweek.com\/big-data\/commentary\/software\/information-management\/sap-expands-big-data-push\/240161134\" target=\"_blank\">SAP Expands Big Data Push<\/a>.<\/p>\n<p>As part of the series of SAP \u201cBig Data\u201d discussions, Hortonworks CTO Ari Zilka explained why he felt that <a href=\"https:\/\/timoelliott.com\/blog\/2013\/11\/sap-hortonworks-hangout-big-data-meets-enterprise-systems.html\" target=\"_blank\">combining Hadoop and enterprise systems was the best of both worlds<\/a>.\u00a0 And there\u2019s more information about how real-life organizations are using Hadoop in their organization, in organizations as diverse as <a href=\"http:\/\/www.insideworldfootball.com\/world-football\/europe\/13610-sap-signs-bierhoff-and-signals-intention-to-drive-deeper-into-football\" target=\"_blank\">football<\/a> and <a href=\"http:\/\/www.forbes.com\/sites\/sap\/2013\/11\/14\/the-white-house-honors-sap-stanford-and-nct\/\" target=\"_blank\">genetics<\/a>, visit the <a href=\"http:\/\/www.sapbigdata.com\/\" target=\"_blank\">SAP Big Data<\/a> website.<\/p>\n<p>For more detailed technical information about Hadoop can be integrated with traditional information architectures, check out the <a href=\"http:\/\/www.saphana.com\/docs\/DOC-3777\" target=\"_blank\">CIO Guide on Big Data: How to Use Hadoop With Your SAP Software Landscape<\/a>.<\/p>\n<h3>Hadoop Summit Europe 2014<\/h3>\n<p>I\u2019m looking forward to next year\u2019s <a href=\"http:\/\/hadoopsummit.org\/amsterdam\/\" target=\"_blank\">Hadoop Summit Europe in April<\/a>, and thoroughly recommend you attend. It\u2019s a great venue and a great crowd. And please vote for my presentation on \u201c<a href=\"https:\/\/hadoopsummit.uservoice.com\/forums\/196821-hadoop-for-business-applications-and-development\/suggestions\/5059444-real-life-examples-of-using-hadoop-to-drive-busine\" target=\"_blank\">Real-Life Examples of Using Hadoop to Drive Business Innovation<\/a>\u201d!<\/p>\n<p>&nbsp;<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Enterprises should be more interested in Hadoop &#8212; and the Hadoop community should be more interested in helping them integrate Hadoop into their existing architectures. <\/p>\n","protected":false},"author":2,"featured_media":5881,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2},"jetpack_post_was_ever_published":false},"categories":[14],"tags":[27,55,93,173,431,556,557,558,560,576,625,911],"class_list":["post-12366","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-thoughts","tag-bigdata","tag-55","tag-amsterdam","tag-big-data","tag-enterprise","tag-hadoop","tag-hadoop-summit","tag-hadoopsummit","tag-hana","tag-hortonworks","tag-intel","tag-sap"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2013\/11\/hanadoop-example-1.jpg?fit=690%2C310&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3X9RF-3ds","_links":{"self":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/12366","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/comments?post=12366"}],"version-history":[{"count":0,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/12366\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media\/5881"}],"wp:attachment":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media?parent=12366"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/categories?post=12366"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/tags?post=12366"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}