{"id":12392,"date":"2014-02-05T14:54:38","date_gmt":"2014-02-05T13:54:38","guid":{"rendered":"http:\/\/timoelliott.com\/blog\/?p=6372"},"modified":"2014-02-05T14:54:38","modified_gmt":"2014-02-05T13:54:38","slug":"etl-and-information-governance-discussions-from-dslayer","status":"publish","type":"post","link":"https:\/\/timoelliott.com\/blog\/2014\/02\/etl-and-information-governance-discussions-from-dslayer.html","title":{"rendered":"ETL and Information Governance Discussions From DSLayer"},"content":{"rendered":"<p><a href=\"https:\/\/dslayer.net\/\" target=\"_blank\"><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border-width: 0px;\" title=\"dslayer\" alt=\"dslayer\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2014\/02\/dslayer.jpg?resize=640%2C162&#038;ssl=1\" width=\"640\" height=\"162\" border=\"0\" \/><\/a><\/p>\n<p>If you\u2019re interested in SAP, BusinessObjects, and everything to do with Analytics, I encourage you to check out the <a href=\"https:\/\/dslayer.net\/\" target=\"_blank\">Diversified Semantic Layer<\/a> site (tagline: \u201cunprofessional journalism at its finest\u201d!). It\u2019s run by a bunch of analytics veterans from around the English-speaking globe, featuring quirky (i.e. sometimes completely off-topic) discussions of the latest and greatest (and somewhat geeky) events in the analytics world.<\/p>\n<p>I subscribe to the interviews as an <a href=\"https:\/\/dslayer.net\/feed\/podcast\/\" target=\"_blank\">audio podcast<\/a> and listen to them on my morning run or evening shopping errands (full disclosure: they bought my goodwill with a DSLayered Bowling Shirt). I often find myself, somewhat inappropriately, arguing with them out loud and startling passers-by.<\/p>\n<p>In the event you are passionate enough about analytics to hear even more about it in your spare time, here are a couple of recent shows that I found interesting:<\/p>\n<h3>Is ETL Dead?<\/h3>\n<p>The DS Layer crew <a href=\"https:\/\/twitter.com\/ericvallo\" target=\"_blank\">Eric Vallo<\/a>, <a href=\"https:\/\/twitter.com\/gpmyers\" target=\"_blank\">Greg Myers<\/a>, <a href=\"https:\/\/twitter.com\/oswaldxxl\" target=\"_blank\">Jamie Oswald<\/a>, <a href=\"https:\/\/twitter.com\/vosloo777\" target=\"_blank\">Clint Vosloo<\/a>, and <a href=\"https:\/\/twitter.com\/josh_fletcher\" target=\"_blank\">Josh Fletcher<\/a> (<a href=\"http:\/\/twitter.com\/dallasmarks\" target=\"_blank\">Dallas Marks<\/a> couldn\u2019t make it) discussed the topic of \u201cdoesn\u2019t there seem to be less need for ETL these days?\u201d Sadly, Clint Vosloo had the most to say \u2013 and the worst bandwidth, so his points kept getting cut off.<\/p>\n<p>Topics covered:<\/p>\n<ul>\n<li>ETL is by far the most arduous, long, costly process involved in business intelligence.<\/li>\n<li>With new technologies like Hadoop and in-memory there\u2019s less need to move data around.<\/li>\n<li>But you\u2019ll never find a company that has just one system, so some data movement is always required.<\/li>\n<li>ETL is not \u201cdying\u201d \u2013 but instead \u201cevolving\u201d [or transforming?]<\/li>\n<li><a href=\"http:\/\/www.sap.com\/pc\/tech\/enterprise-information-management\/software\/data-services\/index.html\" target=\"_blank\">SAP Data Services<\/a> is very flexible and fast \u2013 any delays are because of the source or target system.<\/li>\n<li>In the future, the \u201ctransform\u201d can be pushed into in-memory systems like HANA. Faster, but you could still end up with a very complex, unmanageable HANA view that only one person understands.<\/li>\n<li>Real change will come when apps are built from the ground up to optimize HANA use (no data copying needed \u2013 e.g. if you have a brand new ERP on HANA, maybe you don\u2019t need BW or a data warehouse \u2013 and associated data movements &#8212; at least initially)<\/li>\n<li>The problem is always the quality of the source system. If that\u2019s fine, then replication may be enough.<\/li>\n<li>SAP seems to be the only vendor pitching in-memory apps for both operations and analytics (as opposed to in-memory for analytics)<\/li>\n<li>One thing is clearly going to change: the notion of a once-a-day ETL load. Can now do it in more continuous ways.<\/li>\n<li>Federation (e.g. <a href=\"http:\/\/www.saphana.com\/community\/blogs\/blog\/2013\/07\/22\/smart-data-access-data-virtualization-with-sap-hana\" target=\"_blank\">smart data access in HANA<\/a>) is a great option for flexibility \u2013 but you\u2019ll want to make a physical copy sometimes.<\/li>\n<li>Conclusion: ETL isn\u2019t dead \u2013 but it\u2019s going to change a lot, especially for SAP customers.<\/li>\n<\/ul>\n<p><iframe loading=\"lazy\" src=\"\/\/www.youtube.com\/embed\/hlQprOMACjk?feature=player_embedded\" height=\"360\" width=\"640\" allowfullscreen=\"allowfullscreen\" frameborder=\"0\"><\/iframe><\/p>\n<h3>The Need for Information Governance<\/h3>\n<p>This show featured \u201cI see Data Quality People\u201d <a href=\"https:\/\/twitter.com\/vosloo777\" target=\"_blank\">Clint Vosloo<\/a>, and \u201cI\u2019m all about the Quality\u201d <a href=\"https:\/\/twitter.com\/josh_fletcher\" target=\"_blank\">Josh Fletcher<\/a> following on the ETL discussion with a discussion of data quality and information governance. Warning: you will need a high tolerance for kids-in-the-background noise.<\/p>\n<p>Topics covered:<\/p>\n<ul>\n<li>Data quality is a business issue \u2013 IT can only help show the problem<\/li>\n<li>Funding is often the biggest issue in a data quality project. Using fear can be a useful tactic \u2013 reminding people that the financial results are based on dubious figures, or that figures reported to government may be incorrect. If there are doubts about a figure, copying the CFO on the discussion may help (but may lose you some friends)<\/li>\n<li>Using a tool like SAP Explorer to quickly show incorrect or null values \u2013 and how much money is attached to them \u2013 can be powerful.<\/li>\n<li>Business people tend to be oblivious to data quality issues, because they only see a small subset of the data, unlike IT.<\/li>\n<li>Humans create bad data. One way to get better data is to improve incentives, pay people for quality of data, not just quantity.<\/li>\n<li>A real information governance strategy takes experts<\/li>\n<li>Organizations like retailers are realizing that the ability to cross-sell requires better quality information<\/li>\n<li>SAP <a href=\"http:\/\/www54.sap.com\/solutions\/tech\/enterprise-information-management\/software\/data-integrity-steward\/index.html\" target=\"_blank\">Information Steward<\/a> makes it easy to track data quality metrics over time, identify areas where there are big business benefits \u2013 which makes it much easier to get the momentum to fix the problem.<\/li>\n<\/ul>\n<p><iframe loading=\"lazy\" src=\"\/\/www.youtube.com\/embed\/xS9XWABYZZw?feature=player_embedded\" height=\"360\" width=\"640\" allowfullscreen=\"allowfullscreen\" frameborder=\"0\"><\/iframe><\/p>\n<p>I hope you enjoy the shows and you can follow <a href=\"https:\/\/twitter.com\/dslayered\" target=\"_blank\">DSLayer on twitter<\/a> and ask questions using the <a href=\"https:\/\/twitter.com\/search?q=%23askdslayer&amp;src=typd\" target=\"_blank\">#AskDSL hashtag<\/a>\u2026<\/p>\n","protected":false},"excerpt":{"rendered":"<p>If you\u2019re interested in SAP, BusinessObjects, and everything to do with Analytics, I encourage you to check out the Diversified Semantic Layer site. Here are notes on a couple of recent shows. <\/p>\n","protected":false},"author":2,"featured_media":6371,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[14],"tags":[100,253,317,342,347,404,409,447,455,537,550,560,614,648,665],"class_list":["post-12392","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-thoughts","tag-analytics","tag-clint-vosloo","tag-dallas-marks","tag-data-quality","tag-data-services","tag-dq","tag-dslayer","tag-eric-vallo","tag-etl","tag-governance","tag-greg-myers","tag-hana","tag-information-steward","tag-jamie-oswald","tag-josh-fletcher"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2014\/02\/dslayer-1.jpg?fit=640%2C162&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3X9RF-3dS","_links":{"self":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/12392","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/comments?post=12392"}],"version-history":[{"count":0,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/12392\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media\/6371"}],"wp:attachment":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media?parent=12392"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/categories?post=12392"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/tags?post=12392"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}