{"id":12268,"date":"2012-09-21T17:40:42","date_gmt":"2012-09-21T16:40:42","guid":{"rendered":"http:\/\/timoelliott.com\/blog\/?p=4212"},"modified":"2012-09-21T17:40:42","modified_gmt":"2012-09-21T16:40:42","slug":"dancing-with-dirty-data-thanks-to-sap-visual-intelligence","status":"publish","type":"post","link":"https:\/\/timoelliott.com\/blog\/2012\/09\/dancing-with-dirty-data-thanks-to-sap-visual-intelligence.html","title":{"rendered":"Dancing With Dirty Data Thanks to SAP Visual Intelligence"},"content":{"rendered":"<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;\" title=\"bad data fixing\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2012\/09\/bad-data-fixing.jpg?resize=690%2C310&#038;ssl=1\" alt=\"bad data fixing\" width=\"690\" height=\"310\" border=\"0\" \/><\/p>\n<p>Here\u2019s my entry for the <a href=\"http:\/\/blogs.sap.com\/analytics\/2012\/09\/10\/the-ultimate-data-geek-challenge\" target=\"_blank\">SAP Ultimate Data Geek Challenge<\/a>, a contest designed to \u201cshow off your inner geek and let the rest of world know your data skills are second to none.\u201d There have already been <a href=\"http:\/\/scn.sap.com\/docs\/DOC-31772\" target=\"_blank\">lots of great submissions<\/a> with people using the new <a href=\"http:\/\/www12.sap.com\/solutions\/analytics\/business-intelligence\/visual-intelligence\/index.epx\" target=\"_blank\">SAP Visual Intelligence<\/a> data discovery product.<\/p>\n<p>I thought I\u2019d focus on one of the things I find most powerful: the ability to create visualizations quickly and easily even from real-life, messy data sources. Since it\u2019s election season in the US, I thought I\u2019d use some polling data on whether voters believe the country is \u201cheaded in the right direction.\u201d There is lots of different polling data on this (and other topics) available at <a href=\"http:\/\/www.pollingreport.com\/right.htm\" target=\"_blank\">pollingreport.com<\/a>.<\/p>\n<p>Below you can see the data set I grabbed: as you can see, the polling date field is particularly messy, since it has extra letters (e.g. RV for \u201cregistered voter\u201d), includes polls that were carried out over several days, and is not consistent (the month is not always included, sometimes spaces around the middle dash, sometimes not\u2026).<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: inline; padding-top: 0px; border: 0px;\" title=\"poll data sample\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2012\/09\/poll-data-sample.jpg?resize=689%2C406&#038;ssl=1\" alt=\"poll data sample\" width=\"689\" height=\"406\" border=\"0\" \/><\/p>\n<p>If you take this data and try to paste it into Excel, it automatically converts numbers like \u201c6\/02\u201d into the 2nd of June, further scrambling the analysis, so instead I put it directly into a text file.<\/p>\n<p><img data-recalc-dims=\"1\" loading=\"lazy\" decoding=\"async\" style=\"background-image: none; padding-left: 0px; padding-right: 0px; display: block; float: none; margin-left: auto; margin-right: auto; padding-top: 0px; border: 0px;\" title=\"excel scramble\" src=\"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2012\/09\/excel-scramble.jpg?resize=450%2C212&#038;ssl=1\" alt=\"excel scramble\" width=\"450\" height=\"212\" border=\"0\" \/><\/p>\n<p>To see how you can easily take the messy data and turn it into shareable analysis, I recorded a short demonstration of the steps involved:<\/p>\n<p><iframe loading=\"lazy\" src=\"http:\/\/www.youtube.com\/embed\/1AT34Hto-QQ\" frameborder=\"0\" width=\"690\" height=\"518\"><\/iframe><\/p>\n<p>If you\u2019d like to try the product, you can download it for a free trial at <a href=\"http:\/\/sap.com\/tryvisualintelligence\" target=\"_blank\">sap.com\/tryvisualintelligence<\/a>. The product is undergoing very rapid iteration cycles, so please give your feedback and feature requests at the SAP Community Network <a href=\"https:\/\/cw.sdn.sap.com\/cw\/community\/ideas\/businessanalytics\/sap_visual_intelligence\" target=\"_blank\">Ideas Place<\/a>.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>How to visual real-life, messy data sets using SAP Visual Intelligence<\/p>\n","protected":false},"author":2,"featured_media":4209,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":false,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[1],"tags":[],"class_list":["post-12268","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-uncategorized"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2012\/09\/bad-data-fixing.jpg?fit=690%2C310&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3X9RF-3bS","_links":{"self":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/12268","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/comments?post=12268"}],"version-history":[{"count":0,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/12268\/revisions"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media\/4209"}],"wp:attachment":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media?parent=12268"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/categories?post=12268"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/tags?post=12268"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}