{"id":20771,"date":"2022-01-03T10:52:10","date_gmt":"2022-01-03T09:52:10","guid":{"rendered":"https:\/\/timoelliott.com\/blog\/?p=20771"},"modified":"2022-01-03T10:52:10","modified_gmt":"2022-01-03T09:52:10","slug":"what-is-dark-data-why-does-it-matter-and-why-are-humans-still-needed","status":"publish","type":"post","link":"https:\/\/timoelliott.com\/blog\/2022\/01\/what-is-dark-data-why-does-it-matter-and-why-are-humans-still-needed.html","title":{"rendered":"What is Dark Data, Why Does it Matter, and Why Are Humans Still Needed?"},"content":{"rendered":"<p>Back in the 1960s, a <a href=\"https:\/\/www.bell-labs.com\/about\/awards\/1978-nobel-prize-physics\/\">pair of radio astronomers<\/a> were busily collecting data on distant galaxies. They had been doing this for years. Elsewhere, other astronomers had been doing the same.<\/p>\n<p>But what set these astronomers apart \u2013 and eventually earned them a Nobel Prize \u2013 was what they eventually found in the data. Like other radio astronomers, they had long detected a consistent noise pattern. But unlike others, they persisted in trying to understand where the noise was coming from and eventually realized that it wasn\u2019t a defect in their equipment as they initially suspected. Instead, it was an echo of the Big Bang, still emitting cosmic microwaves billions of years later.<\/p>\n<p>This discovery helped prove the Big Bang theory \u2013 which, at the time, was not yet fully accepted by the scientific community. Other astronomers had collected similar data but had failed to recognize the full value of what they had observed \u2013 and today\u2019s organizations are grappling with a similar dilemma. Opportunities for key insights are often buried in a vast universe of dormant information known as \u201cdark data.\u201d<\/p>\n<h3>It\u2019s easy to collect information, but it\u2019s hard to turn it into insights.<\/h3>\n<p>Vast swathes of information are generated every day \u2013 everything from corporate financial figures to teenage social media videos. It\u2019s stored in corporate data warehouses, data lakes, and a myriad of other locations \u2013 and while some of it is put to good use, it\u2019s estimated that around <a href=\"https:\/\/www.inc.com\/jeff-barrett\/misusing-data-could-be-costing-your-business-heres-how.html\">73%<\/a> of this data remains unexplored.<\/p>\n<p>Just like dark matter in astrophysics, this unexplored data can\u2019t be observed directly by standard analytics tools, and so has been largely wasted.<\/p>\n<h3>So how can organizations find data in their own universes?<\/h3>\n<p>Every data point stored has potential value. But to extract it, the data typically needs to be translated into other forms, reanalyzed, and turned into action. This is where new technologies and new opportunities come into play.<\/p>\n<p>Today\u2019s data volumes have long since exceeded the capacities of straightforward human analysis, and so-called \u201cunstructured\u201d data, not stored in simple tables and columns, has required new tools and techniques. But the latest machine learning algorithms can help us detect and identify patterns in the data \u2013 once some common problems are addressed.<\/p>\n<h3>Improving data quality<\/h3>\n<p>Unexamined and unused data is often of poor quality. This can be because it\u2019s intrinsically noisy, due to inaccurate signals from cheap sensors or the linguistic ambiguities of social media sentiment analysis (\u201cit\u2019s wicked!\u201d). Or it can simply be because there\u2019s been little incentive to improve it.<\/p>\n<p>Today\u2019s data quality solutions, augmented by machine learning capabilities, can help sift through the noise, identify the patterns of bad data quality, and help fix the problem.<\/p>\n<h3>Data augmentation<\/h3>\n<p>New technologies make it easier than ever to bring together information from sources both inside and outside the organization. Sometimes this can provide the missing key to unlock new value from the data you already have.<\/p>\n<p>Weather radar data, for example, must filter out various sources of background noise to make more accurate predictions. But as we\u2019ve seen, one person\u2019s noise is another\u2019s data gold mine. It turns out that weather radar can be an invaluable source of information about bird migrations.<\/p>\n<p>Ornithologists, for example, have been able to <a href=\"https:\/\/onlinelibrary.wiley.com\/doi\/10.1111\/ibi.12906\">augment and unlock<\/a> the value of the radar information by mixing it with data stored in \u201ccitizen science repositories.\u201d These repositories, containing observations from amateur birdwatchers, provide a detailed, three-dimensional view of migrations for different bird species at little cost. With this data, ornithologists can better analyze the loss of biodiversity and the <a href=\"https:\/\/pubs.er.usgs.gov\/publication\/70046384\">effects of climate change<\/a>.<\/p>\n<p>Or take the city of Venice \u2013 which seeks to minimize the potentially damaging impact of millions of yearly visitors. With anonymized information from cell phone operators, the city has been able to analyze the<a href=\"https:\/\/www.lonelyplanet.com\/articles\/venice-is-tracking-tourists\"> flows of tourists<\/a> throughout the city to better manage congestion and facilitate smarter municipal planning.<\/p>\n<p>Another example is the city of Brussels, where authorities sought to improve the lives of citizens with disabilities. Using a <a href=\"https:\/\/www.forbes.com\/sites\/sap\/2018\/01\/05\/how-business-driven-analytics-make-public-transportation-smarter-in-brussels\/?sh=2ab21f016809\">municipal transport database<\/a> that stored time and location data for when wheelchair ramps were used on buses, the city was able to optimize the allocation of funds to provide better access and a better experience for disabled citizens.<\/p>\n<h3>Dark variables<\/h3>\n<p>The problems of dark data are confounded by dark variables \u2013 the \u201cblack holes\u201d of the dark data universe, invisible to the naked eye, but whose gravitational pull affect other objects.<\/p>\n<p>For example: did you know that children with big feet have better handwriting? At first glance this may seem surprising \u2013 but correlation is not causation. In this case, the dark variable is \u201cage.\u201d Children with bigger feet have better handwriting because they\u2019re older. Without understanding this dark variable, one can imagine executives immediately rushing off to create a feet-stretching taskforce. But, as always, it\u2019s best to get the full picture before taking action \u2013 which is why humans are needed.<\/p>\n<h3>The human factor: shining a light into dark data<\/h3>\n<p>Untapped dark data represents opportunities to get new insights into aspects of your business that have previously been invisible. Such insights can help you increase efficiencies, spot new customer opportunities, or improve your carbon footprint.<\/p>\n<p>But doing this requires an approach based on both machines and humans.<\/p>\n<p>On the machines side of the equation, SAP and Intel have been co-innovating to help organizations move forward. SAP Business Technology Platform, for example, provides a full, cloud-native suite of solutions to integrate, improve, analyze, and act on data. At the core of this platform is the SAP HANA databases which runs in memory.<\/p>\n<p>\u201cIntel helps make SAP\u2019s in-memory approach viable for real-scenarios,\u201d says Jeremy Rader, General Manager, Enterprise Strategy &amp; Solutions at Intel. \u201cWith technologies that speed processing, drive performance, enable memory persistence, and support security, we\u2019re helping organizations get the most out of all their data \u2013 including dark data.\u201d<\/p>\n<p>But as powerful as SAP and Intel technologies may be, ultimately making sense of dark data takes people. Only humans can understand the context of how the data is stored, what data might be inaccurate or missing, and how it can be used to deliver greater value to customers and the business.<\/p>\n<p>The best way forward is to bring together experts on data with expertise on the underlying business processes being studied. In this way, you can turn dark data into insights and help drive business improvements.<\/p>\n<h3>Learn More<\/h3>\n<p>To learn more about dark data and how businesses can realize the true value of their unstructured data, have a look at <a href=\"https:\/\/www.vox.com\/ad\/22692233\/universe-dark-data-value-opportunity-intel-sap-esa-space\">this explainer video<\/a> at Vox.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Untapped dark data represents opportunities to get new insights into aspects of your business that have previously been invisible. Such insights can help you increase efficiencies, spot new customer opportunities, or improve your carbon footprint. But doing this requires an approach based on both machines and humans.<\/p>\n","protected":false},"author":2,"featured_media":20744,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"jetpack_post_was_ever_published":false,"_jetpack_newsletter_access":"","_jetpack_dont_email_post_to_subs":false,"_jetpack_newsletter_tier_id":0,"_jetpack_memberships_contains_paywalled_content":false,"_jetpack_memberships_contains_paid_content":false,"footnotes":"","jetpack_publicize_message":"","jetpack_publicize_feature_enabled":true,"jetpack_social_post_already_shared":true,"jetpack_social_options":{"image_generator_settings":{"template":"highway","default_image_id":0,"font":"","enabled":false},"version":2}},"categories":[14],"tags":[],"class_list":["post-20771","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-thoughts"],"jetpack_publicize_connections":[],"jetpack_featured_media_url":"https:\/\/i0.wp.com\/timoelliott.com\/blog\/wp-content\/uploads\/2021\/10\/dark-data-2021.jpg?fit=1857%2C1383&ssl=1","jetpack_sharing_enabled":true,"jetpack_shortlink":"https:\/\/wp.me\/p3X9RF-5p1","_links":{"self":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/20771","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/comments?post=20771"}],"version-history":[{"count":2,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/20771\/revisions"}],"predecessor-version":[{"id":20774,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/posts\/20771\/revisions\/20774"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media\/20744"}],"wp:attachment":[{"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/media?parent=20771"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/categories?post=20771"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/timoelliott.com\/blog\/wp-json\/wp\/v2\/tags?post=20771"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}