<img src="https://secure.leadforensics.com/99430.png" alt="" style="display:none;">


Illuminating Dark Data and Managing Risk

01 Feb 2018


We are living in a new data-driven world and economy. The large, well-capitalised companies of today risk being replaced by smaller start-ups. They’re rapidly leveraging data to create new markets they can dominate and are forcing the ‘old order’ to rethink, or diminish. Uber is more valuable than most traditional car makers, but does not manufacture cars. Airbnb is reaching its first decade as an operating company but is already finding more of us holiday and business accommodation than Hilton or Hyatt. (Even though it owns no hotels.) Netflix, Apple and Amazon are threatening the old guard in Hollywood. And Spotify and iTunes are doing the same to traditional record labels. New data-driven companies are threatening the old order!

Tony Muraki-HartData is more valuable than gold or oil – how are you mining yours?

The new data world is unique, personal, live and accurate. And thanks to the reducing costs of storing and transporting it via the Cloud, it can be traded and leveraged cheaply. Another fundamental factor to the value of data, is that, unlike gold or oil, it will never run out. In this sense, it becomes increasingly valuable as organisations begin to exploit it.

The new wave of data-driven services argue Big Data will make our lives better. It will help predict our own everyday needs, a company’s needs, serving up new products or services using artificial intelligence (AI).

What is dark data and how to understand the risk level?

Data is your competitive currency. ‘Dark’ data in the context of business data describes the data elements that are undiscovered, hidden or undigested. An estimated 90% of all data in existence today was generated in the past 5 years. According to IDC and Gartner the digital universe we operate in today is expected to double in size every 12 months.

The new world entrants like Uber and Airbnb are enabling (or forcing) CIOs, business leaders and data scientists to use technology to define new ways of looking at their data. The quest is to unearth valuable business, customer and operational insights – across all industry sectors, not just consumer environments.

Traditionally, when firms think of data and analytics, they focus on structured data that exists in systems, servers and databases. Dark data analytics seeks to remove those boundaries by casting a much wider net, capturing an amount of currently untapped signals.

3 dimensions you should focus on to illuminate your dark data world:

  1. Untapped data already in your possession – create connections between disparate data sets, whether they are structured or unstructured. For example, emails, notes, messages, documents, logs and sensor notifications;
  2. Non-traditional unstructured data – audio, video and images cannot be mined using traditional reporting and alytics techniques. However, leveraging the latest ways to apply analytics pattern recognition to audio and video feeds in real-time is opening profound new opportunities. This could have a significant impact in all sectors, whether you’re a retailer, oil gas company, hospital, or an amusement park;
  3. Extracting data in the deep web – the deep web (or dark web as Jamie Bartlett has discussed in his recent books television shows), offers what may contain the largest body of untapped information. That is, data curated by academics, consortia, government agencies, communities, and other third-party domains, as well as the illegal and friendly activities that exist on the web.

Indeed, dark data analytics efforts that are “surgically” precise in both intent and scope often deliver the greatest value. Like every analytics journey, successful efforts begin with a series of specific questions. hat problem are you solving? What would we do differently if we could solve that problem? Finally, what data sources and analytics capabilities will help us answer the first two questions?

Also, data security should never be forgotten about. This is where GDPR begins to play a key role around control and authority of data sharing. 

Understanding membership trends using an Azure Data Lake to mine Twitter

One of OCSL’s Public Sector Health clients was experiencing a decline in membership numbers. They were struggling to understand why. Was it because of NHS funding, salary reductions, increased competition from overseas staff, unsatisfactory working environments, Brexit, or something else?

The client had noticed comments on Twitter they felt may provide some insight into why numbers were falling. But they were unsure how to leverage this information. OCSL’s Enablement Framework identified a way to leverage the Data Lake capabilities within Azure to connect to Twitter. It then extracted, indexed, mined and analysed any comments related to the health sector, hospitals and staff. In turn, eliminating any lengthy infrastructure investments for on-premises capabilities. Sentiment analysis was applied to the Twitter comments providing real-world insight and understanding to the decline in membership. This approach enabled the Health organisation to develop future strategies to maintain and grow its membership numbers.

Dark data analytics: where to start?

Organisations should act now to tackle their data challenges. A delayed response will result in immeasurable volumes of new structured and unstructured data to manage. It will come from endless data sources, both known and unknown. Acting now to develop the discipline and tools you’ll need to manage and mine all this dark data, will help your organisation today, while preparing for future opportunities.

It is important not to over complicate your approach and be practical:

  • Ask the right questions from the business – identify the data sources that are vital to your organisation;
  • Look beyond your 4 walls – understand how to augment your data with publicly available data, (see our Twitter/Azure example above). This will help your analytics teams develop targeted, more expansive and detailed reporting;
  • Enrich your data resource pool – identify those who can artfully combine deep modelling and statistical techniques with your company’s industry or function-specific insights. Be successful in creating teams that can effectively balance programming with data science;
  • Visualise, look beyond the core data – infographics and/or data dashboards, such as those available in Microsoft’s Power BI, will help many in your organisation fully understand the ‘so what’s’ and ‘why’ of complex analytical insights, so that practical actions can be taken. This is a business-driven effort – analytics is more than an IT function, it should be core to your business strategy. Not only will it showcase the true value of your IT service capabilities, but it will help gain funding for ever-more sophisticated techniques of understanding your data landscape. Ultimately, it helps drive accountability of your underlying assets. It’s critical to understand how to harness the available data to generate the required answers;
  • Expand your vision – as you develop new capabilities and approaches, consider how these can be extended across your organisation, both internally, and externally with customers and partners. This defined data strategy will become part of your reference architecture and target operating model.

Grow your capability to learn from your data

Core business value, competitive insight and differentiation can be gained by those who take advantage of data.

Data aggregation, enrichment, analysis and storage will remain high on the data agenda. But OCSL expects the focus to shift slightly to also include illuminating data elements that will promote powerful strategic, customer and operational insights, currently hidden in untraditional and ‘dark’ data sources.

Organisations need to focus on the ‘what, ‘why’ and ‘how’ questions that will deliver measurable value back into the business. Answering these questions will focus dark data analytics on the business areas that matter, unlocking hidden potential, driving competitive insight, differentiation and will also help you avoid drowning in your sea of data.

OCSL believes that everyone should have the capability to learn from data, whether you’re a data scientist, healthcare professional or car salesperson. We can help organisations benefit from a flexible cloud-based infrastructure. Combined with the right applications, this can convert vast amounts of internal and external data into predictive and analytical toolsets. Making the concepts of advanced analytics, machine learning and artificial intelligence simpler and achievable.

The following topics and approaches will be discussed in future data themed articles:

  • Driving 360o view of data that drives business value and insight
  • Risk analytics and managing compliance confidently
  • Determining the value of Robotics and Automation in the enterprise
  • Toolsets and procedures to make exploration into your data landscape simple
  • How to test to ensure you’re prepared for any non-compliance or data breaches
  • How the OCSL Enablement Framework can help you achieve the data panacea in your organisation

In the meantime, if you have any questions related to Dark Data Analytics, or your Data Strategy in general, please

Tony Muraki-Hart

About the author

Tony Muraki-Hart

Enterprise Architect

More about the author


Get in touch

0845 605 2100

For more contact information and for our address click the link below.

Contact us

Stay informed

Stay updated and subscribe to our regular communications.