The phrase “dark data” might seem ominous, and yet it’s anything but. In fact, dark data is one of the most promising fields for data analysis.
What is dark data? Simply put, it’s data that’s stored but isn’t being used for anything. In an era in which big data and data analytics are creating new business insights, new efficiencies, and new opportunities to serve customers and make money, dark data is tantamount to an unopened safe.
For example, think of all the scanned-in, pre-electronic-health-records medical charts that are sitting unexamined on servers. They contain huge amounts of data that, if analyzed, could advance our understanding of diseases and the effectiveness of treatments, as well as the efficiencies and inefficiencies of medical practice. Or think of all the environmental reports buried as PDFs in government and nonprofit websites. They also contain data that, if analyzed with modern methods, could yield all kinds of useful information for businesses and environmental groups alike.
The problem is, most dark data is unstructured. That means the data isn’t in a format conducive to data analysis. The data contained in scanned-in medical charts cannot be easily extracted. Nor can the data scattered throughout an environmental report saved as a PDF, deep inside a website. E-mails, word processing documents, photographs, video, audio, and most of the data transmitted by the Internet of Things are also difficult to pull apart and manipulate.
The only solution is to develop methods of sorting through unstructured data and finding data analysts capable of selecting which kinds of dark data can yield the most insights. If you’re intrigued by the opportunities dark data presents, you can learn more in our article “5 Things Every IT Manager Should Know About Dark Data.” You can also put yourself in position to be a dark data pioneer and/or expert with our MS in Information Technology program, particularly if you want to specialize in Big Data Analytics.