In an article published in 2005 entitled, “What is Web 2.0?”, Tim O’Reilly first used the term “Big Data” to refer to a large set of data that is almost impossible to manage and process using traditional business intelligence tools. The use of that term peaked between 2013 and 2015, and even found a place in the Oxford Dictionary in 2013. Gartner was tracking big data as an emerging technology for many years, but removed it from that list in 2015, arguing that it had become a part of many different emerging technologies such as machine learning and advanced analytics.
Given this context, what’s on the horizon for big data in 2020? When examining big data trends, we see new developments that will lead to substantial improvements in capabilities, speed, understanding and cost optimization.
What are these big data trends for the future and how will they impact businesses and governments? Let’s take a look below:
- IoT data analytics will make strides
Over the past few years, the adoption of Internet of Things (IoT) technology has been growing in many sectors such as manufacturing, agriculture, utilities, healthcare, and governance. Connected devices are gathering large volumes of data, usually stored in data lakes. This is definitely big data, constantly growing in volume, variety, and velocity.
IoT sensors gather heterogeneous data from many different sources such as video feeds, geolocations, log files, and equipment readings. The real value of IoT systems will be felt when decisions can be made based on the insights and trends that the data reveals. So far, the unstructured nature of the data and its large volume has made it quite difficult to process and analyze.
Going forward, it is clear that there will be significant developments in our ability to analyze IoT data in all three areas — storage, stream processing, and analytics platforms.
- Digital twins will make more powerful analytics possible
A digital twin is a digital representation of a physical object or system. Digital twins have been created for buildings, factories, and even cities. This idea originated at NASA, where full-scale mockups of space capsules were used to model and diagnose possible problems in orbit.
The digital twin receives data about the actual object from sensors and simulates the object in real-time. A digital twin based on a prototype created before the actual product is made can be used to refine the design of the final product.
Digital twins enable predictive analytics and what-if scenarios, helping to optimize systems for maximum efficiency. Digital twins of large and complex systems, such as ships for factories, generate large volumes of data for big data. One big data trend that we will see is the merging of artificial intelligence, big data analytics, IoT and predictive analytics in order to best utilize the data generated by digital twins.
Digital twins are already being used in different sectors, including oil and gas, healthcare, Formula 1 racing, smart city management, and agriculture.
- Dark data will be mined for insights
While many organizations invest in technology to capture data, it is estimated that 60-73 percent of enterprise data is not utilized. It’s possible that the organization may not even be aware of the data that is being collected. With IoT sensors gathering data every moment, we can safely assume that the percentage of data that’s not being effectively harnessed will increase even further.
Such unutilized data is commonly referred to as “dark data.” Gartner defines it as “the information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes.”
“Dark data” may be collected and stored purely for compliance purposes, or possibly generated by systems or sensors without a plan in place for further processing and analysis. The non-utilization or underutilization of data is a cost and opportunity loss, as it costs money to store large volumes of data and the potential insights gained from that data are lost.
A few examples of dark data that could reveal valuable business insights are website log files, which help us to understand visitor behavior; call center audio recordings, which indicate consumer sentiment; and mobile geolocation data that reveal traffic patterns.
We will see organizations become more conscious of the business value of data and take specific action for dark data analytics. The availability of more powerful hardware and software systems will enable the processing of big data which is still “dark.” Experts estimate that enterprises that keep pace with this big data trend and effectively mine dark data will achieve substantial productivity gains compared to their peers.
- Cold storage on the cloud will drive further cost optimization
Cold storage is a term that has been used for methods of storing data that is no longer in active use and does not need to be accessed frequently. Organizations maintain such data either out of legal requirements or a belief that they may need to access it in the future. Traditionally, low-cost storage options, such as tape, were used for cold storage until cloud storage providers also started offering cold storage options.
More recent offerings, such as Google’s cold storage service, provides highly economical storage as well as speed of access without millisecond latency. This is a big data trend that will enable organizations to further optimize data storage costs and divert their investments to more effective analytics and insights.
- Edge computing will enhance the speed of large systems
Edge computing is a new way of handling computation and storage for systems with large geographical spreads, such as self-driving cars, CCTV cameras, or transportation monitoring systems.
Edge computing decentralizes the storage and processing of data, bringing it closer to the devices where data is being collected. The objective is that data, especially that which is required in real-time, does not develop latency issues that hamper the system’s performance. There is also a cost-saving benefit of edge computing, as processing is done locally and the amount of data that needs to be processed in a centralized location is reduced.
Edge computing is a big data trend that will enhance the speed of large, complex and geographically dispersed systems. Edge computing has powerful applications in smart grid management, remote monitoring of oil and gas operations, and traffic management, to name a few.
- Augmented analytics will improve data management
Data scientists are highly sought-after professionals. Yet, most spend 80% of their time on the collection and preparation of data and only 20% on discovering insights. Augmented analytics aims to change this by automating the processes of data collection and data preparation, freeing up 80% of the time for data scientists.
Augmented analytics uses statistical and linguistic technologies to improve data management performance, from data analysis to data sharing and business intelligence.
Data analytics software with augmented analytics uses machine learning and NLP to understand and interpret data. This is a big data trend that will improve data management and make analytics more effective.
- Graph analytics will further our understanding of relationships within data
With the ever-increasing volume of data generated and the desire to understand more and more complex phenomena, traditional analytical methods cannot meet the big data requirements of the future. Graph analytics is a new approach that helps understand connections between data points and identify clusters of related data points based on influence, frequency of interaction, and probability.
Graph analytics helps us explore the relationships between entities, such as companies or people. Entities and relationships are mapped as graphs consisting of nodes, edges, and properties.
Graph analytics help find patterns among the relationships between nodes, which would have been extremely cumbersome with traditional analytical methods. Some of the applications where graph analytics are currently being used are social media network analysis, fraud detection, and system load optimization.
The seven big data trends discussed above give us a view of the future when data analytics will have capabilities, applications, and benefits far greater than what we have experienced before.