Statista forecasted global big data and business analytics markets to grow to 215.7 billion U.S. dollars by 2021, with players like Databricks, Fivetran, Cribl, and dbt Labs raising millions in 2022.
Chris Gladwin, CEO, and Co-founder of Ocient, tells Digital Journal about his predictions for 2023. These centre on the data world going into large, increasing complexity, and using AI to fix issues that arise with its rapid growth.
#1 – Hyperscale Will Become Mainstream
Gladwin predicts that data-intensive businesses are moving beyond big data into the realm of hyperscale data, which is exponentially greater. And that requires a re-evaluation of data infrastructure.
According to Gladwin: “In 2023, data warehouse vendors will develop new ways to build and expand systems and services.”
He adds: “It’s not just the overall volume of data that technologists must plan for, but also the burgeoning data sets and workloads to be processed. Some leading-edge IT organizations are now working with data sets that comprise billions and trillions of records. In 2023, we could even see data sets of a quadrillion rows in data-intensive industries such as adtech, telecommunications, and geospatial.”
Hyperscale data sets will become more common as organizations leverage increasing data volumes in near real-time from operations, customers, and on-the-move devices and objects.
#2 – Data Complexity Will Increase
The nature of data is changing. There are both more data types and more complex data types with the lines continuing to blur between structured and semi-structured data.
Gladwin notes: “The software and platforms used to manage and analyze data are evolving. A new class of purpose-built databases specialize in different data types—graphs, vectors, spatial, documents, lists, video, and many others.”
He predicts: “Next-generation cloud data warehouses must be versatile—able to support multimodal data natively, to ensure performance and flexibility in the workloads they handle. The Ocient Hyperscale Data Warehouse, for example, supports arrays, tuples, matrixes, lines, polygons, geospatial data, IP addresses, and large variable-length character fields, or VARCHARs.”
Furthermore, Gladwin assesses: “The need to analyze new and more complex data types, including semi-structured data, will gain strength in the years ahead, driven by digital transformation and global business requirements. For example, a telecommunications network operator may look to analyze network metadata for visibility into the health of its switches and routers. Or an ocean shipping company may want to run geospatial analysis for logistics and route optimization.”
#3 – Data Analysis Will Be Continuous
Data warehouses are becoming “always on” analytics environments. In the years ahead, the flow of data into and out of data warehouses will be not only faster, but continuous.
Here Gladwin finds: “Technology strategists have long sought to utilize real-time data for business decision-making, but architectural and system limitations have made that a challenge, if not impossible. Also, consumption-based pricing could make continuous data cost prohibitive.”
In terms of change, Gladwin puts forward: “Increasingly, however, data warehouses and other infrastructure are offering new ways to stream data for real-time applications and use cases. “
In addition, Gladwin predicts: “Popular examples of real-time data in action include stock-ticker feeds, ATM transactions, and interactive games. Now, emerging use cases such as IoT sensor networks, robotic automation, and self-driving vehicles are generating ever more real-time data, which needs to be monitored, analyzed, and utilized.”
#4 – Pipelines Will Get More Sophisticated
A data pipeline is how data gets from its original source into the data warehouse. With so many new data types—and data pouring in continuously—these pipelines are becoming not only more essential, but potentially more complex.
Gladwin predicts: “In 2023, users should expect data warehouse vendors to offer new and better ways to extract, transform, load, model, test, and deploy data. And vendors will do so with a focus on integration and ease of use.”
#5 – The Economics Will Change
In relation to economics, Gladwin finds: “Sheer performance will always be a critical differentiator, but some customers do not want to pay a premium for split-second speed alone. They want superior data warehouse performance at a sustainable price.”
From this, he makes the prediction: “In 2023, it will be imperative for data warehouse providers to not only drive improvements in all aspects of performance—data ingress, indexing, I/O, transformation, and query speed—but in cost of ownership too.”
Following this, Gladwin says: “The consumption-based cloud model has been a boon in some respects because businesses can start new initiatives without an upfront investment, paying only for the compute and storage resources they use. But costs escalate as adoption grows. To avoid budget overruns, CIOs are sometimes forced to put a “cap” on access to the data warehouse. Ironically, the success of the data warehouse then becomes a limitation.”
Moreover: “In 2023, business and technology leaders will look to data warehouse providers to help them solve this dilemma. Potential solutions include flexible licensing terms and multi-cloud deployment options, including on premises. But there’s more that data warehouse vendors can and will do to lower the cost of ownership. For example, adding pre-integrated capabilities such as ETL and ML within the data warehouse adds speed, efficiency, and cost savings to data preparation.”
Gladwin closes with: “Additional savings can come through advances in system architecture that exploit industry-standard hardware, translating into lower costs in areas such as data ingress, compression, networking, I/O, and queries.”