Connect with us

Hi, what are you looking for?

Business

How well are firms navigating the AI infrastructure market?

AI workloads and how their ambitious plans for the future signal a need for highly performant, cost-effective ways to optimize GPU utilization.

Despite fears of the dangers of artificial intelligence, investors are focusing on the potential rewards of the technology
Despite fears of the dangers of artificial intelligence, investors are focusing on the potential rewards of the technology. — © AFP Kirill KUDRYAVTSEV
Despite fears of the dangers of artificial intelligence, investors are focusing on the potential rewards of the technology. — © AFP Kirill KUDRYAVTSEV

As companies navigate the AI infrastructure market, one thing many will be seeking is clarity, peer insights and reviews, as well as industry benchmarks on suitable platforms.

What are executives’ biggest pain points in moving AI to production? A new survey commissioned by the firm ClearML and conducted by the AI Infrastructure Alliance has examined not only model training, but also model serving and inference.

Commenting on the survey, Noam Harel, ClearML’s CMO and GM, North America says: “Our research shows that while most organizations are planning to expand their AI infrastructure, they can’t afford to move too fast in deploying Generative AI at scale at the cost of not prioritizing the right use cases.”

Harel adds: “We also explore the myriad challenges organizations face in their current AI workloads and how their ambitious plans for the future signal a need for highly performant, cost-effective ways to optimize GPU utilization (or find alternatives to GPUs), and harness seamless, end-to-end AI/ML platforms to drive effective, self-serve compute orchestration and scheduling with maximum utilization.”

The report finds that 96 percent of respondents plan to expand their AI compute infrastructure with availability, cost, and infrastructure challenges weighing on their minds, with 40 percent considering more on-premise and 60 percent considering more cloud, and they are looking for flexibility and speed. The top concern for cloud compute is wastage/idle costs.

In addition, 95 percent of executives have reported that having and using Open Source technology is important for their organization. In terms of a further high proportion, 96 percent indicate that they are focused on customizing Open Source models. PyTorch is their framework of choice.

On the less optimistic side, 74 percent of companies are dissatisfied with their current job scheduling and orchestration tools and face compute resource on-demand allocation and team productivity constraints.

The same proportion of respondents see value in having compute and scheduling functionality as part of a single, unified AI/ML platform (instead of cobbling together an AI infrastructure tech stack of stand-alone point solutions), yet only 19 percent of respondents actually have a scheduling tool that supports the ability to view and manage jobs within queues and effectively optimize GPU utilization.

Furthermore, 93 percent of surveyed executives believe that AI team productivity would substantially increase if compute resources could be self-served.

In terms of the greatest challenges, optimizing GPU utilization and GPU partitioning are the major concerns, with the majority of GPUs reported as being underutilized during peak times.

In this regard, 40 percent of respondents, regardless of company size, are planning to use orchestration and scheduling technology to maximize their current compute investments of their existing AI infrastructure. Only 42 percent of companies have the ability to manage Dynamic MiG/GPU partitioning capabilities to optimize GPU utilization.

Cost is the key buying factor for inference compute. To address GPU scarcity, 52 percent of respondents reported they are actively looking for cost-effective alternatives to GPUs for inference in 2024 as compared to 27 percent for training. 20 percent were interested in cost-effective alternatives to GPU but were not aware of existing alternatives. This indicates that cost is a key buying factor for inference solutions. We expect that while industries are still in early days for inference, the demand for cost-efficient inference compute will grow.

The biggest challenges for compute were found to be latency, followed by access to compute and power consumption. Over half of respondents plan to use language models (like LLama), followed by embedding models (BERT and family) (26 percent) in their commercial deployments. Mitigating compute challenges will be essential in their plans.

Avatar photo
Written By

Dr. Tim Sandle is Digital Journal's Editor-at-Large for science news. Tim specializes in science, technology, environmental, business, and health journalism. He is additionally a practising microbiologist; and an author. He is also interested in history, politics and current affairs.

You may also like:

Business

Canada’s nonprofits are stepping into AI with RAISE, a new national program helping the sector adopt ethical, mission-aligned tools.

Business

The Canadian Council of Innovators president is calling for a policy reset, warning that the country’s economic future is at stake.

Tech & Science

Digital Journal today announced it is forming an editorial advisory committee to deepen its coverage of the innovation economy.

Business

Every business owner talks about growth, but not many talk about what it really takes to get there.