Connect with us

Hi, what are you looking for?

Business

The state of AI infrastructure at scale: Exposing GPU utilization challenges

On the issue of latency, over half of respondents plan to use language models.

Generative AI apps like ChatGPT are raising concerns about the impact of artificial intelligence on a range of issues including disinformation as well as copyright over images, sound and text
Generative AI apps like ChatGPT are raising concerns about the impact of artificial intelligence on a range of issues including disinformation as well as copyright over images, sound and text - © AFP Julio Cesar AGUILAR
Generative AI apps like ChatGPT are raising concerns about the impact of artificial intelligence on a range of issues including disinformation as well as copyright over images, sound and text - © AFP Julio Cesar AGUILAR

The AI Infrastructure Alliance, MLOps co ClearML and chip firm FuriosaAI have teamed up as so to assess what business executives think about artificial intelligence. The output is a new report titled “The State of AI Infrastructure at Scale 2024: Unveiling Future Landscapes, Key Insights, and Business Benchmarks”. The tome includes responses from AI/ML and technology leaders across North America, Europe, and Asia Pacific, addressing issues and obstacles to scale-up.

Many executives reported that having and using Open Source technology is important for their organization. With most focused on customizing Open Source models. PyTorch is their framework of choice. PyTorch is a machine learning library used for applications such as computer vision and natural language processing.

This assessment has revealed that the biggest challenge is in scaling AI is compute limitations (an issue of both availability and cost). The next top challenge was infrastructure issues.

Central concerns are with:

  • How executives are building their AI infrastructure.
  • The critical benchmarks and key challenges they face.
  • How they rank priorities when evaluating AI infrastructure solutions against their business use cases.

More specifically in relation to the compute concerns, latency was top-ranked at, followed by power consumption. To address this, the majority of executives plan to use more cloud compute and many will buy more GPU machines on-premises in 2024 (a graphics processing unit  – GPU – is an electronic circuit that can perform mathematical calculations at high speed. Computing tasks like graphics rendering, machine learning, and video editing require the application of similar mathematical operations on a large dataset).

On the issue of latency, over half of respondents plan to use language models (like LLama), followed by embedding models (BERT and family) in their commercial deployments. Mitigating compute challenges will be essential in their plans.

One challenge is the global limitations in GPU supplies. A global chip shortage, triggered by the COVID-19 pandemic in 2020, severely hampered the production of GPUs. The pandemic disrupted the global supply chain, causing delays in chip production and delivery. To counter GPU scarcity, most businesses are looking for or are interested in cost-effective alternatives to GPUs.

The main challenges for operating GPUs is with job scheduling and management. This is especially with coordinating tasks and workflows within the AI/ML technology stack; something that is necessary in order to optimize GPU and compute resource allocation.

For those who already operate cloud compute systems, the main concerns are around wastage and idle costs. In addition there are misgivings about the cost of overall compute power consumption.

Avatar photo
Written By

Dr. Tim Sandle is Digital Journal's Editor-at-Large for science news. Tim specializes in science, technology, environmental, business, and health journalism. He is additionally a practising microbiologist; and an author. He is also interested in history, politics and current affairs.

You may also like:

World

Forget it. Change the subject and move on.  

Business

Venice's 60th Biennale international art show has opened its doors, exploring humankind's relationship with the fragile planet.

World

Over the rolling hills of Hebden Bridge in England, a gigantic painting interrupts the placid green pasture with a call to action.

Tech & Science

The arrival of ChatGPT sent shockwaves through the journalism industry - Copyright AFP/File JULIEN DE ROSAAnne Pascale ReboulThe rise of artificial intelligence has forced...