How is AI and associated technology set to further transform businesses in 2025? What new insights will data science deliver? Do we need a new governance structure? To find out what the next months have in store for workplace technology, Digital Journal spoke with Michael Berthold, CEO of KNIME.
Digital Journal: What are the key ways that business intelligence and data science have evolved in the past year, and what are the key drivers behind this change?
Michael Berthold: Almost all software for performing analytics or data science has incorporated experimental GenAI assistants; some have incorporated integration with LLM technology. Nearly every tool in our space is responding to industries’ desire to automate some or all data work. However, practically speaking, few organizations have taken advantage of all the opportunities that GenAI seems to promise, due to an inability to control the technology or lack of transparency into how it works.
Simultaneously, the gap between data skills and data technology is widening as LLM and other data technology matures quickly yet businesses struggle to have even basic, enterprise-wide data literacy.
Tools that improve the accessibility and time-to-value of data work are going to be critical in the next several years.
DJ: Broadly, what are your predictions for the key trends and challenges in AI development and usage in 2025?
Berthold: 2025 will be the year of getting AI under control—or, in other words, AI governance. Now that organizations know the value of getting AI into production, they realize that controlling cost, quality, and access is critical for making the most of the technology.
People will also be reminded over and over again of the consequences of the fact that AI doesn’t know what it’s talking about – or as Stefan Wrobel put it, “AI states the likely, not the truth – but does so remarkably well.” For some applications, this is fine but for most, this is a fundamental problem. How to make AI reliable will be a key focus of 2025.
And finally, 2025 will be the year AI is overused. As the expression goes, to someone who holds a hammer, everything looks like a nail. But after everyone better understands the limitations and costs of AI, I see people often returning to classic analytics and text analysis methods that are cheaper, easier to control, and more reliable. The most powerful and innovative approaches will combine those classics with new techniques.
DJ: In 2025, what are the essential components of effective AI governance policies, and how can companies stay ahead of potential regulatory shifts?
Berthold: First, it’s essential to have the ability to decide when you do and don’t have tolerance for any kind of hallucinations or inaccuracies. Organizations that want to innovate won’t be able to create blanket bans—banning all AI—or even mandating in-house AI only. Your policies need the flexibility to define where AI can be used and to what extent.
For the cases where tolerance can be higher than zero, you’ll need policies that minimize risk to the extent possible. That means policies that focus on the following:
- Transparency – a clear trail of which decisions an AI made
- Reproducibility – when does the result need to be consistent?
- Data security – what kind of data can be shared and how? (e.g., anonymized?)
- Accuracy tolerance – in what use cases can you tolerate mistakes and to what extent? Which AI decisions are required to be checked by humans?
- Access – who can and should build AI products, and who should access them?
DJ: What is agentic AI and how do you foresee the role of this technology evolving in 2025, especially in terms of supporting data teams with autonomous decision-making and enhancing overall data science capabilities?
Berthold: Agentic AI is a rapidly emerging concept in technology and data science. However, there’s some disagreement on what the exact definition of agentic AI is. Ultimately, agentic AIs are artificial intelligence systems that can operate and act with a level of autonomy. These systems make decisions to achieve specific goals based on input data, adapt to shifting circumstances, and determine the best course of action with minimal or no human oversight. Core features include autonomy, goal focus, and adaptability, making them highly versatile for dynamic environments.
While these systems hold immense potential to drive innovation across industries, their adoption introduces governance challenges that demand attention. For starters, transparency is critical – users need to understand how the system works and makes decisions. Data governance is equally important to ensure the data used by these systems is managed to prevent bias or discriminatory outcomes. Also, ethical safeguards must be in place to ensure every decision made by agentic AI systems is traceable and accountable, especially when errors occur. Beyond legal considerations, organizations must also implement internal governance frameworks that maintain human oversight while still allowing the system to operate independently as designed. By combining these governance strategies with the technology’s advanced capabilities, data teams will be able to harness its full potential, driving innovation and efficiency across industries.
Agentic AI will make fewer headlines, but not disappear. The big bold vision, that cleverly interacting AIs will make completely autonomous and complex decisions, will not materialize anytime soon—but we will see more practical setups that do use elements of AI agents underneath the hood. One reason is that specialized AIs perform better for more specialized tasks. A second reason is because specialized AIs tend to be smaller and less expensive to run. Expect agentic AI to continue gradually improving, and one day—but likely not next year—we will have that assistant who knows and does it all.
DJ: How do you see the role of citizen data scientists and data democratization evolving in 2025? What impact will this have on the demand for specialized data science skills within organizations?
Berthold: The key factors are:
- The line between citizen data scientists and data scientists will increasingly blur, as more data workers in the organization get access to advanced data technology through intuitive, easy-to-learn environments.
- Much more data work will be partially or fully automated, thanks to advancements in AI.
- However, those AI systems need to be transparent and well-governed (see above).
- The three points above do not mean the roles of data scientists will go away, but rather that advanced data experts will now spend less time on mundane tasks—such as data accessing and cleaning—and much more on interesting, innovative data analysis. This will, overall, drastically increase how much value-generating insight teams will be able to produce.
DJ: How can organizations best leverage open-source platforms to remain competitive, and what role will these tools play in expanding access to data science?
Berthold: In an environment that is changing as rapidly as the data and AI space, organizations simply can’t afford to not leverage open-source technology.
First and foremost, open-source tools are the only technologies that, immediately, are compatible with all the new technology—GenAI and otherwise—appearing in the space. They’re also frequently free, or together with commercial controls or add-ons, cheaper than proprietary solutions, meaning that organizations can adopt the technology without a large upfront investment. And lastly, they offer transparency—an essential component that is coming up more and more in AI and data regulations.
Businesses can rely on open-source technology to innovate with data and AI solutions while using commercial solutions to balance innovation with control and governance.
DJ: Looking ahead, what trends in data science do you think will be most impactful for businesses by 2025?
Berthold: These are:
- AI will be better controlled. Organizations will develop more sophisticated governance policies to more granularly control what is being done with AI, and by whom, trying to strike the delicate balance between innovation and safety.
- A resurgence of predictive AI. Now that “AI” is making headlines, this will remind people of classical machine learning technology that we may not have leveraged to its full potential. We’ll see more focus on predictive AI, and interesting mixes of predictive and GenAI use cases.
- Agentic AI will remain (but won’t be such a hot topic). People will quickly see that the likelihood of fully autonomous AI agents isn’t realistic—however, some elements of agentic AI are possible in specific environments. First, because specialized AIs perform better for more specialized tasks, and second, because more specialized AIs tend to be smaller and less expensive to run. Expect agentic AI to continue to gradually improve.
- A focus on value might lose some (long-term) opportunities. Now that the initial hype has died down, businesses will ask the critical question: is this bringing business value? The risk here is that some of those experiments are being stopped too early, and teams revert to old, less sophisticated tools. Picking the areas where it’s worth investing a bit more energy and patience will be an interesting challenge for many organizations.
DJ: How will SLMs vs. LLMs be used by data organizations in 2025? Do you see smaller, purpose-built models becoming more widely adopted next year?
Berthold: There are two reasons why one would prefer a smaller, more specialized model: quality and cost.
The quality argument is a pragmatic one only, though. A large model can also be trained to provide the same quality for specialized tasks—however, we may need to make that model even larger (and more expensive). So this is more of a trade-off argument: do you really want to invest even more resources into a large model that, in the end, will only be used to focus on a limited task?
This supports the cost argument: smaller models will become more attractive because they are cheaper to run—potentially even locally. This will allow embedding specialized AIs into programs and applications directly, without the need to rely on cloud resources for every interaction.
Where we’ll really see the benefit of smaller, specialized models is in agentic setups: when many specialized AIs interact with each other, that cost-saving effect will become even more dramatic.
