Connect with us

Hi, what are you looking for?


Interview: How can organizations maintain data governance when using generative AI?

Generative AI platforms come with data governance risks for businesses due to unauthorized use or plugging private and sensitive data into the model, resulting in potential security breaches or non-compliance with data regulations.

The arrival of ChatGPT late last year shook up the EU's plans to regulate AI
The arrival of ChatGPT late last year shook up the EU's plans to regulate AI - Copyright AFP Karim SAHIB
The arrival of ChatGPT late last year shook up the EU's plans to regulate AI - Copyright AFP Karim SAHIB

Generative AI empowers faster, more thorough understanding of complex topics, and the large language models (LLMs) that power them are a business productivity gold mine to create stronger deliverables and execute labour-heavy tasks more efficiently. But in their current state, public platforms like OpenAI’s ChatGPT and Google’s Bard are the most common ways users can access them – presenting an exposure risk for sensitive information that’s driving many enterprises to outright ban their use on company networks and hardware.

Daniel Fallmann, CEO at Mindbreeze, shares insights with Digital Journal on the risks these exposure risks presented by these platforms, pitfalls leadership must consider when identifying tools to fit their needs, how businesses can get more trustworthy and effective results from the tools, and what innovations will drive the next generation of LLMs for business.

Digital Journal: Why are generative AI platforms a data governance risk for businesses?

Daniel Fallmann: Generative AI platforms come with data governance risks for businesses due to unauthorized use or plugging private and sensitive data into the model, resulting in potential security breaches or non-compliance with data regulations. Similar to some of the pitfalls to avoid, public generative AI platforms may expose companies to possible data leakages, making the training data all the more critical. Data privacy is also a data governance risk if customer data or confidential business information is input into the platform, exposing proprietary information to the public. Many of these risks are associated with generative AI platforms trained on publicly available data. Encryption, access rights, and monitoring mechanisms are needed to protect businesses from data governance risks and establish proper data handling practices.

DJ: Enterprises are rushing to integrate generative AI tools with their internal and external workflows – what are some pitfalls to avoid when adopting this technology?

Fallmann: Although there are tons of benefits from integrating AI tools into internal and external workflows, there are some pitfalls companies must avoid before adopting innovative AI tech solutions. One pitfall is biases present in training data that may result in biased or unfitting outputs. To mitigate this issue, companies looking to adopt generative AI into their operations must guarantee that the training data is diverse. Additionally, enterprises cannot rush into integration before addressing potential legal and ethical concerns of generating content with a language model and other AI tools – privacy violations, intellectual property infringement, and plagiarism, to name a few. Adequate evaluation, testing, monitoring systems, and ongoing employee training are core ways to avoid these pitfalls.

DJ: Can public platforms for generative AI, like ChatGPT and Google Bard, be used securely for business?

Fallmann: Public platforms for generative AI are one pillar for safe and effective business use. However, these platforms alone are not the entire equation. For best protection and constructive business use, integration with cognitive enterprise search systems must be added to the equation – permitting the connection of enterprise facts and business-relevant information. Combining the two makes up a valuable and secure solution for enterprises with highly sensitive data and protects the data from being shared into any public model.

DJ: Are these tools reliable? What’s the key to making generated responses more trustworthy?

Fallmann: The key to the reliability and trust of generative AI responses is combining them with  cognitive enterprise search technology. As mentioned, this combination generates responses from enterprise data, and users can validate the information source. Each answer is provided in the user’s context, always accounting for data permissions from the data source with full compliance. In addition, these tools ensure data is consistently up-to-date by delta crawling. Integrating generative AI tools into a trusted knowledge management solution allows employees to see which documents their information came from and even provide further explanations. Human review and oversight play a significant role in certifying the accuracy and reliability of the generated content. Incorporating user feedback will also enforce transparency and trust in the system’s natural language question-answering (NLQA) abilities.

DJ: What should leadership consider when approving or denying use cases for generative AI?

Fallmann: Leadership must address various factors when contemplating approving or dismissing specific use cases. Firstly, leadership must evaluate the potential impact of the generated content on the organization’s reputation, brand image, and the effectiveness it will have on the specific business unit. Legal and ethical implications and ensuring compliance with regulations and guidelines are necessary considerations, just like any other deployed technology. Assessing the reliability and accuracy of outputs is crucial to ensure responses are actually helping their employees complete their tasks, as well as analysis of risks and benefits associated with the specific use case. Collaboration with legal, compliance, and data privacy experts provides valuable insights for leaders to make informed decisions about generative AI adoption.

DJ: Large language models are notably cost ineffective to operate on a small scale. How realistic are models for business that operate inside a “walled garden”?

Fallmann: Large language models for businesses operating inside a “walled garden” are absolutely realistic and cost-effective. Companies can achieve better performance and tailor the AI system to their specific needs by training and fine-tuning models specific to their domain or industry. Integration into existing systems allows more control over the training data to improve the accuracy and reliability of the generated content. Operating within a “walled garden” provides stronger data privacy, as sensitive information remains within the organization’s infrastructure.

Avatar photo
Written By

Dr. Tim Sandle is Digital Journal's Editor-at-Large for science news. Tim specializes in science, technology, environmental, business, and health journalism. He is additionally a practising microbiologist; and an author. He is also interested in history, politics and current affairs.

You may also like:

Tech & Science

Google says only 2 percent of online search queries involve news as people seek information from podcasts, newsletters, and short-form video - Copyright GETTY...


If you notice fewer direct messages or invitations to meetings from supervisors, it may indicate that you're being gradually phased out.


"He has to face justice, right?" said Valmir Do Carmo, 30, a babysitter, as he walked his dogs on Court Street in the city's...


Louisiana is perhaps the most stressed state in the entire USA.