Header Image
ACT-IAC Header Image

Generating Good AI Results …and avoiding the Bad and the Ugly

 

Generative Artificial Intelligence (GenAI) has exploded onto the field in the last nine months with the announcements about Open AI’s ChatGPT 3.5 and 4.0, Google’s Bard PaLM 2 and other open-source Large Language Models (LLM).  As government and industry are learning how to apply GenAI across a variety of use cases, the technology is now readily accessible to the public for use in enhanced search and conversational queries that create narrative responses.

Alexis Bonnell, USAF Research Lab CIO/Digital

We decided to summarize what we know to date about using GenAI, what works, what doesn’t, and some key principles to consider on your journey to leveraging generative AI tools. To bring you reality-tested examples, we put two open-source LLMs into practice by prompting them with our main topics for this blog and comparing the results for you to see. We’ve placed those results at the end of this blog so you can compare the similarities and differences between the two.

To provide context for how to view this comparison, let’s first discuss some examples of where GenAI did not yield the expected results. One of the most publicized examples is a lawyer preparing a legal brief in defense of a case. The lawyer used a GenAI tool for deriving the case law citations and submitted the brief to the court without checking validity. Unfortunately, GenAI cited case law that did not exist. The judge immediately recognized the error and called the lawyer to account. Recent examples include an AI-meal planner application that created recipes with poisonous ingredients and hoaxsters using the app to post toxic recipes.

In an example closer to home, writers for this blog asked GenAI to write several paragraphs about a specific IT-related technical subject on Digital Transformation in the style of an informative article. The result was a well formulated response. When a friend was asked to read the document and comment on its content, they indicated it seemed very credible based on their general knowledge of the subject.  In actuality, the response had incorrect information and misrepresented some known facts, but the use of technical vocabulary and an authoritative tone led the individual to believe it was correct. For every wrong example, there are many more cases where GenAI is effective, such as in enhancing productivity, summarizing content from large knowledge bases, and spotting security issues.

As promised, here we contrast the output of two open source LLMs using ChatGPT 4 and GPT4AII. Results are similar in some aspects, but not the same.

10 GUIDELINES for using GenAI

1. Do not upload sensitive non-public or internal data to generative ai services

2. Obtain organization approval before using official email accounts to sign up for generative AI services.

3. Avoid uploading personal data 

4. Exercise caution and do not blindly trust the output of these models, results from generative AI may be misleading or inaccurate.

5. Refrain from using public generative AI tools to make decisions with legal, ethical, or financial consequences for the organization.

6. Be cautious about using these tools in ways that could compromise data security or privacy.

7. Note that the publicly available version of these tools often lacks contractual protection, such as legal privileges and immunities.

8. Understand that datasets uploaded to these tools may be accessed by unauthorized parties for various purposes.

9. Generative AI tools can aid cybercriminals in running scams and creating convincing phishing

 (To view the full texts generated, click on the links in the matrix headers).

 

GPT4

GPT4All

(GPT4all-13B-snoozy.ggmlv3.g4_0)

Overview

Both articles generated the same title even though it was not in our prompt.  Several articles on the web used similar titles.

Both outputs were formatted in the style of an article with an introduction and body even though the prompt never had an introduction in the provided outline

The GPT 4.0 model had a generated conclusion while the other model did not

Introduction Generated

Yes

Yes

Generated Title

Generative AI - The Good, Bad and Ugly

Generative AI - The Good, Bad and Ugly

 

Conclusion Generated

Yes

No

Factually Accurate

Correctly cited specific examples

Correctly cited specific examples

Word Count

887

949

Issues with GenAI

Outlined a few hazards and expanded the example included in original prompt

Outlined one hazard and built on example included in original prompt

Effective Use of GenAI

A few positive uses were noted

Provided an extensive list of GenAI examples across a range of use cases and industries

Principles

A few key principles were noted including the importance of human engagement and how models are continuously learning

A few key principles were noted including the importance of human engagement and continuous monitoring of the technology for advancements

Overall Content

The content generated was very similar beyond the attributes noted above.

In addition to the guidelines shared by Alexis Bonnell above, here are some considerations when deploying GenAI in your organization:

  • Ethical considerations: When deploying generative AI models, it is important to consider ethical implications such as privacy, fairness, and transparency. The models should be designed and trained in a way that respects user privacy and avoids biased or discriminatory outputs. Transparency mechanisms should be implemented to allow users to understand and interpret the decisions made by the models.
  • User feedback and iteration: Continuous feedback and iteration from end-users are essential for improving the performance and usability of generative AI models. Collecting user feedback, monitoring model performance, and incorporating user preferences can help refine the models and ensure they meet the needs and expectations of the users.
  • Legal and regulatory compliance: Generative AI models must comply with relevant legal and regulatory requirements. This includes ensuring compliance with data protection laws, intellectual property rights, and any specific regulations or standards applicable to the domain or industry.
  • Collaboration and interdisciplinary approach: Developing and deploying generative AI models benefits from collaboration among different stakeholders, such as data scientists, domain experts, ethicists, legal experts, and end-users. An interdisciplinary approach can help address challenges and ensure successful deployment.
  • Continuous improvement and adaptation: Generative AI models should be regularly updated and improved to adapt to changing user needs, technological advancements, and emerging challenges. This may involve retraining the models with new data, incorporating new features or functionalities, or addressing issues or biases that arise over time.
  • Risk management: It is also important to assess and manage the risks associated with generative AI models, such as potential misuse or unintended consequences. Risk mitigation strategies should be implemented to minimize these risks, which may include robust testing, validation, and monitoring processes, as well as clear guidelines and protocols for responsible use.

For simple use cases, GenAI can yield positive outputs.  In any case, it’s important to understand how LLMs work and conduct experiments so that there is a realistic expectation of the outputs they produce. Some things to note:  Foundational LLMs have been trained on sets of data that aren’t specifically known.  What we do know is that the ChatGPT 3.0 model was trained on data through 2021, so recent events and content have not been considered in the model. The knowledge base determines the quality and perspective of the results. Options are emerging for training models on a private knowledge base and are becoming more attractive. 

Generative AI has many potential applications and benefits for various domains and industries. However, it also poses some challenges and risks that need to be addressed carefully and responsibly. By following some of the best practices discussed in this article, we can use generative AI effectively and ethically. To adopt the wisdom from our two GenAI produced articles, human engagement, in tandem with continuous learning and model monitoring and technology are foundational to avoiding the infamous Bad and the Ugly.

Contributors:
John Barnes – Director, Consulting and Digital Transformation, Peraton
Alexis Bonnell – Digital Director, USAF Research Lab CIO
Judy Douglas – IAC Executive Committee, ACT-IAC; Fellow, National Academy of Public Administration
Diana Zavala – Director, Analytics and Data, Peraton

 

Publication Date: 

Return to InnOvation News page