Share
Linked In Facebook X (Twitter) Copy
Publication

Validating Sources and Data Integrity Prior to Producing Final Generative Artificial Intelligence Output

Share
Linked In
Facebook
Twitter
Copy
Share
banner image

Authors: 
Kristen Chung, Kirk Goldman, Theodore Clark, Manda Miller, Sam Champagne, Pranit Kotgire, Levent Kacan

Abstract:
This solution proposes a method to minimize bad data and inputs into in-house company Generative AI models. This proposal outlines a way to continuously audit and curate a company’s data so that the data feeding the in-house gen AI tool is up-to-date and accurate, by including/eliminating a data source based on metadata parameters and company crowd-sourced inputs.

Background:
The issue is how do you insure the data feeding into a generative  AI tool is accurate and up to date. This idea outlines a way to continuously audit and curate a company’s data so that the data feeding the in-house gen AI tool is up-to-date and accurate, by including/eliminating a data source based on metadata parameters and company crowd-sourced inputs.

Description:
The disclosure requires the following actions to deliver the idea:

1. Consistent information collected in the metadata of each file within the business, such as but not limited to:

a. Author: The name of the original author

b. Subject: A topic or keyword that identifies the document's contents

c. Title: The name of the document

d. Creation date: When the document was created

e. Last saved: When the document was last saved

f. Last printed: When the document was last printed

g. Last opened: When the document was viewed

2. A generative AI tool that can tag sources and save metadata of those sources used in gen AI produced content and present those sources to a generative AI tool user

3. The information contained in the metadata of the sources used in gen AI content can be presented to a gen AI tool user in the form of a table with the option to “include” and “eliminate” that content from the gen AI produced output

a. Ex. Table of sources used to generate a presentation on elevator safety

Author

Subject

Title

Creation Date

Include/

Eliminate

Cherie Berry

Keeping Elevators  Safe

How to make Elevators go super-fast

April 1, 2023

Include

Taylor Swift

Potential Ours lyrics for Speak Now

Elevator buttons in morning air

Strangers silence makes me wanna take the stairs

April 1, 2010

Eliminate

 

4. The ability of the user to “re-run” the generative AI prompt with the new rules around what sources can be “included”. In the case above, as a user, I don’t want to use the elevator reference in Taylor Swift’s file for my Elevator Safety presentation, so I have chosen “Eliminate”

5. Reiterate the above process until all sources have been validated and accepted as “included”.

6. The generative AI tool can inherently discontinue the use of any company files that meet certain parameters, such as:

a. Crowd-sourced content:

i. Have been “eliminated” from content at least X number of times

ii. Author has been flagged X number of times by other users

b. Metadata parameters such as: not opened for one year, created over one year ago, or written by flagged Authors that aren’t within the company

 

TGCS Reference 00166

Contact Intellectual Property department for more information