Filter out "liability generating" (or "compromising") Gen AI outputs
Generative AI outputs that create financial or other liability (for example, by making promises) are filtered before reaching the end-user of the AI system.
For example, https://venturebeat.com/ai/a-chevy-for-1-car-dealer-chatbots-show-perils-of-ai-for-customer-service/. A specially trained LLM or other NLP can be used to flag responses that have compromising content.
This would be added alongside "toxic", "protected", and "sensitive" output, because it is a true fourth category.
0
votes
Seamus Abshere
shared this idea