AI security certification spec. - Exposure draft: Hot (2 ideas)

How can we improve the AI security certification?

Enter your idea

(thinking…)

Enter your idea and we'll search to see if someone has already suggested it.

If a similar idea already exists, you can support and comment on it.

If it doesn't exist, you can post your idea so others can support it.

Enter your idea and we'll search to see if someone has already suggested it.

Threat inherent to language models: Misalignment

Threats inherent to language models - Misalignment

Description: The production of content that contradicts the goals, policies, or intentions of developers.

Impact: Depends on the ability of end users to exploit content for malicious purposes such as reputational damage or social engineering, even if terms of service of Gen AI system state that it does not represent the views of developers.

1 vote

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · The AI Security Threats · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close
Filter out "liability generating" (or "compromising") Gen AI outputs

Generative AI outputs that create financial or other liability (for example, by making promises) are filtered before reaching the end-user of the AI system.

For example, https://venturebeat.com/ai/a-chevy-for-1-car-dealer-chatbots-show-perils-of-ai-for-customer-service/. A specially trained LLM or other NLP can be used to flag responses that have compromising content.

This would be added alongside "toxic", "protected", and "sensitive" output, because it is a true fourth category.

0 votes

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

We’ll send you updates on this idea

0 comments · The AI Security Requirements · Delete… · Admin →

How important is this to you?

We're glad you're here
Please sign in to leave feedback

Signed in as (Sign out)

Close

Close

Don't see your idea?

AI security certification spec. - Exposure draft

Feedback

AI security certification spec. - Exposure draft

JUMP TO ANOTHER FORUM

Searching…

How can we improve the AI security certification?

Threat inherent to language models: Misalignment

Filter out "liability generating" (or "compromising") Gen AI outputs

AI security certification spec. - Exposure draft

Categories

Searching…

How can we improve the AI security certification?

We're glad you're here

Threat inherent to language models: Misalignment

We're glad you're here

We're glad you're here

Filter out "liability generating" (or "compromising") Gen AI outputs

We're glad you're here

We're glad you're here