AIIRS | AI Inherent Risk Scale

What is AIIRS?

The AI Inherent Risk Scale (AIIRS) provides a structured approach for classifying tasks that use generative artificial intelligence (GenAI) into LOW, MEDIUM, or HIGH inherent-risk bands.

Classification is determined via three criteria—epistemic dependence, verifiability, and consequences of error—that define the nature and significance of a task's reliance on GenAI.

These criteria consider the extent to which GenAI is expected to supply information, the degree to which the output can be independently verified, and the seriousness of any potential errors.

PURPOSE

What it does

The purpose of AIIRS is not to determine whether GenAI should be used, but to establish the level of inherent risk associated with a task that may require active management.

AIIRS focuses on the inherent characteristics of a task, not on individual behaviour or user intent.

Once a task's inherent risk is understood, any additional safeguards, mitigations, or design choices may be applied where warranted, in line with any applicable governance arrangements.

SCOPE

What it covers

AIIRS is a classification instrument only, which indicates the level of risk that should be actively managed for a task that uses GenAI. It does not determine whether GenAI use is permitted, prohibited, ethical, compliant, or appropriate in any given context.

AIIRS is designed for task-bounded human use of GenAI and does not cover autonomous or agentic AI systems, which introduce additional risks beyond the scope of this classification instrument.

ALIGNMENT

Governance fit

Higher education regulatory frameworks require providers to manage risks to academic quality and integrity through institutional governance and oversight. In Australia this expectation appears in the Higher Education Standards Framework (HESF), and in the United Kingdom in the Office for Students (OfS) regulatory framework and the QAA UK Quality Code, alongside emerging principles on responsible use of artificial intelligence.

AIIRS supports consistent institutional judgement by classifying the inherent risk of GenAI use in academic work and informing decisions about safeguards, assessment design, and integrity responses within existing governance and quality assurance processes.

Three criteria.
One defensible answer.

AIIRS evaluates tasks across three independent dimensions.

Epistemic Dependence

Epistemic dependence captures whether a task requires the system's representations of the world to be correct in order for the task outcome to be usable.

Tasks with lower epistemic dependence rely only on user-provided material, without requiring the system's representations of the world to be correct for the task outcome to be usable.

Tasks with higher epistemic dependence require the system's representations of the world to be correct for the task outcome to be usable.

Verifiability

Verifiability captures the basis on which the correctness of a GenAI system's output can be verified for the task.

Verifiability is assessed independently of consequences. A task may be high risk due to unsourced verifiability, even where the immediate consequences of error are limited.

Tasks with embedded verifiability enable quick, reliable verification by the user or the surrounding process, without requiring specific domain expertise.

Tasks requiring expert verifiability depend on specialised expertise or external investigation that requires evaluative judgement.

Consequences of Error

The consequences of error reflect the extent to which incorrect, misleading, or incomplete GenAI outputs affect decisions, records, or outcomes related to the task.

Tasks with minimal consequences of error are those in which errors have minimal impact on understanding or outputs and do not affect decisions, records, or outcomes relating to people beyond the task.

Tasks with significant consequences of error are those in which errors affect decisions about people, alter records relating to them, or compromise outputs that have consequences for individuals or groups beyond the task.

Classify a task

Select one option for each criterion. AIIRS calculates the risk classification using max-dominant scoring where the single highest criterion determines the overall result.

Criterion 01

Epistemic Dependence

Criterion 02

Verifiability

Criterion 03

Consequences of Error

Classification Result

What each band requires

AIIRS uses a max-dominant classification model that supports proportionate risk management by ensuring that any single high-risk feature of a task is not offset by lower-risk features elsewhere.

HIGH

Intervention required

Tasks classified as HIGH must not proceed in their current form. One or more of the following interventions are required:

MEDIUM

Controls required

Tasks classified as MEDIUM require proportionate controls to manage identified risk. The following controls and conditions apply:

LOW

Routine care

Tasks classified as LOW require routine care appropriate to the task and context. The following routine practices apply:

AI Inherent Risk Scale

What is AIIRS?

Three criteria.One defensible answer.

Epistemic Dependence

Verifiability

Consequences of Error

Classify a task

What each band requires

Three criteria.
One defensible answer.