AI Governance in Saudi Arabia: Building the Technical Foundations for Responsible AI at Scale

How PII redaction, document classification, and data governance are becoming critical capabilities for organisations operating under the Kingdom's rapidly evolving regulatory framework.

Introduction

As Saudi Arabia accelerates its transformation into a global AI hub under Vision 2030, the Kingdom's regulatory landscape is maturing at a pace that demands serious technical attention from every organisation deploying AI in the region. What was once a patchwork of soft guidelines is rapidly crystallising into binding legislation — and the organisations that build robust governance capabilities now will be the ones positioned to lead, not scramble to catch up.

At Bayseian, we've spent significant time on the ground in Riyadh working with organisations navigating this exact challenge. This article provides a comprehensive technical overview of what AI governance means in the Saudi context, the specific capabilities organisations need to invest in, and how frameworks like PII redaction and intelligent document classification aren't just compliance boxes to tick — they're foundational infrastructure for any serious AI deployment in the region.

The Regulatory Landscape: SDAIA, NDMO, and the PDPL

Saudi Arabia's AI governance architecture is anchored by three institutional pillars: the Saudi Data and Artificial Intelligence Authority (SDAIA), the National Data Management Office (NDMO), and the Personal Data Protection Law (PDPL). Understanding how these interact is essential before diving into technical implementation.

SDAIA, established by royal decree in 2019, serves as the Kingdom's apex body for all things data and AI. It operates with a mandate that few equivalent bodies globally possess — simultaneously functioning as a strategy-setting authority, a regulatory body, and an ecosystem enabler. Unlike many countries where AI governance is fragmented across multiple agencies, Saudi Arabia has concentrated authority in a single institution with both the mandate and the resources to enforce it.

NDMO, operating under SDAIA's umbrella, is the national regulator specifically for data governance and personal data protection. It is responsible for setting the policies, standards, and controls that govern data management across both public and private sectors. NDMO's standards are not aspirational — government entities must implement them, and the scope extends to any business partner handling government data.

The PDPL, which became fully enforceable on 14 September 2024 after a one-year grace period, represents the Kingdom's first comprehensive data protection legislation. Its scope is notably broader than even the EU's GDPR in one critical respect: while the GDPR limits its extraterritorial reach to specific activities such as offering goods or services to EU residents, the PDPL applies to any processing of personal data of individuals located in Saudi Arabia, regardless of the nature of that processing. For organisations deploying AI systems that touch Saudi data — whether they're based in Riyadh or London — this has profound implications.

The penalties are substantive: fines up to SAR 5 million (approximately USD 1.3 million), potential imprisonment for up to two years for unauthorised disclosure of sensitive data, and the prospect of doubled fines for repeat violations. These aren't theoretical deterrents. In its first year of enforcement, PDPL committees issued 48 decisions confirming violations and imposing penalties on data controllers. Separately, in January 2025, NDMO awarded AI service provider accreditation certificates to over 40 entities that met AI ethics maturity requirements — ranking them across five maturity levels from "Conscious" to "Pioneer" — signalling a shift from soft encouragement to formal compliance verification.

How Saudi Arabia's Approach Differs from the GDPR and the EU AI Act

It's worth pausing to note what makes the Saudi approach distinctive. The EU's regulatory architecture separates data protection (GDPR) from AI-specific regulation (EU AI Act) and distributes enforcement across 27 member states' Data Protection Authorities. Saudi Arabia, by contrast, concentrates data governance, personal data protection, and AI regulation under a single authority (SDAIA), creating a more unified — and arguably more agile — regulatory environment.

The PDPL's extraterritorial reach is also broader than the GDPR's. Where the GDPR requires a nexus to specific activities (offering goods/services or monitoring behaviour), the PDPL simply covers any processing of data belonging to individuals in Saudi Arabia. This means AI systems that even incidentally process Saudi data — for example, a global customer service model trained on multilingual datasets — may fall within the PDPL's scope.

The AI Adoption Framework, while currently non-binding, represents a risk-based approach conceptually similar to the EU AI Act but with Saudi-specific adaptations around sovereign compute, Arabic-language requirements, and alignment with Vision 2030's economic diversification goals.

Figure 1: The three-pillar regulatory architecture governing AI in Saudi Arabia, with SDAIA at the apex and a dedicated AI law on the horizon.

NDMO's 15-Domain Framework: The Technical Compliance Map

For technical teams, the most actionable component of the Saudi regulatory stack is NDMO's National Data Management and Personal Data Protection Standards. This framework spans 15 domains, encompassing 77 controls and 191 compliance specifications.

The 15 domains are organised into five pillars that map directly to the data lifecycle.

Pillar 1 — Data Governance establishes the overarching rules: governance structures, policies and procedures, and defined roles and responsibilities. For AI deployments, this translates directly to model governance — documented training data lineage, inference audit trails, clear ownership of model outputs, and accountability chains that connect AI system behaviour to named individuals and organisational structures.

Pillar 2 — Data Assetisation addresses how data is catalogued, its quality managed, and its metadata maintained. For AI systems, this pillar drives feature store management, training data quality assurance, data versioning, and — critically — bias detection. NDMO requires that data assets be valued based on their function and quality, which in the AI context means organisations must be able to demonstrate that their training data meets defined quality thresholds and is free from systematic biases that could produce discriminatory outcomes.

Pillar 3 — Data Usage governs how data is shared, analysed, and made available through open data standards. For AI, this pillar defines the rules for inter-agency AI model sharing, federated learning arrangements, model output distribution, and API governance. Organisations that want to share AI-derived insights across government entities must comply with these data sharing standards.

Pillar 4 — Data Classification and Availability requires formal classification of all data assets and ensures appropriate access management and high availability. This is where automated document classification becomes essential — no organisation dealing with large document volumes can manually classify every asset to NDMO standards. It also drives requirements for role-based access control (RBAC) for AI systems, high availability for inference endpoints, and disaster recovery for model registries.

Pillar 5 — Data Protection covers privacy controls, data residency requirements, and encryption standards. This is the pillar that makes PII redaction non-negotiable and sovereign compute architecturally mandatory. Personal and government data must remain within Saudi Arabia's national borders, which has immediate implications for cloud infrastructure choices, model training pipelines, and vector database deployments.

Impact Assessment and Classification Registers

Data Classification under NDMO standards is not merely a labelling exercise. It requires a formal impact assessment to determine the potential damage caused by data mishandling, followed by the assignment of classification levels. Organisations must maintain a register of all identified data assets containing: their assigned classification levels, the dates of assignment, the duration of those classifications, approved classification levels from review, and the dates of review. This is a continuous, auditable process — not a one-off project.

Figure 2: The five pillars of NDMO's National Data Standards and their specific implications for AI system design.

PII Redaction: From Nice-to-Have to Infrastructure Requirement

The PDPL defines "Sensitive Data" broadly: racial or ethnic origin, religious or political beliefs, criminal records, biometric and genetic data, health data, and data indicating that one or both of an individual's parents are unknown. Any AI system that processes unstructured data — documents, communications, transcripts, support tickets — will inevitably encounter PII that falls into these categories.

PII redaction, therefore, isn't a peripheral feature. It's a critical infrastructure layer that must sit between raw data ingestion and any downstream AI processing.

The Hybrid Detection Architecture: NER + Regex + Small Language Models

A robust PII redaction pipeline in the Saudi context requires three complementary detection layers working in concert, not a single approach in isolation.

Arabic PII patterns differ substantially from English ones. Saudi national ID numbers follow a 10-digit format beginning with 1 (for citizens) or 2 (for residents). Iqama (residency permit) numbers have their own distinct patterns. Saudi phone numbers use the +966 country code with specific regional prefixes. Arabic name patterns — including patronymic chains — require morphological analysis that goes far beyond the tokenisation approaches used for English names.

Off-the-shelf NER solutions built for English-language markets will miss a significant portion of Arabic-language PII. This isn't a marginal gap — in our experience, English-optimised NER tools detect as few as 40–60% of Arabic PII entities in Saudi government documents. The NER layer requires dual-language pipelines: dedicated Arabic models (typically fine-tuned transformer architectures like CAMeL or AraBERT) running in parallel with English NER models, with a fusion layer that handles mixed-language documents — which are extremely common in Saudi business communications.

The regex layer handles structured identifiers where patterns are deterministic: Saudi national IDs (10 digits, starting with 1 or 2), Iqama numbers, phone numbers (+966 prefix), IBAN formats, and other machine-readable PII. Regex provides high-precision, zero-latency detection for these known patterns and serves as a safety net that catches entities the NER models might miss.

The small language model (SLM) layer provides the contextual intelligence that neither NER nor regex can offer alone. A string of 10 digits might be a Saudi national ID in one document context and an invoice number in another. A name that appears in a public government directory should be treated differently from the same name appearing in a medical record. The SLM disambiguates these cases by understanding document structure, surrounding text, and domain-specific patterns.

This is where a hybrid approach becomes essential — combining NER models for entity detection, regex pattern engines for structured identifiers, and small language models for contextual disambiguation. The SLM layer doesn't just ask "is this a name?" but "is this a name in a context where the PDPL requires it to be protected?" SLMs are preferable to large language models in this pipeline because they can run on-premise within the sovereignty envelope, offer deterministic latency suitable for high-throughput document processing, and avoid the data residency complications of routing content through hosted LLM APIs.

Reversible vs. Irreversible Redaction

The architectural choice between pseudonymisation and anonymisation has significant downstream implications. Pseudonymisation (reversible, with a secure mapping table stored in a KSA-resident key management system) enables downstream analytics while protecting identity — useful for internal research, trend analysis, and performance monitoring. Full anonymisation (irreversible, using techniques like token replacement, generalisation, or differential privacy) is required for model training datasets and any data that will be shared externally.

The system must support both modes with clear policy controls governing which mode applies to which data classification level. A document classified as "Confidential" under NDMO standards will typically require full anonymisation before any AI processing. An "Internal" document might permit pseudonymisation for authorised analytics use cases. These policy mappings must be codified, version-controlled, and auditable.

Audit Logging

Every redaction decision — what was redacted, when, by what rule or model, and with what confidence score — must be logged and retained. NDMO standards require tamper-evident logging with minimum 5-year retention. This audit trail serves dual purposes: demonstrating compliance to regulators and enabling quality assurance of the redaction pipeline itself. Without it, organisations cannot prove that their PII handling meets PDPL requirements.

Figure 3: The end-to-end PII redaction pipeline, from multi-source ingestion through detection, classification, and policy-driven redaction to audit-logged output.

Document Classification: The Gateway to Compliant AI Processing

Under NDMO's data classification requirements, every data asset must be classified before it can be processed, shared, or stored. For organisations dealing with large volumes of unstructured documents — contracts, correspondence, reports, applications, complaints — manual classification is neither scalable nor reliable.

Intelligent document classification becomes the gateway capability that enables everything else. Without it, organisations cannot reliably apply the correct data handling rules, cannot enforce data residency requirements at the document level, and cannot ensure that PII redaction policies are applied proportionally to the sensitivity of the content.

Four Dimensions of Classification

An effective document classification system for the Saudi regulatory context must address four dimensions simultaneously.

Sensitivity classification must align with NDMO's defined levels: Public, Internal, Restricted, and Confidential. Each level triggers a cascade of handling requirements. Public data can be processed with minimal controls. Internal data requires access controls and encryption. Restricted data demands need-to-know access, full audit trails, and MFA for access. Confidential data must be processed exclusively on KSA-sovereign infrastructure with the most stringent controls.

Content-type classification identifies what kind of document is being processed — a contract, a government directive, a citizen complaint, a medical record, a financial statement. Each content type carries its own regulatory obligations. A medical record containing health data triggers PDPL sensitive data provisions. A government directive may trigger different classification and sharing rules under the National Data Governance Interim Regulations. A citizen complaint containing personal grievances requires different redaction treatment than a public tender document.

Language and script detection is foundational. Saudi organisations routinely produce and process documents in Arabic, English, and frequently mixed-language documents where a single page might contain Arabic body text with English technical terms, or English templates filled with Arabic data. The classification system must handle all three scenarios accurately, including right-to-left text rendering, Arabic morphological complexity, and code-switching patterns common in Saudi business communications.

Temporal classification — identifying when a document was created, when it was last modified, and what retention period applies — feeds directly into NDMO's requirements around data lifecycle management. Documents that have exceeded their retention period must be identified for disposition. Documents approaching regulatory deadlines must be flagged for action. AI-processed documents must carry metadata about when and how they were processed.

Architecture: Hybrid ML + Rule Engine

In practice, the most effective classification systems for the Saudi context use a hybrid architecture: machine learning classifiers (typically fine-tuned BERT variants for Arabic) for content-type and sensitivity detection, combined with a deterministic rule engine for regulatory mapping and policy enforcement. The ML layer handles the fuzzy, context-dependent decisions ("is this a medical record or a wellness brochure?"), while the rule engine ensures that classification outputs map deterministically to NDMO-compliant handling rules.

The system should target >95% classification accuracy on Arabic documents and >99% on English. When confidence falls below threshold, automatic escalation to review queues ensures that edge cases are handled appropriately — NDMO standards demand demonstrable accuracy in data handling.

Figure 4: The four-dimensional classification engine and the downstream actions each classification triggers — from storage routing to encryption requirements.

The AI Adoption Framework: Risk-Based Governance in Practice

SDAIA's AI Adoption Framework, published in September 2024, introduced a structured maturity model with four levels — Emerging, Developed, Proficient, and Advanced — alongside a risk-based classification system for AI use cases.

While the Framework is currently non-binding, it carries significant weight, particularly in the public sector where government entities are mandated to follow it through internal directives. And the trajectory is clear: a dedicated AI law currently under development aims to codify many of these principles into binding legislation, introducing mandatory risk-based classification, registration requirements, and audit duties for AI providers.

Four Maturity Levels

The Framework defines organisational AI maturity across four levels. Emerging organisations have initial awareness and ad-hoc AI experimentation. Developed organisations have established basic AI governance structures and repeatable processes. Proficient organisations demonstrate integrated AI capabilities with formal risk management. Advanced organisations operate at the frontier with continuous improvement, proactive risk identification, and thought leadership.

Each level maps to expected capabilities across the Framework's four enablers: data readiness, technology infrastructure, human capabilities, and responsible use practices. Organisations are expected to self-assess and develop roadmaps for progression.

Risk-Based Classification

The Framework's risk-based approach mirrors the EU AI Act's conceptual structure but with Saudi-specific adaptations. High-risk AI systems — defined as those whose failure could materially affect safety, fundamental rights, or critical services — require:

•Pre-deployment impact assessments documenting potential harms, affected populations, and mitigation strategies
•Documented testing regimes including stress testing, edge-case analysis, and adversarial probing
•Red-team exercises specifically targeting failure modes relevant to the Saudi context (Arabic language edge cases, cultural sensitivity, government process disruption)
•Post-deployment monitoring with defined KPIs for accuracy, fairness, and reliability, and escalation procedures when thresholds are breached

AI Ethics Principles and Generative AI Guidelines

The AI Ethics Principles (2023) establish seven foundational values: integrity and fairness; privacy and security; reliability and safety; transparency and interpretability; accountability and responsibility; human-centric values; and social and environmental benefit. These aren't abstract aspirations — they translate to concrete technical requirements. Fairness demands bias testing across demographic groups relevant to the Saudi population. Transparency requires explainability mechanisms appropriate to the risk level of the system. Accountability demands clear ownership chains for AI outputs.

The Generative AI Guidelines (2024) address the specific challenges of large language models and generative systems. Key requirements include content authenticity verification, watermarking of AI-generated content, human oversight for high-stakes outputs, and clear disclosure when AI-generated content is being presented. The Deepfakes Guidelines (2024) further mandate disclosure and watermarking for synthetic media.

Cross-Border Data Transfers: The Sovereignty Imperative

The Regulation on Personal Data Transfer Outside the Kingdom, issued in August 2024, introduces structured mechanisms for cross-border transfers. These include adequacy assessments of recipient jurisdictions, standard contractual clauses, and binding corporate rules.

But the direction of travel is toward greater data sovereignty, not less. The Draft Global AI Hub Law, which underwent public consultation in mid-2025, proposes a revolutionary "data embassy" model — allowing foreign entities to operate under their own legal frameworks for certain data activities while physically located in Saudi Arabia. This "sovereign compute" concept positions the Kingdom as a destination where global AI workloads can run under controlled conditions, rather than requiring data to leave Saudi borders.

For AI practitioners, this means that cloud-native architectures must be designed with data residency as a core constraint, not an afterthought. Model training pipelines, inference endpoints, vector databases, and embedding stores all need to respect jurisdictional boundaries. This is architecturally non-trivial, particularly for organisations accustomed to centralised global cloud deployments.

Practical Sovereignty Architecture

In practice, data sovereignty for AI systems requires several architectural commitments:

•Compute must be provisioned in KSA-resident cloud regions (GCP's Dammam region, AWS's planned KSA region, or Azure's planned KSA availability zones)
•Encryption keys must be managed within KSA using customer-managed encryption keys (CMEK) stored in KSA-resident key management services
•Network egress controls must prevent any data — including model weights, embeddings, and inference results derived from classified data — from leaving approved jurisdictional boundaries
•Data processing agreements must explicitly address the Saudi regulatory context, not merely rely on generic international DPAs

Building a Compliant AI Stack: Practical Architecture

Drawing these threads together, an AI system deployed in the Saudi context requires an integrated, six-layer governance stack.

Layer 1 — Data Ingestion handles multi-source data collection through APIs, file drops, email gateways, database synchronisation, and streaming pipelines. At this layer, format normalisation, deduplication, initial language detection, and schema validation occur. The key governance requirement is complete provenance tracking — every data item must be traceable to its source.

Layer 2 — Document Classification categorises incoming data across the four dimensions described above: sensitivity, content type, language, and applicable regulatory regime. The classified output drives routing decisions — determining which data can be processed by which systems, under which conditions, and with which controls.

Layer 3 — PII Redaction applies the appropriate redaction policy based on the classification output. Pseudonymisation for internal analytics, full anonymisation for model training, or no redaction for already-public data. Every decision is logged with confidence scores and policy references.

Layer 4 — Sovereign Processing is where AI inference and training occur within KSA-resident infrastructure. VPC isolation, network egress controls, and KSA-managed encryption keys ensure that no data leaves approved boundaries. This layer must support the full range of AI workloads — from lightweight inference to GPU-intensive fine-tuning — within the sovereignty envelope.

Layer 5 — Audit and Compliance captures every significant action across the stack into a tamper-evident log. Ingestion events, classification decisions, redaction actions, processing requests, model outputs, and access events are all recorded. Automated compliance dashboards provide real-time visibility into regulatory posture, and anomaly detection flags potential breaches before they become incidents.

Layer 6 — Human Governance provides the organisational oversight that ties the technical stack to accountability. This includes the AI Ethics Office (recommended by the AI Adoption Framework), review dashboards for human oversight of edge cases, exception handling workflows, periodic compliance assessments, and the model registry that tracks the lifecycle of every AI system from development through deployment to retirement.

Running vertically across all six layers are the cross-cutting regulatory requirements: PDPL compliance, NDMO standards adherence, data residency enforcement, AI ethics principles, cross-border transfer rules, and (looking ahead) the requirements of the forthcoming AI law.

Figure 5: The six-layer architecture for a fully compliant AI system in Saudi Arabia, with cross-cutting regulatory requirements running vertically across all layers.

Sector-Specific Considerations

Government and Public Sector

Government entities face the most stringent requirements under NDMO standards. All 15 domains must be implemented, and compliance is actively monitored. AI systems processing citizen data must meet the highest classification and redaction standards. Inter-agency data sharing for AI purposes must comply with specific data sharing standards that go beyond generic PDPL requirements.

Healthcare

Healthcare AI systems trigger both PDPL sensitive data provisions (health data is explicitly classified as sensitive) and sector-specific regulations. AI-assisted diagnostics, patient record processing, and clinical decision support systems require the highest redaction standards and must operate exclusively on sovereign infrastructure. The combination of Arabic-language medical records and English-language clinical terminology creates particularly challenging NER and classification requirements.

Financial Services

The Saudi Central Bank (SAMA) imposes additional data governance requirements on financial institutions that layer on top of the PDPL and NDMO standards. AI systems used for credit scoring, fraud detection, and customer analytics must navigate both the national governance framework and sector-specific prudential requirements. The explainability requirements are particularly stringent — financial AI decisions that affect consumers must be explainable in terms that satisfy both SAMA's regulatory expectations and the PDPL's transparency requirements.

Real Estate and Mega-Projects

Saudi Arabia's giga-projects (NEOM, The Red Sea, New Murabba, and others) generate enormous volumes of data — architectural plans, contracts, citizen engagement records, environmental assessments — that must be classified, protected, and governed. AI systems deployed in these contexts must handle multi-party data governance challenges where data ownership may span government entities, international contractors, and private operators, each with different regulatory obligations.

Implementation Roadmap: Where to Start

For organisations beginning their AI governance journey in the Saudi context, we recommend a phased approach:

Phase 1 — Assess and Map (Weeks 1–4): Conduct a comprehensive audit of all data assets that will be touched by AI systems. Map each asset to NDMO classification levels. Identify all PII categories present in your data. Document your current data residency posture — where is your data today, and where is it processed?

Phase 2 — Foundation (Months 2–3): Deploy the document classification engine and PII redaction pipeline. Establish the audit logging infrastructure. Migrate critical workloads to KSA-resident cloud infrastructure. Implement CMEK with KSA-resident key management.

Phase 3 — Integration (Months 3–5): Integrate the governance layers with your existing AI/ML pipelines. Implement automated compliance dashboards. Establish escalation workflows for edge cases. Begin formal compliance documentation.

Phase 4 — Maturity (Months 5–8): Advance toward the "Proficient" level on the AI Adoption Framework's maturity model. Implement proactive monitoring and anomaly detection. Establish periodic compliance review cycles. Build internal capabilities for ongoing governance.

Looking Ahead: The Convergence of AI Innovation and Governance

Saudi Arabia's regulatory approach is notable for its ambition. The Kingdom isn't merely trying to regulate AI — it's trying to build an environment where rigorous governance enables rather than constrains innovation. The SDAIA AI Ethics Principles, the Generative AI Guidelines, the AI Adoption Framework, and the forthcoming AI law collectively articulate a vision where governance is a competitive advantage, not a compliance burden.

The Draft Global AI Hub Law's "data embassy" concept is particularly significant. If enacted, it would create a unique regulatory environment where international AI companies can operate under their own legal frameworks while physically running compute in Saudi Arabia — a proposition designed to attract global AI investment while maintaining the Kingdom's data sovereignty objectives.

For organisations like Bayseian working at the intersection of AI delivery and regulatory compliance, this represents both a challenge and an opportunity. The technical bar is high — building systems that are simultaneously powerful, compliant, auditable, and sovereign is not trivial. But the organisations that invest in these capabilities now are building the foundations for long-term, sustainable AI deployment in one of the world's most dynamic and ambitious markets.

The message is clear: in Saudi Arabia's AI future, governance isn't a constraint on innovation. It's the prerequisite for it.