DISC InfoSec blog

InfoSec and Compliance – With 20 years of blogging experience, DISC InfoSec blog is dedicated to providing trusted insights and practical solutions for professionals and organizations navigating the evolving cybersecurity landscape. From cutting-edge threats to compliance strategies, this blog is your reliable resource for staying informed and secure. Dive into the content, connect with the community, and elevate your InfoSec expertise!

Aug 26 2025

AI systems should be developed using data sets that meet certain quality standards

Category: AI,Data Governance — disc7 @ 3:13 pm

AI systems should be developed using data sets that meet certain quality standards

Data Governance
AI systems, especially high-risk ones, must rely on well-managed data throughout training, validation, and testing. This involves designing systems thoughtfully, knowing the source and purpose of collected data (especially personal data), properly processing data through labeling and cleaning, and verifying assumptions about what the data represents. It also requires ensuring there is enough high-quality data available, addressing harmful biases, and fixing any data issues that could hinder compliance with legal or ethical standards.

Quality of Data Sets
The data sets used must accurately reflect the intended purpose of the AI system. They should be reliable, representative of the target population, statistically sound, and complete to ensure that outputs are both valid and trustworthy.

Consideration of Context
AI developers must ensure data reflects the real-world environment where the system will be deployed. Context-specific features or variations should be factored in to avoid mismatches between test conditions and real-world performance.

Special Data Handling
In rare cases, sensitive personal data may be used to identify and mitigate biases. However, this is only acceptable if no other alternative exists. When used, strict security and privacy safeguards must be applied, including controlled access, thorough documentation, prohibition of sharing, and mandatory deletion once the data is no longer needed. Justification for such use must always be recorded.

Non-Training AI Systems
For AI systems that do not rely on training data, the requirements concerning data quality and handling mainly apply to testing data. This ensures that even rule-based or symbolic AI models are evaluated using appropriate and reliable test sets.

Organizations building or deploying AI should treat data management as a cornerstone of trustworthy AI. Strong governance frameworks, bias monitoring, and contextual awareness ensure systems are fair, reliable, and compliant. For most companies, aligning with standards like ISO/IEC 42001 (AI management) and ISO/IEC 27001 (security) can help establish structured practices. My recommendation: develop a data governance playbook early, incorporate bias detection and context validation into the AI lifecycle, and document every decision for accountability. This not only ensures regulatory compliance but also builds user trust.

ISO 27001 Made Simple: Clause-by-Clause Summary and Insights

From Compliance to Trust: Rethinking Security in 2025

Understand how the ISO/IEC 42001 standard and the NIST framework will help a business ensure the responsible development and use of AI

Analyze the impact of the AI Act on different stakeholders: autonomous driving

Identify the rights of individuals affected by AI systems under the EU AI Act by doing a fundamental rights impact assessment (FRIA)

Building Trust with High-Risk AI: What Article 15 of the EU AI Act Means for Accuracy, Robustness & Cybersecurity

From Compliance to Confidence: How DISC LLC Delivers Strategic Cybersecurity Services That Scale

Secure Your Business. Simplify Compliance. Gain Peace of Mind

Managing Artificial Intelligence Threats with ISO 27001