The Ethics of Building with AI

When researchers and ethicists began raising concerns about AI systems in the mid-2010s, the conversation was frequently dismissed as premature. The systems were too limited, the concerns too speculative, the critics too distant from the actual engineering challenges. A decade later, those critics don't look premature. The systems are consequential, the concerns are concrete, and the questions about how AI should be built and governed have moved from academic conferences to boardrooms, regulatory bodies, and engineering team norms. The answers haven't caught up to the questions, but at least the questions are being taken seriously.

Bias: A More Honest Conversation

The early discourse around AI bias often fell into one of two unhelpful camps. One camp treated algorithmic systems as inherently objective — inputs in, outputs out, no human judgment involved. The other treated bias as a problem that, once identified, could be comprehensively solved through careful dataset curation and algorithmic correction.

The more mature understanding that has emerged from years of research and real-world deployment is considerably more complicated. AI systems trained on historical data inherit patterns from that data, including patterns that reflect historical inequities. Debiasing a model in one dimension can introduce or amplify bias in others. Bias in AI is not a bug to be fixed so much as a property to be actively managed — which requires ongoing measurement, clear accountability, and the organizational will to act on what the measurement reveals.

More importantly, the field has developed a better vocabulary for being specific about what kind of bias, in what context, matters. A facial recognition system that performs differently across demographic groups is a problem in law enforcement contexts in a way that it simply isn't in a photo organization app where the stakes of misclassification are trivial. Bias in a credit-scoring model has different implications than bias in a content recommendation algorithm. Getting specific about context, stakes, and affected populations has produced more tractable discussions than the early framing of "AI bias" as a monolithic problem.

Abstract representation of AI neural networks and decisions

Understanding AI bias requires specificity about context, affected populations, and the stakes of particular decisions. — Photo: Unsplash

Transparency: What It Means in Practice

Transparency is consistently cited as a core principle of responsible AI. In practice, what transparency means — and what it requires — varies enormously depending on the system and context. For a large language model generating text, transparency might mean disclosure that the content was AI-generated. For an automated hiring tool, it might mean providing rejected applicants with a meaningful explanation of why they were rejected. For a medical diagnostic model, it might mean making the model's uncertainty estimates available to the clinician who's using it.

The technical challenge of transparency is that many of the most capable AI systems are genuinely difficult to explain in terms that would be meaningful to affected individuals. Neural networks encode knowledge in ways that don't decompose cleanly into human-interpretable rules. The interpretability research field has made real progress on tools that can approximate explanations — identifying which features of an input most influenced an output, generating contrastive explanations that show what would have needed to change to get a different result — but these approximations are imperfect and can themselves mislead.

Regulatory frameworks in Europe and elsewhere have established rights to explanation for automated decisions affecting individuals, which has created practical pressure to develop transparency tools regardless of whether they're technically complete. Organizations navigating these requirements have had to make difficult choices about what level of explanation is both technically feasible and meaningful enough to satisfy both regulators and affected individuals.

The Consent and Data Problem

The training data used to build large AI models has raised questions that existing intellectual property and privacy frameworks weren't designed to answer. Models trained on scraped internet data have learned from vast quantities of text, images, and other content created by people who had no expectation that their work would be used this way, and who received no compensation for that use.

The legal landscape around training data is still being established through litigation in multiple jurisdictions. The ethical landscape is contested in a different way: even if training on publicly accessible data is found to be legally permissible, questions remain about whether it's how creators would have consented to their work being used if asked, and whether the economic value captured by model developers from this training represents a redistribution that the creators affected should have some say in.

The emerging approaches — opt-out mechanisms for data contributors, licensing arrangements between AI developers and content platforms, compensation pools distributed to creators whose work contributed to training — are partial solutions to a structural problem. None of them comprehensively address the concerns of the millions of individual creators whose work informed current models without any process of consent or compensation.

Safety: Short-Term and Long-Term

AI safety means different things to different communities, and the lack of a shared vocabulary has sometimes made productive conversations difficult. For AI practitioners focused on deployed systems, safety typically refers to reliability — models that behave consistently, don't fail catastrophically in edge cases, don't produce harmful outputs at significant rates. For researchers focused on more capable future systems, safety refers to alignment — ensuring that systems with increasing autonomy and capability pursue objectives that are genuinely beneficial and not subtly misaligned with human values in ways that only become consequential as capability scales.

Both concerns are legitimate, and both require investment. The near-term safety work — red-teaming models before deployment, developing robust evaluation practices, building feedback mechanisms for when deployed systems produce harmful outputs — is tractable and increasingly well-understood. The longer-term alignment research is harder, more speculative, and depends on views about the trajectory of AI capability that are genuinely uncertain.

The organizations doing the most serious work on AI safety tend to be engaged with both. They're building evaluation infrastructure to catch near-term harms systematically, while also investing in the foundational research that might make alignment tractable at higher capability levels. The cynical view is that safety language is primarily reputational management. The accurate view is that it's a mix: genuinely motivated safety work, alongside some use of safety framing for purposes that have more to do with public relations than technical substance.

The Accountability Gap

One of the most persistent structural problems in AI ethics is accountability. When an AI system produces a harmful outcome — a loan denied to a creditworthy applicant, a medical diagnosis missed, a job candidate filtered out erroneously — the question of who is responsible is often genuinely unclear. The model developer didn't make the deployment decision. The organization deploying the model may not have adequately evaluated it for their use case. The individual operator may not have understood its limitations. Existing legal frameworks don't map cleanly onto this distributed responsibility structure.

Regulatory responses have begun to address this — the EU's AI Act imposes obligations on both developers and deployers, creating clearer accountability at each link of the chain. But implementation is new, enforcement experience is limited, and the practical effects of these accountability frameworks on how AI systems are built and deployed will take years to fully manifest.

What Responsible Practice Actually Looks Like

The organizations doing AI ethics well aren't primarily the ones with the most sophisticated ethics principles statements or the most prominent AI ethics board appointments. They're the ones that have integrated ethical review into product and engineering processes, built feedback mechanisms that surface real-world harms systematically, committed to auditing their systems post-deployment and acting on what those audits reveal, and created organizational conditions where engineers and product managers can raise ethical concerns without career risk.

The gap between stated principles and operational practice is the most important thing to evaluate when assessing any organization's ethical commitments around AI. Principles are easy to write. Process changes that create friction and sometimes slow down shipping are harder. The organizations serious about this subject have made the harder changes.