In recent months, I've attended numerous AI conferences and summits, engaging with experts across the field. A pattern has emerged that's becoming increasingly concerning: while everyone nods in agreement about the importance of AI safety, we rarely progress beyond this superficial acknowledgment to discuss concrete, actionable steps.
It's time for a more nuanced conversation. Yes, this means embracing vulnerability and admitting that we're all learning to navigate these new systems. It means exposing our ideas to criticism and learning from both successes and failures. But most importantly, it means moving from theoretical discussions to practical implementations.
Based on our experience at Data Friendly Space, I want to share three promising practices that can serve as a foundation for meaningful AI safety measures:
Thoughtful Prompt Engineering as a First Line of Defense
Think of prompt engineering as setting guardrails for AI interactions. Just as you would carefully brief a junior professional, we need to be explicit and layered in our instructions to AI systems. If hallucination is a concern, include specific directives against fabricating facts. If bias is a worry, explicitly instruct the system to avoid assumptions about capabilities or roles.
The key shift in thinking is to treat AI as a knowledgeable but inexperienced colleague who needs clear guidance and boundaries. While this isn't a complete solution, it's an essential first step in promoting safer AI interactions.
Grounding AI in Authoritative Sources
Durning my work creating AI tools to advice humanitarians in the field, it became clear how vital it is to have verified information grounding AI replies. By building domain-specific knowledge bases and leveraging vector databases, we can ensure AI draws from authoritative sources rather than generating potentially unreliable information.
When combined with careful prompt engineering and expert oversight, this approach significantly reduces the risk of hallucinations while maintaining the AI's utility as an analytical tool.
AI Checking AI: The Promise of Multi-Agent Systems
Perhaps the most exciting development is the emergence of AI agent infrastructures, where multiple AI systems work together, each with distinct roles and responsibilities. We're implementing "safety agents" specifically configured to check for bias, verify facts, and monitor for concerning behaviors.
For instance, in our report writing projects, we employ dedicated agents to verify claims against original sources, essentially creating an AI-powered fact-checking system that goes beyond simple link verification.
Moving Forward
These approaches aren't perfect solutions, AI safety challenges are enormously complex. However, we must remember that human processes also contain biases and errors. The goal isn't perfection but meaningful progress.
What's crucial now is moving beyond mere declarations about safety's importance. In my professional circles, we've reached consensus on the "why" - it's time to focus on the "how." Just as cybersecurity evolved from theoretical discussions to practical implementations, AI safety needs to make the same transition.
The path forward requires:
- Concrete, implementable safety measures
- Open discussion of successes and failures
- Collaborative learning and iteration
- A focus on practical solutions over theoretical perfection
Let's shift the conversation from acknowledging the need for AI safety to actively implementing it. The technology is here, and it's evolving rapidly. Our safety practices need to keep pace.