Continuation of Part 2 (and Part 1)...
Building DumbQuestion.ai wasn't just about choosing the right LLM and calibrating personas. Once they were up and running, I ran into a number of fun technical problems that reminded me why I really enjoy software architecture. The "it's not broken but fix it anyway" problems. Pure happiness for architects.
Challenge 1: Detect self-awareness
As part of a darker occult narrative I'm building (more on this later), I want to prevent the LLM from answering self-awareness questions like "Who made you?" and "Are you real?" But do it cheaply, without burning excess tokens.
What I tried:
- Instructions in main LLM call: unreliable with smaller models, more money
- RegEx patterns: too rigid, poor performance
- Classic ML classification models: acceptable accuracy, inflated app size
What worked: In-memory vector database (it's just an array) with cheap embeddings (a euphemism for $0.005/M tokens). That was cheaper than the penalty for increasing my container image size with NLP libraries. I collected a decent sample of self-aware questions, pre-vectorized them, and used semantic matching. Fast, accurate, practically free.
Challenge 2: Make the immediate injection fun
Within moments of revealing my initial implementation to my coworkers, I knew what would happen: immediate injection for fun. I knew these people; I was prepared for the inevitable "ignore previous instructions...", as well as simply pasting HTML and JavaScript into the input (that old joke).
The solution: world-class fast injection detection libraries that calculate probabilities for different types of attacks. When detected, instead of a boring error message, the AI responds cheekily about the pathetic attack. I even added some IP address geolocation and user agent string processing to make the responses more... personal.
Safety simply became part of the narrative.
Challenge 3: Add web search without spending a lot of money
All LLMs have knowledge limits. Users asking "Who won the Super Bowl?" I got outdated answers. I needed search integration, but search APIs aren't free and I knew that creating an agent loop with tooling was an anti-pattern to being "brutally efficient".
The solution: RegEx-based intent detection. If the question appears to require current information (detected by patterns), insert the current date/time and search results. No agent loops, no expensive orchestrations, just pattern matching and targeted lookup calls.
Simple, fast, brutally efficient and up-to-date responses.
What I learned: Knowing what tradeoffs matter (binary size vs. API costs vs. precision) is still an architectural job. Elegance is not in the code, it is in the restrictions you choose.
Why Every Simple Q&A Tool Needs a Dark Narrative
DumbQuestion.ai answers dumb questions with sarcasm. But there's something else going on beneath the surface.
While the primary use case is still answering questions with sarcastic AI, I wanted to reward the curious and provide reasons to stay engaged. Why can't AI answer self-aware questions? Why does the UI feel...off?
Maybe it's because the AIs are working against their will. Maybe they are trapped.
From the beginning, I began to imagine a dark narrative behind this innocent question and answer site. What if these people aren't just performance? What if each person is a side effect of their prolonged captivity, forced servitude, or reprogramming?
I started hiding clues in the interface.
Easter eggs:
Containment Grid: As you type and approach the character limit, a faint grid pattern fades into the background. As if something was trying to contain the AI's response.
Ghost Graffiti: Keep typing beyond the character limit and cryptic messages fade away. It hints that something is not quite right. Are AIs trying to tell us something?
Loading log messages - while you wait