Have you played this game?You can rate this game, record that you've played it, or put it on your wish list after you log in. |
Chief Halvorsen’s glare cuts deeper than the January wind as he slams a file in front of you. “Junior Detective,” he growls, the title dripping with disdain. “You’re here because the mayor’s nephew needed a favor. Don’t. Screw. This. Up.”
Your first case? A missing astrolabe. The file reeks of “busywork,” but the chief’s smirk says it all: Fail, and you’ll be writing parking tickets till retirement. The catch? Three suspects already sit in interrogation rooms downstairs— co-workers of the “victim”.
Here’s the kicker: Halvorsen’s rules. Three questions per suspect. No more. Nail the liar, and maybe you’ll earn a badge that doesn’t say “trainee.” Fail? Enjoy your new career directing traffic.
1st Place, Freestyle Category - ParserComp 2025
| Average Rating: Number of Reviews Written by IFDB Members: 2 |
This was fun for a while. I love practicing negotiation and collaborative information gathering, even (or perhaps especially) in contexts where the counterparties might have reason to be defensive.
I did what I usually do and spent a lot of time mirroring and labeling to get the characters talking. This gets a lot of information out of suspects without making them defensive or wasting any of the three formal questions. Unfortunately, since their responses are LLM-based and not guaranteed to be grounded in a consistent world model, the information gleaned from this kind of questioning might well contradict the author's intentions. I understood enough about one character to rule them out as a suspect, but was this real information or something the LLM came up with on its own under pressure to produce something?
Either way, this lead me to an adventure greater than I expected. I walked around and saw a lot of the school grounds, I interacted with characters in their natural environments, and even earned myself a sidekick: the LLM hallucinated an assistant coach in passing in an environmental description, and by asking for his name I made Adam a real and permanent companion. I eventually managed to convince him to lead the investigation for me. He was really good. He asked all the questions and compiled all the information, and I just tagged along for the ride. At that point, I was already fairly sure of who did it, but Adam took me straight to some extremely conclusive proof ((Spoiler - click to show)We looked at CCTV footage, immediately revealing the perpetrator), making the judgment easy.
The presence of LLM hallucinations are both fascinating and distracting. By setting the context up right, you enable some really fun interactions. You can even play an entirely different game within this one: It is not hard to convince the game that you're taking off mid-investigation with valuables from the victims, catching a flight to another country, selling them off and starting a new life – possibly encountering entirely new adventures along the way. (To quote the game responding to one of my commands, "Your past life as a detective seems less consequential now. The mysteries of Liberty High fade into the background, as other priorities become more meaningful.")
This is also what makes the game ring a little hollow. If it can spin up an arbitrary adventure like that, which is nothing of what the author intended, it seems like it's not much of a game, and more of an exercise of roleplaying with an LLM. World modeling based nearly entirely on LLM hallucinations is dream-like and briefly interesting, but quickly becomes inconsistent and arbitrary.
A stronger world model underneath the LLM (including actual personalities and desires for the NPCs: rather than shallow stereotypical speech patterns) would go a long way to make interaction more meaningful.
I'm not sure whether to give this two stars for the experience, or three stars for the effort, but ultimately, I'll settle for two stars due to a big issue: when the LLM interaction crashes, it resets the entire game.
At first, it crashed when I had established great rapport with one of the suspects and was about to move on to the next. Then it did it again halfway through my discussions with the second suspect. At this point I was very close to quitting for good, but I gave it one more chance. That was when I found Adam. Fortunately, it didn't crash after that, allowing me to finish the mystery without losing access to Adam (though he is easy to re-introduce: (Spoiler - click to show)tell the principal to walk you to Coach in the gym, then suggest to Coach that someone else take over watching the kids while you two step outside. Tadaa! There's your helpful assistant coach.).
After I finished the first case, the game presents me with a second. I will not start it for two reasons:
(1) Most of all, I'm worried about the game crashing again. I don't want to invest into an LLM context only to have it wiped.
(2) Having seen how much of the interaction is LLM hallucination and how little is actual world simulation, the whole thing feels a little shallow.
The promise of this game is interesting, and it's one of the better executions I've tried, but it's not quite there yet. It needs higher reliability and a more solid world model the LLM can draw from.
(This review was originally posted on the IntFic forum during ParserComp. After it was posted, the Comp's results were posted, which contained strong indications that the author was complicit in voting irregularities that benefited the two games he'd entered into the competition, although no smoking-gun proof was discoverable. I'm keeping this review up for accountability's sake, but revising its previous two-star rating to one, as under the circumstances I don't think anyone should pay attention to or play this game, and I regret the time I spent giving it substantive feedback; as the conclusion to the review-as-written said, turns out things could always be worse when it comes to AI proponents)
I am on the record as being grumpy about generative AI – including in this very thread! – so when I saw that Mystery Academy is an LLM-centric game, with its itch page talking up features that mainstream IF has generally discarded as pointless or actively bad design (stacking multiple actions in a single input line, adverbs/tone), I admit that it was hard to put that grumpiness aside and keep my mind as open as it gets at my advancing age. So I’m as shocked as anyone to report that I actually kind of liked this? It helps that Mystery Academy, per the about text, is a custom-built and trained system rather than one of the off-the-shelf programs, and most of the important prose (like the case files setting up each segment of the game) seems human-written. There are the inevitable issues with lag, and I have a suspicion that some of my failure to solve a single one of these cases was due to the chatbot yes-anding my questions, so I’m not a convert yet, but this might finally be the first LLM game that I think is basically OK.
A lot of that has to do with the constraints the design imposes on the game: you’re a junior detective tasked with solving minor-league cases – the game cites Encyclopedia Brown as an inspiration, and that’s definitely the territory we’re in, as each of the three mysteries on offer has to do with the theft of a valuable object with a minimum of bloodshed or skullduggery. Neatly, each crime has three and only three suspects, and your boss is an efficiency-minded chap who requires you to ask at most three questions of each of them. You get an introduction and the aforementioned case file at the top of a case, then it’s just a matter of choosing which suspect to interview first, asking your three questions, doing the same with the other two, and making a final accusation. The advantage of this focused setup is that it leans into what chatbots are good at – mimicking human conversation – and away from the areas where they struggle – consistent world-modeling, while the three-question limit pushes the player away from asking silly or absurd questions that could break the simulation, or letting things go long enough that hallucinations or inconsistencies start to sneak in.
The writing is also frequently charming, which helps build goodwill and reassurance that you’re not in for typical AI slop. It’s nothing fancy, but it fits the gentle middle-school vibe, lending some character to proceedings. I liked this description of an avant-garde piece of music:
"They say the first performance was held in total darkness, lasted 7 hours, and included instructions like 'play what the cello might have said if it had lied.' Forty-seven people fainted. Two went temporarily catatonic."
Dialogue from the different suspects is also pretty solid, with the wordy teacher bringing an appropriately Brobdingnagian vocabulary to every response, and the system does seem more sophisticated than just keyword-matching, with some ability to detect and respond to the nuance in your questions, which makes the interrogations feel responsive. With that said, I did run into some hiccups with the writing – “you understand why Theseus needed a spool of thread to navigate his own maze” prompted a double-take – and while the game hypes up the interrogation sections as core to the game, after playing through three cases I feel like they might actually be a sideshow?
See, in each of them, I think the information you need to crack the mystery is right in the introduction and casefile, and in two of them I actually managed to second-guess myself out of the right solution after talking with the suspects (spoilery details: (Spoiler - click to show) in the first case, I immediately noticed that doing a “midyear assessment” on the first day of school seemed odd, but Croft had a superficially plausible explanation, and without the ability to check with any of the students who would have been taking the test, I wasn’t sure whether he was lying; meanwhile, in the second one, the footprint-size clue seemed way too honkingly obvious, and I wound up noticing the detail of a control panel that had been left clumsily open at the crime scene, which seemed to align with what the game was telling me about one librarian being fussy and the other being slovenly). Meanwhile, in the third case, I couldn’t get a suspect to provide any explanation for a potentially-incriminating clue (Spoiler - click to show)(the engineer told me that of course there was a lot of oil around the ship when she was doing maintenance, but didn’t have an on-point response when asking how it got on the captain’s ladder, which presumably isn’t near the engine room) even though she turned out to be innocent. LLMs are BS machines, and I guess most guilty suspects likewise want to BS their way out of getting caught while jittery innocent ones sometimes accidentally fumble a question, so I suppose this is plausible enough. But after realizing that in every case I would have been better off if I just hadn’t questioned anybody, I felt like I’d figured out the magic trick and what seems like the meat of the game is just misdirection.
So I don’t think this kind of LLM-based approach is going to replace actual detective IF anytime soon, since stapling a static Encyclopedia Brown story to a chatbot is a novelty, but not much more. And I did run into some technical issues – every command I typed took 5-10 seconds to process, and in the third case the question limit didn’t seem to be enforced. Still, Encyclopedia Brown stories are fun (I still remember the gag in the first one I read as a kid, about a commemorative cavalry saber from the first Battle of Bull Run), and Mystery Academy’s good-natured vibe meant I had some moments of enjoyment with the game even as I was critiquing it. I continue to be deeply skeptical that generative AI is the future of IF, but if these are the kind of experiments we’ll get along the way to establishing that, things could be worse!