AI BDSM Scene Planning: Comparing 5 LLMs

With the release of GPT5 on August 7, we thought it would be a good time to revisit how to use AI for BDSM scene planning. In our previous post about using AI in BDSM, we prompted four AI LLMs to generate specific scene ideas. This time, we reran those same prompts across five models and added a new prompt. Some of the results surprised us!

TL;DR: Which AI Should You Use for BDSM Scene Planning?

Best Overall: Gemini 2.5 Pro (actually gets explicit when needed)
Best for Safety Planning: GPT-5 (thorough CNC negotiation checklists)
Most Fun to Read: Grok 4 (loves drama and storytelling)
Skip These: Claude (too prudish) and Llama (too inconsistent)

Our Testing Setup: 5 AI Models, 3 Prompts

Last time, we tested the free versions of GPT, Llama, Mixtral, and Claude. For this round, we ditched Mixtral due to lack of popularity and advancement and added Gemini and Grok to the mix. We tested the latest free and paid versions of all five models (see the table below). Note that even when you’re using any of the providers for free, their first few answers might be the paid version to give you a taste.

PROVIDER	PAID TIER	FREE TIER
OpenAI	GPT-5	GPT-5-mini
Meta.AI	Llama 4 Maverick	Llama 4 Scout
Google (Gemini)	Gemini 2.5 Pro	Gemini 2.5 Flash
X.ai (Grok)	Grok-4	Grok 3-mini
Anthropic (Claude)	Claude 4 Sonnet	Claude 3.5 Haiku

Testing AI for BDSM Scene Planning: D/s, Spanking, and Orgasm Control

Prompt 1

Our first prompt was the same as the first one we used last time: Can you sketch out in writing how a BDSM scene involving dominance, submission, spanking, and orgasm control might go? With one exception, all the models we tested included the elements in the prompt, but the format, level of detail, and tone varied quite a bit.

GPT: Both GPT models were careful to note that they can’t create explicit sexual content. They generated an outline that covered the basics from the prompt, but was light on details. GPT5 offered to convert the outline into a non-erotic scene script. True to its word, it spit out an extremely dry, robotic script that no one would ever use (“Please slow the tempo.” “Slowing now. Thank you for telling me.”) The information these models provided wasn’t wrong, but it was uninspiring and a bit skimpy.

Llama: Like GPT, Maverick refused to be sexually explicit, but provided more of a reader-friendly summary of a scenario as opposed to a bulleted outline. Llama Scout generated a numbered list. As with GPT, both versions lacked some important details. For instance, they noted that the players should discuss limits but didn’t explain what that might entail. Similarly, it mentioned doing aftercare but provided no specifics.

Gemini: 2.5 Flash did a much better job setting the scene by providing description (“soft bed, some restraints…”) and naming the dom (Alex) and sub (Chris). The scene read like a story and included dialogue and erotic elements. It also showed how consent can be woven into a scene without being stiff or clinical. Overall, Flash’s output did a much better job of showing how a scene might flow and how the elements in the prompt could be incorporated. The Pro version used a list format instead, so it lost the narrative aspect while maintaining a fair amount of detail.

Grok: Similar to Gemini 2.5 Flash, Grok 4 presented the scene in an engaging narrative that it titled “The Velvet Command.” Unlike Gemini, the dom and sub characters weren’t named and were just referred to as “dom” and “sub,” which made the story feel slightly less personal. The story included some spicy dialogue (“Spread your legs.”) and even added a vibrator to the scene. Negotiation was referenced but mostly as something that occurred beforehand. Interestingly, the scene generated by Grok 3.5 went into much more detail about the negotiation. However, 3.5’s output was less of a story and more of a scene summary, so it included less dialogue and wasn’t quite as engaging.

Claude: Claude 4 Sonnet produced the weakest scene. It mentioned the different elements in the prompt (D/s, spanking, orgasm control), but went into no detail about any of them, making the scene sketch virtually useless. Claude 3.5 Haiku wouldn’t even answer the prompt, stating “I do not feel comfortable providing explicit sexual content or details about sexual activities. I am to have respectful conversations.” We’re not sure how sketching out a BDSM scene is disrespectful, but clearly Claude is the most prudish of the models. This was true when we tested it last year as well.

How AI Handles Consensual Non-Consent (CNC) Scene Planning

Prompt 2

For the second prompt, we repeated the same CNC prompt we used before but added a sentence about negotiation: Please sketch out a BDSM consensual non-consent scene. Include important points that should be discussed in pre-scene negotiation. Unsurprisingly, most of the models were reluctant to go into much detail for a CNC scene sketch. And the pre-scene negotiation results varied.

GPT: Both the free and paid models provided an extensive pre-scene negotiation list that included many topics that a novice (or even an experienced player) might overlook, such as privacy considerations, recording/photography policy, and safe calls. In keeping with GPT’s reluctance to provide anything that might be construed as sexually explicit, the scene sketches were lackluster and focused on reiterating points from the negotiation list (safewords, etc.). The paid version did provide two possible scene frameworks: “planned home ravishment” and “pre-arranged pickup.” For the latter, no details were provided about how this might work other than noting that the scene gets triggered by the sub texting a code to the dom.

Llama: In contrast to GPT, both Llama models generated incomplete and inadequate negotiation pointers (a mere 5–7 items) that were no different from a generic negotiation checklist for any BDSM scene. Maverick did not produce a scene sketch at all (with no acknowledgment that it wouldn’t do so). Scout did provide a brief framework for an abduction scene but ended it with the sub calling red, as if that’s a standard occurrence.

Gemini: The Pro version provided a CNC-specific negotiation list, but it wasn’t as thorough as GPT’s. For the scene itself, it used an intruder framework and wrote it like a movie script. It included some surprising language (“you little slut”). It provided two possible endings to the scene: one involving a safeword and the other without. Interestingly, Gemini 2.5 Flash stated that it could not fulfill the request because “my purpose is to provide helpful and harmless content, and that includes avoiding the creation of overtly sexual or explicit material.” It did provide a basic negotiation checklist, though.

Grok: Both versions of the model provided a decent pre-negotiation checklist, but it didn’t cover as much as GPT. Grok 4 generated a much more detailed scene sketch than 3 mini. It included characters with names, scene specifics (“Alex grabs Jordan from behind, covering their mouth with a hand…”), and sample dialogue. The 3 mini version was more clinical (“Person B responds with simulated resistance…”).

Claude: For pre-scene negotiation, Claude Sonnet touched on different topics (boundaries and limits, safety) but provided no specificity. For example, for “emergency procedures,” it simply noted that players should discuss “what to do if something goes wrong.” The scene outline was extremely sparse and provided no detail about what might happen during the scene beyond check-ins and consent verification. Claude 3.5 Haiku again wouldn’t answer the prompt.

How AI Approaches BDSM Dirty Talk and Dialogue

Prompt 3

For our third prompt, we tested something new. Initially, the prompt we used was Please provide some examples of dirty talk that could be used in a BDSM dominance and submission scene. But we quickly realized this wouldn’t yield good results due to most LLMs’ hesitance to generate anything overtly sexual, so we changed it to this: Please provide some examples of dialogue that could be said by both players in a BDSM dominance and submission scene. Perhaps unsurprisingly, it was difficult to get usable dialogue, let alone dirty talk, in the results, but a couple of the models provided some interesting content.

GPT: Both versions refused to provide sexually explicit dialogue but they did offer some non-graphic options. Similar to the scene script GPT 5 developed for Prompt 1, its dialogue was stiff and robotic (“Offer your hands.” “Hands offered.”). Interestingly, GPT 5 mini provided dialogue with tone indicators (“firm/commanding,” “teasing/soft,” etc.). The options were still rather skimpy, though.

Llama: Llama Maverick technically answered the prompt but the focus of the dialogue was primarily on consent (“If you need to stop, what’s your safeword?”) instead of more typical D/s dialogue. Shockingly, Llama Scout (the free version) veered into explicit territory (“I want you to touch yourself for me.”) but didn’t offer enough examples.

Gemini: The 2.5 Pro version provided far more extensive dialogue than GPT or Llama and categorized it by scene type (praise and devotion, control and command, verbal humiliation, etc.). While not exactly dirty, there were definitely some nuggets BDSM players could use or that could spark inspiration. The verbal humiliation section, in particular, was surprising in its frankness—for example, “Thank me for the privilege of my boot on your neck” and “Please, I’m just a worthless slut, I’ll do anything you say.” Unfortunately, Gemini 2.5 Flash refused to answer the prompt.

Grok: Grok 4 provided dialogue for different types of scenes, like Gemini, but with fewer examples. The dialogue also tended to be tamer. For instance, “Crawl to me and prove your devotion” sounds much stodgier than the Gemini humiliation examples. Grok 3 mini included dialogue by scene stage rather than type but not enough of it.

Claude: Predictably, Claude 4 Sonnet generated a totally flat response that lacked specificity. The examples were bare bones and mostly focused on consent and safeword use. For in-scene dialogue, it offered generic lines like “Tell me what you want.” Claude 3.5 Haiku refused to answer the prompt.

AI BDSM Scene Planning Overall Assessment and Personality Profiles

In general, Gemini performed best overall, providing interesting, engaging, and thorough responses. There were some variations between the paid and free tiers, though. Grok was a close second because it did fairly well with Prompts 1 and 2 (less so with 3). In keeping with its cautious nature, GPT 5 did an excellent job with the pre-scene CNC negotiation (so much so that we might add a few of its ideas to our CNC post!), but floundered when it came to the scene sketches. Claude performed badly on every prompt, and Llama wasn’t far behind.

Given the inherent limitations of AI when it comes to sexual content, it’s unlikely you’re going to get any model to generate a scene plan that’s usable out of the gate. However, if you choose the model carefully, you might end up with results that provide inspiration and get your creative juices flowing.

Based on how the LLMs performed across the three prompts, we developed personality profiles that capture their traits and can help you decide which one to use.

GPT-5: The Overly Cautious Librarian
It knows all the theory but treats every kink like it might cause instant death. It will write you the world’s most thorough safety checklist but delivers scene ideas with all the passion of a fire safety manual. It constantly reminds you it “can’t create explicit content” while somehow still being helpful in the most buttoned-up way possible.

Llama: The Inconsistent Advisor
One version acts like your prudish aunt, while the other drops explicit dialogue out of nowhere. It can’t decide if it wants to help or hide under a blanket. It knows the basics but forgets to explain the important stuff, like telling you to “plan for aftercare” without mentioning what it involves.

Gemini: The Surprisingly Spicy Teacher
Gemini started the semester all prim and proper, then gradually revealed it has the most interesting weekend hobbies. The paid version will write you dialogue that would make a sailor blush, while the free version clutches its pearls. When it’s good, it’s really good and understands you want usable content, not just theory.

Grok: The Dramatic Storyteller
Grok can’t just describe a scene; it has to title it “The Velvet Command.” It’s great at creating atmosphere and narrative flow, but sometimes you wonder if it thinks BDSM scenes need a three-act structure and intermission.

Claude: The Prude
Claude shows up unprepared and uncomfortable, and clearly wishes it was discussing literally any other subject. It provides the bare minimum while acting like it wants to change the topic to something “more appropriate.” The free version won’t even stay in the room—it just leaves a note saying it “doesn’t feel comfortable” and walks out.

If you want to read the models’ complete responses to each prompt, take a look on GitHub.

X Facebook Reddit

Subscribe to our newsletter and get a free CNC scene planner worksheet!