Letting Claude Ship a Real 1:1 DM Feature

Pickful shipped direct messages today. One-to-one chat, text plus images, real-time delivery, a 2-minute retract window, points-tier rate limiting, three privacy modes, block-aware — 38 files, +1376 lines, 41 specs, one PR.

This article isn't "Claude is amazing." It's about what actually happens when you ask Claude to add a feature to a year-old social product where users will absolutely try to harass each other: which calls you have to make yourself, what Claude misses on the first pass, and what I had to patch up afterward.

Scope the Feature Before Writing a Line

DMs aren't comments. Comments are a public square — anything you throw out is visible to everyone. DMs are a private room with exactly two people. That means every gap in the data model, the authorization policy, or the push channel translates directly into "a stranger can dump abuse onto someone's face." So I didn't say "build chat." I locked the scope down with Claude first:

1:1 only — group chat brings member management, @mentions, who-sees-what; complexity 3× minimum
No read receipts — privacy-sensitive, defer it, but leave room for an opt-in toggle later
No emoji reactions — in a 1:1 room people just send emoji, reactions are low-priority

Claude got "sharpen the blade first, then decide how big a tree to fell" immediately. I came back to this principle a dozen times: AI writes code fast, but product boundaries are a human call.

One Pair = One Conversation: Avoiding (A,B) and (B,A) Twins

The classic DM modeling trap: A messages B and creates (A→B); B replies and creates (B→A); now two rows exist for the same conversation and nothing is in sync.

Claude's solution I took straight in:

def self.between!(a, b)
  raise ArgumentError, "cannot create conversation with self" if a.id == b.id
  one_id, two_id = [a.id, b.id].sort
  find_or_create_by!(user_one_id: one_id, user_two_id: two_id)
end

[a.id, b.id].sort plus a unique index on (user_one_id, user_two_id) — no matter who speaks first, both sides land on the same row. Simple, no migration to reconcile twins, no application-layer joining of mirrored records.

The peer(viewer) lookup falls out cleanly:

def peer(viewer)
  viewer.id == user_one_id ? user_two : user_one
end

"Who's the other party" is always a one-line ternary. Pushing the invariant into the database layer like this is far more robust than enforcing it in application code.

Split State Across Both Sides

Two pieces of conversation state are intrinsically per-user:

Read state — A reading the thread doesn't mean B read it
Hidden state — A deleting the conversation can't make it disappear for B

The first draft Claude wrote had a conversation_states table with one row per user. I stopped it — that table is permanently capped at 2 rows per conversation; a separate table plus a join isn't worth it. We put the four columns directly on conversations:

t.datetime :user_one_last_read_at
t.datetime :user_two_last_read_at
t.datetime :user_one_hidden_at
t.datetime :user_two_hidden_at

Read and write dispatch on which side the viewer is:

def column_name(viewer, suffix)
  viewer.id == user_one_id ? :"user_one_#{suffix}" : :"user_two_#{suffix}"
end

A little viewer dispatch in code, one less join table and index set in the schema. A relation that's permanently bounded to N rows shouldn't get its own table. That's a database-design judgment call — Claude proposes, you decide.

Anti-Abuse Baked In From Day One

I put it in the prompt explicitly: "DMs only ship if abuse defenses are on by default, every toggle pointed at the safe direction." Claude came back with a three-layer defense that I changed exactly zero of:

Layer	Mechanism	Where it lives
User-initiated block	`blocked_users.exists?(id: other.id)` short-circuits	`can_be_dmed_by?`
Three privacy modes	`enum :dm_privacy, { everyone: 0, followers_only: 1, nobody: 2 }`	User concern
Points-tier rate limit	<50pts → 10/h, <500pts → 60/h, else 300/h	`dm_hourly_limit`

The third layer I asked for. Spam from throwaway accounts is the guaranteed disaster on day one of DMs in any social product. Claude's first pass had a flat "60/h for everyone" — I made it tier by reputation points: near-silent for zero-cost new accounts, generous for genuinely active users. That's a "don't enforce fairness in code, design fairness across user tiers" judgment — product, not engineering.

The full gate sits in a Pundit policy, no bypass:

def create?
  return false unless user
  return false unless record.conversation&.participant?(user)
  recipient = record.conversation.peer(user)
  return false unless recipient
  return false unless recipient.can_be_dmed_by?(user)
  return false if user.dm_rate_limited?
  true
end

Each return false corresponds to a concrete abuse scenario. The code reads like a checklist — which is what a policy should look like.

2-Minute Retract, Soft Delete, Audit Trail

Retract is non-optional (typos, wrong image, wrong person), but the details are easy to get wrong:

RETRACT_WINDOW = 2.minutes

def retractable_by?(user)
  !deleted? && sender_id == user&.id && created_at >= RETRACT_WINDOW.ago
end

def retract!
  update!(deleted_at: Time.current, content: nil)
  image.purge_later if image.attached?
end

The pieces:

Soft delete — stamp deleted_at, blank the content, keep the row
Hard-delete the image — purge_later queues attachment cleanup, saves storage
2-minute hard window — no archaeology retracting messages from 6 months ago
Row stays — for compliance/audit; Turbo broadcast swaps the rendered message to a "retracted" placeholder

"Wipe the content, keep the record" is the standard pattern for private messaging. Claude got the logic right on the first try, but image.purge_later was my fix — its original image.purge would block the response.

Turbo Broadcasts Per Viewer, Not Per Conversation

For real-time I told Claude to use Hotwire/Turbo since the whole app is on that stack. The interesting choice is how to slice the broadcast channels:

def broadcast_to_thread
  [conversation.user_one, conversation.user_two].each do |viewer|
    broadcast_append_to(
      "conversation_#{conversation_id}_user_#{viewer.id}",
      target: "conversation_#{conversation_id}_messages",
      partial: "direct_messages/direct_message",
      locals: { message: self, viewer: viewer }
    )
  end
end

The channel name includes viewer.id — one message gets broadcast twice, on two separate streams, rendering with a different viewer each time.

Why? Because the message partial branches on "did I send this?" and "can I retract it?" — the rendered output is genuinely different per viewer. If we broadcast a single payload and let the client decide, we'd either ship sender_id to the front end or write conditional CSS — neither as clean as rendering twice on the server.

The inbox badge uses the same model:

Turbo::StreamsChannel.broadcast_replace_to(
  "user_#{recipient_id}_inbox",
  target: "dm_inbox_badge",
  html: ApplicationController.render(
    partial: "shared/dm_inbox_badge",
    locals: { unread: recipient.total_unread_dm_count }
  )
)

One stream per user, each refreshing its own unread count.

What I Had to Patch After Claude's PR Merged

Once the PR was in I sat with the real product for a couple of hours and found three things Claude hadn't anticipated. Shipped them in a follow-up commit:

1. Empty conversations shouldn't show on either side

I clicked someone's "Send DM" button. A conversation row got created, but I hadn't actually written anything yet — and the other person could now see an empty conversation in their inbox. Socially weird: "if you didn't say anything, don't bug me."

Added a scope:

scope :visible_to, ->(user) {
  for_user(user)
    .where.not(last_message_at: nil)
    .where(...hidden_at IS NULL...)
}

Conversations with last_message_at IS NULL are invisible to both sides. The first real message calls bump_last_message!, which stamps the timestamp and the conversation pops into view. Claude didn't think of this — it was focused on "does it work," not "what if the user changes their mind mid-flow."

2. The "Send DM" button on profiles was stacked under Follow and too cramped

Claude's layout had the buttons stacking vertically, plus the "Chat-bubble-left-right" heroicon tooltip name was leaking out. I switched it to sit alongside Follow, icon-only, with an explicit title to override the tooltip leak.

3. Avatars collapsed in inbox and thread header

.avatar-wrapper had a width lock fighting Tailwind's w-X h-X utilities. Matched the convention from the user_card partial: drop the wrapper, use w-X h-X rounded-full object-cover directly.

Each of these three is small in isolation. Together they're the gap between "works" and "actually feels right." Claude writes code fast, but the kind of polish you only spot by using the product can't be delegated — Claude doesn't have eyes, fingers, or muscle memory from other pages in the same app.

What This Build Taught Me

1. Scope is a human job, not an AI job.

"No groups, no read receipts, no reactions" was worth more than every line of code Claude wrote. Loosen any one and the timeline doubles. Claude won't volunteer to cut scope because it doesn't know your release schedule.

2. Push invariants into the database.

[a.id, b.id].sort plus the unique index, distinct_participants validation, foreign-key cascade — Claude proposed all of it, but only after I said "no mirrored conversations exist," "no self-DMs," "deletes cascade clean." The stricter the constraint, the fewer bugs survive.

3. Anti-abuse goes in the specs.

Of the 41 specs, ~60% are policy tests: block enforced, privacy mode enforced, rate limit enforced. These specs aren't "does the feature work" — they're "is abuse blocked." Each abuse scenario gets a spec. Claude can enumerate the list, but only after you tell it which scenarios you're defending against.

4. UX polish requires hands on the actual product.

Two hours of real interaction surfaced three issues that a code review or green test suite would never catch. There's no substitute.

From "I want this" to "users can use it": main PR landed at 21:53, polish at 23:36 — one evening. But the rare resource wasn't writing code. It was the time spent deciding scope, authorization, and abuse defenses. Writing the code is the start; using the product yourself and grinding off the burrs is what makes it shippable.