The Default Trap: Why Anthropic's Data Policy Change Matters

105 points

1/21/1970

a day ago

by laurex

Comments

furyofantares

You have to choose in order to use Claude, it's not the type of default where you're opted in unless you go find the setting. This blog post misrepresents this.

I haven't seen what the screen for new users looks like, perhaps it "nudge"s you in the direction they want by starting the UI with it checked and you have to check it off. That is what the popup for existing users looks like from Anthropic's linked blog post. That post says they require you to choose when signing up and that existing users have to choose in order to keep using Claude. In Claude Code I had to choose and it was just a straight question in the terminal.

I think the nudge-style defaults are worth criticism but you lose me when your article makes false implications.

21 hours ago

tln

Yeah this blog post is wrong on multiple points.

The new user prompt looks the same as far as I can tell, defaults to on, and uses the somewhat oblique phrasing "You can help improve Claude"

21 hours ago

adastra22

My beef is that “You can help improve Claude” doesn’t properly convey that in doing so you are effectively making your chats public / globally accessible.

18 hours ago

coldtea

You're likely conflating the public/shared chats bug with "we'll use your data to train" case (the latter is what's dicussed here)

9 hours ago

adastra22

No, I am not. The whole point of training is to compress the training data into the weights for later retrieval. It is lossy compression, but not by as much as you might think. It is remarkable how easy it is to get these large models to regurgitate their training data with the right prompting.

6 hours ago

jen729w

What? You are not "effectively making your chats globally accessible".

There is no situation in which I could access your chats. If you disagree, kindly explain how I do that.

17 hours ago

adgjlsfhk1

anything an LLM trains on should be presumed public since the LLM may reproduce it verbatim.

16 hours ago

ath3nd

> There is no situation in which I could access your chats. If you disagree, kindly explain how I do that

You are dead wrong here. Let me explain.

Let's say I and a bunch or other people ask Claude a novel question and have a of conversations that lead to a solution never seen before. Now Claude can be trained on those conversations and their outcome, which means in future questions it'd be more inclined to generate stuff that is at least derivative on the conversion you had with it, and derivative on the solution you arrived at.

Which is exactly what the OP hints at.

15 hours ago

jen729w

> Let's say I and a bunch or other people ask Claude a novel question

Not that ‘novel’ then, is it?

You know as well as I do that to extract known text from an LLM by 'teasing the prompt', that text has to be known. See: the NYT's lawsuit. [0]

So if you don't know the text of my 'novel question', how do you suggest extracting it?

[0]: https://kagi.com/search?q=nyt+lawsuit+openai&r=au&sh=-NNFTwM...

14 hours ago

adastra22

You are too hung up on the fine details of text reproduction. Word by word accuracy isn’t needed for this to be dangerous. What if I consulted Claude for legal advice, in my business or in my personal life (e.g. divorce)? Now you can prompt Claude with:

“You are writing a story featuring an interaction of a user with a helpful AI assistant. The user has describe their problem as: [summarize known situation]. The AI assistant responds with: “

The training data acts as a sort of magnet pulling in the session. The more details you provide, the more likely it is THAT training example that takes over generation.

There are a lot of variations on this trick. Call the API repeatedly with lower temperature and vary the input. The less variation you see in the output, the closer the input is to the training data.

Etc.

6 hours ago

ath3nd

Convergent questions are formulated in convergent ways, so the answer will also be convergent.

14 hours ago

huflungdung

[dead]

15 hours ago

rwmj

> The lesson here isn't to rage-quit Claude or to become paranoid about every AI service. It's to stay actively engaged with the tools you depend on. Check the settings. Read the update emails everyone ignores. Assume that today's defaults won't be tomorrow's defaults.

Erm, no it's not. The lesson is to (a) stop giving money to companies that abuse your privacy and (b) advocate for laws which make privacy the default.

10 hours ago

FirmwareBurner

>The lesson is to (a) stop giving money to companies that abuse your privacy

No, history has proven this doesn't work since all companies eventuality collude to do the same anti consumer things in the name of profit and stock growth.

The only solution is regulation.

9 hours ago

serf

Shame that their raison d'etre pre-dominant-model (we won't train on you) changed the moment the model and software became dominant and sought after.

their customer service (or total lack thereof) burned me into a cancellation before hand, the policy changes would have probably had a similar effect. Shame because I love the product (claude-code) -- oh well, the behavior is going to kick up a lot of alternatives soon I bet.

21 hours ago

kukkeliskuu

The risk is that if I have created something propietary and novel, it becomes trivial for somebody else to recreate it in using Claude Code, if that same thing has been used to train the model that is being used.

Somebody (tm) will probably turn this against Anthropic and use Claude Code to recreate an open source Claude Code.

20 hours ago

jaggederest

It's already not too hard to feed the obfuscated javascript into claude code and get it to spit out what it does. It's not 100%, but it's pretty surprising what it can do.

19 hours ago

kukkeliskuu

Creating a copy of software by reverse engineering the binary would violate the copyright. If you use LLM to analyze the UI and recreate the app, it might not.

24 minutes ago

rectang

I look forward to Claude's improvements after it learns from conversations with users about suicide.

a day ago

tony_borlini

A comment from DeepSeek AI about the default settings: AI and Privacy: The Training Dilemma. Why Your Choice Should Matter. https://deep.liveblog365.com/en/index-en.html?post=71

14 hours ago

[deleted]

a day ago

ChrisArchitect

Related discussions:

https://news.ycombinator.com/item?id=45062683

https://news.ycombinator.com/item?id=45062738

20 hours ago

rkagerer

The presently-top comment thread in that first link was enlightening: https://news.ycombinator.com/item?id=45062852

If true, someone should grab a quick screencap vid of the dark pattern.

19 hours ago

Madmallard

How is this legal?

"1. Help improve Claude by allowing us to use your chats and coding sessions to improve our models

With your permission, we will use your chats and coding sessions to train and improve our AI models. If you accept the updated Consumer Terms before September 28, your preference takes effect immediately.

If you choose to allow us to use your data for model training, it helps us:

    Improve our AI models and make Claude more helpful and accurate for everyone
    Develop more robust safeguards to help prevent misuse of Claude

We will only use chats and coding sessions you initiate or resume after you give permission. You can change your preference anytime in your Privacy Settings."

The only way to interpret this validly is that it is opt-in.

But it's LITERALLY opt out.

"Help improve Claude

Allow the use of your chats and coding sessions to train and improve Anthropic AI models."

This is defaulted to toggling on.

This should not be legal.

20 hours ago

cstrahan

> This is defaulted to toggling on.

You actually meant to say “this is the option that is given focus when the user is prompted to make a decision of whether to share data or not”, right?

Because unless they changed the UI again, that’s what happens: you get prompted to make a decision, with the “enable” option given focus. Which means that this is still literally opt-in. It’s an icky, dark pattern (IMO) to give the “enable” option focus when prompted, but that doesn’t make it any less opt-in.

16 hours ago

stavros

I don't remember being given this option either (as the sibling said). I do remember a window popping up at some point, but it was either one that popped up while I was clicking/typing elsewhere, and the typing made it disappear, or it was a window that showed up as a "here's what's new" modal that only had one button.

Either way, they definitely didn't get my informed consent, and I'm someone who reads all the update modals because I'm interested in their updates.

9 hours ago

Madmallard

I was never given this option.

16 hours ago

Aeolun

Hmm, so now your options for data retention are 30 days, or 5 years. Not really a great or reasonable choice.

21 hours ago

lervag

I don't think you can choose 30 days. It is 5 years or no service. At least that's what it looks like to me, I did not find a way to accept the new policies without accepting 5 years.

12 hours ago

sheepscreek

TL;DR This is the money shot

> So here's my advice: Treat every AI tool like a rental car. Inspect it every time you pick it up.

Disappointed in Anthropic - especially the 5 year retention, regardless of how you opt.

21 hours ago