How the AI Bubble Bursts

367 points
1/21/1970
a day ago
by martinvol

Comments


joshstrange

> RAM prices are crashing because new models won’t need as much

Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.

RAM prices are most certainly not crashing (yet) and treating it as a forgone conclusion because _one_ lab found gains could be made and hasn't even reported on the efficiency of their method is just irresponsible. It's almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite.

[0] https://pcpartpicker.com/trends/price/memory/

[1] https://tech.sportskeeda.com/gaming-news/how-google-s-new-tu...

a day ago

amelius

> Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined.

I think it is determined:

https://en.wikipedia.org/wiki/Jevons_paradox

a day ago

woadwarrior01

Yeah, even if one efficiency trick lands, people will end up spending the saved budget right back on bigger models, and/or more "thinking" tokens.

a day ago

EthanHeilman

Not if the bigger models have diminishing returns. Lets say you figure out a way to reduce RAM requirements 100X, but 2x increasing RAM usage by 2x only gets you a 1% increase in effectiveness and 3x does not get you any noticeable increase over 2x at all. Sure you can reduce the price per token, but you might have already saturated the market. Even if you haven't saturated the market, your hardware based moat just got smaller and this is going to reduce your margins even more.

Just noticed that pydry made a similar point: https://news.ycombinator.com/item?id=47574216

a day ago

pydry

Jevons paradox only applies if demand hasnt already been saturated.

The fact that public LLM usage is leveling off at a price of $0 and Jensen "we make the shovels in this gold rush" Huang is rather desperately claiming that you need to spend $250k/year in tokens to be taken seriously suggests that demand saturation may not be that far off.

Whether Jevons' Paradox applies to software engineers I think is another open question. Im constantly being told that it doesnt and that LLMs make half of us redundant now, but Im skeptical - so much automation I see is broken or badly done.

a day ago

raincole

It is quite hard to imagine how the demand is saturated now. I think any company that uses a sliver of AI will happily increase their token consumption 100x if it's free.

a day ago

flir

Are you assuming a brute force "burn tokens until it passes the tests" model, or is there a really sweet approach on the horizon that is impractical at current token costs?

I'm asking 'cos while I'm philosophically opposed to the first option, but I'd love to hear about anything that resembles the second.

a day ago

SpicyLemonZest

One idea I've heard is prototype-first design reviews. If the cost of code genuinely trends to zero, there's no reason why most technical disagreements about product functionality couldn't come with prototypes to illustrate each side of the debate. Today, that's not always practical between token costs and usage limits.

a day ago

pydry

What if the agent fucks up the better approach but does a good job of the worst approach?

a day ago

SpicyLemonZest

Then hopefully the reviewers will notice that the first prototype's flaws are correctable. Sometimes they won't, and they'll end up making a bad decision, just as they sometimes make bad decisions today with no prototypes to look at. But having prototypes allows for a lot of debates that are today vague and meandering to be reduced to "which of these assertions at the end of this integration test do you think is the correct behavior?".

a day ago

pydry

Executive FOMO disease is being exploited by the model providers to push for maximal token usage even when it is pointless.

This includes encouraging people to set up elaborate multi model set ups (e.g. "gas town") for coding that do not meaningfully improve productivity but which certainly do cause token usage to explode.

It also includes encouraging execs to use token consumption as a proxy for productivity - almost akin to SLOC.

AI has a halo right now and the managerial class seem to be willing to forgive almost any failure because the promise is so enticing. We're at peak expectations right now. They will soon start to be less forgiving when the warts which are intrinsic to LLMs remain unsolved.

a day ago

monknomo

nobody know how to measure software productivity + ai is supposed to mean productivity goes up = more ai means more productivity

As best as I can tell, that's the thinking. It's one number, it's very easy to find and manage, and there is a belief that it directly measures productivity.

I disagree that it does; seems to me the throughput of useful features is a better measure, but I'm not in the drivers seat on this one

a day ago

irke

Incremental revenue and cost-savings, at least for enterprises, is where it would show up. There’s also a present value consideration - if LLM’s make those dollars come into existence closer to the present, they are worth more.

The personal use case stuff is messy and subjective.

a day ago

monknomo

attributing incremental revenue to gross engineering effort is challenging, imo.

Cost savings is primarily a function of headcount here. Which is also easy to measure, and so if we take my thesis that easy to measure stuff is prioritized...

a day ago

irke

Yep - it’s impossible to separate experimental tokens vs value creating ones.

Ultimately the performance will be assessed via the income statement and cash flows of customers of the model producers.

Frankly in the window pre-IPO it’s in the best interests of OAI et al to show a line going to the top-right in relation to tokens, in their prospectus. What does that mean?

Strategic manipulation.

a day ago

Marha01

Demand for top models is definitely not saturated, at least when it comes to programming. If I could afford to use 5x more Claude Opus 4.6 tokens, I would!

a day ago

hajile

Demand is relative. How many Claude tokens would you buy if they had a 10x price hike?

The market has achieved it's current saturation level with loss-leader prices that remind me of the Chinese bike share bubble[0]. Once those prices go up to break even levels (let alone profitable levels), the number of people who can afford to pay will go down dramatically (and that's not even accounting for the bubble pop further constricting people's finances).

[0] https://www.youtube.com/watch?v=FQrEDq8KPiU

a day ago

pigpop

If they've already built themselves a loyal customer base (which is usually the point of fighting a price war) and the customers are happy with the technology they have, then if funding is tight and turning a profit is more important why wouldn't they pivot to optimizing inference by stopping further training, freezing the model versions, burning the weights into silicon and building better caching strategies and improving harnesses and tools that lower their cost and increase their margin?

If all they do is hike prices then they'll lose customers to competitors who don't or who find a way to serve a similar model cheaper.

The demand isn't going to go away purely through higher prices. Once people know something is possible they will demand it whether supply is constrained or not. That's a huge bounty for anyone who can figure out how to service that demand.

a day ago

philistine

Easier said than done. What you're describing can take years to implement. Can OpenAI et al. keep burning cash at the same rate for two years while they wait for the salvation of custom silicon if the investments dry up?

a day ago

eru

They could stop further training right this very second.

15 hours ago

HDThoreaun

There is no evidence that labs are losing money on inference subscriptions. The labs have massive fixed costs, but as long as inference spend is higher than the datacenters they use for inference cost all they need to do to become profitable is scale up. Right now software engineers are basically the only ones actually paying for inference, the labs just need to create coding assistants for everything that are good enough that every white collar worker in the country(world?) is paying a $1000/yr subscription. Certainly theres a lot of risk, will models become commoditized and everyone switches to open models? can they actually get non software engineers to pay for inference in mass? But its not like theres no path

a day ago

veunes

Demand is stagnating only applies to the B2C segment, where people are already bored of generating poems and funny pictures. In B2B, the demand hasn't even started yet because corporations are still terrified of shoving their NDA data into public APIs. The second local models and secure private clouds get cheaper, the enterprise is going to devour literally any amount of available compute just to automate internal document workflows

9 hours ago

zozbot234

> The fact that public LLM usage is leveling off at a price of $0

Tne price is very much not $0, even 'free' models have usage capacity limits that equate to a shadow-price.

a day ago

adventured

LLMs haven't remotely begun to be integrated into the lives of the typical person. Not even close. The typical person is using LLMs not at all as it pertains to their daily life tasks. They're using them almost entirely for limited discussion matters (eg having a discussion with GPT about a medical issue, or a work related matter).

This is the first or second inning in the LLM rollout. It'll take 15-20 more years for full integration of AI agents into the life of the typical person.

The claw experiments for example can just barely be considered alpha stage. They're early AI garbage unfit for the average person to utilize safely. That new world hasn't gotten near the typical person yet.

The compute requirements to get to full integration of AI agents into the life of the average person - billions of them - is far beyond 10x where we're at now.

a day ago

pizlonator

> LLMs haven't remotely begun to be integrated into the lives of the typical person. Not even close. The typical person is using LLMs not at all as it pertains to their daily life tasks. They're using them almost entirely for limited discussion matters

This is an argument in favor of demand having leveled off.

a day ago

pigpop

Only if nothing changes. Right now, people are running agent frameworks like OpenClaw on their own hardware or a VPS and the frameworks are often single person projects. This results in all sorts of problems but you can pick an easy solution from history which is to create a walled garden service for running these agents where you can provide security and standardization. If that platform also allows trusted services to integrate then they can provide end to end security guarantees. They also benefit from improvements to the models themselves making them more difficult to subvert. Creating something that is secure enough for the average person to entrust their credit card to is not an impossible task.

a day ago

pydry

>The typical person is using LLMs not at all as it pertains to their daily life tasks.

This doesnt track at all with my experience. Everybody is using it everywhere.

Moreover people are using them for daily life tasks even when it is not an appropriate use of LLMs - e.g. getting medical advice as you referred to or writing emails which are clearly pissing off their coworkers.

In this respect I see it as akin to radium - a new technology that got a little too fashionable for its own good when it first emerged and which will likely have many use cases scaled back.

a day ago

TheScaryOne

>Everybody is using it everywhere.

No one in our Auto shop is using AI. One of the new diagnostic tools was demo'd with AI, and none of us were having it. It's about as accurate as Googling your symptoms.

My mother had an AI powered lung scan that came back with Stage 4 Cancer. The Oncologist got called in (for a fee!) to tell us it was just early stage COPD.

a day ago

user34283

In my experience people vastly overestimate the competence of doctors. Getting medical advice from LLMs could be life saving.

Personally I experienced this when a specialized doctor believed a drug interaction to be the opposite, thinking A hinders the absorption of B, when actually it hinders the clearance, tripling concentration of B.

Without AI, I would have been clueless about this and could not have spotted the mistake. I don't know if it would truly have been critical, but it did shake my confidence in doctors.

a day ago

PAndreew

This^^ Use both, they have their own strengths and weaknesses.

16 hours ago

eru

And the AIs are still getting better at a good clip. I'm not so sure about (unassisted) doctors.

15 hours ago

HDThoreaun

> getting medical advice

Id be careful stating this is an inappropriate use of LLMs. Im semi tapped in to the medical literature community and there is a lot of serious discussion and research going into the usage of LLMs for medical advice and most of it is showing that LLMs are barely worse than doctors, and much much cheaper/more convenient. They definitely arent ready to completely replace doctors, but it seems they can provide competent medical advice in a pinch. Look out for the literature on this in the coming year, its only the last few months that researchers seem to be taking LLMs seriously.

a day ago

Delphiza

I am surprised that people are surprised by this finding, and support your position.

Anecdotally, doctors get things wrong quite frequently. Almost everybody has a bad medical diagnosis/advice story. The amount of reference material that a doctor needs to know off-hand and the data that they are given to make a diagnosis makes it a really difficult job. They also seldom have the ability to know whether their diagnosis/treatment worked, so have a limited ability to 'learn' from outcomes. (I did some work for cancer research and one of the most difficult problems was trying to get 'end of treatment' data because the end of treatment was often an unknown, to the researchers, death).

The ability to have a 'prompt' that includes lab data is likely to be better than the opinions of a doctor that only has one person's professional experience, limited ability to interpret 'prompts', and needing map it to an in-memory conditions database.

4 hours ago

checkyoursudo

This seems ripe for a joke akin to "how was the food?" "bad, but at least the portions were big!"

Like, "how was the medical advice" "worse than a doc's, but at least it was cheaper!"

a day ago

HDThoreaun

Well the thing is that it often isnt worse than a doctor's, thats the point of the research here. I get that sounds crazy, just watch out for the coming literature I guess.

A significant portion of americans detest the medical industry and deeply dislike going to the doctor so I dont even think the product needs to be very good to disrupt the way the system works, just different and accessible is likely enough. Funnily enough, restaurants where the food is bad but the portions are big are actually decently popular. Priorities can vary so widely that many people are unable to even comprehend the priorities a significant number of people truly hold.

a day ago

d2ssa

"deeply dislike going to the doctor"

No you are not capturing the trade off at all. And frankly you clearly have an inherent agenda implicit in your posts, that's clear to see.

a day ago

jrflowers

> barely worse than doctors

I like that this comment is below, and posted after, an example where somebody had to pay extra money to clear up a misdiagnosis of stage 4 cancer by the “barely worse” software

a day ago

HDThoreaun

There are many examples of doctors misdiagnosing a wide variety of things, which is largely the point here. People think of doctors as infallible when that is not even close to true.

Im certainly not saying fire all the radiologists, just advising an open mind when the actual literature starts saying that LLMs are as good as doctors in some areas.

a day ago

pydry

There are many examples of people into homeopathy, chinese medicine and even witchcraft using an identical (not similar, identical) argument to the one you just used to push it.

a day ago

d2ssa

Legit that dude seems like a nutter. lol'd hard at "Im semi tapped in to the medical literature community."

a day ago

jrflowers

Yeah that’s the pitch for Dianetics

17 hours ago

vonneumannstan

Pretty sure the entire markets for Storage, HBM, DDR5, etc are completely sold out for next several years. How is that saturated?

a day ago

Analemma_

We’re not even close to demand saturation with tokens. Have you seen the people rending their garments with rage that Anthropic and Google won’t let them use their flat-rate subscriptions to burn millions of tokens per hour on OpenClaw? And that’s a tiny set of die-hard tinkerers.

The ceiling of token use when everyone has something akin to OpenClaw just running as a background process on their phone is way higher than there’s supply for right now. Jevons paradox is still in full force.

a day ago

Macha

Is that not appealing to those users _because_ its a subsidised flat rate? Like those users could go and swap to API pricing right now if they wanted to, but at API pricing they don’t want to

a day ago

Analemma_

Right, but that just proves there's tons of pent-up demand waiting in the wings as token prices fall.

6 hours ago

kmeisthax

I thought we were going to hit token saturation years ago, but they keep inventing new ways to use tokens. Like, instead of asking a chat model to write something and getting ~1000 tokens out of it, you now have an agent producing ~10,000 tokens - or, worse, spawning 10 subagents that collectively burn ~100,000 tokens. All for marginally better answers with significantly higher compute usage.

Personally, I would have used all those tokens to generate synthetic data for IDA (iterated distillation and amplification) so that the more efficient 1000 token/answer chat model can answer more questions, but apparently that doesn't justify an insane datacenter buildout.

a day ago

azinman2

Everyone is interested in using less tokens to accomplish the same task.

a day ago

user34283

Marginally better answers?

Claude Code and co. can now analyze an enterprise codebase to debug issues in a system with multiple services involved.

I don't see how that would have been possible at all in the past.

a day ago

fotcorn

Also, there is zero reason to think that the big labs did not have anything similar to TurboQuant for a long time already.

The recent blog post from Google announcing TurboQuant does not change anything regarding RAM planning for the big labs.

TurboQuant itself is already a year old! So even smaller labs have probably seen and implemented it.

a day ago

scw

TurboQuant has a specific benefit by compressing the KV cache at a negligible cost to quality. That mainly means that the context lengths can go up in models for the same amount of memory, however the KV cache only accounts for something like 20% of the overall model size, and this will not dramatically decrease memory demands in the way that some of the more sensationalist reporting has stated.

a day ago

lostmsu

In large providers KV caches are the main bottleneck, no?

a day ago

schmidtleonard

The open source tooling got quantization support 3 years ago! It was a lesser type of quantization, but more than enough to prove that the savings just go to bigger models.

a day ago

adjejmxbdjdn

I’m not disagreeing with you, but consumer RAM prices are lagging indicators. If commercial RAM prices are dropping then consumers will see those price drops last, especially given the fact that several consumer manufacturers turned to commercial only.

a day ago

drakythe

Is there a source that says commercial RAM prices are dropping? I was recently told (without a source, so I am not sure if it is true or not) that OpenAI never even bought any of the RAM they signed deals on last year, and that those deals were just letters of intent. So if prices are coming down I wouldn't be shocked but the economy is pretty well vibe coded these days so who even knows.

a day ago

ffsm8

Well, all manufacturers of ram have publicly stated that they're sold out for 2026

RAM prices falling during 2026 is insanely unlikely unless AI crashes so hard it starts to actually kill companies. And not just any but big tech

I'm not seeing that in 2026. Maybe 2027 (I'd sincerely doubt that too, honestly), but definitely not within the next 9 months. Their runway is _way_ too large for things to spiral out of control within such a short period of time

a day ago

eru

If the spot market for RAM is reasonably efficient, prices should be about as likely to fall as to increase further.

Otherwise you could make a surefire profit by just buying some RAM and waiting a few months to re-sell.

(All of this is modulo interest rates etc to finance this.)

15 hours ago

ffsm8

> Otherwise you could make a surefire profit by just buying some RAM and waiting a few months to re-sell.

Yes, that's why scalping is so widespread right now, because that's essentially what it is

12 hours ago

eru

And all morality aside: people will scalp ever harder, until prices are as likely to go down as up.

That's not just RAM, but pretty much any commodity or financial instrument.

12 hours ago

dylan604

If the claims the GP made about letters of intent to buy vs actual purchases are true, that brings additional questions. Like, if you send a letter of intent but do not follow through, are there financial penalties? How hard is it for the chip maker to sell the chips allotted based on that letter of intent? Would someone like Apple buy up the extra, or would they not need it as they've already bought enough for the units they expect to sell? If someone like Apple suddenly had an influx of RAM, that does not mean they would have extra CPU capacity to match. If the supply chain is this closely apportioned, what is the most likely result of a sudden surplus?

a day ago

citrin_ru

> unlikely unless AI crashes so hard it starts to actually kill companies. And not just any but big tech. I'm not seeing that in 2026

A month ago AI crash we looking unlikely but with the strait of Hormuz being de-facto blocked many predict a global stagflation which could affect AI too.

a day ago

sergiotapia

here you go: https://x.com/wccftech/status/2037921057097892018

Ram prices are dropping

a day ago

itintheory

Every response to the original post calls it out as being factually incorrect...

a day ago

sergiotapia

a day ago

eru

Why would they be lagging?

15 hours ago

ToucanLoucan

If they see them. Plenty of businesses are still charging pandemic prices for all kinds of goods and simply pocketing the difference.

Cars come to mind instantly. Prices exploded in 2020/1, due to legitimate shortages, most of which have been plus or minus resolved, but the prices for new (and used!) cars never came back down.

a day ago

mono442

Actually the prices for new cars seem to be now lower than in 2022 where I live in Europe. Though this could be attributed as well to the competition from Chinese manufacturers.

a day ago

busterarm

While the pandemic chip shortage resolved around 2024, a new chip shortage started in 2025 when the Dutch government took control of Nexperia (who are owned by China's Wingtech) and China retaliated by creating export restrictions. Honda, Nissan, Mercedes-Benz and others cut production. With less inventory, manufacturers and dealers are raising prices to compensate.

Also the cost of shipping never came down and lots of cars and/or their components need to cross oceans. Plus we have a new energy crisis...

a day ago

slfnflctd

> almost as bad as when LLMs link things to prove their point, you visit the link, and find it says nothing of the sort or even the opposite

To be fair, they got it from us. This happened to me plenty of times long before modern LLMs.

a day ago

throwup238

It learned by reading HackerNews, after all.

a day ago

h14h

I do wonder how closely prices consumer RAM kits follow the wholesale prices for NAND chips manufacturers see internally. The pcpartpicker graphs you linked show consumer prices have leveled out and may even be starting to fall. Depending on how the economics shake out this could mean we've hit an inflection point.

My personal prediction is that once the VC bill comes due and prices for frontier models starts to climb, competition for efficiency will heat up. The main AI use-cases seem to be falling into buckets, and I doubt serving gigantic, do-it-all general models for every use-case under the sun is remotely cost-effective.

If common use-cases start to be more efficiently served by smaller, more efficient purpose-built models (or systems thereof), it'd make the big frontier models increasingly niche. Cursor's Composer 2 model is a great example of this.

In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.

a day ago

joshstrange

Consumer vs NAND is an absolutely fair distinction to make, I'm not sure how to track those prices. My main issue the article saying "RAM prices are crashing" (which I can't find any evidence of) and linking to an article that doesn't even repeat that claim, it instead just speculates that maybe RAM will come down in price due to this new idea.

> In any case, I think it's pretty fair to speculate we may be seeing RAM prices start falling sooner rather than later.

I sure hope so. RAM, HDDs, and SSDs are all crazy-high right now and I was in the market for literally all 3 but have paused all my buying because I can't justify the costs as they stand today.

a day ago

h14h

> My main issue the article saying "RAM prices are crashing" (which I can't find any evidence of) and linking to an article that doesn't even repeat that claim

That's totally fair. The article is written in a very odd way where it makes a bunch of authoritative, factual-sounding claims and then throws a "this is all very speculative" line right at the end.

It's very interesting speculation, but can't really be considered anything more than that, despite the prose it chose.

a day ago

layer8

I agree. The article they link to talks about memory company stocks crashing, not RAM prices crashing. There is some truth to the former: https://www.ft.com/content/e4e15692-187e-4466-832e-ec267e792...

a day ago

veunes

Bingo. Even if some magic drops tomorrow that compresses the KV cache down to literally zero bits, that saved VRAM will instantly get swallowed up by bumping the batch size or pushing the context window to 10 million tokens. There is no such thing as "excess memory" in ML, only under-trained models

9 hours ago

martinvol

RAM prices haven't crashed yet and it'll take time because it has to propagate within the supply chain. Micron is -20% from the top already https://www.investing.com/equities/micron-tech

Stock price is the best forward indicator I can think of

a day ago

cwillu

That might be true, but it's still straightforwardly wrong to say that RAM prices have crashed, and it calls into question everything else they write.

a day ago

martinvol

yeah good point, although it's just one of all the catalysers I mentioned. I fact I had written most of the post already before I saw the news about RAM.

a day ago

albinn

I would think that we are going to see RAM prices increase even more, given, among other things, pure helium disruptions and increased electricity prices.

I haven't looked closely into TurboQuant, but perhaps it will revolutionize just as much as the 1-bit llm did...

a day ago

aurareturn

Even if TurboQuant, which was released a year ago, drastically lower RAM requirements, AI labs will just release bigger models.

Jevons Paradox. When are we going to learn that efficiency gains in AI does not decrease hardware usage?

a day ago

functional_dev

valid point, it reminds me of video games. GPUs got faster, devs pushed higher resolutions, more complex lighting instead of saving power :)

a day ago

gmerc

consumer ram is starved by production capacity shifting to HBMs. Hbms dropping in price would not affect consumer RAM on any immediate timeline. Also, as pointed out by many, Jevons Paradox

a day ago

Valakas_

This article and the title are total clickbait filled with emotional hooks. And it worked. You totally debunked it but look how it still became so popular.

15 hours ago

am17an

Thank you, there are two things I would like to point out:

1) Google releasing something probably means they don't see it as important. 4-bit KV-cache quantization has been known for a long time. The fact there is almost a mass hysteria about this paper makes me think there is a lack of skepticism in this AI mania, even in relatively tech-savvy crowd.

2) But prices for memory companies are crashing! look around, the whole market is crashing.

a day ago

maeln

> > RAM prices are crashing because new models won’t need as much

> Reality begs to differ [0] and following the link for that text goes to an article [1] where they talk about Google's TurboQuant which supposedly will lower the RAM requirements. Now if that means RAM prices come down (as speculated, not reported on, in the link) or the AI companies just do more things with their extra ram is yet to be determined. The fact this article links there with text "RAM prices are crashing" throws the entire rest of the article into doubt for me.

I find it fascinating how extremely reactive things have become. One research paper which, to my knowledge, hasn't been externally replicated yet, nor implemented, generate tons of hyperbolic article, tweets and such, and do actually manage to move the market at least temporarily. Not just this, but a simple message in full caps lock by the president of the U.S who is in the habit of lying through is teeth constantly, and the same thing happens. It's like there is a big bubble that threw any form of critical thinking out of the window and is in a hurry to react to anything even if it is not even remotely believable. Now I understand why it happens, there is a lot of money that can be made by capitalizing on FOMO, either by driving traffic to their website, socials, etc, or by simply insider trading (which feels like it has been legalized these days). But I still find it incredible the proportion it started to take.

a day ago

JCTheDenthog

My favorite was when Google revealed Project Genie a month ago (which lets you generate video game worlds with AI, basically) and stocks for game companies immediately dropped. Anyone familiar with games and gaming knows that what Project Genie offers (essentially empty worlds with minimal interactivity that you can just kind of wander around in, and they struggle with simple things like object permanence if you look away) knows that this isn't real competition for actual games, but the markets reacted anyways.

a day ago

mcv

I've always seen the stock market as a mix of mass hysteria and pyramid scheme. With actual value underlying it of course, but actual stock values are frequently irrational.

a day ago

irke

Short term it’s mood and momentum

a day ago

incognition

You nailed it. It's algos and noise trading

a day ago

tracker1

Even worse, 3 memory companies control well over 90% of the international market, with a history of cartel collaboration that's going to be ever harder to prove with fewer companies.

a day ago

hintymad

Some also argue that the RAM price keeps rising because of the bullwhip effect. I was wondering if there's anyway for us to differentiate a sustained demand from the bullwhip effect.

a day ago

faangguyindia

If the gains are real why the limits are so bad? Google can barely serve Anti-gravity.

a day ago

owlmirror

Isn't that at the moment still a free product? Of course they will not prioritize serving those requests. That tells you nothing.

a day ago

butlike

It tells you there's no clear path to monetization.

a day ago

adventured

They've all avoided loading up their LLMs with ads to this point. That is going to change dramatically over the next 2-3 years. All of them will be loaded with ads, and Google will partake as expected given their ad network & capabilities in that realm. They'll match GPT's ad roll-out.

a day ago

notatoad

it has a paid option. and the antigravity subreddit is full of people who claim to be paid users, complaining about constantly hitting limits.

a day ago

owlmirror

where do I find the paid option? I can not find that on their product page. There are only two options I can see; one "Available at no charge" and another one "Coming soon - For organizations"

Can you upgrade in the IDE? It would be strange that Google has a performance problem for paid users while I do not experience any such issues at all with Claude and Codex.

a day ago

ajam1507

Maybe it's unavailable in your region. Four options on this page for me.

https://antigravity.google/pricing

a day ago

BoredPositron

You get more Claude tokens from a Google subscriptions via antigravity than from anthropic. Especially if you use the 5 other "family" accounts you can share the subscription with...

a day ago

ajross

> Reality begs to differ

Honestly you're both wrong. RAM prices spiked speculatively, and they're going down for the same reason. Market people always want to argue in fundamentals, when in practice *ALL* the high frequency components of the signal are down to a bunch of traders trying to guess where it's going in the short term.

At best those guesses are informed by ground truth ("AI needs a lot of RAM!" "Sam cornered the marked!" "TurboQuant needs less RAM!"), but they remain guesses, and even then you can't tell the difference between that and random motion.

a day ago

T-A

> RAM prices spiked speculatively, and they're going down for the same reason.

https://pcpartpicker.com/trends/price/memory/

Note how flat the black lines are.

Then note how wide the gray bands are. That makes it very easy to cherry-pick a few examples to present as "supporting evidence" that prices are doing whatever you want to believe they are doing.

a day ago

ajross

FWIW, you're misreading that chart. It shows a wild increase in memory prices, no matter how much you try to cherry pick.

An example might help: in July of last year I bought exactly this 2x32 DDR5 kit for $141: https://www.amazon.com/dp/B0DSR14511

It's showing $999 now, which seems about median for similarly-spec'd memory on Amazon. The cheapest slot-and-capacity-compatible equivalent I can find is around $570, even. So 3-5x increase, at minimum.

It's true that that's a high error bar. It's absolutely not true that the trend is ambiguous.

Can you cherry pick me a $141 kit, please? I mean, it's not an abstract question! I'd buy it from you right now if you had it or could get it, in whatever quantity you can source. No joke.

4 hours ago

cma

> RAM prices spiked speculatively

Didn't OpenAI buy up 40% of the capacity all at once?

a day ago

ajross

No, they signed a bunch of contracts for future deliveries. That's not a supply constraint. The factories making RAM continued operating and serving their existing deliveries, and in fact they still are.

Freshman economics would say that supply is fine and that prices shouldn't move. But they did anyway. And the reason is speculation.

a day ago

leoc

I don't get it tbh. What market participants were speculating here? There aren't futures markets in RAM as far I know, though I certainly don't know much. And the supply constraints appear to have been pretty real (though maybe not immediate) if eg. Valve was begging publicly for RAM consignments. Were there pure-play speculators filling warehouses with DDR5?

a day ago

notatoad

>There aren't futures markets in RAM as far I know

sure there is. not formally, but if you hold a contract for x units of future production, you can sell that contract to somebody else who wants those units more than you do.

a day ago

irke

That’s a forward contract yeah. They def do exist.

Futures are standardised forward contracts traded on exchanges

a day ago

drakythe

The economy is vibe coded at this point.

Have we gotten anymore word on the potential Helium constraints that SK Hynix was making noise about after the strike on the helium plant in the Middle East that suppplied 60% of S. Korea's Helium? Because that could definitely put a kink in things, since SKH is one of the 3 remaining big DRAM producers.

a day ago

cma

According to this he ordered them uncut and unfinished and may just warehouse until needed:

https://www.mooreslawisdead.com/post/sam-altman-s-dirty-dram...

Its still speculative that OpenAI won't go bankrupt and have to free it back to the market, but if it is holding them unfinished it is a supply constraint on finished RAM chips even if not on wafer output.

a day ago

Forgeties79

I’ll believe they’re going down when it doesn’t cost $550 for the $105 ram I purchased 1 year ago. Yes consumer prices lag commercial prices yada yada, I think any hot takes are pointless until we see lower prices or far more convincing evidence it’s coming. When it costs basically a MacBook neo for 32gb of DDR5 ram it’s hard to hear “ram is coming down for sure”

a day ago

hirako2000

Not crashing yet. The article is looking 1 to 5 years to come.

Given Nvidia's CEO's agitation I would give credit to the prediction, and if it's correct the price will go back to what it was, or even lower of investment in capacity are made today.

a day ago

michaelcampbell

My take is new capabilities will consume any price reductions, making them moot. At least in the medium term.

A RAM price drop due to some magic efficiencies assumes everything else doesn't change, which I doubt anyone honestly thinks will be the case.

a day ago

sigmoid10

Yeah, I also stopped reading at that point. If I want a bunch of random, made up facts to sell lukewarm opinions or steer the uneducated masses, I'll tune in on a Trump press conference. Why does this feel like someone is desperately trying to make reality mirror his flailing market bets?

a day ago

Forgeties79

Sometimes it's real easy to see who has risky short positions right?

a day ago

mNovak

This feels similar to when Deepseek first debuted with claims of ultra-low cost training, and all the pundits exclaimed that Nvidia was finished, the bubble had burst, etc.

a day ago

sandworm101

There is also demand for ram in others areas of data centers. As we are all pushed deeper into clouds, i can see the rise of ram for data storage (ram drives) continue to eat into the supply. A module of ddr5 will be more useful in a netflix rack streaming movies 24/7 than in a gaming PC where it may only be used an hour or two every day.

a day ago

infecto

It’s incredible how polarizing the AI rush is. I keep the perspective that the technology is an absolute step change but I have no idea where the cards will fall. I take a lot of issue with these style of articles. I get a sense that the authors are being overly defensive.

The cost to serve tokens is absolutely profitable today and that’s been true for at least a year. What’s unclear is how R&D and capex fit into the picture. I am not that pessimistic on this front either though. For the data center build outs, demand for tokens is still exceeding supply. On the R&D front, well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable, usually without a plan for success. In this current rush, companies cannot keep up with supply, it’s a much easier math problem when you have something that people want (tokens) and you need to figure out profitability when including R&D.

a day ago

Aperocky

Demand of tokens is absolutely skyrocketing.

And unlike the traditional "this will replace humans right away", I think what this introduce is a lot of incentive to spread those token in places where there was never any incentive to hire a software engineer for previously. In turn, that will drive a lot of business activity in those area that will potentially fail given the current quality of the output.

This feels like a boom before bust scenario, and I'm not even sure if it will bust.

a day ago

skeeter2020

Maybe we need to focus on a better definition of "bust" but we will surely see something along the lines of the hype-cycle graph in AI; what technology has not fallen into the trough before (best case) reaching a more steady-state of use and growth?

a day ago

Aperocky

It's also funny because a bust require a 2 quarters time. AI cycle can be faster than that? My workflow is certainly not recognizable compared to past for every 6 months for the past 24 months.

Maybe we are closed to singularity, or maybe we'll just plateau somehow. But either case there are so much work to support the breakneck change that are not done because the change take priority every single time, there should be a lot of things to work on.

a day ago

ido

cars? airplanes?

a day ago

gdilla

the busting will come from the token consumers. so many disasters waiting to happen.

5 hours ago

WarmWash

>potentially fail given the current quality of the output.

The question is how big the fail is if you measure it in 3 month increments going back to late 2022.

a day ago

Aperocky

fails are beneficial to an economy. If there are no fails, you end up with Soviet Union.

As long as there are more amount of success, then it should be net positive.

a day ago

hirako2000

Tulips sales also skyrocketed.

Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today. Not in the speculating scenario that cardiologist could be replaced with models.

We see this new trend of agentic coding, again a promise software will be written that way going forward, despite the number of fiasco already experienced when trusting a model turned bad. The use case may provide value, but right now all it does is fullfil the push for token consumption all these AI leaders are advocating for.

a day ago

sempron64

It's ridiculous to call this tulips, in the sense of a speculative asset whose price depends on resale. A more similar recent example is the dotcom boom and bust based on building internet infrastructure, or the 2008 crash which was based on cyclical infrastructure overinvestment. These crashes were characterized by demand growth not keeping up with investment because the target markets were tapped out. Not clear when we'll get there with AI. The consumer market seems saturated on chatbots but we're not even close to saturated for b2b or self driving for example. And this discounts other new technological offerings which may unlock larger consumer markets (products where people are willing to pay $100 a month instead of 10 or 20)

All that said the dotcom boom is extremely analogous and that crash was quite bad.

a day ago

skeeter2020

dotcom was maybe 100B a year focused on the US and mostly VCs. AI is perhaps 250B global VC (with more than half of ALL VCs concentrated in one sector) and another 800B+ from non-VC. These numbers are basically a guess but structurally we are set up for something much, much worse.

a day ago

infecto

But unlike the dot com boom, demand for tokens has not let up and there is increasing demand. I don’t know where it falls, certainly companies don’t get or right and they either over or under build. With the current demand rate changes it’s hard to understand why you would stop building today.

a day ago

hirako2000

Demands for tokens exists yes. On one side you have huge demand for the infinitely subsidized tokens so that people can post a "unique" illustration when posting on social media, along with the text itself even.

On the other end we have professionals happy to pay a subscription for heavier use, to build something in the hope to sell it.

I figured I don't believe in value when my dad explained to me his mate fired his team once he realised he could just pay 20 bucks for his Gemini account and run his business. I asked, do you call this value add? He said it must be, since he can produce the same output with no staff.

There is a confusion between profiting from a circumstance and value creation.

You create value if, say, you cure a disease. That it takes you an army of staff or extract maximum profit from it is just a feasibility formula.

That you make the cure more affordable is value creation.

That you cure the same disease but increase your profit doesn't create any value, except to yourself, for a while

a day ago

Aperocky

> I figured I don't believe in value

Maybe you don't, but it's fairly obvious that a lot of things are changing and things are moving.

Maybe your dad's mate didn't have to expand on his business, good for him. Other business are expanding because they now can.

Will the positive overweigh the negative? Not necessarily, but to go "it's tulip" is the kind of argument so devoid of nuance that we shouldn't be discussing so on HN.

The overwhelming demand for token would not coming from people wanting a unique illustration - it would be from professional usage. In fact, I'm not even sure who is subsidized. The $20 subscription surely isn't being used fully across all members of that subscription.

a day ago

Throaway199999

^^^^^

a day ago

fsloth

Yeah, this is the difference.

2000's tech bubble was caused among other things over-investment to infrastructure and technology that had no users yet.

Totally different setup.

Does not mean AI boom will not turn to bust, but weak analogues generally don't help with understanding complex systems.

a day ago

matheusmoreira

> Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today.

Claude helped me implement a ridiculous amount of features in my programming language. It's helped me migrate the heap to an easily moveable index-based object space. It's helped me implement generators. It's helped me implement a new memory allocator. It's helped me fix a ridiculous amounts of bugs and make a huge number of small improvements everywhere. Its ability to provide me repository wide code review was a game changer for a solo developer like me. And it's doing so much more than that. I got more things done in the past few weeks than previous months even though I'm evaluating, learning, understanding and rewriting the AI output.

It's actually addictive to build things with Claude. The usage limits are starting to make me anxious, just like withdrawal syndrome. I applied for their open source max subscription program even though I'm too small for it because who knows, I might get in anyway and it costs nothing.

AI is quite literally a world changing technology. I hope the open models keep steadily progressing and that hardware remains available to all so we can run our own models on our own computers one day.

a day ago

Aperocky

Just pay for it, think of it like college tuition.

Just far cheaper (if you are in USA) and probably more useful in terms of job prospects.

a day ago

matheusmoreira

I am paying for it. I subscribed to the Pro plan about two weeks ago. The Max plan is a significant expense I can't justify. I straight up cannot afford the API prices as an individual.

a day ago

rhaen

Tulip futures skyrocketed, it was economic speculation on a useless asset, not supply and demand. Crypto is the analogy, not AI. Given that the major AI labs other than GDM are private, this is even more true.

Agentic coding absolutely blew up from demand, users are not being tricked into paying $200 a month, and they’re not complaining about hitting rate limits because it’s useless.

a day ago

aurareturn

  users are not being tricked into paying $200 a month
I can't believe people actually believe that people and companies are tricked into paying for tokens. My $20 Codex subscription is so useful, I can easily see myself paying $200 for it.

This belief is so common amongst AI collapse people online. I'm guessing these people have only used free ChatGPT or worse, they use Windows and get Copilot shoved down their throats?

Meanwhile, I'm flying around with a $20 Codex subscription doing everything from writing code, analyzing stocks, coming up with ideas, etc.

a day ago

fsloth

I'm paying $20 for Codex and $90 for the Claude Max plan. They are a "pry from my cold dead fingers" product for me.

IMO if someone tried this tech last time 6 months ago, or their only exposure is eg. via MS copilot, they do have a rational reason for skepticism. No technology of this complexity has improved this rapidly in my memory (well, ok, we had the CPU speed races from 90's to early 2000's).

a day ago

BirAdam

The CPU speed race might be the most apt comparison I've yet heard.

From the 80486 to AMD Athlon64 X2 and much of that progress was enabled by better EDA being run on the more powerful CPUs being made with each improvement.

Now, we have better models helping to create even better models.

a day ago

tartoran

Would you still pay if prices were to increase,say $1500-2000 monthly?

a day ago

aurareturn

Probably. I assume the value would drastically increase. Companies will definitely continue to pay for it. It's irreplaceable now.

a day ago

tartoran

How about if they plateau but prices skyrocket? Most companies would pay but if you're not working for a company that does pay for it, what's the line beyhond which you'd think twice about paying for it yourself? 500? 1000? 1500?

a day ago

aurareturn

Why would price skyrocket?

Let's say they have already plateau. But hardware continues to get better, right? So tokens should go down in price, not up. Since they're already 50%+ on inference today, better hardware would allow them to generate more tokens for less money.

I would pay $500 to start, build stuff with it, then keep going up the tiers as the stuff I'm building makes money.

a day ago

fsloth

Privately no. Professionally yes.

a day ago

gruez

>Seriously, what value are tokens providing other than justifying layoffs. Concretely. Today.

It's adding tests for me and doing medium complexity refactors that I'd otherwise have to spend hours on

a day ago

michaelcampbell

Same, and constructing at least drafts of huge documents that I can iteratively fine-tune that have (at least last week) saved me 10's of hours.

And based on reality (code) rather than my feelz of what I vaguely remember the code to have been doing in some long past.

a day ago

datsci_est_2015

These example put in the category of “best IDE ever created, by a wide margin” - but not “replacement for the programming workforce”.

a day ago

infecto

I know there is a large force on HN that want to deny the value of tokens and I know it’s anecdotal but the writing is on the wall. If it’s not valuable to your workflow today it will be soon. I already have tests being written, automated hooks into bugs where an initial PR gets generated with a potential fix. It’s far from perfect but junior engineers are far less productive.

a day ago

rileymichael

> there is a large force on HN that want to deny the value of tokens

there is an even larger force on HN that financially _needs_ the value of tokens to be inflated (so much so that bots have overwhelmed the site)

a day ago

infecto

That’s not me. I am simply an engineer who gets a ton of value out of these tools.

a day ago

therealdrag0

Really? Do you think there are more bots and employees of AI stakeholder companies than there are vanilla engineers on the site?

a day ago

rileymichael

by far. at this point there are very few tech companies without exposure to AI

a day ago

jdmoreira

are you seriously comparing AI to tulips? I don't even know what to say. Even if you are very bearish about the technology certainly you can't be this detached from base reality. Yet here we are

a day ago

somewhatjustin

> Seriously, what value are tokens providing other than justifying layoffs.

Coding, writing, summarizing, translating, data analysis, customer support, test generation.

a day ago

senordevnyc

In the last eight months, my solo SaaS has gone from $0 to $325k ARR, and growth is accelerating. I run tens of billions of tokens per month through automated pipelines for the product itself (which replaces an ultra niche human-driven process in a very non-technical industry), plus probably low billions more per month for coding, systems operation and management, data analysis, etc. And I feel like I'm just barely scratching the surface of what today's models are capable of.

19 hours ago

vekker

> Seriously, what value are tokens providing other than justifying layoffs

Like the OP said, it's incredible how polarizing this debate is. When I read comments like yours, I feel like a significant part of the global workforce in IT must be living on another planet? Or they never really used Claude Code, Codex, OpenCode, ... intensively before because of company policies?

I legitimately am at least 10x more productive than a year ago, and I can prove it in number of commits and finished monetizable features developed per day. Obviously my workflows still very much require an active, constantly context-switching human-in-the-loop, but to me there's absolutely no question both output volume & quality have skyrocketed.

a day ago

classified

> 10x more productive

That claim is totally worthless without you providing concrete information how you measured that.

a day ago

hirako2000

And that's my point about value. That engineers can spit out far more code, or that they don't have to think much is surely precious convenience.

Value add so far lacks evidence.

Layoffs. It justifies them to the public. I'm not certain it grants them as it contradicts a principle of enterprise: scale, as much as you possibly can.

If tokens provided value today, we would be hiring more engineers to review their output and put things together.

a day ago

vekker

I literally wrote how I measure this in the post you are replying to: #commits which is admittedly a worthless proxy for productivity, so, more importantly, number of finished production-ready features delivered.

That number is at least tenfold of what it was before, simply because I can run a lot of gruntwork in parallel now without wasting brainpower and focus on that stuff.

a day ago

Throaway1975123

I created 5 websites this year and am working on 3 prototype games. For free. Without any knowledge of coding beforehand.

a day ago

hirako2000

Value?

There are millions of other wanna be engineers doing exactly the same, assuming demand will scale as much as the offer.

What returns are you getting on those?

Let me create 500 websites, deployed for free, I hand that over to you by end of day. Will you give me a cent per piece? If so, happy to do business with you.

a day ago

Throaway1975123

The value is obviously to the people who will use this to replace engineers.

I would happily pay $200 a month for this. Luckily I dont need to, it's free.

Literally every game and website that I would have had to pay someone else to make I can now make myself. There's no value in that?

A year ago the best free LLM couldn't even give me a basic gridmap and collision. Now it can give me a full RCT style prototype & editor in 20 iterations.

I can only imagine what improvements we will have NEXT year!

a day ago

hirako2000

> luckily I don't have to. It's free.

Ponder that for a minute.

There are over 2 million games, for Android alone.

That you weren't making games before the advent of LLMs makes it cool for you to build, and at no cost. But people have been able to make games without them and already grew the market to saturation.

If the outcome of LLMs is that we get more games, it won't imply that people will consume more games. Most games never get played anyway.

a day ago

Throaway199999

There's nothing to "ponder" as you so patronizingly put it, and your stats on gaming are self-evident.

Op never said they're selling games. They said they're making their own games and websites for a fraction of the cost (even $0). That's amazing value. And it's just getting better.

a day ago

hirako2000

that $0 is meant to go on the side of the value add that justifies the sort of funding we are seeing?

I didn't mean to patronize, sometimes self evidence isn't trivial to notice.

a day ago

Throaway199999

The funding is in anticipation of AI becoming so good that mistakes are only seen in the most complex output. In consumer applications, it's hard not to see that happening, given the exponential improvements of the past year. Whoever gets there first can capture the market.

21 hours ago

Throaway1975123

Tulips had literally no economic value. LLM's do.

a day ago

drakythe

I say this as someone who has used them to boilerplate/scaffold a bit of code by this point: Economic Value of LLMs is debatable, if only because they're being too broadly applied.

a day ago

Throaway1975123

Debatable sure. Not 0. Tulips are 0. They add nothing to anyone's output. LLM's are not. LLM's are not tulips.

a day ago

skeeter2020

This is changing the narative. Nobody really cares about tulips and some dumb throwaway comparison. Unless LLMs are worth an awful lot the math here does not make sense. That is both debatable and important.

a day ago

hirako2000

Since I brought up the tulips: People do care about Tulips. They do have value. So do LLMs. How many people will remain willing to pay for them, and how much, is what we call speculation.

a day ago

Throaway1975123

No it isn't changing the narrative. Tulip bulbs were a huge bubble based on speculation, completely. No one ever used a tulip to create a piece of software, or anything else. Their economic value was precisely 0. The whole thing was based on a bubble. LLM's may be IN a bubble, but they aren't tulips.

a day ago

boriskourt

> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.

> For the data center build outs, demand for tokens is still exceeding supply.

Can you provide any numbers for this?

a day ago

wongarsu

I can get Kimi K2.5 inference on openrouter for about $0.5/MTok input + $2.5/MTok output, from six providers that have no moat besides efficiently selling GPU time. We can assume they are doing so at a profit (they have no incentive to do this at a loss), giving us those numbers as the cost to serve a 1T-a32b model at scale.

Now we don't know the true size of any of the proprietary models, but my educated guess is that Sonnet is in about the same parameter range, just with better training and much better fine tuning and RLHF. Yet API pricing for Sonnet is $3/MTok input + $15/MTok output, exactly six times as expensive. Even Haiku is twice as expensive as Kimi K2.5.

I find it difficult to believe in a world where those API prices aren't profitable. For subscription pricing it's harder to tell. We hear about those that get insane value out of their subscription, but there has to be a large mass who never reaches their limits. With company-wide rollouts there might even be a lot of subscription users who consume virtually no tokens at all.

a day ago

yobbo

> We can assume they are doing so at a profit

This is false. We may assume it's the most efficient way of generating revenue given their GPUs, but their overall profitability will just be a guess. They would still have incentives to run hardware at maximum, even when it's uncertain to eventually recoup costs.

> a world where those API prices aren't profitable

A lab with employees and models in training has other costs than the operating expenses of a GPU farm.

a day ago

aurareturn

Why would a company sell inference on Openrouter if they're not profitable? Except for Grog/Cerebras and a few other hardware companies looking to showcase their new chips.

If they're losing money and have no VC backing, they'd just turn off the lights.

a day ago

financltravsty

The actual inference is operated at a 95%+ margin.

a day ago

FiberBundle

This is like saying that innovative medical drugs could be sold at a profit if only there was no patent protection and the innovative companies would still invest in R&D. Yes, on a token level pure inference costs might be profitable, but the frontier Ai labs will surely have to recoup their R&D investments at some point.

a day ago

jerojero

Companies doing foundational models need to cover the cost of training which is much more expensive than training something like kimi.

a day ago

wongarsu

Yes. I would not consider Kimi a particularly good model relative to its size, and making a SotA model is a lot more expensive. But training costs are explicitly excluded when talking about the cost to serve tokens

a day ago

gruez

>Companies doing foundational models need to cover the cost of training [...]

But that's moving the goalposts? The original claim was on inference itself, not the whole company.

> The cost to serve tokens is absolutely profitable today and that’s been true for at least a year.

a day ago

lbreakjai

But that's the same as thinking "This bar is selling a cocktail for $15. I could make it at home for 30 cents. They're making $14.7 dollars of profit per cocktail, the owner must be a millionaire now!"

Everything is profitable if you ignore the costs.

a day ago

ZitchDog

> they have no incentive to do this at a loss

Are you sure? Surely there is a lot of interesting data in those LLM interactions.

a day ago

wongarsu

Many of them are promising not to store any of this. Of course we have to trust them, for all we know they are all funded by various spy agencies

a day ago

KallDrexx

The problem I have with this analysis is it's missing the multi-dimensional aspect of "is this profitable".

It's fair to say that if all these operators are competing for tokens, that the OpenRouter token operator (not sure the exact phrase but the people running the models) are accounting for some level of margin.

However, how many of these are running their own data centers and GPUs?

If they are running their own infrastructure, then it's not a simple equation of if each specific token set is profitable, since it needs to account for the cost of running the data center. It could be that they believe that it is profitable in the long term by utilizing the long tail of asset depreciation, but that isn't guaranteed.

IF they aren't running their own infrastructure, then it's much easier to claim that it's profitable and has a margin (outside of running their servers to manage the rented infrastructure).

HOWEVER, a lot of data centers have some pretty crazy low prices for GPUs that may be vying for user base and revenue over profitability. In these cases, if data center growth starts slowing due to slower buildout then it's very likely GPU prices go up and inference stops becoming profitable for the open router owners.

So long term it's not clear how profitable even these open models are.

OpenAI and Anthropic definitely fall into the latter category too. Their infrastructure requirements are much higher than the open models, and they are being given huge discounts so Microsoft/Amazon/Google can all claim revenue (since they have profitability coming from other parts). It's not clear if OpenAI and Anthropic models would be profitable at inference if they were paying rates that cloud hosts would make a profit from.

There's just way too many dimensions to this scenario to flat out state that open router proves inference is profitable at scale.

a day ago

ACCount37

Check the token prices for open weight LLMs at various independent inference providers.

That gives you a very good estimate of "how much can you serve the tokens of a model of the size N for while making a profit".

Now, keep in mind: Kimi K2.5 is 1T MoE. Today's frontier LLMs are in the 1T to 5T range, also MoE. Make an estimate. Compare that estimate with the actual frontier lab prices.

a day ago

lolc

I don't think it's as easy as looking at open weight API prices. We don't know whether the operators are making a profit on all the hardware they bought. Maybe the prices we pay just cover electricity. And it's not even certain that running costs are covered by API prices: The operators may be siphoning content and subsidize from selling that.

In the current volatile environment, the API prices are more of a baseline where we can assume it can't be much cheaper to operate these models.

a day ago

aurareturn

That doesn't make sense in this environment because everyone is compute constrained with huge backlogs they can't fulfill. If these inference providers aren't making any money, they'd simply sell their GPUs to those who are starved for compute.

a day ago

infecto

Most/all private labs have cited inference is profitable. This was happening before the large push to scrap plans and largely charge folks the underlying api rates. Second take a look at the pricing of open models. Now certainly it’s not direct 1-1 comparison but we can use it as a baseline. Now of course folks might not be telling the truth but one of those situations where I see too many markers on the true side.

For supply look at outages and growth rates at companies like openrouter. The demand is growing every week.

a day ago

paulddraper

Anthropic has said inference is profitable. That’s a biased source, but the math pencils.

This is why switching to local open weight models saves a lot of money. (Even though it’s not apples to apples.)

a day ago

drakythe

Anthropic also recently tweaked their usage limits to discourage use during peak hours. Why would they do that if inference was profitable?

a day ago

infecto

Don’t confuse inference (api usage) with the consumer plan products. When people say inference is profitable they are referring to the cost to serve a token via the API. The consumer products are absolutely a question mark on profitability and as we see with most of the business and enterprise plans, going away for pure on demand use (api cost) full time.

a day ago

strangegecko

Profitability doesn't imply infinite ability to scale. Of course they will want to prioritize their most profitable customers when they hit capacity issues.

a day ago

aurareturn

They do it because their demand is higher than the compute that they have available to them. Their GPUs must be melting during peak hours so they're encouraging people who move their workload to off peak hours if possible.

This is the opposite of an AI bubble burst.

a day ago

paulddraper

Those are subscription plans. They tweaked the limits/periods included in the subscription. Having higher limits for subscription plans didn't give them any more revenue.

a day ago

financltravsty

Their infra team is very understaffed and they are reacting to the public backlash of "no 9s?"

a day ago

nyeah

Can you give a few penciled numbers?

a day ago

paulddraper

You can rent a H100 GPU for $4/hour. [1]

300k tokens for that hour.

OpenAI charges $6.

Those are pessimistic assumptions.

[1] https://lambda.ai/instances

a day ago

hajile

Can you keep that GPU 100% saturated at least 16 hours per day every day of the week?

If not, you aren't breaking even.

a day ago

paulddraper

Note this is also assuming you

(1) Rent your GPUs.

(2) Pay list price, no volume breaks.

(3) Get only 85 tokens/sec. Realistically, frontier models would attain 200+ tokens/second amortized.

Inference is extremely profitable at scale.

a day ago

aurareturn

Assuming 80GB H100 and you inference a model that is MoE and close to the size of the 80GB VRAM, you're going to see around 10k tokens/second fully batched and saturated. An example here might be Mixtral 8x7B.

You're generating about 36 million tokens/hour. Cost of Mixtral 8x7b on Open router is $0.54/M input tokens. $0.54/M output tokens.

You're looking at potentially $38.88/hour return on that H100 GPU. This is probably the best case scenario.

In reality, inference providers will use multiple GPUs together to run bigger, smarter models for a higher price.

a day ago

drakythe

3.99 at 8x instances, with a minimum 2 week commitment. Good luck getting 70% usage average during that time. Useful when you're running a training round and can properly gauge demand, not so great when you're offering an API.

a day ago

infecto

Is it not a good penciled number? It helps set the directional tone that at inference cost is being covered.

a day ago

drakythe

It says the numbers are theoretically possible. Requiring a 66% usage to break even when 100% usage will piss off customers by invoking a queue means it’s a balancing act.

“Technically correct. The best kind of correct”. So inference may technically be _capable_ of being profitable, but I have question’s about them being profitable in _practice_.

a day ago

iterateoften

According to open router token demand is growing at something like 10% a week

It’s insane

a day ago

infecto

I wish this was higher up. I have been tracking the same since Thanksgiving ‘25 and the growth is unreal. Again I don’t know where the cards fall maybe the industry overspent on capex but it’s at least easier to see why they are spending based on demand. The risk of being left out is greater than overbuilding.

a day ago

coldpie

I do wonder how much of the apparent demand is driven by companies automatically running these things when users didn't actually ask for it. For example every web search I make now has an AI response that I scroll right past. I'm sure that counts for someone's token usage data, but I got zero value from it. This is happening in almost every software product now.

a day ago

irke

Tokens as a metric is the analogue of users as a metric.

In the end value per user is what matters in relation to being a healthy going concern and valuation in relation to Meta for example. Value per token is what should matter too - after all that’s what people are paying for.

a day ago

chrisweekly

> "decades of overinflated engineering salaries"

'Overinflated' relative to what? You make some good points but I don't accept this as a premise.

a day ago

schmidtleonard

Overinflated relative to the wet dreams of the ownership class.

a day ago

gruez

It's not exactly stuff of "wet dreams of the ownership class" to say that of the possible white collar careers, software engineering is pretty hard to beat in terms of salary vs work you need to put in.

a day ago

schmidtleonard

This is a story of other careers having salaries pushed down relative to inflating essentials and the resulting economic surplus being squeezed into asset portfolios. It's a story of rich people getting paid for being rich in proportion to how rich they are and soaking up more than 100% of economic growth for the last 50 years. Not a story of software engineers working for a living and getting what would have been a blue-collar salary for it.

a day ago

mono442

Average salaries for software engineering seem higher compared to other professions because the jobs are mostly in the most expensive to live cities. There's no swe jobs in smaller towns but they're jobs for e.g. accountants.

a day ago

financltravsty

Compensation is usually more tightly coupled with leverage, rather than "work."

a day ago

fcarraldo

Well, not GP, but I do. Let’s look at the numbers:

Median senior SWE salaries in SF: https://www.levels.fyi/t/software-engineer/levels/senior/loc...

Median income in metro areas: https://www.cnbc.com/2024/07/11/the-median-salary-for-the-25...

Engineering salaries are significantly higher than nearly every other industry on average and on median. Much of this is driven by VC funding rather than sound, profitable, bootstrapped businesses with sustainable profit margins.

Engineering salaries have also been driven upwards significantly the past ~10 years (since the post-2008 crash recovery), while wage growth in the US is mostly stagnant. I don’t have a source handy for that, but there are plentiful studies.

Outside of the US this may be less true, but I took GP’s “most of us on HN” to mean people who work in US tech companies which are primarily concentrated in high COI areas.

a day ago

marcyb5st

Isn't salary a proxy of how hard to replace one person or a group of persons is or how valuable they are?

There was a surge in demand for SWEs and scarcity brought salaries up. Are them too high? Hell no. On average, my colleagues and me generated ~2M$ each in 2025 for our company, while we get payed a fraction of that (grants and bonuses included). If you look at net income per employee we are at around 700k each in 2025.

Additionally, employers try their hardest to drive costs down (eg. offshoring as much as possible, everyone doing layoffs at the same time, ...) and average/median salaries remained high. If the salaries were overinflated those numbers should have came down I believe. The fact that they didn't makes me think that it still is a scarcity problem not an overinflation one.

a day ago

gruez

>There was a surge in demand for SWEs and scarcity brought salaries up. Are them too high? Hell no. On average, my colleagues and me generated ~2M$ each in 2025 for our company, while we get payed a fraction of that (grants and bonuses included). If you look at net income per employee we are at around 700k each in 2025.

So by that logic, housing in coastal cities also aren't "overinflated"? After all, like SWEs, they're they're also scarce and in demand. They're also providing enormous value to the people buying/living in them, otherwise they'd be living in Oklahoma or whatever and paying a fraction of the cost.

a day ago

marcyb5st

Maybe we give different meanings to the overinflation word. I see it as something that is speculative/shady in nature. Is housing overinflated? Probably in some places for sure because those who already have a house or invested in real estate wants to cut down supply to raise prices.

Is the same on the job market? I don't think so. I never heard any SWE saying "let's scare people away from a CS career so we can bargain for higher salaries". The opposite is true though. Companies participate in career fairs, pre-uni events to make people gravitate towards a CS careers, ... so with a higher supply each employee loses a bit of bargaining power.

Small excursus, this very fact was taken to the extreme in 2022 when everyone did layoffs at the same time despite the numbers being still great. If you put 300k people on the street at around the same time you can hire some of them for way less money as they now lost all leverage (since there are other 299.999 people waiting in line for a job).

a day ago

B56b

Ya, that sounds right to me. Coastal city housing is very supply constrained, part of why it's so expensive, but it is hugely in demand and provides tons of value to many by letting them live near high paying companies. Unless by "overinflated" you mean a constrained supply/demand curve?

a day ago

rileymichael

> Engineering salaries are significantly higher than nearly every other industry on average and on median

now compare the profit per employee at tech (software engineering) companies and those industries..

a day ago

fcarraldo

At the top end (say, top 100 tech companies) it’s pretty high indeed. Public companies, for sure, as otherwise their stock price would tank. It’s not uncommon in this industry to have margins above 70-80%.

But there are thousands if not tens of thousands where the profit per employee is minimal or negative.

I can’t find a source for all tech (the data wouldn’t exist for private firms anyway) but I think it’s telling to look at this list, scroll down to about the middle and look around at salaries you or your colleagues are pulling. Software revenues are certainly high but the industry is afloat because of these high margin businesses creating returns so that low margin businesses can exist. Without the massive infusion in upfront capital, very uncommon in other industries, it’s simply not sustainable.

Typically a market that’s buoyed by its top performers but has significant amounts of capital tied up in under performers is called “a bubble”.

https://www.trueup.io/revenue-per-employee

a day ago

infecto

Thank you for saying it better than I could have. It’s probably an unnecessary jab but I know how well I benefited financially in an industry where not much was expected in terms of output, lavish perks and huge base salary and stock compensation. Absolutely some companies are extremely profitable per headcount but I look at the sea of failures and how well engineers have generally done. It sets the tone for this massive negativity I see around AI when so many of us have benefited from VC money that failed.

a day ago

malfist

> The cost to serve tokens is absolutely profitable today

How can you possibly say that? Everyone knows that's not the case, these companies are losing money every day selling tokens. Revenue is not the same thing as profit.

a day ago

infecto

Don’t confuse what I say. Bottom line these companies are not profitable yet but it is profitable to serve a token via the API. They have increasing demand, not enough supply, models are getting better on quick timelines. For sure there may be some losers but it’s not hard to see that that token serving can be a profitable activity.

a day ago

scrollop

This is a claim in the wind until you provide evidence.

a day ago

jeromegv

Yep, especially if we look at what happened just last week, both Google and Anthropic have dropped how much you get out of your existing plans.

a day ago

infecto

Don’t confuse plan changes with profitability of inference. When people talk about the cost to serve a token and it being profitable we are referring to API cost not the plans which absolutely subsidize some level of use. Hard to know what is breakeven on plan math.

a day ago

surajrmal

That's not a necessarily profitability thing as much as a demand thing. The only way to improve the supply for those willing to pay more is to take it away from those paying less. Once supply catches up to demand things will change

a day ago

dist-epoch

There are private companies which rent/buy GPUs, run open-weight LLMs on them and sell the tokens. They absolutely make profit, and their clients think they get a good deal and are buying the tokens.

a day ago

naravara

I think they’re losing money because they have to amortize the costs of training the models in the first place, which is where most of the resource sink is.

This is why they were freaking out about DeepSeek just taking the trained model weights and slapping an interface on it.

a day ago

malfist

Thats like saying a restaurant is profitable because they're making money selling meals if you ignore the costs of ingredients.

Of course they are profitable if you ignore their cost to bring a product to market.

a day ago

infecto

The problem with that comparison is restaurants largely don’t have much room to adjust price or optimize cost. The AI industry is too new with many unknowns right now so investors are willing to take risk. For the hyperscalers the bet is that being left out is going to be a greater loss than overbuilding.

a day ago

naravara

That’s the wrong analogy. Model training is more like the setup costs of developing the menu and training staff. What’s driving the costs is important when talking about financial sustainability. If it’s mostly coming from optional R&D investments instead of the direct costs of producing the food then you can simply not exercise the option and be profitable. If it’s more coming as a variable cost that scales with each meal served that’s a very different situation.

Yeah it should be factored in, but it’s a different set of implications for long term sustainability. They don’t actually need to test and optimize a new menu every day or week. If they decide to just stick to the same one longer they can get way more return from each dollar spent on development. It’s just that right now the rate of improvement you get with training is really high and nobody can afford to fall behind their competition.

a day ago

malfist

These companies are continually training new models. This is not a long term amortized cost, it's actual COGS.

Yeah, sure you can ignore the cost of purchasing the building for the restaurant for most profitability calculations, but if every year or two you were tearing down your old building and building a new one, you better believe that has to be in your profitability calculations.

a day ago

s1artibartfast

I think the relevant question to define the counterfactual is what would happen if they stopped training.

If you can simply not remodel your restaurant and keep making money, then yeah, it makes sense to call it profitable.

a day ago

Tade0

My main worry is - once this is all over, the market consolidates and using LLMs will become a requirement in job listings, what's the highest price per million tokens companies will be able to charge us?

Currently on a given day I'm chewing through approximately the equivalent of my lunch money, but where there's opportunity to extract wealth, someone will find a way to do it.

a day ago

h14h

My (potentially naive) take is that open models will save us. The biggest markets for LLMs (e.g. coding) are narrow-enough to be served well by smaller models with proper RL. Cursor's Composer 2 (created from a Kimi K2.5 base) is a great example, and I expect it to be the first of many.

The wealth of great open models provide an excellent base for fine-tuning, distillation, and RL. I see a lot of untapped potential in the field of bespoke, purpose-built models that can be served far more cheaply than the frontier competition. I would not be surprised if we see frontier-adjacent experiences running comfortably on a Mac Mini by year end.

With frontier models seemingly hitting diminishing returns in quality, I struggle to see a world in which gigantic, expensive, general-purpose models don't become increasingly niche.

a day ago

bluedays

It's already a job requirement for a bunch of places, they're just not listing it. I lost out on a job recently because I haven't used cursor ai.

a day ago

Tade0

My friend has been looking for a job for the past few months and the other day he was given an HR LLM Agent to talk to.

He contacted the company saying he's not going to do that, to which they replied something among the lines of "sorry, that's our process".

13 hours ago

dist-epoch

Jensen is already talking about $1000/mil tokens soon.

But there is no real higher limit. Imagine a LLM which could answer the question "what does my company need to do to beat the competition?". And then realize that the competition asks their LLM the same question. So now everybody is bidding the price up or using more tokens to get a better answer

a day ago

Tade0

This is the kind of bullshit I'm worried about the most.

Nvidia has a de facto monopoly in the datacenter-tier GPU market. He says this sort of stuff because he knows he can keep jacking up prices, because the cost will be transferred onto consumers - mainly software engineers.

13 hours ago

dist-epoch

If those software engineers keep buying it must mean they get more value out of it than they pay, right?

11 hours ago

irke

Complete nonsense.

In that world there’s no reason for a business enterprise to exist.

a day ago

SirensOfTitan

This is a classic HN mistaking the map for the territory. R&D and capex absolutely figure into de-facto profitability and sustainability for AI labs, despite their separate treatment in accounting.

> well most of us here on HN have benefited from decades of overinflated engineering salaries being paid by often companies that were not profitable and not only unprofitable

This is a really concerning perspective: people were paid what they were worth. Software is or was one of the few remaining arenas wherein a person can find a middle or upper middle class lifestyle consistently.

I will also note: a startup raising an 8 MM series A and eventually fizzling out is not the same at the hundreds of billions invested in these AI companies without a path to profitability. It is utterly absurd to pretend these are the same thing: any company ingesting that much cash needs to justify its capacity to survive.

a day ago

fcarraldo

> Software is or was one of the few remaining arenas wherein a person can find a consistently

Software salary inflation and expansion has made this the case. Tech’s accessibility to the educated has accelerated gentrification massively, rising up prices on rent and food. While the statement is correct, tech’s contribution to income inequality is part of the issue. If you’ve lived in Austin or Chicago (especially Austin) prior to ~2010 you’ll have seen this first hand.

a day ago

reverius42

I don't think there are enough well-paid tech workers to affect things like the (national) market for food. Local rent markets are at least partly explained by this; I agree that the $3M houses near Palo Alto, CA are because of Big Tech salaries, but not the price of ground beef.

a day ago

fcarraldo

Not at the nationally owned chain grocery store, but at local establishments it’s certainly an issue and prices out longtime residents who don’t work in this industry.

Rent prices push everything up in the local market. Housing rents impact business rents in an area, as well as what the business’ owners and employees need to maintain their own lifestyles. People who live there now can pay more and will, so prices go up. But the local shop owner isn’t getting rich, they’re still struggling as everything around them rises.

a day ago

ForHackernews

> any company ingesting that much cash needs to justify its capacity to survive.

What, why? There are tons of low-margin capex-intensive business out there.

I think AI will end up like being like hosting. All the models will converge to being pretty-decent and the companies will have to compete on efficiency since they are selling a generic commodity.

You can already see Anthropic fears this scenario since they try so hard to make people use their first-party tools rather than plugging Claude in as a generic part of a third-party stack.

LLM hosting is the next VPS.

a day ago

guzfip

> Software is or was one of the few remaining arenas wherein a person can find a consistently.

I want to add something additional to this: it is one of the few fields that can afford middle or upper middle class lifestyle and is accessible.

I have no doubt if I could redo my life with the necessary resources I’d be more than capable of putting myself through med school and gone with a secure career that paid more than I ever made in software.

But at this stage of life? I don’t have the time or money to spend a decade+ paying some institution tens of thousands of dollars to hopefully maybe have a real career.

Once software as a career dies, I suspect many will find themselves locked out the middle class for generations.

a day ago

WarmWash

It was kind of a flash in the pan moment where you could leave your retail floor manager job, crash course this thing called "javascript" in a 3 month class, and then get hired for a six figure remote job if you could choke out a mildly competent github repo.

a day ago

infecto

Exactly. I don’t know why folks take so much offense to it. You could absolutely do just as you describe. Spend 3-4 hours truly working while enjoying the lunch today sessions and in-house lunch and barista. I definitely benefited from this and I am not ashamed of it but absolutely it was this weird moment in time.

a day ago

9rx

> I suspect many will find themselves locked out the middle class for generations.

On the other hand, once software as a high paying career dies there will be nothing to prop up the status quo (high cost of housing, for example) so the middle class will return to being much more accessible to modestly paid jobs.

a day ago

infecto

Oh come on there are no “classic HN mistakes” here. Inference is profitable but bottom line is not yet. This is a very young industry and unlike those of the past, it’s much easier to picture a possibility of profitability. It’s absolutely different in that the marginal cost scales linear but solving for the R&D portion of a product where supply cannot keep up is a lot easier than some SaaS where the underlying product is not being used.

The salary jab was probably a little harsh.

Your ending is a bit of a fizzle too. There are many capex intense businesses that do just fine.

a day ago

keybored

> This is a really concerning perspective: people were paid what they were worth.

Even interpreting what-they-were-worth in the usual sense, I’m not so sure about this. We have seen wage collusion reported by the usual US West Coast-based companies. And some news on here[1] have reported that some engineer with a salary of $100K[2] might be producing $1M of value. And even factoring in the usual “but benefits and overhead” comes out to a solid factor of profit per programmer/engineer.

Despite that the sense I get (only from this site since that is my only reference) is that the so-called overpaid engineers are incredibly content to just have this happen to them. As long as they are paid well compared to other workers, it’s fine. No matter the profit factor. In fact, the discourse is very much focused on how “privileged” they were if the tide ever changes. Instead of realizing how much value they provided, collectively.

Outlets for capturing more of the value they create is entrepreneurship (Hello HN). Never any collective organizing. And entrepenurship is easily bought via aqcuisition.

Collective bargaining would have been relevant in case they ever get automated... by the very software they co-created.

One could imagine that this “privileged” collection of programmers could have served as a vanguard for the collective good of programming professionals as well as collective ownership of software goods, using their privilege to that end. The former never happened, and the latter is partly realized in people’s free time (see the OSS maintainer in Nebraska meme).[3]

[1] All from recollection since this is just news from the Frontier to me

[2] Of course the pay might be much higher now; this might have been a while ago

[3] when it isn’t simply exploited by corporations just using OSS without giving any back; a logical turn of events when no license or law forces them to contribute back

a day ago

guzfip

> As long as they are paid well compared to other workers, it’s fine.

Well I’m sure they’ll be thrilled to know they can collect $100 a week more in unemployment benefits than their neighbor.

a day ago

keybored

I wasn’t alluding to them resigning or whatever this comment is referring to.

a day ago

reverius42

I don't think you get to collect any unemployment if you resign.

a day ago

keybored

Really.

4 hours ago

reverius42

Yes, really. At least in the USA, you only get to collect unemployment if you are laid off -- if you are fired for cause, or leave your job of your own volition, you are not eligible.

4 hours ago

keybored

Do you have a point that you want to make?

3 hours ago

9rx

> This is a really concerning perspective: people were paid what they were worth.

The parent comment doesn't discount that, only pointing out that "what they were worth" was inflated due to a speculative environment. Wherein lies your concern?

a day ago

lotsofpulp

That prices change from one point in time to another is a trivial fact.

“Inflated due to a speculative environment” is not an accurate way to frame labor prices that held for many years. At that point, the prices were simply high due to high demand relative to supply (compared to other types of labor).

a day ago

9rx

> At that point, the prices were simply high due to high demand relative to supply

That goes without saying. The investigation here is into demand. Which was said to be overinflated due to speculation. As noted, many of the companies hiring the developers did not have viable businesses.

a day ago

SirensOfTitan

I think calling it inflated is to play to a narrative that labor was overvalued broadly in tech.

Salaries across industries in the US have remained flat since the 1970s. Calling the one sector that can provide access a middle class lifestyle inflated s to play into a narrative capital is eager to tell, even if OP didn't intend that.

a day ago

9rx

> Salaries across industries in the US have remained flat since the 1970s

What do you mean? The real (meaning adjusted for inflation) hourly wage in the US has increased by around 20% since 1970.

What has changed since the 1970s is that wages are no longer coupled to productivity. Perhaps that is what you are thinking of? But that should be an obvious truism for anyone in tech. We create the very things that cause that to be the case!

a day ago

keybored

> We create the very things that cause that to be the case!

What happened in the 1970’s was the NeoLiberal shift and wasn’t caused by software.

a day ago

win311fwg

That NeoLiberal shift did not take place in a vacuum. It was a product of the world around it. It absolutely was caused by tech.

If we — those with the power to build the productivity creators — took a stand and said "we refuse to create tech for the interests of the few" it would have never happened. But, instead, we welcomed it and are responsible for it.

a day ago

keybored

The corollary of “if we took a stand” is that Capital took a stand and collectively undid a lot of the gains of the post-WWII social democratic order.

So no. It wasn’t caused by tech beyond the uninteresting factors like modern society being complex and, of course, that tech developments influence things (pretty much all things).

a day ago

win311fwg

The productivity gains we've seen above the capacity of human productivity would have been impossible without tech. It absolutely was caused by tech.

The benefactor of those gains was also entirely decided by those who created the tech. We could have given use of that tech to everyone. In some cases we actually did (e.g. open source), but in most cases we gained (at least partial) ownership of the capital so it was in our best personal economic interest to keep it for ourselves and our close friends.

a day ago

keybored

> The productivity gains we've seen above the capacity of human productivity would have been impossible without tech. It absolutely was caused by tech.

Would have been impossible without and being caused by are different things.

The sense of being “caused by” in a political context are the people who have the power to direct things. Which are not necessarily the people who implement something.

> The benefactor of those gains was also entirely decided by those who created the tech.

You assert that they were decided by. Based on what?

The vast majority of tech work was done in employment, either for some government or for private entities. The private entitites were controlled by Capital. The governments were controlled by democratic forces and Capital.

> We could have given use of that tech to everyone. In some cases we actually did (e.g. open source),

Again I reference the meme of Overworked Nebraska OSS Maintainer.

The impressive work done on OSS by tech workers directly have been done in their free time. The bulk of OSS work done by people as a living is probably through corporations like e.g. Intel working on the Linux Kernel.[1]

That impressive free time work has gotten the reputation as a treasure trove for the highly motivated and tech literate. In contrast to something that regular people can plug-and-play as an alternative to Big Tech dominance.

> , but in most cases we gained (at least partial) ownership of the capital so it was in our best personal economic interest to keep it for ourselves and our close friends.

Yes, well played. For those that got away with their financial-independence millions. For the rest, well, I guess they never managed to learn the moral lesson of Monopoly.

[1] Or am I wrong here? I could be off-base.

a day ago

win311fwg

> in a political context

While you are right to recognize that there was some attempt to inject political context, it was not there originally, and is not the main discussion taking place. The fact that wages and productivity have become decoupled is not inherently political. It is but simple mathematics. Tech is the cause for the decoupling; it is why we have been able to become continually more productive and at an accelerating rate.

> The vast majority of tech work was done in employment

Yes, but generally even where employment is present tech workers also demand a share in ownership (e.g. stock). Tech doesn't invent itself. At least not yet. The workers have held the cards. Even those who haven't won the lottery are still in a pretty good economic position, relatively speaking.

a day ago

keybored

> While you are right to recognize that there was some attempt to inject political context, it was not there originally, and is not the main discussion taking place.

I don’t care if anyone wants political context to be there or not. Political context is not some subjective choice that the participants in a discussion can choose to be the case or not, like some alternative history exercise.

This political context (i.e. reality) called NeoLiberalism is so well-researched and argued that I can just call it NeoLiberalism and even a forum full of techheads don’t bat an eye. Which is more than can be said for your incoherent nuh-uh where both:

- Technology just determines things by itself

- And (also) the rank and file peons who implement technology could have forced something better on the world (than the pile of shit that we have)

6 hours ago

thereitgoes456

> The cost to serve tokens is absolutely profitable

Can you explain why you know better than the analyst at Cursor cited in this article?

a day ago

iterateoften

Open router is an upper bound of compute cost for the open source models. So people assume that opus and sonnet really isn’t sucking up 10x the resources because open source models aren’t 10x worse. Idk if it’s true or not, but haiku is $5/m tokens and it is much worse than the $2-3/mt models imo

a day ago

vbarrielle

Openrouter is a startup, what's the indication it serves token at a profit? It could be serving them at a loss to show growth.

a day ago

infecto

Can you cite your source of an analyst at Cursor. I read the article and looking through the boatload of links but struggled to find what you are referring to. Ty

a day ago

noelsusman

That analyst was talking about subsidizing tokens through the subscription plans, which is a different claim.

a day ago

infecto

Ty for sharing and agree. I think there is confusion with some folks in the comments for this post confusing inference profitability and plan profitability. Most plans as we can tell are probably teetering the line of profitability and that’s why we have seen some like Cursor really tighten how many tokens you get.

a day ago

Eridrus

The article is just helpfully illustrating how artisanal you can make your slop if you really try!

a day ago

nickphx

step change? how? profitable? where did you read that? people want tokens? really? who are these people?

a day ago

elzbardico

Yeah, if we just ignore R&D, fixed costs, depreciation, and the fact that there's a high likelyhood investor were expecting a return, yeah, ignoring all of that, and trusting their number we may say inference turns a profit.

In accounting, almost anything you want can be true, at least for some time.

a day ago

Aurornis

This article tries to build upon a lot of half-truths or incorrect facts, like this:

> OpenAI is struggling to monetize. They turned to showing ads in ChatGPT,

The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.

The discussion about being unprofitable also repeats the reductionist view that these companies are losing money and therefore the business model doesn’t work. This happens with every VC cycle where writers don’t understand that funded companies are supposed to lose money while they grow. That’s what the investment money is for.

We have very strong indicators that inference is not a money loser for these companies and is likely very profitable. They should be spending large amounts of money on R&D to get ahead and try new things while they’re serving up tokens.

The “but they’re losing money” argument never seems to be brought out against competitors that literally give away their models for free and for which we can calculate the cost of serving 400B-1T parameter open weight models.

a day ago

Izkata

> The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.

Sounds like it is new for ChatGPT though. That's also how it started with TV and Youtube, first on the free tier then expanding to the paid ones.

a day ago

krferriter

I've never had ads in my Youtube Premium

2 hours ago

smt88

YouTube, Spotify, and most video steamers have zero ads on paid tiers. I never see video ads.

a day ago

PurpleRamen

Most services have now light premium-tiers, where they do show ads. And then there are the rats like Amazon, who just add them to the normal tiers, and offer an additional service to not show ads.

a day ago

adjejmxbdjdn

YT has a Premium Lite paid tier (at least in the U.S.) that does show ads on music and in certain other areas of the app, such as shorts, searching, browsing, etc.

a day ago

Zardoz84

I don't paid anything to YouTube and I don't see any ads. Because I block ads.

a day ago

smt88

Do you feel good about YouTube spending money on hosting and video producers spending time/money on content that you're paying nothing for? How is that sustainable?

a day ago

joquarky

The categorical imperative has been put on life support since 2016 at the latest.

Everything is smash and dash now. And nobody with the means to change it cares about externalities anymore.

a day ago

smt88

I don't understand any of those sentences. How do they apply to YouTube?

18 hours ago

danaris

Frankly? That's Google's (well, Alphabet's, I guess) problem.

They're a multibillion-dollar international monopoly with absolutely staggering amounts of money and power, actively engaging in a wide variety of activities directly aimed at making the lives of every normal person on the planet worse so that they can have more power, more control, and more money. Me blocking ads on YouTube not only costs them effectively nothing, it's also the act of a flea against a polar bear.

If Alphabet showed any signs of actually wanting to create a sustainable alternative to the surveillance economy, I might have some sympathy for them. But not only do they not do this, they are the ones who created it in the first place.

10 hours ago

smt88

Doesn't your boycott of the ad-free model confirm to them that the only viable business model is ads?

3 hours ago

danaris

I'm not sure where you got the idea that I'm boycotting "the ad-free model".

I'm boycotting them. After all, every cent that goes their way supports surveillance advertising (among other unsavory things).

I have other subscriptions that support ad-free creators.

If they choose to misconstrue my refusal to support them with either money or ad views, that's also their problem. (Also, that's patently never going to happen, because my signal vanishes instantly into the noise.)

an hour ago

carlosjobim

YouTube doesn't show ads on the paid plan. If you're talking about sponsored segments those would be impossible to moderate, and YouTube does offer easy skipping of those.

a day ago

butlike

> The ads aren’t going into your paid plans (except maybe a highly discounted tier, depending on the market). The ads are a play to offer a free version. Having an ad-supported free tier isn’t new.

This statement doesn't discount the original statement: that ads are going into GPT, which Sam called a last resort.

> The discussion about being unprofitable also repeats the reductionist view that these companies are losing money and therefore the business model doesn’t work. This happens with every VC cycle where writers don’t understand that funded companies are supposed to lose money while they grow. That’s what the investment money is for.

Usually propped-up companies don't last in the long term once the VC subsidy runs out. There's a difference between getting VC money in order to buy rocket parts, and getting VC money in order to charge $7 when you would really need to charge $10. The latter problem never goes away.

a day ago

throwaway27448

> We have very strong indicators that inference is not a money loser for these companies and is likely very profitable.

Why is OpenAI specifically losing money hand over fist then?

a day ago

aurareturn

Training. But training costs are a smaller and smaller percentage of revenue as inference revenue grows faster than training costs.

a day ago

ainch

Do you have any evidence that inference revenue is growing faster than training costs? RLVR is significantly less compute-efficient than token-prediction pretraining - especially as labs are trying to train models to achieve agentic tasks which take tens of minutes per rollout.

a day ago

aurareturn

I don't have any evidence. You'll have to believe what Anthropic and OpenAI CEOs say publicly.

However, it seems to make a lot of sense. Anthropic literally added $6b ARR in February 2026 alone. I doubt training costs go up that fast.

a day ago

ainch

It's definitely true that they've increased their revenue rapidly. But at the same time the 'scaling laws' that the labs were first built around require exponentially-scaling cost (10x flops for a fixed reduction in training loss).

If anything, a better look at the economics is a reason to look forward to one of them IPO-ing. I suspect the labs probably could cut R&D and turn a profit, but that might only work for one generation, until they get superseded by the competition.

a day ago

aurareturn

There is no doubt that competition is what is driving unprofitability. So when people say AI can't be monetized, I laugh. Right now, foundational AI is unprofitable because of competition, not because they can't make money.

16 hours ago

arctic-true

But this is exactly the problem - we have to take it on faith that inference is profitable because nobody actually knows. It’s hard to even define what that would mean, and while I am suspicious of claims that frontier lab CEOs are just out-and-out liars or bad people, defining and calculating the real cost of inference would be time- and labor-intensive in its own right and there is no strong incentive to do it other than “tech reporters are curious.” Until the IPO, we just won’t know.

a day ago

aurareturn

A lot of people know. A lot of insiders have been saying tokens are profitable. Is there a conspiracy theory for everyone to lie? Including OpenAI, Anthropic CEOs, employees, Cursor management, inference providers of Chinese models?

a day ago

arctic-true

Profitable on what basis? They generate more revenue than the cost of electricity? Does that factor in the cost to service the massive, multi-layer cake of debt that was necessary to even begin to serve inference in the first place - not from a training perspective but from a hardware and facilities perspective?

a day ago

aurareturn

Profitable as in every token they generate, they make some money.

And it's already mentioned that the path to profitability is that inference revenue eclipses training costs. It's already happening rapidly.

a day ago

arctic-true

I’m not talking about training costs. I’m talking about startup costs. You have to pay for GPUs (or to rent data centers). You have to pay for the electricity that runs those data centers, and in a lot of cases these frontier labs are building the data centers on credit, so you need to pay for the construction, the materials, etc. If it was as simple as “running the GPUs costs less than we charge for it,” I might be inclined to agree. But the GPUs don’t just appear by magic.

a day ago

aurareturn

Right now, the demand is far more than supply for GPUs. Every cloud company is saying they're leaving money on the table because they don't have enough compute to serve the demand.

It seems like you're arguing that the bubble is going to collapse soon, like the author? How can it collapse when the demand is so much bigger than supply? Do you think the demand is fake? Or that AI will stop making progress from here on out?

a day ago

arctic-true

The demand is real. The tech is real. The economics are completely unsustainable. Switching costs and barriers to entry are too low, operating costs are too high. And if the tech improves, it actually makes it even easier for competitors to swoop in and take market share. Not long ago, an agent that was 80% as good as SOTA was not usable. A year from now, an agent that is 80% as good as SOTA will be better than the best agent is today. We have it on good authority that today’s agents are very good, very useful. Why bother paying full price?

This is deeply ironic in a way. Because the whole premise of AI labor replacement is that AI does not need to be better than human labor, it just needs to be cheaper with acceptable performance. But the same is true one step down: discount AI doesn’t need to be better than bleeding-edge AI, it just needs to be cheaper with acceptable performance.

a day ago

gruez

>The “but they’re losing money” argument never seems to be brought out against competitors that literally give away their models for free and for which we can calculate the cost of serving 400B-1T parameter open weight models.

To be fair people aren't exactly bullish on the prospects of deepseek or z.ai either, it's just they're below radar so they don't get mentioned.

a day ago

Kye

Z.ai is at least owned by a public company, so there might be something in the financials.

https://en.wikipedia.org/wiki/Z.ai

>> "On 8 January 2026, Z.ai held its initial public offering on the Hong Kong Stock Exchange to become a listed company.[24][25][26] It is considered to be China's first major LLM company that went through an IPO.[26] In February 2026, JPMorgan Chase recommended to investors of purchasing stocks of the company alongside MiniMax.[27]"

https://www.zhipuai.cn/investor_relations/

But I haven't looked into it.

a day ago

14113

> companies are supposed to lose money while they grow

At what point do we declare that a company has "grown" and now must make money? OpenAI is a multi-billion dollar company right now, surely that's a point at which they should be profitable, instead of propped up by further investment and borrowing.

> We have very strong indicators that inference is not a money loser for these companies

All of the economic analysis that I've read strongly states the opposite. Running a GPU is a net loss /even for the data centre operators/. For them to break even, they currently charge OpenAI/Anthropic/Etc more than OpenAI/Anthropic/Etc make per-token.

a day ago

butlike

They clearly have some vested interest/skin in the game. Not sure it's worth retorting that one.

a day ago

Mentlo

We have strong indicators that inference is profitable on non-economically-valuable prompts. We don't have strong indicators that inference is profitable on economically valuable prompts.

As AI companies start extracting rent from the prompting, one of two things are going to collapse - either the long tail revenue base of low-value inference is going to collapse, because people won't be using Chat GPT to get a recipe if it costs them money or if it is ad-ridden; or the cost of economically-valuable inference is going to go up - and whether it goes up to economically stable positions is a toss-up.

And I say this as an AI enthusiast with <50% probability of a bubble burst in the short term.

7 hours ago

monegator

> Having an ad-supported free tier isn’t new having ads shoved in paid tiers isn't new either

a day ago

raincole

And it usually doesn't result in a market crash.

a day ago

project2501a

> This article tries to build upon a lot of half-truths or incorrect facts, like this:

yeah i was wondering why my bullshit detector was going off. This feels as if someone who cooks for Ramsey's kitchen is trying to predict the end of the market hike.

a day ago

mcv

I've heard "They're losing money" since the 1990s. About Amazon and nearly every other tech company.

The strategy is always:

* Build something useful

* Give it away for free to get people exited

* Convince investors that this is going to rule the world

* Grow to dominate the world

* Enshittify

a day ago

arctic-true

The difference is switching costs and the viability of alternatives. Even open source models are only a few months behind the frontier labs, which is a long time in tech but practically no time at all in the eyes of a business consumer. At best, one of the frontier labs will survive and get to flex its hegemonic muscle. But billions and billions of dollars worth of investments still get wiped out in that scenario, which I would still qualify as the bubble popping. This would be doubly true if the winner winds up being Google or Microsoft.

a day ago

danaris

I don't know about others, but with Amazon specifically, it's always been very clear that their "losing money" in aggregate was purely on paper, for tax purposes: their ability to undercut everyone else was initially based on being online without the brick-and-mortar costs that other stores did, then on economies of scale, and now on being the 900kg juggernaut that just has more money than God and can blow it on running you out of business if they feel like it.

10 hours ago

logravia

The thing I am struggling with is where is the impact of LLM tools, especially given the massive increase in token consumption from 2025 to now and the saturated presence of LLMs everywhere.

Naively speaking, I have so many expectations for the impact of this tech.

I'd expect a noticeable uptick in applications published on Google, Apple and Microsoft app stores. I'd also expect an uptick of games published to Steam. I'd expect an uptick in Github repos and libraries on PyPi.

I'd also expect some impact on the GDP ⸻ a non-negligible part of running a business is communication, planning, ads. Naively, I'd expect that LLMs should be able to both speed some of these things up and lubricate others.

I'd also expect that large corpos like Microsoft and Apple would have more resources to spare on the essential details of their OS like having a functioning taskbar or a predictable, consistent GUI.

I'd expect increased SAT scores or improved PISA results. Maybe even improved mental health, let's go wild.

It's strikes me as a reasonably useful tool, personally.

Yet, where are the goods in the aggregate?

a day ago

atomicnumber3

Programming is a necessary but not sufficient condition for software products to exist. So while the programming has to be good, so too do many other things, like product vision, product management, project management, and of course there still needs to be feedback between all of the above so that engineering isn't implementing a misunderstood version of the product and that product isn't asking for 5 years and a PhD research team. And on and on and on. Typing the code is like 2-10% of actually ending up with a software project and it's more toward the 2% for a software business.

So while AI made coding maybe 110% faster, it has also made literally every other person in the process lose their gd minds and they're wanting to break or skip everything else in the process to just shit out code faster.

a day ago

atomicnumber3

I meant 10% faster btw, typo

7 hours ago

d2ssa

Going faster only works WHEN you know EXACTLY (or close to it) what you want.

Going faster when experimenting? Nah you actually need a mix of slow and fast, and mostly slow stuff up-front.

There's a fundamental misunderstanding of how people actually do stuff imo - its akin to force fitting a square peg in a round hole. Im sure many are hoping its just a 'your organisation is designed wrong' problem. I doubt it though.

a day ago

therealdrag0

The tech is still young and projects take time. And there are many slow parts of building that have not been accelerated (mythical man month).

I have started making an indie game, as one does, and it’s easily going 2-4x speed, but even still I’d predict a year of free time development with focus to ‘finish’ this thing. But the latest agentic tech is 3 months old.

a day ago

tasuki

> ⸻

Wow, I'm impressed at your usage of this. Apparently it's 0x2E3B, named "three-em dash".

You must be human!

a day ago

logravia

Oh yeah, a month ago I was reading a comment section about LLM writing tendencies and someone humorously suggested using the loooooooong-em-dash to distinguish yourself from LLMs. I found it so charming that I made my keyboard output it when I double tap "-".

On Linux you press Ctrl+Shifs+U and then type 2E3B, then press enter.

a day ago

nopinsight

> nobody is sure if even their metered pricing is profitable

This is most likely wrong. Lab executives insist that serving tokens is profitable. It's the cost of training next-gen models that requires them to keep raising ever larger rounds. More importantly, many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.

a day ago

atwrk

But are they actually profitable, or do they employ creative accounting where only parts of overhead expenses are counted against all of inference revenue, similar to what Uber did?

OpenAI's numbers show that they definitely are not profitable on inference, and even worse, revenue growth scaled linearly with inference cost from 2024 to 2025, which means they can't outgrow this problem. See https://www.wheresyoured.at/oai_docs/

a day ago

therealdrag0

Does it matter if it’s creative accounting? Uber is a great example of a company that everyone was certain would fail because it was unprofitable and now it succeeded and is profitable.

a day ago

armonster

Uber didn't have ever-increasing costs though.

a day ago

baq

If they shut down all training today they’d be absolutely printing money for the next couple quarters and then die with a bang once the other lab releases the next frontier to the public.

a day ago

shimman

How? They're already burning $2 bills to make $1, court documents shown that Anthropic has already been lying around revenue (claimed to have made $19 billion when it's actually $5 billion to date [1]).

Not hard to believe they're lying about other things when they've been lying about the capability of their products since inception.

[1] https://www.reuters.com/commentary/breakingviews/anthropic-g...

a day ago

thereitgoes456

That is not what the article says, it says $19B ARR.

I don’t necessarily see a contradiction. $19B run rate, achieved very recently, is actually consistent with $5B lifetime earnings, because their growth curve is so sharp. Zitron is not good at math.

a day ago

shimman

Didn't link to Zitron site but if you can't see how dishonest it is to say you have $19b ARR when the reality is you have only a total of $5b IDK what to tell you. Says more about how you think and why you think it's okay for corporations to be misleading.

a day ago

s1artibartfast

Seems natural to me too. ARR is understood as the current rate. It would be more misleading to say 5b ARR.

Its like asking how fast a car is moving.

a day ago

MattRix

This is not lying, that is just what run rate revenue means! It makes sense to use as a metric when a company’s user base is growing as fast as Anthropic’s is.

a day ago

shimman

It makes sense to be extremely misleading about actual accounting figures? In what world is it okay to say you have $19b in ARR when you have only ever generated $5b for the entire duration of your company's existence?

Did Enron start a business school I'm unaware of something?

a day ago

dragonwriter

> In what world is it okay to say you have $19b in ARR when you have only ever generated $5b for the entire duration of your company's existence?

In the same world that it makes sense to say that your current speed is 57mph when you've only driven 15 miles since starting the trip.

a day ago

baq

sir if you say a number is $19B and everyone who is invested knows what it means, is there a problem?

a day ago

B56b

So just ignoring the link entirely, cool cool cool

a day ago

mattmanser

Try doing some inference with local models.

I'd be surprised if they're making money on inference just from that. There's no way someone paying $20 p/m and using it all day is not spending way more on even just the electricity for tokens, let alone the capex.

a day ago

beepbooptheory

I don't really get the last bit. It's hard to imagine what a new fangled "frontier model" could do that would blow anyone out of the water. Like what does this look like? Really good benchmarks? Who cares about that anymore?

a day ago

layer8

Not hallucinating anymore would be a good start.

a day ago

martinald

Yes I wrote a detailed article about this Forbes claim. https://martinalderson.com/posts/no-it-doesnt-cost-anthropic...

Key points - if you compare it to openrouter costs for ~similar sized models it is ~90% gross margin.

And this claim came from Cursor - not Anthropic!

a day ago

mrbungie

> Lab executives insist that serving tokens is profitable.

Maybe marginally profitable, but right now they need to give out subsidies for people to use their products (Antigravity, Codex, Claude Code et al) in an actually useful manner that prevents churn and at the scale they need to justify usage growth forecasts, which they need to keep the wheel turning.

Probably if you look at the users who exclusively use the simple chat box interfaces (i.e. ChatGPT, Gemini in UI, Claude in UI) plans it is actually profitable, but I'd also say that's not where most of the usage comes from.

I'd love to actually look at both usage + profitability from each user segment to see if their PxQ growth expectations from non-enterprise usage make any sense.

> Many independent providers price tokens of open-weight models at a fraction of Anthropic's prices.

Are those open-weight models as good as Anthropic? Are they the same parameter class?

a day ago

zozbot234

> Are those open-weight models as good as Anthropic? Are they the same parameter class?

Are they as good as Anthropic was one year ago? That's more like it. They don't have to be just as good, they just need to be the most worthwhile for the price. If frontier models are only providing a negligible advantage for what they charge, that absolutely matters.

a day ago

est31

It's a loss leader but this is normal. Same has happened with Uber, Airbnb, Amazon, etc. Using VC money to buy marketshare and once you have it, you can milk it.

The question is more around the moats that these companies have and it seems to me while their models are amazing technology, they don't really have a moat. The open/chinese models still continuously catch up to the american ones.

a day ago

hirako2000

And what possible moat. It isn't hard to foresee that in just a couple of years, models outpacing the latest frontier tech we have today will run on consumer hardware. With open source workflows anyone can pull in to run, providers won't see a penny.

Another scenario is that dense models get replaced entirely, in which case the likelyhood of OpenAI and co pioneering the concept is pretty slim. They will be left with billions worth of infrastructure which cost them 10 times that 2 years earlier, faced with the reality touched by the article: liquidate.

a day ago

sunaurus

The point is that you can’t just serve tokens without also training the next models. It’s an inseparable part of your costs, so naturally you can’t be profitable unless the price you are charging ALSO covers training.

a day ago

dash2

Is that right? I think that you can serve tokens without training the next models. It would be bad strategy, but it would work. So it's an important question, are they covering their operating expenditure? If they are the business has legs (and it will be worth spending a lot to train the next models). If not, maybe not.

a day ago

camdenreslink

If a major model provider were to just halt progress on developing new and improved models, the open weight alternatives would catch up in a couple years.

They would have a period of great margin, followed by possibly zero margin as enterprises move to free options.

They would have to come up with a lot of great products around the inferior models to justify charging at that point.

a day ago

leoc

Also, an out-of-date model which doesn't know about last year's world events, hit songs and new JS libraries is a depreciating asset even before you consider low-cost competitors catching up. So you'd presumably have to do some training just to keep the model up to date at the current quality level (unless you completely give up and just sweat the assets). And on the other side of that coin: over the next few years, do the latest, biggest models continue to generate user-perceived real-world improvements sufficient to keep users wanting the latest and greatest?

a day ago

dash2

> If a major model provider were to just halt progress on developing new and improved models, the open weight alternatives would catch up in a couple years.

That's why it would be bad strategy.

17 hours ago

yorwba

There are companies that already do nothing but serve tokens using models trained by others. Just running infrastructure and collecting a reasonable fee for their troubles. It's only a bad strategy if you want to claim to investors that you'll gain monopoly market share if only they could give you a few more billion dollars.

a day ago

chasd00

i don't think it will work, it's too easy to switch models. When google comes out with a new model people will just switch. I think Google wins in the long run, they have the money to just wait until everyone else goes bankrupt and they also have the Apple contract and therefore the mobile market.

a day ago

leoc

And apparently the most efficient training and inference thanks to their TPUs, IIUC?

a day ago

shafyy

Not counting training models as part of your gross margin is just creative accounting. It's an inherent part of being able to provde the service for OpenAI, Anthropic etc.

Even so, their subscriptions are significantly cheaper than the token pricing via API. So at some point they will need to get rid of subscriptions or increase the subscription prices dramatically... And that's assuming their current token pricing is actually profitable. Which it probably isn't.

Lastly, I would not trust one word that comes out of an executive of an AI company (or any other large company, for that matter).

a day ago

phantom784

Do tokens just cover ongoing operating costs, or are they also able to pay back the cost of training that model originally?

a day ago

pier25

So these companies will be profitable if training stops? Is that even a real possibility?

a day ago

danaris

Any given company could stop training tomorrow, and, as some others have said here, they'd be generating quite a bit of profit until their models visibly fell behind, however long that ended up taking, at which point they'd probably just fall over completely.

Over the whole industry? No; they can never, ever stop training, or they'll cease to be useful at all very soon.

Training is what keeps the models up-to-date on current events, which includes new programming languages, frameworks, and techniques. It's already been observed that using LLM assistance on some types of programming is much more effective than others, based on how well-represented they are in the training data: if everyone stopped training tomorrow, and next month a new programming language came out, none of them would ever be able to help you program in that new language.

This can be extended to other aspects of programming, too. If training stopped, coding assistants would gradually start giving you wrong answers on how to implement code for APIs, frameworks, and languages that continued to evolve, as they will always do, in much subtler (and likely harder-to-debug) ways than how they'd deal with a new language whose existence they don't even know about.

10 hours ago

naravara

The impetus to continue training at the pace they are is driven by the competition. So if the money starts drying up, then they’ll naturally slow down because they’ll have to figure out how to do more with less.

I suspect that once the models hit a point of “good enough” for certain use cases companies will start putting R&D focus in other areas that may be less expensive. Like figuring out how to run more efficiently, UI/UX conventions that help users get what they’re trying to accomplish in fewer steps, various kinds of caching of requests, etc. So the cost to serve tokens over time should only come down, and will probably start coming down more rapidly as the returns to model training slow down.

That’ll probably be a while though, because each successive model tends to be a lot better than the last.

a day ago

WarmWash

What's interesting to note is that the "intelligence" labs can squeeze out of an H100, an almost 4 year old GPU, is dramatically higher than what they got out of it in 2022.

It hints that once these labs get a good enough "everyday model", they can work on efficiency so they can serve these models on old hardware. Which is almost certainly already happening.

a day ago

pier25

> So if the money starts drying up, then they’ll naturally slow down because they’ll have to figure out how to do more with less.

Meanwhile companies like Google will keep investing on training...

Anthropic's CEO has suggested all AI companies should slow down training but obviously this is only beneficial for companies that can't afford to keep training.

a day ago

hbn

> UI/UX conventions that help users get what they’re trying to accomplish in fewer steps

If we can expect the past 15 years of software UI/UX history to continue, it's more likely they'll spend the money on making the UI/UX more confusing, removing features, and making basic tasks take more steps than they do today.

a day ago

naravara

That’s because the past 15 years were dictated by Web 2.0 companies that make their money off keeping you glued to the screen.

A AI assistant would work more like Planet Fitness where the goal is to figure out how to convince you to keep paying them while using the facilities as little as possible.

A big part of that might just be steering you towards repos of existing solutions to the problem you’re trying to solve rather than helping you vibe code a solution yourself. Over time they’ll be able to accrue a whole pile of canned functions that’s all automatically documented and audited and it’ll be able to plug and play those rather than having to rewrite.

The security implications of this give me a headache to contemplate to be honest.

7 hours ago

techpression

I wouldn't trust those claims from any private companies, even public ones play the most insane tricks in earnings calls to inflate numbers or heck, just make up new ones.

I'm not saying they're wrong, but I don't take much stock in their words.

a day ago

gedy

Buying and driving a new car off the lot costs the manufacturer nothing at that moment, but what happens before that is important to account for.

a day ago

schnitzelstoat

It's a winner-takes-all market and everyone wants to be the next Google and not the next Lycos or AskJeeves etc.

It'd be interesting to see what they spend all the money on though as we seem to be hitting diminishing returns and I'm not sure if the typical enterprise user really cares about small improvements on benchmarks.

It seems like it'd probably be better to spend all that on marketing, free trials, exclusivity/bundle deals etc. ChatGPT already has a strong advantage there as it has so much brand recognition. I've seen lay people refer to all LLM's as ChatGPT like my grandparents did with Nintendo and all video game consoles.

a day ago

joefourier

It’s absolutely not winner take all. LLMs have become a commodity and the cost of switching models is essentially nil.

Even if ChatGPT has brand recognition amongst lay people, your grandparents aren’t the ones shelling out $200/mo for a Claude code subscription and paying for extra Opus tokens on top of that. Anthropic’s revenue is now neck and neck with OpenAI, but if tomorrow they increased the price of Opus by 5x without increasing its capabilities, many would switch to Gemini, GPT 5.4, Cursor, or any cheap Chinese model. In fact I know many engineers that have multiple subscriptions active and switch when they hit the rate limits of one, precisely the tools are so interchangeable.

At some point it could even become cheaper to just buy 8x H100s and host Qwen/Deepseek/Kimi/etc yourself if you’re one of those companies paying $3k/mo per engineers in tokens.

a day ago

mattmanser

I have non-tech friends telling me about preferring other models like gemini, this feels like the early days of search engines when people were willing to switch to find better results.

a day ago

youniverse

Yep i have nontech friends and even the younger generation students talking about how Claude is better at certain tasks or types of homework problems lol.

If it's used as a tool not just search, then people will definitely talk about the other stuff. Students who rely on free tiers will also definitely just have everything bookmarked.

a day ago

baq

> It's a winner-takes-all market and everyone wants to be the next Google

absolutely isn't! if billed per token, there is no reason to be married to a single model family provider at all. the models have very different strengths and weaknesses, you should be taking advantage of this at all times.

a day ago

wavemode

people used to say this about search engines and web browsers, as well

regardless, eventually Google became the universal default for both. When it comes to software, the average person doesn't shop around for the technologically optimal choice, they just use what everyone else is using.

a day ago

upcoming-sesame

Google search is free to use. if they spike the models price up, people will look for alternatives

a day ago

wavemode

AI (that is, plain chat) is always going to be free to use as well. Google and Microsoft are going to keep it that way. And make the money back via ads.

That's why ChatGPT still has a free option. If they didn't, they would lose a billion users overnight to Gemini.

a day ago

baq

my point is today there is no clear winner. opus, gpt 5.4 and gemini have different strengths. google search was running circles around competition in basically all use cases.

a day ago

H8crilA

Where to go next? I don't think anyone has gotten close to automating everyday PC usage, likely via screen capture and raw keyboard+mouse inputs. Imagine how much bigger would that market be than vibecoding.

a day ago

wavemode

tbh I don't think this use case is going to be as big as people seem to think

there are a lot of reasons, but in brief - I think AI desktop use is a product that the average person isn't going to get much value out of. to make an analogy - the creators of Segway thought people would buy them in large numbers, but it turned out most people don't mind walking manually (or at least, don't mind it enough to spend money on a scooter). I think makers of AI Desktop Use products are going to find out the same thing as it relates to everyday tasks like checking email and shopping.

a day ago

H8crilA

I was thinking more remotely managing the computer in a warehouse, replacing the mouse of an architect, or some physical object engineer. That your grandma can finally find Discord by speaking to such a bot is just a nice side effect.

a day ago

wavemode

well yeah I wasn't even talking about professional use, since I think in professional use cases it will turn out make a lot more sense to set up APIs that AIs, use, than to set up screen scraping and mouse+keyboard use.

in fact even in rare cases where it's not possible to get an API or CLI to interface with some piece of software, I think people will find that their best bet is to first create a deterministic screen-scraping program for that specific software, then have that program serve an API for the AI to use. it would be so much cheaper to run (inference-wise) and so much more reliable, than having the AI itself perform the image interpretation and clicking.

I see AI desktop use as mainly a consumer product for that reason, since that's the situation where you have to react "on the fly" to whatever the user asks you to do and whatever program happens to be on their computer (versus professional cases which are more large-scale and repetitive, and where you can have a software developer on hand).

a day ago

zozbot234

Automating GUI use is a silly idea when the AI can do much of the same things by getting access to a *nix command line - which is how all coding models work. It matters when driving proprietary apps or browsing websites that aren't providing a clean machine-readable API, not really otherwise.

a day ago

delecti

I don't think it's winner-takes-all. Google is Google in 2026 because Lycos and AskJeeves were bad in comparison. The average user doesn't care whose LLM they're using because they're all close enough. It's hard to see past the bubble bursting, but I expect most people will use multiple of them depending on context (Copilot via the integration in windows, Gemini via Siri on their phone, etc), likely without paying.

a day ago

piker

> They lose a big customer for their cloud services. Even worse considering that now, using the AI they helped fund, everyone can compete with their sub-par products. GitHub is a good candidate for disruption, and that’d be just the start.

Look, I'm a Microsoft hater like the rest of us, but calling Microsoft's products sub-par discredits the author a good bit. I invite anyone who thinks this to try and compete with them. Go after something like Word, for example. Then prepare to be awed by what some of the most brilliant programming minds ever can produce after grinding for four decades.

a day ago

hbn

If I saw a helicopter crashed into a tree, I don't have to be a helicopter pilot to know it's not an ideal state of a helicopter and something/some people failed.

When I'm using MS Word and it takes 20 seconds to cold launch on a machine that's magnitudes faster than any computers 25 years ago where it launched near instantly, I can tell something is going wrong. When all of their software is harassing me to use AI in ways I don't want to use it, I can tell something is going wrong.

a day ago

s1artibartfast

your comment sums up the conflict.

I dont know if you noticed, but there was a shifting of the goal post from "sub-par" to something wrong/sub-optimal.

The best helicopter you can buy may in fact crash into trees sometimes.

a day ago

hbn

Microsoft's products do not occasionally fail, they're constantly going out of their way to block users from doing basic tasks through ads and dark patterns. It makes some KPI go up so some asshat product manager can get a promotion, and they never lose users because 99% of their users are hostages.

a day ago

s1artibartfast

I'm not saying they are great. I am pointing out the difference between absolute and relative performance.

You can have a shitty product as long as you are better than the next guy.

My fortune 50 company is migrating to microsoft because they dont like their current tools

20 hours ago

Aperocky

Sub par is not the right word, the right word is feature creep.

markdown have much less of that brilliance and thankfully I also needed none of it.

Last time I authored a word document is probably 2 years ago for a government interaction.

a day ago

karolist

You can have an opinion about a tool as a user, without ever having ability to create such a tool yourself, that's literally what every tech and auto reviewer does.

a day ago

piker

Sure, and the less you understand about the tool’s fundamental capabilities, the less useful your opinion is. The best reviewers have deep knowledge.

a day ago

hbn

You can use this logic to say all products are perfect and any criticisms of them by users are moot because their creator knows them best.

a day ago

red_admiral

Microsoft's AI, on the other hand, is underwhelming at the moment and might well go the way of Windows Phone. Plus enough people hate the copilot icons everywhere that Microsoft is hinting at dialing down a bit.

MS Office should last a while if they stop calling it "Copilot 365 Office" or whatever it was.

a day ago

tapoxi

The state of GitHub and Windows 11 certainly qualify as sub-par.

a day ago

sooperserieous

I think Github represents 'par'. Plenty of stuff worse and plenty of stuff better. Overall it's what most people expect a coding social media site to be because it set those expectations. Those of us who are only looking for code management (including issues/PRs/etc) are easily satisfied elsewhere.

a day ago

camdenreslink

It has had some really bad reliability issues recently. In terms of uptime, it has been a poor product.

5 hours ago

piker

There are some frustrating parts, but subpar is an odd way to describe GitHub to me. I’m pretty happy with what they’re doing, and find the UX super helpful. I do agree Actions needs a debug mode but otherwise I get a ton of value out of the service for $20/month?

a day ago

tapoxi

Specifically their failure to meet reasonable uptime requirements. I can't run a 99.9 service on a 99.0 platform.

a day ago

curtisblaine

I'm sure Word is full of arcane backwards compatible tricks that 20% of users use, but I find it hard to differentiate the Pareto 80% of the product from Google Docs or any other competitor (LibreOffice?) Adding rich text, tables, headings and colors is pretty much a solved problem for all of these softwares. Adding images or handling more complex layouts sucks everywhere, it's not like that Word has a great user experience and the other don't. All of them are bad. IMHO, if we had any of the competitors being the de-facto standard for word processing, the vast majority of users wouldn't feel the difference. Power users would for sure, but I'm not sure they're many or they use existential features. If Word didn't have a near monopoly in office settings due to aggressive marketing, OS presence and a proprietary file format that constantly changes and never renders well outside of Microsoft products, it could disappear without anyone (save Microsoft) losing much.

a day ago

piker

Yes. That 80% you find useful is served fine by Google Docs, but there’s a good reason the enterprise overwhelmingly goes for Word, and it lives deep in that 20% and a lot of the time has zero overlap with others.

a day ago

qoez

History doesn't have to repeat. There's barely anything else going on in terms of innovation, and AI is a real step function technology. We might be overspending but there's no way we're getting another AI winter like last time (remember last time investment in 90s AI had to compete for resources with the internet boom).

a day ago

hk__2

Isn’t that covered at the top of the post?

> AI is here to stay. If used right, chances are it will make us all more productive. That, on the other hand, does not mean it will be a good investment.

a day ago

joefourier

The dotcom bubble burst and 26 years later we’re all hopelessly addicted to the internet and the top companies on the stock market are almost all what would have been called “dotcoms” then.

The railroad bubble burst in 1846 not because trains were a dead end - passenger number would increase more than 10x in the UK in the following 50 years.

a day ago

lionkor

> History doesn't have to repeat

This is high up there on the list of things people say before, you know, it does

a day ago

tracker1

Datacenters themselves are really weird... most of the announced 2024 data centers are nowhere near completion, most of NVidia's production is taking longer to deploy than to produce and will be upwards of 2+ years behind on deployments sometime in the next year.

That doesn't even begin to cover the lack of actual electricity to power the data centers. We have more "dark silicon" sitting in boxes that aren't close to being deployed, while a lot of actual people can't manage to buy consumer products for anythign resembling reasonable... it's kind of insane to say the least.

a day ago

hyperpape

> Magnificent 7 companies are increasing capex to their biggest ever to differentiate their tech from each other and the big AI labs, but the key realization is that they don’t have to spend it to win. It’s a defensive move for them, if they commit $50B, OpenAI and Anthropic need to go raise $100B each to stay competitive, which makes them reliant on investors’ money.

Stay competitive how? If the Magnificent 7 aren't spending the money, then how could it possibly hurt OpenAI/Anthropic to not raise equal amounts of money? Maybe you can pull together an explanation, but this author didn't even try to do so.

This piece seems poorly thought-out, but well designed to get shared.

Promote writers who will actually explain their claims carefully.

a day ago

martinvol

they have to fight to stay competitive because mag7 can outspend them, but my hypothesis is that they wont need to ultimately.

a day ago

agentultra

It sounds like most of the data centers promised in 2025 and 2026 are not even built yet and most of the GPUs bought haven't even been installed.

If it does all go down in flames, even floor value is not going to be that valuable.

I can't predict the future but it's smelling a lot like a recession already under way that is bigger than the sub-prime crash.

a day ago

Chance-Device

From the beginning of this I’ve wondered the same question: how do these companies justify spending such massive amounts now (and 3 or 4 years ago) when software and hardware efficiencies will bring down the cost dramatically fairly soon?

They basically decided that scaling at any cost was the way to go. This only works as a strategy if efficiency can’t work, not if you simply haven’t tried. Otherwise, a few breakthroughs and order of magnitude improvements and people are running equivalent models on their desktops, then their laptops, then their phones.

Arguably the costs involved means that our existing hardware and software is simply non viable for what they were and are trying to do, and a few iterations later the money will simply have been wasted. If you consider funnelling everything to nvidia shareholders wasting it, which I do.

a day ago

Aperocky

The decision is the right one. Scaling at any cost is the right way to go.

You cannot find the efficiency if you haven't been experimenting at scale, this is true personally as well.

If someone haven't been burning a few B tokens per month, everything coming out of their mouth about AI is largely theory. It could be right or wrong, but they don't have the practice to validate what they're talking about.

Not everyone scaling to that degree would have the right answer or outcome, many would be wrong and go bust. But everyone who didn't will not have the right answer.

a day ago

raincole

Well said. Quantity itself is a quality.

In the worst of the worst case, they're building know-how of how to manage big datacenters, infra and data-labeling teams. These are incredibly valuable in the next few years. And no, no one, even the AI companies' executives themselves, believe that you can delegate business know-how to LLMs.

a day ago

ap99

They're not just betting on the current tech, they're building out infra like this because probably any future tech currently being researched will also require massive data centers.

Like how the gpt llms were kind of a side project at openai until someone showed how powerful they could be if you threw a lot more parameters at it.

There could be some other architecture in the works that makes gpts look old - first to build and train that new ai will be the winner.

a day ago

phito

I think their current goal is to capture as much market as they can while they still have the best models, their only moat. Look at Anthropic, they are clearly trying to lock their users in their ecosystem by refusing to follow conventions (AGENT.md etc) and restricting their tools exclusively to their own services.

a day ago

mrob

Because whoever wins the AI race (assuming they don't overshoot and trigger the hard takeoff scenario) becomes a living god. Everybody else becomes their slave, to be killed or exploited as they please. It's a risky gamble, but in the eyes of the participants the upside justifies it. If they don't go all in they're still exposed to all the downside risk but have no chance of winning.

I don't expect hardware prices to go down unless the third option (economic collapse) happens before somebody triggers the dystopia/extinction option.

a day ago

WarmWash

Just to add some slight nuance but is an important distinction,

They aren't all necessarily racing to be "god", some are racing to make sure someone else is not "god".

If it weren't for Altman releasing ChatGPT, it's very likely that we would have markedly less powerful LLMs at our disposal right now. Deepmind and Anthropic were taking incredibly safe and conservative approaches towards transformers, but OAI broke the silent truce and forced a race.

a day ago

aurareturn

This is an awful article. I don't know how it reached #1 on HN.

Bottom line is that H100 prices are near 3 year highs, A100s are still profitable to run, B200 prices are increasing, no one has enough compute. Google, OpenAI, Anthropic, Meta, AWS, Azure are all compute constrained. Every single one of them said so publicly. Neo clouds are telling customers they're all sold out now and you even have to book compute in advance if you're an AI company.

  OpenAI is struggling to monetize. They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them with the more profitable corporate customers and software engineers. 
AI bubble is bursting because OpenAI is trying to monetize free users on ChatGPT with ads but Anthropic is kicking butt in AI. What kind of logic is that? So it seems like AI can be monetized as Anthropic shows. Is AI going to burst because OpenAI can't monetize but Anthropic can?

  I wouldn’t be surprised at all if in the next couple of quarters we see OpenAI looking for an exit. It will be interesting because the sizes are now so big that we will probably know all the details. The most likely buyer is Microsoft, they already own a lot of it, and because of that, they are the most interested in showing a win. 
I'll take the opposite stance. I think OpenAI is going to be bigger than Microsoft in market cap within the next 3 years. I think Anthropic and OpenAI are going to run laps around current big tech except maybe Google. For example, in a few years, I think AI agents could completely replace Microsoft Office, Microsoft's cash cow.

  Independent reports state that Claude metered models are priced 5x more expensive than their subscribers pay
Already dispelled. It isn't 5x more expensive than their subscribers pay. Inference has a gross margin of 50%+. It's been repeated over and over again by Anthropic CEO, OpenAI CEO, and just about anyone who's done deep analysis on token profitability. If you don't believe OpenAI and Anthropic CEOs, just look at inference providers on Openrouter. They don't have VCs backing them selling tokens at a loss. They should be making margins on every token in order to keep the lights on.
a day ago

doom2

> Bottom line is that H100 prices are near 3 year highs, A100s are still profitable to run, B200 prices are increasing, no one has enough compute.

Then why aren't the hardware manufacturers of components needed by AI companies making plans yesterday to bring new fabs online to meet demand? That isn't a gotcha question, I genuinely want to know. The money involved isn't that much compared to the money changing hands between Nvidia Microsoft, OpenAI, etc., and it's not like once in-progress data center construction is complete they won't need to buy more RAM and GPUs, especially with any new advances in technology that might happen.

Inevitably someone will reply that hardware manufacturers don't want to be stuck losing money on a facility because the bubble popped and demand disappeared, but if Anthropic and OpenAI are going to "run laps around current big tech", it should be a no-brainer to increase production capacity.

a day ago

jsnell

A new fab will need to be filled with advanced equipment like lithography machines. They are the most complex thing humanity has every built.

There is one supplier of EUV lithography machines in the world, ASML. They are basically acting as an integrator for hundreds of highly specialized components manufactured to unimaginable levels of precision. Each of them has roughly one eligible supplier in the world who are operating at full capacity. To expand, they'll need yet another set of specialized and almost impossible to build equipment.

So the supply chain moves incredibly slowly, and the slowness is intrinsic due to the complexity and depth of the supply chain. It can't be fixed with just money. IIRC ASML is aiming to merely double their production of EUV lithography machines by 2030.

a day ago

doom2

Sure, I didn't mean to suggest that it would be easy or fast to increase manufacturing capabilities, just that the confidence I'm seeing around AI should extend to the manufacturers (if that confidence for the future growth and success of OpenAI and Anthropic is warranted). That is, the business decision to increase RAM and GPU supply should be "easy".

a day ago

jsnell

Right, but the business decisions probably aren't the constraint at this point? (But were a year ago.)

Once the ability of the supply chain to grow has been saturated, no amount of extra confidence will make it grow faster.

a day ago

aurareturn

They are. They're making as many fabs as they can as fast as they can.

The bottleneck is ASML, who can only make so many EUV machines. No one else can make EUV machines.

Scaling chip fabs and chip equipment is much harder. And you have to understand that chip fabs go bankrupt if demand suddenly drops so they have to be more cautious by default.

a day ago

zozbot234

If you're really compute constrained do you really need EUV machines? You can make do with DUV fabrication nodes, albeit at somewhat higher cost. The trailing edge is where a lot of the mass impactful innovation is, e.g. trying to replicate more advanced EUV nodes with DUV multiple patterning.

a day ago

aurareturn

That’s what’s happening. Companies who were planning a move to advanced nodes for non AI chips are delaying it. All the advanced nodes are going to AI or smartphone chips only.

17 hours ago

senordevnyc

There was a good episode on Dwarkesh's podcast about this in the last few weeks, just a deep dive into the semiconductor industry and what the bottlenecks are.

19 hours ago

nunez

> I think AI agents could completely replace Microsoft Office

How? What do you think lawyers/government will use to write briefs?

a day ago

aurareturn

With ChatGPT

a day ago

the_gipsy

> but Anthropic is kicking butt in AI

that's not what the article said:

> They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them

a day ago

aurareturn

Yes, that's what he said.

He said AI is going to bust because OpenAI needs to put ads on free tier. Then he said Anthropic is doing great with enterprise customers.

So which is it? Is AI going to burst because OpenAI needs to put ads on ChatGPT? Or is AI not going to burst because Anthropic is doing great in enterprise?

The logic has glaring flaws.

a day ago

veunes

OpenAI overtaking Microsoft? Seriously? Microsoft has a massively diversified business spanning from gaming and cloud infra to B2B software that the entire world runs on. OpenAI has exactly one product (matrix weights), which is getting heavily commoditized by open-source models every single day. Once a theoretical Llama 4 catches up to GPT-5, an API price war is going to completely nuke their hyper-margins

9 hours ago

aurareturn

That one product can reproduce or replace nearly all of Microsoft's services. It's not OpenAI that is going to do it. It's people and other companies wielding OpenAI's model that will do it.

3 hours ago

HackerThemAll

> I think OpenAI is going to be bigger than Microsoft in market cap within the next 3 years.

I am yet to see how a one-legged business model with just a single product (that is not crude oil), without a plan and money is going to become sustainable. Oh yeah, maybe they'll finally make money on those autonomous lethal weapons. That sounds the easiest.

a day ago

aurareturn

Sure. I'll give you a basic plan without any insider knowledge on OpenAI.

First, OpenAI and Anthropic are the leaders in model capabilities. Google is a close 3rd but 3rd nonetheless.

Second, ChatGPT likely has about 1 billion active users right now. I think ads on ChatGPT will surpass even Google search ads in the future. There will be a class of users who will never pay for ChatGPT subscriptions and that's ok. Meta and Google are two of the most profitable companies in history who almost rely solely on free users for their cash cows. "Ask ChatGPT" is already "google it" for the masses.

Third, there is so much untapped revenue potential from science, medicine field that OpenAI can eventually own with Anthropic. Microsoft stands no chance here since they can't build competing models.

Fourth, I can easily see ChatGPT morphing into agents for consumers and people will pay for them. AI is moving up the value chain fast. I don't see any reason why consumers won't pay for ChatGPT but will pay for Netflix.

Just some basic ideas based on public knowledge. I'm sure there are plenty more.

I'm not going to bet my house that OpenAI will become bigger than Microsoft in 3 years, but I'll put down a few hundred dollars on this bet.

a day ago

niam

I don't discount this as a possibility but my impression is that the OpenAI brand isn't very sticky.

Internet Explorer being pre-installed on Windows devices didn't prevent it from being demolished by newcomer Chrome throughout the 2010s. Now we're looking at a product that's even less integrated, and whose value is exposed through universal interfaces (human language, images, etc.).

If OpenAI succeeds, I imagine that remarkably little of it will have come from the brand. But subtracting the first-mover brand advantage: they can either compete on the frontier, which seems difficult and bears potentially diminishing returns (particularly wrt to distillation); or compete as a commodity, which I imagine cannot justify their valuation/spend.

It seems very uphill of a battle.

a day ago

fragmede

For people that use ChatGPT the same way you do, yeah it's not. For people in the throes of AI psychosis who've named their ChatGPT and have a deep relationship with it, switching to a newer model from OpenAI is an issue, nevermind switching to a different model from a different company.

a day ago

niam

I considered that but I don't see it being very impactful. It presumes a user who cares enough about "their" ChatGPT that they can't move from a particular model provider, but simultaneously does not care enough that model providers themselves have a financial motivation to shoo users onto their newer and more efficient models.

The transition from GPT4 to GPT5 was not well recieved among this crowd -- nevermind that I think this crowd is pretty small (comparatively) to begin with. I just don't imagine you can build a business on that sliver of a sliver, much less one that justifies OpenAI's spending.

a day ago

d2ssa

Most people dont give a hoot about that, they have much more interesting stuff going on in life.

a day ago

d2ssa

"Third, there is so much untapped revenue potential from science, medicine field that OpenAI can eventually own with Anthropic. "

Lol... yeah. They are not even looking like a going-concern long enough at this rate, let alone that.

a day ago

skeeter2020

>> Building a datacenter is supposed to be a “safe” investment in normal times, so banks give private credit and mortgages to finance them.

Except the investment is more like a railway or utility. It generates like 3% return, which is definitely not good enough for the people providing the money, or (in the case of the profitable companies) anywhere near the double-digit returns they make on their technology products. I won't be surprised when we see consolidation of marginal players and abandonment of the losers, just like you can find rail lines to nowhere, and fiber that's never been used.

a day ago

ajay-b

I would be very sad to lose services like ChatGPT. It has significantly improved my workflow by digesting and analyzing huge documents, and helping me to synthesize and respond better. May be I am part of a minority.

a day ago

coder68

The good news is local models have significantly improved. If it all goes down today, you can still run e.g. Qwen 3.5 at home, and it's "good enough" for most workloads.

With a gaming GPU you can run Qwen3.5-35B-A3B. I use 122B-A10B on my local rig (1x6000 Pro), and 397B-A17B on my 2x6000 Pro server (some spillover into CPU/RAM). It's pricey now but probably within a few years it'll become very affordable.

a day ago

raincole

Don't worry lol. It's not going anywhere. The article is just ragebaitng. Verbatim:

> Anthropic is already in a push to reduce costs and increase revenue

Yeah, it's totally a bad sign when a company tries to... reduce costs and increase revenue.

a day ago

mattmanser

Their point is it is a bad sign at this stage in the game, there's a lot of competition still.

Usually in a land grab like this you spend, spend, spend.

Uber was still paying to subsidize customer's rides until fairly recently to kill off the competition.

a day ago

raincole

When AI companies spend a lot: a sign of bubble bursting.

When AI companies look to cut cost: a sign of bubble bursting.

When RAM price goes up: a sign of bubble bursting.

When RAM price goes down: a sign of bubble bursting.

a day ago

Lerc

A lot of this make me imagine an Aeroplane flown by a mad pilot, overloaded and running out of fuel. The passengers are all blaming the guy sitting in the back knitting a parachute and telling him that the chute will never work because the wool is the wrong colour.

The tragedy is when it's all over one of the surviving passengers will go "See! I knew we were going to crash because of that knitter"

a day ago

EternalFury

If somehow recovering the capex expenditure is not counted, if somehow the cost of developing future models is not counted, then yes, inference costs of current leading models allow a profit.

But those things are tied together.

Even xAI, that now has a reasonably competitive model, is struggling to achieve PMF. Meta is in shambles because their models have underperformed for years now.

a day ago

ethagnawl

> If investor money dries up, they will be forced to cut their losses and pass the true costs to their users.

I do not see this talked about often enough whilst everyone is in the process of introducing hard dependencies on these services into their workflows.

a day ago

senordevnyc

Really? Virtually every AI thread on HN has multiple people promising doom and gloom once the labs start passing the "true costs" onto the users. This very post has multiple deep comment chains arguing about this!

19 hours ago

titzer

The article says "...and RAM prices are crashing because new models won’t need as much," and I went and read the link. The link was a puff piece for a very specific compression mechanism that...no one is using?

I do hope that RAM prices come down but this was just wishful thinking.

a day ago

KaiserPro

The problem with these kind of posts is that "How" is almost useless, I can tell you how the bubble pops: The value of these AI companies crash and take out a lots of other stuff with it.

The interesting questions are: "What triggers it" and "what also goes tits up"?

The issue with high/international finance is that a good percentage of it (if not more) is fraudulent or semi fraudulent bollocks.

"Here is a startup that is worth x million because y" Both of those statements are bollocks. However its in the interest of most people to agree with that bollocks to get money. If enough money is given there is a chance that the startup will make money.

If we look a few year back, NFTs fulfil that niche quite nicely. It was obviously bollocks, but a very convenient way to launder money, or run a series of rugpull operations.

The problem we have to contend with now is that the sheer amount money that has been invested all disappearing at once would require 2007/8 levels of coordination to unfuck. The US government does not have the requisite number of admins to pull that off again, and no political will to ever have that expertise again. So if AI does go pop, and it takes a lot of money with it, I would put a guess on china doing the money lubrication and extracting a subtle but richly ironic level of control in exchange

Also, its no guarantee that AI will trigger the next bubble popping, my money is on Private Equity.

a day ago

martinvol

> The problem with these kind of posts is that "How" is almost useless, I can tell you how the bubble pops: The value of these AI companies crash and take out a lots of other stuff with it.

That's like saying "I know exactly how you're going to die, your heart will stop"

a day ago

NickNaraghi

> Taking this into account, Google is extremely well positioned to weather the storm. When they announce capex expenditure, they don’t spend it overnight. They can simply deploy month by month until their competitors struggle to raise and get forced to capitulate. At that point they can just ramp down the spending and declare victory in a cornered market. They don’t need capex, they just need to make it very clear for everyone that nobody can outspend them.

Have you tried Gemini 3.1 lately? It is not even close to Opus 4.6 never mind Claude 5.

This post, like many pessimistic takes, seriously discounts innovation and the exponential takeoff of recursive self-improvement.

a day ago

endymion-light

Exponential take-off is great until it stops- genuinely, what are the signals showing any of the large models are performing exponential takeoff and recursive self-improvement?

Currently a lot of that appears to be marketing hype to drive up usage. Is it exponential, or are the labs spending exponentially more for smaller and smaller gains from LLMs?

a day ago

bogzz

What recursive self-improvement?

a day ago

shubhamjain

> OpenAI is struggling to monetize. They turned to showing ads in ChatGPT, something Sam Altman once called a “last resort”, while Anthropic is crushing them with the more profitable corporate customers and software engineers. Their shopping feature flopped and they shut down Sora, both supposed to be revenue drivers.

I don't think Sora ever thought of as a "revenue driver" considering how notoriously expensive and unpredictable video generation via inference is. OpenAI is just a repeat of Uber—minus the scandals—in a different decade. Uber got itself into tons of businesses related to transportation on the assumption that it would all be viable "one day." Same stuff that OpenAI is going.

I would say, once the bubble bursts—which is likely, considering the geopolitical environment—OpenAI, Anthropic, and Alphabet are likely to be the winners, with a lot of small players at the tail end. Anthropic won over programmers and OpenAI on everyone else. For millions of people, AI = ChatGPT, so I would bet that OpenAI can still become profitable, once they cut down their expenses.

a day ago

JohnTHaller

> minus the scandals

Given the tech bros involved, we just don't know about them yet. Also was this comment generated using AI? Look at all the em dashes.

a day ago

thebeardredis

Hopefully soon. My new unwords are f.e. "agentic".

a day ago

m12k

Remember, having the dot com bubble burst did not prevent the internet from being integrated more and more in society over the next couple decades. What it did was stop the headless investment where money was thrown at anything that tangentially could be called "online". We went from "nobody knows what this is, but everyone wants a piece of it" to "we know what it is, and we sure did pursue a lot of bad ideas when we didn't". Expect something similar to happen with AI - having the bubble burst will not stop it in its tracks, but it will change what gets invested in.

a day ago

hnthrow0287345

I don't see this bubble really popping as-in sinking the economy. Some circular investing and enough write offs will happen to avoid the largest recession indicators from informing the general population that there's actually a recession. You also have a government willing to do shady shit for their own benefit at the expense of responsible governing and ethics, and we have already seen the business leaders of the biggest tech companies cozy up to the administration.

My guess is that cloud companies will scoop up the data centers for pennies on the dollar and the GPUs get written off or fire-sold to enthusiasts still wanting to run local models. Then they can offer exceptionally low initial prices to new customers and get more people to be locked in. Or maybe we see a couple of new cloud companies start up but that would likely need lower interest rates.

a day ago

lstodd

DC infra will be scooped up by cloud guys, that's a given. As for GPUs.. well low-precision tflops have other uses besides inference. You can run Doom for example.

a day ago

Havoc

Gov bailout seems like the only way out.

a day ago

relation_al

RAM's dropping? Woohoo!

a day ago

256BitChris

I could see OpenAI hitting financial issues which triggers some media induced panic and for people to claim the AI bubble has popped.

However, the core utility of the best AI (read: Anthropic's ATM, by miles), will still exist and be leveraged by those who have learned to use it well.

I could also see the exponentially declining power requirements offsetting the exponential-but-slower rate of AI compute demand, which then renders a lot of unused capacity in these massive data centers.

I think of it like the old mainframes in the 70s which would take an entire city block to run, and now we have the equivalent of millions, if not billions of them in our pockets.

a day ago

baq

Anthropic isn’t the best by any reasonable measure. They’re the best in some areas and get pwned in others.

In general AI is very much like human intelligence in the regard that no two models are the same just like no two people are the same. IOW if you are a single model shop you might even not have any idea that you’re falling behind.

a day ago

jqpabc123

I think of it like the old mainframes in the 70s

I think this is a good comparison to current AI.

billions of them in our pockets.

AI in your pocket (but first on the desktop) is a real possibility.

a day ago

_puk

A lot of anthropic's recent improvements are coming from the task focus and improved orchestration around the models, not purely massive changes in the models themselves.

This bodes well for us being at a point that even if the bubble burst, we'd still have usable AI going forward.

a day ago

eieje

It’s pretty much undeniable at this point that the sentiment has changed.

About 2 months ago this place was unbearable - filled with doom and hype AI posts. I welcome the calming and eventual slow release of the bubble.

a day ago

cmrdporcupine

The coming months are the reckoning in which the poor quality of the tooling and the safeguards around them become evident and hopefully eventually rectified.

By which I mean the competent organizations are the ones that will come up with cultural and technical solutions to manage the quantity and quality of the code better.

Others will suffer severe quality issues. Not because the "AI"s produce inherently inferior code but because the volume of the code is too high to manage review of, and to have good internal organizational knowledge of to manage the pages in the middle of the night when servers go down because of code nobody really understood.

I produce masses of independent project work all day long in my spare time using these tools and they blow me away. But in the context of professional work on teams of other coworkers the results are difficult to reason about and often impossible to competently review and it's not clear the results are superior. ' IMHO companies that drink too deep from the well without caution could be burned badly.

Aside:

I hate to say it, but there is no sense in which Anthropic has the clearly better product than OpenAI at this point. I know Claude caught developer's hearts through the fall, but GPT5.4 is a more powerful, careful, and competent model for coding and Codex is a far less buggy and more performant TUI. For the last 3 months I've gone back and forth between the two and I always run anything written by Claude Opus 4.6 by myself and my coworkers through Codex for review and it is constantly finding severe correctness issues to the point where I simply won't subscribe to Anthropic's product anymore.

On top of that, OpenAI provides far higher token limits. Even their $20 plan goes quite far.

If I was just building crud websites, probably Claude Code would be fine, and it does indeed show more "initiative" and "imagination" but I've seen it build way too many race conditions and correctness issues to trust it or the work my coworkers make with it.

a day ago

nexos

I think ultimately the AI bubble is bound to burst solely based on the fact that no AI company has turned a profit. A business model consisting of pure speculation on profitability when profit has not come in for 4 years now indicates that the tech industry is over-betting on AI. That plus consumer backlash at the way AI is jacking up consumer prices on RAM and etc means that the bubble is bound to burst. To paraphrase Linus Torvalds, AI is a helpful tool but I look forward to the day it’s a regular part of life and the hype cycle ends

a day ago

lnfromx

Okay lets suppose all those companies are profitable if training would stop today. What if token demand is shrinking ? I think big parts of the current demand is artificially build by e.g. FOMO and marketing without real value generated by them. There is no indication in economic data about some productivity boom resulting from AI usage. Next thing is Energy costs - that will soon eat into profitability too. I don't see how this bubble can't burst.

a day ago

martinvol

I don't think token demand will shrink because we're still just learning how to use it, demand will skyrocket. The problem is what price we'll be willing to pay for it, specially if competition keeps soaring.

a day ago

lnfromx

But are we really still learning ? I feel like we already converge to a set of use cases. Also I am always wondering if LLMs are what they promise to be, why is it so difficult to find sources of real (measurable) value ? Wouldn't a disillusion of those overpromises trigger a reduction in demand ?

a day ago

LarsDu88

The world has seen this play out before. Launch a service, sell it at a loss to achieve hypergrowth, raise prices add ads and enshittify.

The thing that is difference is the scale and the hardware. When Britain underwent its rail building boom in the 1850s, the bubble bursting left the kingdom with 150 years worth of infrastructure. Unless we invest in energy buildouts, we will be left with billions in rapidly depreciating GPUs

a day ago

beepbooptheory

Just checked and my API bill for this stuff is about $2.50 this month. Am I really the minority here? I know there is a lot of kids into the openclaw and paying for subscriptions and stuff, but after that literally no one I know (who isn't a developer) is paying for it, and seemingly would never dream of paying for it. It would be like paying for Gmail to them I think.

I just dont understand why it justifies so much spending!

a day ago

nickcageinacage

AI is shit. I just want this to be over. Can we move on

a day ago

elorant

I feel that even if the bubble bursts hardware prices will still take years to normalize. So no clear benefit for the average consumer here.

a day ago

baggachipz

Consumers and retail investors will bear most of the brunt from this bubble. Even taxpayers, as the government will most likely bail out the "too big to fail" ai companies in the "race against China". All based on bullshit, hype, and greed.

a day ago

post-it

Cheaper hardware, discounts on stocks, and we keep AI itself? My flavour of hopium, sign me up.

a day ago

mvdtnz

> And independent of whether Microsoft makes money or not in their OpenAI endeavor, it kills the story: they were betting the whole growth story on AI, and if that doesn’t work out, then what’s left to justify a high stock price?

Microsoft's stock price today is the same as it was in late 2021 before anyone cared about AI. What would happen? Nothing. I don't think it's a significant revenue driver today. Microsoft, like everyone else, is speculating that AI will drive profits in the future. If it all fell apart there will certainly be losers but I don't see why it would bring down Microsoft.

a day ago

HackerThemAll

Excellent reading to realize how the rich greedy investment monkeys with no plan other than "let's build a data center" will ultimately drag the market and the economy down. This time it may not explode as abruptly as in dotcom era, but will slowly sink as the stupid US data center boom proves unprofitable. Billions burned for nothing more than a run for the money.

a day ago

positron26

When will this concern farm end? Internet is ant-milling harder than a model gone psychotic on synthetic data. Call me when it's over.

Back to the mines. The Vulkan only writes itself when prompted with well-conditioned problem statements.

a day ago

monegator

> How this affects you?

> checks list ...

nope, nothing will either directly or indirectly affect me. Let it happen sooner, rather than later, and unleash the mobs at the tech bros that set the world on course to make everybody's life more miserable. We'll still be here to get the scrapped RAM and GPUs to train and infere local models thank you very much.

a day ago

coffeebeqn

The current best models are already very capable of disrupting the job of millions of people. I don’t think a scenario where we just go back to pre-Claude Code exists and I’m sure the same models can be tuned for much of other white collar work at similar capability

a day ago

eieje

People keep saying this but nothing of the sort has happened.

People continue to work, some proportion of the those working use LLM’s regularly.

Enough time has passed that subjective statements about the future don’t pass muster. Look at the numbers - there has been no large scale lay offs since correcting for over hiring. Has hiring slowed down? Sure. However I’d wager most firms are finding it pretty difficult to think of projects to take that will generate positive NPV. If that’s the case why would they hire? Moreover the focus has returned to cash flows - not product based growth metrics. Which again re-inforces the point about project selection.

Efficiency generated growth does not continue on forever - it’s short lived.

a day ago

monegator

might be: there is too much busy work as it is, but we need people to work in order to make money in order to spend it in order to keep the circus from going under. It's the circle of life

Let me remind you that you are not paying the full price for the service and all the value of those company is out of thin air. More or less the premise of the article. *when* you will be asked the real price, we'll see if the company will prefer a human or a bot it can't pass blame to

a day ago

jarek83

I wonder if AI labs could be bailed out - like banks.

See, they kind of became a national asset and letting it go down, will leave USA watching China taking the lead for a very long time ahead. It just can't happen - right? So we'll just all fund it in taxes.

a day ago

franze

.... so what? the technology exists, the models exist. Even when the bubble bursts things will not go to the state "before AI". Even if model development would stop today (not the worst thing to happen) it would still be the most impactful invention since the printing press

a day ago

hk__2

Yes, that’s what the author wrote in the second sentence of the post: "AI is here to stay."

a day ago

irusensei

I guess the point is that without the hype subsiding it enshitification will ensue.

a day ago

jqpabc123

Another possibility not really addressed here --- local LLMs.

AI on hardware you own and control --- instead of a metered service provider. In other words, a repeat of the "personal computing" revolution but this time focused on AI.

TurboQuant could be a key step in this direction.

a day ago

schnitzelstoat

Yeah, I don't think local LLM's will keep up with what the massive corporations put out. But they might get to a level of performance where it just doesn't matter for most users.

And people would prefer to run a model locally for 'free' (not counting the energy cost) rather than paying for an LLM subscription.

a day ago

zozbot234

TurboQuant helps KV quantization which is not very relevant to local LLMs, since context size becomes most relevant when you run inference with large batches. For small-scale inference, weights dominate. (Even if you stream weights from SSD, you'll want to cache a sizeable fraction to get workable throughput, and that dominates your memory usage.)

a day ago

netdevphoenix

Local LLMs don't sound profitable at all for those building them. If you really wanted a SOTA model, you would be paying eye watering amounts to own it unless you got an open sourced one.

a day ago

jqpabc123

unless you got an open sourced one.

Ding, ding, ding --- we have a winner.

https://techstartups.com/2026/03/26/nvidia-backed-ai-startup...

https://tiiny.ai/

a day ago

HardCodedBias

In general:

Cynicism makes you sound smart. Optimism makes you successful.

The cynicism around this technology is everywhere, even though it clearly has real power to solve problems. It is a technology which enables so many use cases that were impossible before, that makes it very highly hyped/expected. And that is causing an immune (over) reaction by natural skeptics, that's an error.

People need to take a measured, reality based, view of how the technology is being used today, the adoption curve, and the increase in capabilities over time.

It's clearly being used strongly, and may even be revolutionary.

Bubbles burst when there's no 'there' there. AI has an undeniable 'there'—the only question is the timing of the ROI.

a day ago

martinvol

bubbles are created by people investing more than reasonable in something, independent of the actual value it will generate for society.

a day ago

richard___

Complete bs.

a day ago

martinvol

great feedback

a day ago

dist-epoch

excellent comment

a day ago

general_reveal

HN is no longer a reliable place for the truth. Quite frankly, unless you are utterly self educated, you are terribly vulnerable to this place.

At this rate, I’d almost prefer to talk on a private mailing list with vetted resumes.

a day ago

rvz

> HN is no longer a reliable place for the truth.

"No longer?" It never was.

Especially with AI boosters being allowed to degrade the comments section and shilling their paid blogs and violating the HN guidelines.

a day ago

myspy

Why?

a day ago

general_reveal

You have to be uneducated to even read an “AI is bubble article”. Anyone working this stuff knows how much more compute we need.

a day ago

dgb23

Aren't you conflating the technical side of it with the economic one?

A bubble doesn't necessarily mean that the the underlying tech/innovation isn't useful. It's a financial and economic phenomenon that is pretty well understood and researched:

- During the hype cycle, investors tend to overestimate the short to mid term effects and underestimate the long term effects.

- It's near impossible to pick the winners in advance, and research has shown that investors underestimate how many losers there will be.

- The financial system/market works very well when there are localized issues with debt. Those get seemingly automatically detected and repaired. But broad increases in credit not so much. Those spread into the whole system in non-obvious and complex ways and destabilize the whole system, which can lead to very large corrections.

etc.

a day ago

myspy

Thanks for clearing this up, as I don't work in that area.

Personally I'd say that it's a problem that prices of consumer goods go up that far to satisfy this part of the market. We could need a more sensible way to advance the technology.

a day ago

user34283

That problem seems to mostly impact teenage gamers who need more than 16 GB of memory and can't afford the extra $300.

In my opinion this is incomparable to what we are seeing with agentic AI that is rapidly replacing handwriting code.

I figure chances are AI is not going to stop here.

a day ago

A_D_E_P_T

Two things can be true at the same time:

- AI is a genuinely transformative technology on par with the internet and on track to probably surpass the smartphone

- The inflated valuations, the circular flows of money (or "money"), and the financial cup-shell game mean that the players of the game are all a few bad weeks away from catastrophe. This is, of course, nothing new for SV -- but the scale this time is new. Some believe it will soon collapse -- "bubble," thus.

a day ago

rvz

Yes. Both things can be true at the same time.

The question is when will the frontier AI companies turn a profit on said transformative technology since other than NVIDIA and big tech, it is losing them tens of billions and who will survive a crash when it comes.

This is when you know you are in a bubble when people with a clear financial incentive are going on to newsletters, podcasts and posting extremely outlandish predictions to sell the public on something.

The amount of engineers becoming snake-oil salesmen and vibe-coders becoming cybersecurity experts overnight selling AI courses is a good indicator which I am looking at.

a day ago

d2ssa

Agreed. People like Huang are doing too much weird stuff now, its obvious.

THe game is up - huge returns are not coming but some areas will benefit from LLMs like Software Engineers. Continued competition and reinvestment among the players will yield good outcomes imo.

Investors who didnt sell their stock in time are going to be p1ssed though. Wonder how the fall out will be managed.

a day ago

lstodd

> The amount of engineers becoming ..

This is good. It's how you know they lacked the intellectual rigour required to be engineers in the first place and thus never were.

a day ago

d2ssa

This may sound like a bizarre comment, but, one could argue collusion re. wages back in the day was a good thing - it kept the people who weren't all that passionate about it out.

a day ago

dude250711

> Anyone working this stuff...

Things look different from within a bubble, you need an outside perspective.

a day ago

Der_Einzige

Correct. This place is a cesspool.

a day ago