Wasting their precious limited time on this planet for performative hand wringing.
AI is only going to get better and better. Eventually manually writing software by hand with programming languages will be thought of as the punch-card phase of software development.
Do these people think we'll be writing software in 200 years time? That anybody will be maintaining rsync, let alone this "moral human hands only" version of it?
The anti-AI lot are trying to make all AI content wear a Scarlett letter. I wish they would wear one themselves so that we could filter them from our timeline.
It is not "performative hand wringing" to observe that a tool sucks and to reject its use. You cannot, at present, write quality software with AI tools. At best you get something you could've made yourself, slower than you could've made it yourself. Only a fool insists on using a tool when it has been proven to not work.
I want this technology! You don't speak for all of us.
I'm sick of techo-Luddism. I'm sick of complaints about water use in a world that has avocados, beef, and fabric dyes. I'm sick of complaints about power use when you have your air conditioners, winter heat, air travel, and gaming PCs.
I'm sick of artists saying AI image and video sucks. I'm sick of pretend artists, armchair warriors, obsessed fans, and pickmes towing the same line.
I'm sick of engineers saying these models aren't a huge performance gain.
You haters and skeptics out there can keep doing you, but I'm going to keep using the technology. We'll see where the chips fall.
I was raised on optimism and dreams of the future. I want that. I don't want to die with the same incrementalism we've always had. I want orders of magnitude more.
This is our one moment of awareness in a cosmically infinite void. I want spectacular. I'm tired of the chicken attitudes when people should aspire to be eagles.
What we have is so boring. There's so much more if we reach for it. Holodecks, models that cure every molecular cause of cancer, doubled and tripled health spans, instant ability to understand every language, fast and cheap travel autonomous p2p travel, everyone on earth lifted out of poverty, a Michelin chef in your kitchen, ...
There’s a difference between being a hater and acknowledging the reality of the technology and those building it. I want all of those things too. However I do not understand why LLMs will get them for us. Instead I see a few really powerful people looking to get more powerful. I see a powerful tool being presented as god in a box when. I see the most resources ever spent on a singular thing being spent in a way that’ll _best case_ be mostly obsolete in a few years.
I want real AI. I want cures for cancer. I want too want to live in a post scarcity world. We had most of the technologies to do that before this. However the companies and investors involved in the AI build out chose to sit on massive reserves instead of trying to directly solve those problems. There exist proposals which solve hunger, the energy transition, etc and together they wouldn’t amount for even half of what’s been spent.
That tells me those involved want nothing other than money and power.
That’s a take. Would be cool if you could be happy with what is, rather than what you wish could be, while still being optimistic about new tech. But even if you can’t, would be nice if you could have your optimism without letting other people’s differing opinions get under your skin.
I’d also suggest, if you care about things like curing “every molecular cause of cancer” to spend some time and energy working in that field to understand the real problems there and work towards real solutions (with models or whatever floats your boat), rather than hoping that some poorly defined techno-optimism hand-waving will just happen to result in the best of all possible worlds, with no downsides or alternative outcomes.
Also, crazy to say that the miracle of existence is boring because we don’t have the tech you imagine!! If that’s your take now, no new technology is going to fix it. You’ll just still be bored with the holodeck and your one precious life to live.
Last year I used a bunch of models to try to generate Rust code. They all sucked.
This February I tried again and used Claude to generate Rust code. I have never been more stunned in my life. It's just as good as I am, and 30x faster. No fluff, the code is verbatim just as I would have written.
I then tried other models. Total disappointment.
I've continued to repeat this experiment. Opus is the only model that can write Rust reasonably.
Codex produces junk to this day. It passes variables that aren't needed, it abuses pointers, it creates overly verbose monstrosities...
I don't want any single company to win. I want OpenAI to be competitive. I want open source models to win. But right now, Claude Code and Opus are it.
> This February I tried again and used Claude to generate Rust code. I have never been more stunned in my life. It's just as good as I am, and 30x faster. No fluff, the code is verbatim just as I would have written.
Having looked at a bunch of known or suspected (based on the intent of the code and/or what I know about the developer(s)) LLM generated rust, there's only a few explanations here:
1. You're way better at prompting than (virtually) anyone else.
2. You're vastly overestimating how good the rust code it produced is.
3. You handheld the model throughout and made lots of edits.
4. Your hand written rust code is very bad.
Because from every example I've seen, these models write horrible rust. Sure, it may technically pass all the tests, but it's horribly pessimized, badly organized, doesn't even attempt to use the type system, if there aren't bugs now there will be the second it tries to refactor or add a new feature, etc. etc.
(I also strongly suspect that the same would be true for other languages, but I can detect it in rust more easily because it's my main language)
I recently tried with C# code and Avalonia on Linux. Total disaster. Could only get things to run after 10 attempts or so, and was only trying a very basic example. For some of the experiments I actually gave up.
Gruber's usually too much of a walking Apple ad for my taste, but I love this.
We need to define the things we hate. Give them words. Use the words as weapons.
I've been thinking about this a lot recently with "watermarks" of the statistical and non-visible kind used to track image creators. (Google embedding "this image is AI but also here's the user ID".)
I've been thinking that practice needs a new word too. It's not watermarking, it's signals-math based tracking, so maybe sigtracked.
I find the characterisation of his Apple praise fascinating. It's really not that zealous, unless you hate Apple (which is fine). I think this image of him speaks more of the prominence of the Apple superfan image in popular culture than the actual reality of his position.
It isn't anymore, but if you go back a decade or two, it really was that zealous. He really did used to blindly defend Apple (e.g. things like this: https://daringfireball.net/2006/09/open_challenge), but I think he's grown more skeptical of Apple lately.
I don't want to split hairs over what constitutes as overzealous, but I will say that Apple ~20 years ago earned more praise than Apple does today. This is probably reflected in the writing.
Nobody trying to compete with Google, OpenAI, and Anthropic should be playing the small models / local models game.
Foundation model labs should be building very large reasoning models, then leaving it to the community to distill them down.
You can't scale a small model up, but you can scale a small model down.
I'm convinced the only way we'll have a seat at the table in the future and avoid total runaway takeoff is if there are very large models within 80% of the capabilities of the frontier models. Tiny RTX models do diddly squat to remain competitive.
Build open weights models for running on H200s. I'll spin them up on RunPod or Lambda.
I do think there's a chance open weight models have a bit of a moment with the costs of frontier models growing on business balance sheets. It's unfortunate from my "privacy loving" PoV that it's mostly Chinese models filling the gap. ( the top models on openrouter for instance ).
I have used Mistral models out of pure ideology for web agents and the like which aren't doing a lot of heavy lifting.
Antirez’s Deepseek 4 Flash implementation that can run on MacBooks also was a revelation. It runs decently on M5 Max 128GB and it’s pointing out other bottlenecks like prefill speed which will improve.
I thought distillation meant small models don't have to compete with the big models and can always eventually achieve close parity, but it's just a matter of time to do the distillation? (i.e. how much lag do you want to live with) Am I oversimplifying?
There is likely a theoretical limit to how much intelligence you can pack into a model of a given size (especially when stretching that over a large input context size).
Our evals are pretty complex so we only recently started testing ~30B class models, which are now becoming quite smart (on par with the frontier from 1 year ago). Mistral is far behind, but I'm rooting for them.
I think the cost pressures just make most AI generated stuff slop. Its not that AI can't make good stuff its that the slop to good ratio is 100s of times worse with AI published music than with human stuff. Simply because AI generation cost is essentially zero.
Purely a economic argument but also the rare good music from AI I am still looking its generally speaking not that cohesive and for unremarkable. A lot of human work is that to but the discovery of good music from people feels much less daunting
That's not to say you can't make effortful novel content using AI, but this is just lazy hollow stimulation. Like all the laziest of AMVs, nothing to say outside of "isn't this cool?".
We want to see the person underneath and what ideas they explore through the medium - AI is just a fancy new tool of the times.
Who cares what brush or canvas Vincent used to make Starry Night? Without his name on it, it's just another oil painting.
People on social media follow creators who make novel content irrespective of how they made it.
AI generated content will always be lower quality because the entry level is literacy, hence YouTube intervention with their "it's probably not worth your time" label.
Runtime evaluated feature flags can always be used for control plane levers and emergency handbrakes.
You just have to label them as such and prevent other teams from fiddling with them.
This is not an antipattern, it's just semantic hand-wringing.
My team managed critical systems in the online flow of billions of dollars of daily payment volume. We also wrote the feature flag system that the rest of the company used. Not only were we completely fine with feature flags as long-lived control plane levers, we heavily used the system that way ourselves.
You just have to clearly distinguish between ephemeral rollout flags (and clean them up or expire them) and the permanent control plane levers.
It's the exact same functionality for both sets of tools. Just different practices around the two usages.
I completely agree with your distinction and that is exactly what they mandated :)
I don't think that is what most people colloquially mean by "feature flags" though. Even most teams in Shopify abused "ephemeral" flags for long periods of time.
When they rolled out the mandate it was very annoying for my team because we had a lot of operational flags like you're describing that we needed to get exemptions for.
You mean the incident where his copilot intentionally locked him out of the cockpit and crashed the plane into a mountain? Hardly seems like an indictment of locked doors to me.
There was also Helios 522 where one of the cabin attendants only managed to enter minutes before the engines flamed out, there is a strong argument if the door wasn't locked he could've entered earlier.
And my understanding is that the current theory for MH370 is that the pilot locked out the copilot and then depressurised the cabin.
There are non-fatal cases like Ethiopian Airlines Flight 702 where the copilot locked the captain out when we was in the restroom (though loss of life was still a possibility, one of the engines had flamed out and was on emergency fuel).
As with all incidents, there are many factors that lead to them, but in these cases the presence of locked and reinforced cockpit doors contributed to the incident (in malicious cases the fact the door was impenetrable was clearly part of the decision-making, and in accident cases it was obviously an impediment to any positive outcome once the incident occurred).
> But a lot of the people who end up causing destruction do so because there's some problem affecting them that's not being dealt with.
I think solving the socioeconomic, geopolitical, and religious tensions that lead to plane hijackings is a much harder problem to solve than simply putting doors on cockpits and forcing people to do body scans.
But it the long run we should maybe still attempt to solve it, before there are mandatory body scans everywhere and cars only start, if you do a mental examination first?
Exactly. That's treating a symptom, which creates more and more extreme symptoms. After a while though it's far more costly and complex to keep treating the wide variety of symptoms than it ever would've been to treat the cause, but because so much infrastructure has been built around treating those symptoms it's too difficult to dedicate resources to treating the cause.
> Languages with a single way to do things benefit the most: Rust
I posit that Rust is the optimal language to emit from LLMs unless you have to target web, a specific platform, or a legacy project:
- The required error handling for Option<T>, Result<T,E>, and required destructuring of sum types naturally reduces errors by an order of magnitude
- If it compiles, chances are higher the code is correct. Especially if you're using strong typing.
- The training data for Rust is likely of a higher quality than, say, Javascript
- The resulting code is fast and portable
- You get really nice threading and async, and you don't have to think about the silly "color problem" because the LLM handles it for you.
- Using an LLM takes away any trouble you'd have with the borrow checker or refactoring, or otherwise working in a slightly more difficult language.
- Applications are single binary executables.
Since LLMs let you generate and manipulate Rust code as fast as you would Python, why not just emit Rust instead? It's the least brittle language, and it's incredibly performant.
Even for the web, Rust is a great language out of LLMs. It was quite surprising given the early performance of Python that Rust does so well. It really speaks to the high dimensional generality of transformer translation models.
AI is only going to get better and better. Eventually manually writing software by hand with programming languages will be thought of as the punch-card phase of software development.
Do these people think we'll be writing software in 200 years time? That anybody will be maintaining rsync, let alone this "moral human hands only" version of it?
The anti-AI lot are trying to make all AI content wear a Scarlett letter. I wish they would wear one themselves so that we could filter them from our timeline.
This "effort" is entirely wasted.
reply