Hacker News new | past | comments | ask | show | jobs | submit login

This is a very cool demo - if you dig deeper there’s a clip of them having a “blind” AI talk to another AI with live camera input to ask it to explain what it’s seeing. Then they, together, sing a song about what they’re looking at, alternating each line, and rhyming with one another. Given all of the isolated capabilities of AI, this isn’t particularly surprising, but seeing it all work together in real time is pretty incredible.

But it’s not scary. It’s… marvelous, cringey, uncomfortable, awe-inspiring. What’s scary is not what AI can currently do, but what we expect from it. Can it do math yet? Can it play chess? Can it write entire apps from scratch? Can it just do my entire job for me?

We’re moving toward a world where every job will be modeled, and you’ll either be an AI owner, a model architect, an agent/hardware engineer, a technician, or just.. training data.




> We’re moving toward a world where every job will be modeled

After an OpenAI launch, I think it's important to take one's feelings about the future impact of the technology with a HUGE grain of salt. OpenAI are masters of hype. They have been generating hype for years now, yet the real-world impacts remain modest so far.

Do you remember when they teased GPT-2 as "too dangerous" for public access? I do. Yet we now have Llama 3 in the wild, which even at the smaller 8B size is about as powerful as the [edit: 6/13/23] GPT-4 release.

As someone pointed out elsewhere in the comments, a logistic curve looks exponential in the beginning, before it approaches saturation. Yet, logistic curves are more common, especially in ML. I think it's interesting that GPT-4o doesn't show much of an improvement in "reasoning" strength.


A Google search for practically any long-tail keywords will reveal that LLMs have already had a very significant impact. DuckDuckGo has suffered even more. Social media is absolutely lousy with AI-powered fraud of varying degrees of sophistication.

It's glib to dismiss safety concerns because we haven't all turned into paperclips yet. LLMs and image gen models are having real effects now.

We're already at a point where AI can generate text and images that will fool a lot of people a lot of the time. For every college-educated young person smugly pointing out that they aren't fooled by an image with six-fingered hands, there are far more people who had marginal media literacy to begin with and are now almost defenceless against a tidal wave of hyper-scaleable deception.

We're already at a point where we're counselling elders to ignore late-night messages from people claiming to be a relative in need of an urgent wire transfer. What defences do we have when an LLM will be able to have a completely fluent, natural-sounding conversation in someone else's voice? I'm not confident that I'd be able to distinguish GPT-4o from a human speaker in the best of circumstances and I'm almost certain that I could be fooled if I'm hurried, distracted, sleep deprived or otherwise impaired.

Regardless of any future impacts on the labour market or any hypothesised X-risks, I think we should be very worried about the immediate risks to trust and social cohesion. An awful lot of people are turning into paranoid weirdos at the moment and I don't particularly blame them, but I can see things getting seriously ugly if we can't abate that trend.


> I'm not confident that I'd be able to distinguish GPT-4o from a human speaker in the best of circumstances and I'm almost certain that I could be fooled if I'm hurried, distracted, sleep deprived or otherwise impaired.

Set a memorable verification phrase with your friends and loved ones. That way if you call them out of the blue or from some strange number (and they actually pick up for some reason) and you tell them you need $300 to get you out of trouble they can ask you to say the phrase and they'll know it's you if you respond appropriately.

I've already done that and I'm far less worried about AI fooling me or my family in a scam than I am about corporations and governments using it without caring about the impact of the inevitable mistakes and hallucinations. AI is already being used by judges to decide how long people should go to jail. Parole boards are using it to decide who to keep locked up. Governments are using it to decide which people/buildings to bomb. Insurance companies are using to deny critical health coverage to people. Police are using it to decide who to target and even to write their reports for them.

More and more people are going to get badly screwed over, lose their freedom, or lose their lives because of AI. It'll save time/money for people with more money and power than you or I will ever have though, so there's no fighting it.


The way to get around your side channel verification phrase is by introducing an element of stress and urgency: "omg, help, I'm being robbed and they need $300 immediately or they'll hurt me, no time for a passphrase!" can additionally feign memory loss.

Alternatively while it may be difficult to trick you directly, phishing the passphrase from a more naive loved one or bored coworker and then parroting it back to you is also a possibility. 'etc.

Phone scams are no joke and this is getting past the point where regular people can be expected to easily filter them out.


Or just ask them to tell them something only you both know (a story from childhood, etc). Reminds me of a book where this sort of thing was common (don't remember the title):

1. something you have

2. something you know

3. something you are

These three things are required for any authz.


For many people it would be better to choose specific personal secrets due to the amount of info online. I'm not a very active social media user, and what little I post tends not to be about me, but from reading 15 year old Facebook posts made by friends of mine you could definitely find at least one example on each of those categories. Hell, I think probably even from old work-related LinkedIn posts.


We had a “long lost aunt” come out of nowhere that got my phone number from a relative who got my number from another relative.

At that point, how can you validate it, as there’s no shared secret? The only thing we had was validating childhood stories. After a preponderance of them, we accepted she was real (she refused to talk on the phone — apparently her voice was damaged).

We eventually met her in real life.

The point is, you can always use these three principles: asking relatives to validate the phone number — something you have — and then the stories — something shared — and finally meeting in real life — something you are.


Oh, you remember those little games that your mom played on facebook/tic tok that asked her "Her favorite", sorry she already trained the AI who she was.

I only say this sort of jokingly. Three out of four of my parents/in laws are questionably literate on the internet. It wouldn't take much of a "me bot" for them to start telling it the stories of our childhood and then that information is out there.


"Hey Janelle, what's wrong with Wolfie?"


Your foster parents are dead


Another amazing demo of an AI talking to another AI over a phone line.


People are and have always been screwed over by modestly equiped humans.


Lincoln already made that observation in the 1850s, "You can fool some of the people all of the time, and all of the people some of the time"

As technology advances those proportions will be boosted. Seems inevitable.


Not sure how much of that has to do with technology or simply a widening gap in people's education we seem to be seeing for a while now.


"Hey mom and dad, we need a memorable phrase so AI bots can't call us and pretend to be each other."


I think humankind has managed massive shifts in what and who you could trust several times before.

We went from living in villages where everyone knew each other to living in big cities where almost everyone is a stranger.

We went from photos being relatively reliable evidence to digital photography where anyone can fake almost anything and even the line between faking and improving is blurred.

We went from mass distribution of media being a massive capital expenditure that only big publishers could afford to something that is free and anonymous for everyone.

We went from a tiny number of people in close proximity being able to initiate a conversation with us to being reachable for everyone who could dial a phone number or send an email message.

Each of these transitions caused big problems. None of these problems have ever been completely solved. But each time we found mitigations that limit the impact of any misuse.

I see the current AI wave as yet another step away from trusting superficial appearances to a world that requires more formal authentication protocols.

Passports were introduced long ago but never properly transitioned into the digital world. Using some unsigned PDF allegedly representing a utility bill as proof of address seems questionable as well. And the way in which social security numbers are used for authentication in the US is nothing short of bizarre.

So I think there are some very low hanging fruits in terms of authentication and digital signatures. We have all the tools to deal with the trust issues caused by generative AI. We just have to use them.


During these boundaries people can die. Consider the advent of yellow journalism and the connection with the Spanish-American war 1898: https://en.m.wikipedia.org/wiki/American_propaganda_of_the_S...


No doubt, people die of absolutely everything ever invented and also of not having invented some things.

The best we can ever hope to do is find mitigations as and when problems arise.


Which is why we started saying "whoa, slow down" when it came to some particular artifacts, such as nuclear weapons as to avoid the 'worse than we can imagine' scenario.

Of course this is much more difficult when it comes to software, and very few serious people think the idea of a ever present government monitoring your software would be a better option then reckless AI development.


Outside of the transition to a large city, virtually everything you've mentioned happened in the last 1/2 century. Even the phone was expensive, and not widely in use in under 100 years ago.

That's massive fast change, and we haven't culturally caught up to any of it yet.


Here's another one: We went from in-person story telling to wide distribution of printed materials, sometimes by pseudonymous authors.

This happened from the 15th century onward. By the 19th century more than half the UK population could read and write.


Just because we haven't yet destroyed the human race through the use of nuclear weapons doesn't mean that it can't or won't happen now that we have the capacity to do so. And I would add that we developed that capacity in less than 50 years of creating the first atomic bomb. We're now living on a knife's edge and at the merge of safeguards which we don't give much thought to on a daily basis because we hope that they won't fail.

That's how I look at where we're going with AI. Plunge along into the new arms race first and build the capacity, then later figure out the treaties and safeguards which we hope will keep our society safe (and by that I don't mean a Skynet-like AI-powered destruction, but the upheaval of our society potentially as impactful as the industrial revolution.)

Humanity will get through it, I'm sure. But I'm not confident it will be without a lot of pain and suffering for a large percentage of people. We also managed to survive 2 world wars in the last century--but it cost the lives of 100 million people.


I tend to think the answer is to go back to villages, albeit digital ones. Authentication only enforces that an account is accessed by the correct "user", but particularly in social media many users are bad actors of various stripes. The strongest account authentication in the world doesn't help with that.

So the question, I think, is how do we reclaim trust in a world where every kind of content can be convincingly faked? And I think the answer is by rebuilding trust between users such that we actually have reason to simply trust the users we're interacting with aren't lying to us (and that also goes for building trust in the platforms we use). In my mind, that means a shift to small federated and P2P communication since both of these enable both the users and the operators to build the network around existing real-world relationships. A federation network can still grow large, but it can do so through those relationships rather than giving institutional bad actors as easy of an entrance as anyone else.


But this causes other problems such as the emergence of insular cultural or social cliques imposing implicit preconditions for participation.

Isn't it rather brilliant that you can just ask questions of competent people in some subreddit without first becoming part of that particular social circle?

It could also reintroduce geographical exclusion based on the rather arbitrary birth lottery.


More tech won’t solve it. Areas, either physical or logical, with no or low tech might help.


> Each of these transitions caused big problems. None of these problems have ever been completely solved. But each time we found mitigations that limit the impact of any misuse.

This a problem with all technology. The mitigations are like technical debt but with a difference. You can fix technical debt. Short of societal collapse mitigations persist, the impacts ratchet upward and disproportionately affect people at the margin.

There's an old (not quite joke) that if civilization fell, a large percentage of the population would die of the effects of tooth decay.


Sure, all tech has 'real' effects. It's kinda the definition of tech. But all of these concerns more or less fall into the category of "add it to the list of things you have to watch out for living in the 21st century" - to me, this is nothing crazy (yet)

The nature of this tech itself is probably what is getting most people - it looks, sounds and feels _human_ - it's very relatable and easy for a non-tech person to understand it and thus get creeped out. I'd argue there are _far_ more dangerous technologies out there, but no one notices and / or cares because they don't understand the tech in the first place!


>to me, this is nothing crazy (yet)

The "yet" is carrying a lot of weight in that statement. It is now five years since the launch of GPT-2, three years since the launch of GPT-3 and less than 18 months since the launch of ChatGPT. I cannot think of any technology that has improved so much in such a short space of time.

We might hit an inflection point and see that rate of improvement stall, but we might not; we're not really sure where that point might lie, because there's likely to still be a reasonable amount of low-hanging fruit regarding algorithmic and hardware efficiency. If OpenAI and their peers can maintain a reasonable rate of improvement for just a few more years, then we're looking at a truly transformational technology, something like the internet that will have vast repercussions that we can't begin to predict.

The whole LLM thing might be a nothingburger, but how much are we willing to gamble on that outcome?


If we decide not to gamble on that outcome, what would you do differently than what is being done now? The EU already approved the AI act, so legislation-wise we're already facing the problem.


The EU AI act - like all laws - only matter to those which are required to follow said law.


Yes, but it's really hard to see a technical solution to this problem, short of having locked down hardware that only runs signed government-approved models and giving unlocked hardware only to research centers. Which is a solution that I don't like.


If you get off the internet you'd not even realise these tools exists though. And for the statement that all jobs will be modelled to be true, it'd have to be impacting the real world.


Is it even possible to "get off the internet" without also leaving civilisation in general at this point?

> it'd have to be impacting the real world

By writing business plans? Getting lawyers punished because they didn't realise that "passes bar exam" isn't the same as "can be relied on for citations"? By defrauding people with synthesised conversations using stolen voices? By automating and personalising propaganda?

Or does it only count when it's guiding a robot that's not merely a tech demo?


I’ll be worried about jobs being removed entirely by LLMs when I see something outside of the tech bubble genuinely having been removed by one - has there been any real cases of this? It seems like hyperbole. Most people in the world don’t even know this exists. Comparing it to the internet is insane, based off of its status as a highly advanced auto complete.


800 million dollar studio expansion halted - https://www.theguardian.com/technology/2024/feb/23/tyler-per...


Thank god! Enough Medea, already! I chalk this up as a win for humanity.


Sure, but think about all of the jobs that won't exist because this studio isn't being expanded, well beyond just whatever shows stop being produced. Construction, manufacturing, etc.

Edit: Also this doesn't mean less medea, just less actual humans getting paid to make medea or work adjacent jobs


Not like there's nothing else to construct.

Maybe it's time to construct some (high[er] density) housing where people want to live? No? Okay, then maybe next decade ... but then let's construct transport for them so they can get to work, how about some new subway lines? Ah, okay, not that either.

Then I guess the only thing remains to construct is all the factories that will be built as companies decouple from China.


> has there been any real cases of this?

Apparently so: https://www.businessinsider.com/jobs-lost-in-may-because-of-...

Note that this article is about a year old now.

> Comparing it to the internet is insane, based off of its status as a highly advanced auto complete.

(1) I was quoting you.

(2) Don't you get some cognitive dissonance dismissing it in those terms, at this point?

"Fancy auto complete" was valid for half the models before InstructGPT, as that's all the early models were even trying to be… but now? The phrase doesn't fit so well when it's multimodal and can describe what it's seeing or hearing and create new images and respond with speech, all as a single unified model, any more than dismissing a bee brain as "just chemistry" or a human as "just an animal".


"By 2005 or so, it will become clear that the Internet’s impact on the economy has been no greater than the fax machine’s."

~ Paul Krugman, winner of the 2008 Nobel Memorial Prize in Economic Sciences


If you get away from roads you wouldn't realize engines exist. Also, the internet is (part of) the real world.


Sure and there’s endless AI generated blog spam from “journalists” saying LLMs are amazing and they’re going to replace our jobs etc… but get away from the tech bubble and you’ll see we’re so far away from that. Full self driving when? Autonomous house keepers when? Even self checkout still has to have human help most of the time and didn’t reduce jobs much. Call me a skeptic but HN is way too optimistic about this stuff.

Replacing all jobs except LLM developers? I’ll tell my hairdresser


If we could predict "when", that would make the investment decisions much easier.

But we do have a huge number of examples of jobs disappearing thanks to machines — even the term "computer" used to refer to a job.

More recently and specifically to LLMs, such losses were already being reported around this time last year: https://www.businessinsider.com/jobs-lost-in-may-because-of-...


In a world where openAI exists, it really does require an almost breathtaking lack of imagination to be a skeptic.


Or you’ve been around the block long enough to recognize hype and know when your imagination may be wrong. Imagination isn’t infallible.


Right, that entire internet think was complete hype, didn't go anywhere. BTW, can you fax me the menu for today?

And that motorized auto transport, it never went anywhere, it required roads. I mean, who would ever think we'd cover a huge portion of our land in these straight lines. Now, don't mind me, I'm going to go saddle up the horse and hope I don't catch dysentery on the way into town.


I don't think anybody's denying that revolutions happen. It's just that the number of technologies that actually turned out to be revolutionary are dwarfed by the number of things that looked revolutionary and then weren't. Remember when every television was definitely going to be using glasses-free 3D? People have actually built flying cars and robot butlers, yet the Jetsons is still largely wishful thinking. The Kinect actually shipped, yet today we play games mostly with handheld controllers. AI probably has at least some substance, but there's a non-zero amount of hype too. I don't think either extreme of outcome is a foregone conclusion.


Capabilities aren't the problem, cultural adoption is. Just yesterday I talked to someone who still googles solutions to their Excel table woes. Didn't they know of Copilot?

Maybe they didn't know, maybe none of their colleagues used it, their company didn't pay for it, or maybe all they need is an Excel update.

But I am confident that using Copilot would be faster than clicking through the sludge that are Microsoft Office help pages (third party or not.)

So I think it is correct to fear capabilities, even if the real world impace is still missing. When you invent an airplane, there won't be an airstrip to land on yet. Is it useless, won't it change anything?


I don't see how "failing to use the best tools available" is a relevant problem for this topic, even though it is indeed a problem in other regards.


Copilot in excel is really awful.


HN comments, too. Long, grammatically perfect comments that sound hollow and a bit lengthy are everywhere now.

It's still early, and I don't see much in corporate communications, for instance, but it will be quite the change.


>Long, grammatically perfect comments that sound hollow and a bit lengthy

It's worse than I thought. They've already managed to mimick the median HN user perfectly!


No problem, I'm here to keep the language sophistication level low.


I take care of the etiquette and try to do my best to keep it low.

We need one who's doing the dirty work of not discussing.


I tried to make ChatGPT generate a counterpoint for that but it turns out you're right.


Yes. The old heuristics of if something is generated by grammar and sentence structure don't work as well anymore. The thing that fucks me up the most about it is that I now constantly have to be uncertain about whether something is human or not. Of course, you've always had to be careful about misinformation on the internet, but this raises the scalability of false, hollow, and harmful output to new levels. Especially if it's a topic I'm trying to learn about by reading random articles (or comments), there isn't much of a frame of reference to what's good info and what's hallucinated garbage.

I fear that at some point the anonymity that made the internet great in the first place will be destroyed by this.


To be fair that was already the case for me before AI, Right at that time that companies, individual and governments found out that they could write subvert ads in the form of comments posts and 'organic' and they started to flood reddit, discord, etc.

The dead internet theory started to look more real with time, AI spam is just scaling it up.


I have a strange problem: I have always written in a manner that now leads people to think I am a LLM.

It has been so bad, I even considered injecting misspelling and incorrect grammar and bad punctuation into my prose to prove my words are mine.


I feel you, people who liked the word "delve" will have to stop using it.


I'm a non native English speaker. Edge new feature of automatically improving my text is a God send. Unfortunately it is blocked at work.


Many business doesn't want to send their data to a third party such as OpenAI, so until locally run LLM becomes wildly available in businesses.


as the meme goes "always has been"

i remember seeing the change when GPT-2 was announced


We’ve reached a stage, where it would be advisable to not release recent photos of yourself, nor any video with sound clips to public, unless you want an AI fake instaperson of yourself starting to reach out to member of your externally visible social network, asking for money, emergency help, etc.

I guess we need to have an AI secretary to take in all phonecalls from now on (spam folder will become a lot more interesting with celebrity phone calls, your dead relative phoning you etc)


Hopefully, we will soon enter the stage where nobody believes anything they see anymore. Then, you no longer have to be afraid of being misinterpreted, because nobody is listening anymore anyway. Great time to be alive!


Luckily there’s a “solution” to that: Just don’t use the internet for dialogue anymore.

As someone that grew up with late-90’s internet culture and has seen all the pros and cons and changes over the decades, I find myself using the internet less and less for dialogue with people. And I’m spending more time in nature and saying hi to strangers in reality.

I’m still worries about the impact this will have on a lot of people’s ability to reason however. “Just” Tik Tok and apps like it has already had devastating results on certain demographics.


Statistically, more and more people spend time online and on phone. I'm not sure if we still reached the peak in terms of internet usage.


That bit "... there's a "solution"" - does it keep working in societies where there are mega corps pushing billions into developing engaging, compelling and interesting AI companions?


That's why I put it in quotation marks because it is a solution that will remain available, simply because the planet is really big and there'll always be places on the fringes. But it doesn't really solve the problem for society at large, it only solves it for an individual. But sometimes individuals showing other ways of living helps the rest of society see that there's choices where they previously thought there were none.


I don't know why anyone thinks this will happen. You can obviously write anything you want (we have an entire realm of works in this area that everyone knows about, fiction) and yet huge amounts of people believe passed around stories either from bad or faked media sources or entirely unsourced.


I'm not saying either you or the parent commenter is right or wrong, but fiction in books and movies are clearly fiction and we consume it as such. You are right that some people have been making up fake stories and others (the more naive) have been quick to believe in those false stories. The difference now is that it's not just text invented and written by a human, which takes time and dedication. Now it's done in a second. On top of that it's easy to enhance the text with realistic photos, audio and video. It becomes much more convincing. And this material is created in a few seconds or minutes.

It's hard to know what to believe if you get a phone call with the voice of your child or colleague, and your "child"/"colleague" replies within milliseconds in a convincing way.


I agree it's fundamentally different in application which I think will have a large impact (just like targeted advertising with optimisation vs billboards), but my point is that given people know you can just write anything and yet misinformation is abound - I don't see how knowing that you can fake any picture or video or sound leading to a situation where everyone just stops believing them.

I think unfortunately it will massively lower the trust of actual real videos and images, because someone can dismiss them with little thought.


Be glib, but that is one way for society to bring privacy back-and with it shared respect. I think of it as the “oh everyone has an anus” moment. We all know everyone has one and it doesn’t need to be dragged out in polite company.


I'm not sure if people work like that — many of us have, as far as I can tell for millennia and despite sometimes quite severe punishments for doing so, been massive gossips.


What you see will be custom tailored to what you believe, and your loyalty will be won. Do what the AI says and your life will be better. It already knows you better than you know yourself. Maybe you're one of those holdouts who put off a smartphone until life became untenable wihout it. Life will be even more untenable without your AI personal assistant/friend/broker/coach/therapist/teacher/girlfriend to navigate your life for you.


We're doing FSB's work for them. Or PLA Unit 61398's (or their comrades). Or Bureau 121's.

Brave New World indeed.


I think for most people it's far too late, as there exists at least something on the internet and that something is sufficient - photos can be aged virtually and a single photo is enough, voice doesn't change much and you need only a tiny sample, etc.

And that's the case even if you've never ever posted anything on your social media - it could be family&friends, or employer, or if you're ever been in a public-facing job position that has ever done any community outreach, or ever done a public performance with your music or another hobby, or if you've ever walked past a news crew asking questions to bystanders of some event, or if you've ever participated in some contests or competitions or sports leagues, etc, all of that is generally findable in various archives.


> photos can be aged virtually and a single photo is enough

I'm sure AI-based ageing can do a good enough job to convince many people that a fake image of someone they haven't seen for years is an older version of the person they remember; but how often would it succeed in ageing an old photo in such a way that it looks like a person I have seen recently and therefore have knowledge rather than guesses about exactly what the years have changed about them?

(Not a rhetorical question to disagree with you, I genuinely have no idea if ageing is predictable enough for a high % result or if it would only fool people with poor visual memory and/or who haven't seen the person in over a decade.)

I feel like even ignoring the big unknowns (at what age, if any, will a person start going bald, or choose to grow a beard or to die their hair, or get a scar on their face, etc.) there must be a lot of more subtle but still important aspects from skin tone to makeup style to hair to...

I've looked up photos of some school classmates that I haven't seen since we were teens (a couple of decades ago), and while nearly all of them I think "ah yes I can still recognise them", I don't feel I would have accurately guessed how they would look now from my memories of how they used to look. Even looking at old photos of family members I see regularly still to this day, even for example comparing old photos of me and old photos of my siblings, it's surprising how hard it would be for a human to predict the exact course of ageing - and my instinct is that this is more down to randomness that can't be predicted than down to precise logic that an AI could learn to predict rather than guess at. But I could be wrong.


Maybe it's Europeans posting this kind of stuff where they have much stronger privacy laws, but if you're in the US this is all wishful thinking.

Do you shop in large corporate stores and use credit cards? Do you go out in public in transportation registered to you?

If yes, then images and habits of yours are being stored in databases and sold to data brokers.

And you're not even including every single one of your family members that use internet connected devices/apps that are sucking up all the data they can.


I was just asking about the ability of photo aging software, not commenting about privacy at all. Though yes, I am thankfully in Europe (but there are recent photos of me online).

But don't disagree with you - in a different comment that was about privacy, I (despite living under GDPR) suggested that for offline verification with known people it's better to choose secrets that definitely haven't been shared online/anywhere rather than just choosing random true facts and assuming they couldn't have been found out by hackers: https://news.ycombinator.com/item?id=40353820


> I guess we need to have an AI secretary to take in all phonecalls

Why not an AI assistant in the browser to fend all the adversarial manipulation and spam AIs on the web? Going online without your AI assistant would be like venturing without a mask during COVID

I foresee a cat-and-mouse game, AIs for manipulation vs AIs for protection one upping each other. It will be like immune system vs viruses.


I'm paranoid enough that I now modulate my voice and speak differently when answering an unknown phone call just in case they are recording and building a model to call back a loved one later. If they do get a call, they will be like, "why are you talking like that?"


But why not just make up a secret word to use with your beloved ones in critical situations. In case of ..., one needs to know that secret. Otherwise, FAKE! Gotcha!


Secrets are either written down somewhere and end up on the Internet, or forgotten.


It doesn't have to be the unspeakable, but rather can be the name of the first pet or something others just can't guess on the first time.


The problem here is you're assuming your family members aren't idiots, this is your first mistake.

Chances are they've already shoved some app on their phone that's voice to txting everything they say and sending off somewhere (well lower chance if they have an iphone).

Modern life is data/information security and humans are insanely bad at it.


By chance, they are noobs but not idiots, because they ask me on everything - they don't need Google, I know everything hahah

I don't think it's a problem to find a word or a sentence or a story - whatever - that's commonly used by everyone on daily basis but in different context. That's not a problem by itself :) try it

For the idiots, it is still possible to find a word. They may be idiots, but still, they work and live on their own. They coming along in life. So, it's up to the smarter one to find a no-brainer solution.

I am confident and believe nothing and no one is stupid enough not to be able to adapt to something. Even if it's me, who'll need to adapt to members with less brain.


Better yet, don’t answer unknown phone calls.


> unknown phone calls

This is my biggest gripe against the telecom industry. Calls pretending to be from someone else.

For every single call, someone somewhere must know at least the next link in the chain to connect a call. Keep following the chain until you find someone who either through malice or by looking the other way allows someone to spoof someone else's number AND remove their ability to send the current link in the chain (or anyone) messages. (Ideally also send them to prison if they are in the same country.) It shouldn't be that hard, right?


Companies have complex telecoms but generally want the outside as one company number. Solution, the sender send a packet with the number they should get perceived as. Everyone sends this on. Everyone "looks the other way" by design haha


So what, gate that feature behind a check that you can only set an outgoing caller ID belonging to a number range that you own.

The technology to build trustable caller ID has existed for a long time, the problem is no one wants to be the one forcing telcos all over the world to upgrade their sometimes many decades old systems.


I can only imagine "Hello, this is Guy Incognito..."


> Social media is absolutely lousy with AI-powered fraud of varying degrees of sophistication.

has been for years mon ami. i remember when they started talking about GPT-2 here, and then seeing a sea-change in places like reddit and quora

quite visible on HN, esp. in certain threads like those involving brands that market heavily, or discussions of particular countries and politics.


People were already killing each other for thousands of years so introducing tanks was no big deal, I guess. To say nothing of nuclear weapons.


What does abating that trend look like? Most AI safety proposals I hear fall into the categories of a) we need to stop developing this technology or b) we need laws that entrench the richest and most powerful organizations in the world as the sole proprietors of this technology. Neighther of those actually sound better than people being paranoid weirdos about trusting text/video/voice. I think that's kinda where we need to be as a culture: these things are not trustworthy, they were only ever good as a rough heuristic, and now that ship has sailed. We have just finished a transition to treating the digital world as part of our "real" world, but it's time to step that back. Using the internet to interact with known trusted parties will still work fine, provided that some authentication can be shared out-of-band offline. Meeting people and discovering businesses and such? There will be more fakes and scams than real opportunities by orders of magnitude, and as technology progresses our filtering will only get worse. We need to roll back to "don't trust anything online, don't share your identity or payment information online" outside of, as mentioned, out-of-band verified parties. You can still message your friends and family, do online banking and commerce, but you can't initiate a relationship with a person or business online without some kind of trusted recommendation.


>What does abating that trend look like?

I don't think anyone has a good answer to that question, which is the problem in a nutshell. Job one is to start investing seriously in finding possible answers.

>We need to roll back to "don't trust anything online, don't share your identity or payment information online"

That's easy to say, but it's a trillion-dollar decision. Alphabet and Meta are both worthless in that scenario, because ~all of their revenue comes from connecting unfamiliar sellers with buyers. Amazon is at existential risk. The collapse of Alibaba would have a devastating impact on Chinese exporters, with massive consequent geopolitical risks. Rolling back to the internet of old means rolling back on many years worth of productivity and GDP growth.


> because ~all of their revenue comes from connecting unfamiliar sellers with buyers

Well that's exactly the sort of service that will be extremely valuable in a post-trust internet. They can develop authentication solutions that cut down on fraud at the cost of anonymity.


“Extremely valuable” is another way of saying “extremely costly”.


Trust is more complex then we take credit for.

Even when it comes to people like our parents, there are things we would trust them to do, and things that we would not trust them to do. But what happens when you have zero trusted elements in a category?

At the end of the day, the digital world is the real world, not some seperate place 'outside the environment'. Trying to treat digital like it doesn't exist puts you in a dangerous place to be deceived. For example if you're looking for XYZ and you manage to leak this into the digital world, said digital world may manipulate your trusted friends via ads, articles, the social media posts they see on what they think about XYZ before you ask them.


Point a) is just point b) in disguise. You're just swapping companies for governments.

This tech is dangerous, and I'm currently of the opinion that its uses for malicious purposes are far better and more significant than LLM's replacing anyone's jobs. The bullshit asymmetry principle is very incredibly significant for covert ops and asymmetric warfare, and generating convincing misinformation has become basically free overnight.


>Regardless of any future impacts on the labour market or any hypothesised X-risks

Discovering an asteroid full of gold, with as much gold as half the earth to put a modest number, would have huge impact to the labour market. Anything conductive like copper, silver, mining jobs would all go away. Also housing would be obsolete as we would all live in golden houses. A huge impact to the housing market, yet it doesn't seem such a bad thing to me.

>We're already at a point where we're counselling elders to ignore late-night messages from people claiming to be a relative in need of an urgent wire transfer.

Anyone can prove their identity, or identities, over the wire, wire-fully or wire-lessly, anything you like. When i did go to university, i was the only one attending the cryptography class, no one else showed up for a boring class like this. I wrote a story about the Electrona Corp in my blog.

What i say to people for at least 2 years now, is that "Remember when governments were not just some cryptographic algorithms?" Yeah, that's gonna change. Cryptography is here to stay, it is not as dead as people think and it's gonna make a huge blast.


> Discovering an asteroid full of gold, with as much gold as half the earth to put a modest number, would have huge impact

All this would do is crash the gold price. Also note that all the gold at our disposal right now (worldwide) basically fits into a cube with 20m edges (its not as much as you might think).

Gold is not suitable to replace steel as building material (because it has much lower strength and hardness), nor copper/aluminium as conductor (it's a worse conductor than copper and much worse in conductivity/weigth than aluminium). The main technical application short term would be gold plated electrical contacts on every plug and little else...


Regarding gold, i like this infographic [1], but my favorite from this channel is wolf population by country. Point being, that gold is shiny and beautiful, and it will be used even when it is not appropriate solution to the problem, just because it is shiny.

I didn't know that copper is a better conductor than gold. Surprised by that.

[1] https://www.youtube.com/watch?v=E2Gd8CRG0cc


> The main technical application short term would be gold plated electrical contacts on every plug and little else...

.. And gold teeth and grillz.


> What i say to people for at least 2 years now, is that "Remember when governments were not just some cryptographic algorithms?" Yeah, that's gonna change. Cryptography is here to stay, it is not as dead as people think and it's gonna make a huge blast.

The thing about cryptography and government is that it's easy to imagine for a great technology to be adapted on the governmental level because of its greatness. But it is another thing to actually implement it. We live in a bubble, where almost anyone knows about cryptographic hashes and RSA, but for most of the people it is not the case.

Another thing is that political actors are tending to try to concentrate power in their own hands. No way they will delegate a decision making to any form of algorithm — being cryptographic or not.


As soon as mimicking voices, text messages, human faces becomes a serious problem, like this case in UK [1], then citizens will demand a solution to that problem. I don't personally know how prevalent problems like that are as of today, but given the current trajectory of A.I. models which become smaller, cheaper and better all the time, soon everyone on the planet will be able to mimic every voice, every face and every handwritten signature of anyone else.

As soon as this becomes a problem, then it might start bottom-up, citizens to government officials, rather than top to bottom, from president to government departments. Then governments will be forced to formalize identity solutions based on cryptography. See also this case in Germany [2].

One example like that, is bankruptcy laws in China. China didn't have any law regarding to bankruptcy till 2007. For a communist country, or rather not totally capitalist country like China, bankruptcy is not an important subject. When some people stop being profitable, they will keep working because they like to work and they contribute to the great nation of China. That doesn't make any sense of course, so their government was forced to implement some bankruptcy laws.

[1]https://www.wsj.com/articles/fraudsters-use-ai-to-mimic-ceos... [2]https://news.ycombinator.com/item?id=39866056


A lot of these are non-AI problems. People trying to defraud the elderly need to be taken out back and shot, that’s not an AI issue.


Right, I'll just get right on a plane and travel to whereverthefuckville overseas and ask for permission to face blast the scammers. The same scammers that are donating a lot of money to their local (probably very poor) law enforcement to keep their criminal enterprise quite. This will go well.


> I'm not confident that I'd be able to distinguish GPT 4o from a human speaker

Probably why it's not released yet. It's unsafe for phishing.


I think people are dismissive for a few reasons.

- It helps them sleep at night if their creation doesn't put millions of people out of work.

- Fear of regulation


> What defences do we have when an LLM will be able to have a completely fluent, natural-sounding conversation in someone else's voice?

The world learnt to deal with Nigerian Prince emails and nobody is falling to those anymore. Nothing was changed - no new laws or regulations needed.

Phishing calls have been going on without an AI for decades.

You can be skeptical and call back. If you know your friends or family you should be able to find an alternative way to get in touch always without too much effort in the modern connected world.

Just recently a gang in Spain was arrested for "son in trouble" scam. No AI used. Most of the parents are not fooled in this.

https://www.bbc.com/news/world-europe-68931214

The AI might have some marginal impact, but it does not matter in the big picture of scams. While it is worrisome, it is not a true safety concern.


> yet the real-world impacts remain modest so far.

I second that. I remember when Google search first came out. Within a few days it completely changed my workflow, how I use the Internet, my reading habits. It easily 5 ~ 10x the value of Internet for me over a couple of weeks.

LLMs is doing nothing of the sort for me.


Google was a step function, a complete leveling up in terms of usability of returned data.

ChatGPT does this again for me. I am routinely getting zero useful results on the first page or two of Google searches, but AI is answering or giving me guidance quickly.

Maybe this would not seem such an improvement if Google's results were like they were 10 years ago and not barely usable blogspam


> I am routinely getting zero useful results on the first page or two of Google searches, but AI is answering or giving me guidance quickly.

To me, this just sounds like Google Search has become shit, and since Google simply isn't going to give up the precious ad $$$ that the current format is generating, the next best thing is ChatGPT. But this is different from saying that ChatGPT is a similar step up like Search was.

For what it's worth, I agree with you that Google Search has become unusable. Google basically destroyed it's best product (for users), by turning it into an ad riddles shovelware cesspit.

That ChatGPT is similarly good like Google Search used to be, is a tragedy. Basically we had a conceptually simple product that functioned very well, and we are replacing it with a significantly more complex product.


What are you searching for? I see people complaining about this a lot but they never give examples. Google is chock full of spam, yes, but it still works for me.


Google’s results are themselves an AI product though. You’re just comparing different AIs.


OMG I remember trying Google when it was in beta, and HOLY CRAP what I had been using was like freakin night and day. AltaVista: remember that? That was the state of the art before that, and it did not compare. Night and day.


I remember Google being marginally better than Altavista but not much more.

The cool kids in those days used Metacrawler, which meta searched all the search engines.


Google was marginally better in popular searches and significantly better for tail searches. This is a big reason why it flourished with the technical and student crowd in earlier days because those exceedingly rare sub-sub-topics would get surfaced higher in the rankings. For the esoteric topics Yahoo didn't have it in catalog and Altavista maybe had it but it was on page 86. Even before spelling correction and dozens of other useful search features were added, it was tail search and finding what you were looking for sooner. Serving speed, too, but perhaps that was more subtle for some.

Metasearch only helps recall. It won't help precision, the metasearch still needs to rank the aggregate results.


I used Metacrawler, it was dog slow. The beauty of Google was it was super fast, and still returned results that were at least as good, and often better, than Metacrawler. After using Google 2-3 times I don’t think I ever used Metacrawler again.


You just gave me a weird flashback to 1997.

And hey maybe when combined with GPT-4o AskJeeves will finally work as intended.


With Altavista I had to go through 2 or 3 pages just to find the GNU website. I remember the Google beta as life-changing.


Remember dogpile? Great aggregator


Yes, Altavista was the major step over Yahoo! directory.


And I'm sure that it's doing that for some people, but... I think those are mostly in the industry. For most of the people outside the tech bubble, I think the most noticeable impact it has had on their lives so far is that they've seen it being talked about on the news, maybe tried ChatGPT once.

That's not to say it won't have more significant impact in the future; I wouldn't know. But so far, I've yet to see the hype get realised.


>LLMs is doing nothing of the sort for me.

Don't use it for things you're already an expert in, it can't compare to you yet.

Use it for learning new things, or for things you aren't very good at and don't want to bother with. For these it's incredible.


For me, LLMs mostly replaced search. I run local Ollama, and whenever I need help with coding/docs/examples, I just ask Mixtral7x8B, and get an answer instantly, tailored to my needs.


> OpenAI are masters of hype. They have been generating hype for years now, yet the real-world impacts remain modest so far.

Perhaps.

> Do you remember when they teased GPT-2 as "too dangerous" for public access? I do. Yet we now have Llama 3 in the wild, which even at the smaller 8B size is about as powerful as the [edit: 6/13/23] GPT-4 release.

The statement was rather more prosaic and less surprising; are you sure it's OpenAI (rather than say all the AI fans and the press) who are hyping?

"""This decision, as well as our discussion of it, is an experiment: while we are not sure that it is the right decision today, we believe that the AI community will eventually need to tackle the issue of publication norms in a thoughtful way in certain research areas.

We are aware that some researchers have the technical capacity to reproduce and open source our results. We believe our release strategy limits the initial set of organizations who may choose to do this, and gives the AI community more time to have a discussion about the implications of such systems."""


That's fair: the statement isn't hyperbolic in its language. But remember that GPT-2 was barely coherent. In making this statement, I would argue that OpenAI was trying to impart a sense of awe and danger designed to attract the kind of attention that it did. I would argue that they have repeatedly invoked danger to impart a sense of momentousness to their products. (And to further what is now a pretty transparent effort to monopolize the tech through regulatory intervention.)


> (And to further what is now a pretty transparent effort to monopolize the tech through regulatory intervention.)

I disagree here also: the company has openly acknowledged that this is a risk to be avoided with regards to safety related legislation, what they've called for looks a lot more like "we don't want a prisoner's dilemma that drives everyone to go fast at the expense of safety" rather than "we're good everyone else is bad".


> yet the real-world impacts remain modest so far.

I spend a part of yesterday evening sorting my freshly dried t-shirts into 4 distinct piles. I used OpenAI Vision (through BeMyEyes) from my phone. I got a clear description of each and every piece of clothing, including print, colours and brand. I am blind BTW. But I guess you are right, no impact at all.

> Yet we now have Llama 3 in the wild

Yes, great, THANKS Meta, now the Scammers have something to work with. Thats a wonderful achievement which should be praised! </sarcasm>


> I got a clear description of each and every piece of clothing, including print, colours and brand. I am blind BTW.

That is a really great application of this tech. And definitely qualifies as real-world impact. Thanks for sharing that!


I can’t even get GPT 4 to reliably take a list of data and put it in a CSV. It gets a problem every single time.

People read too many sci-fi books and then project their fantasies on to real-world technologies. This stuff is incredibly powerful and will have social effects, but it’s not going to replace every single job by next year.


GPT-4 is better at planning than at executing.

Have you tried asking it to generate a regex to transform your list into a CSV?


I remember when people used to argue about regex being bad or good, with a lot of low quality regex introducing bugs in codebases.

Now we have devs asking AI to generate regex formulas and pasting it into code without much concern on its validity.


Regexes are easy to test.

Bad developers do bad regexes, regardless of whether they used AI.


How do you test a regex to be 100% sure it's valid? I don't think it's possible.


If it's using classical regex, without backtracking or other extensions, a regular expression is isomorphic to a state machine. You can enumerate combinations doing something like this: https://stackoverflow.com/a/1248566

kids these days and their lack of exposure to finite automata


- Vehicle steering is easy to test.

- How so? I don't think it's possible to test for all cases...

- Well, it's easy, assuming a car on a non-branching track, moving with a constant speed and without any realistic external influences on it, you can simply calculate the distance traveled using the formula s = v/t. Ah, I wish I'd stop running into fools not knowing Newton's first law of motion...

- ??? Are you well?


I understand you want to refute/diminish the parent comment on finite automata, but I think you are providing a straw man argument. The parent comment does provide an interesting, factual statement. I don't believe finite state automata are at all close in complexity to real-world self-driving car systems (or even a portion thereof). Your closing statement is also dismissive and unconstructive.

I believe finite state modeling is used at NASA, A google search brings up a few references (that I'm probably not qualified to speak to), and I also remember hearing/reading a lecture on how they use them to make completely verifiable programs but can't find the exact one at the moment.


I wasn't making a strawman, I was making a parody of his strawman. I thought it's obvious, since I was making an analogy, and it was an analogy to his argument.


I should have been more clear perhaps: many regexes are capable of being verified with 100% certainty: https://en.m.wikipedia.org/wiki/Thompson%27s_construction

But not all regexes (eg, those using PCRE extensions afaik) are amenable to such a treatment. Those you just tend to hope they work.


True for most things I think


Aren't you asking how to create software without bugs?


Well regex isn't Turing-complete, so it's not exactly an analysis of a program. You could reason about regex, about tokens, then describe them in a way that satisfies the specification, but theorizing like this is exactly opposite to "simple" - it would be so much harder than just learning regex. So stating that testing regex is simple is just bs. The author later confirms he is a bullshitter by his follow-up...


No, I’ll give that a shot. I have just been asking it to convert output into a CSV, which used to work somewhat well. It stumbles when there is more complexity though.


Humans also stumble with that as well. Problems being CSV not really being that well defined and it is not clear to people how quoting needs to be done. The training set might not contain enough complex examples (newlines in values?)


No, the data is very well defined. For example, “name, date of birth, key ideas,” etc.

The issue is with ChatGPT formatting a file.


Even if you get it to work 100% of the time, it will only be 99.something%. That's just not what it's for I guess. I pushed a few million items through it for classification a while back and the creative ways it found to sometimes screwup, astounded me.


Yeah and that's why I'm skeptical of the idea that AI tools will just replace people, in toto. Someone has to ultimately be responsible for the data, and "the AI said it was true" isn't going to hold up as an excuse. They will minimize and replace certain types of work, though, like generic illustrations.


> "Someone has to ultimately be responsible for the data"

All you have to do is survive long enough as an unemployed criminal until the system gets round to exonerating you:

https://en.wikipedia.org/wiki/British_Post_Office_scandal

"The British Post Office scandal, also called the Horizon IT scandal, involved Post Office Limited pursuing thousands of innocent subpostmasters for shortfalls in their accounts, which had in fact been caused by faults in Horizon, accounting software developed and maintained by Fujitsu. Between 1999 and 2015, more than 900 subpostmasters were convicted of theft, fraud and false accounting based on faulty Horizon data, with about 700 of these prosecutions carried out by the Post Office. Other subpostmasters were prosecuted but not convicted, forced to cover Horizon shortfalls with their own money, or had their contracts terminated. The court cases, criminal convictions, imprisonments, loss of livelihoods and homes, debts and bankruptcies, took a heavy toll on the victims and their families, leading to stress, illness, family breakdown, and at least four suicides. In 2024, Prime Minister Rishi Sunak described the scandal as one of the greatest miscarriages of justice in British history.

Although many subpostmasters had reported problems with the new software, and Fujitsu was aware that Horizon contained software bugs as early as 1999, the Post Office insisted that Horizon was robust and failed to disclose knowledge of the faults in the system during criminal and civil cases.

[...]

challenge their convictions in the courts and, in 2020, led to the government establishing an independent inquiry into the scandal. This was upgraded into a statutory public inquiry the following year. As of May 2024, the public inquiry is ongoing and the Metropolitan Police are investigating executives from the Post Office and its software provider, Fujitsu.

Courts began to quash convictions from December 2020. By February 2024, 100 of the subpostmasters' convictions had been overturned. Those wrongfully convicted became eligible for compensation, as did more than 2,750 subpostmasters who had been affected by the scandal but had not been convicted."


Do you even work with humans now? I get "Computer says no" out of corporations all the time as it is, AI is just completing that loop.


> Do you remember when they teased GPT-2 as "too dangerous" for public access? I do.

I can't help but notice the huge amount of hindsight and bad faith that it demonstrated here. Yes, now we are aware that the internet did not drown in a flood of bullshit (well, not noticeably more), when GPT-2 was released.

But was it obvious? I certainly thought that there was a chance that the amount of blog spam that could be generated effortlessly might just make internet search unusable. You are declaring "hype", when you could also say "very uncertain and conscientious". Is this not something we want people in charge to be careful with?


I think the problem is, we did drown in a flood of bullshit, but we've just somehow missed it.

Even in this thread people talk about "Oh I use ChatGPT rather than Google search because Google is just stuffed with shit". And on HN there are plenty of discussions about huge portion of reddit threads being regurgitated older comments.


GPT-4 already seems better at reasoning than most people. It just has an unusual training domain of Internet text.


I was going to say the same thing. For some real world estimation tasks where I don't want 100% accuracy (example: analysing working capital of a business based on balance sheet, analysing some images and estimating inventory etc.) the job done by GPT-4o is better than fresh MBA graduates from tier 2/tier 3 cities in my part of world.

Job seekers currently in college have no idea what is about to hit them in 3-5 years.


I agree. HN's and the tech bubble's bias many people are not noticing is that it's full of engineers comparing GPT-4 to software engineering tasks. In programming, the margin of error is incredibly slim in the way that a compiler either accepts entirely correct code (in its syntax of course) or rejects it. There is no in between, and verifying software to be correct is hard.

In any other industry where just need an average margin of error close to a human's work and verification is much easier than generating possible outputs, the market will change drastically.


On the other hand, programming and software engineering data is almost certainly over-represented on the internet compared to information from most professional disciplines. It also seems to be getting dramatically more focus than other disciplines from model developers. For those models that disclose their training data, I've been seeing decent sized double-digit percentages of the training corpus being just code. Finally, tools like copilot seem ideally positioned to get real-world data about model performance.


I’d love to see this! Can you give us a couple of concrete examples of this that we can check?


not really. Even a human bad at reasoning can take 1 hour of time to tinker around and figure things out. GPT-4 just does not have the deep planning/reasoning ability necessary for that.


Have you seen some people with technology? =)

They won't "take 1 hour of time", they try it once or twice and give up.


I think you might be falling for selection bias. I guess you are surrounding yourself with a lot of smart people. "tinker around and figure things out" is definitely something certain humans (bad at reasoning) can't do. I already prefer the vision model when it comes to asking for a picture description (blind user) over many humans I personally know. The machine is usually more detailed, and takes the time to read the text, instead of trying to shortcut and decide for me whats important. Besides, people from the english speaking countries do not have to deal with foreign languages. Everyone else has to. "Aber das ist ja in englisch" is a common blocker for consuming information around here. I tell you, if we dont manage to ramp up education a few notches, we'll end up with even higher stddev when it comes to practical intelligence. We already have perfectly normal seeming humans absolutely unable to participate on the internet.


Reasoning and planning are different things. It's certainly getting quite good at deductive reasoning, especially when forced to check it's own arguments for flaws every time it states something. (I had a several hour chat with it yesterday, and I was very impressed about the progress.)

Planning is different in that it is an essential part of agency. That's what Q* is supposed to add. My guess is that planning is the next type of functionality to be added to GPT. I wouldn't be surprised if they already have a version internally with such functionality, but that they've decided to hold it back for now for reasons such as safety (some may care about the election this year) or simply that the inference costs are so huge they cannot possibly expose it publicly.


Does it need those things if it can just tap into artifacts generated by humans who did spend that hour?


The only reason I still have a job is that it can't (yet) take full advantage of artefacts generated by humans.

"Intern of all trades, senior of none", to modernise the cliché.


If everyone is average at reasoning then it must not be a very important trait or we’d all be at reasoning school getting better at it.

Really philosophy seems to be one of the least important subjects right now. Hardly anyone learns about it in school.

If it was so important to success in the wild than it would stand to reason we all work hard at improving our reasoning skills, but very few do.


What schools teach is what governments who set the curriculum like to think is important, which is why my English lessons had a whole section on the Shakespearean (400-year-old, English, Christian) take on the life and motivations of a Jewish merchant living in Venice, followed up with a 80 year old (at the time) English poem on exactly how bad it is to watch your friends choke to death as their lungs melt from chlorine gas in the trenches of the first world war.

These did not provide useful life-lessons for me.

(The philosophy A-level I did voluntarily seemed to be 50% "can you find the flaws in this supposed proof of the existence of god?")


> These did not provide useful life-lessons for me.

Shakespeare is packed with insight.


None of the stuff we did at school showed any indication of insight into things of relevance to our world.

If I took out a loan on the value of goods being shipped to me, only for my ship to be lost at sea… it would be covered by insurance, and no bank would even consider acting like Shylock (nor have the motivation of being constantly tormented over religion) for such weird collateral, and the bank manager's daughters wouldn't get away with dressing up as lawyers (no chance their arguments would pass the sniff test today given the bar requirement) to argue against their dad… and they wouldn't need to because the collateral would be legally void anyway and rejected by any court.

The ships would also not then suddenly make a final act appearance to apologise for being late, to contradict the previous belief they were lost at sea, because we have radio now.

The closest to "relevant" that I would accept, is the extent to which some of the plots can be remade into e.g. The Lion King or Wyrd Sisters — but even then…

"Methinks, to employeth antiquated tongues doth render naught but confusion, tis less even than naughty, for such conceits doth veil true import with shadows."


They're masters of hype because their products speak for themselves (literally)


Yeah. Open ai are certainly not masters of hype lol. They released their titular product to basically no fanfare or advertisement. ChatGPT took off on Word of Mouth alone. They dropped GPT-4 without warning and waited months to ship it's most exciting new feature (image input).

Even now, they're shipping text-image 4o but not the new voice while leaving old-voice up and confusing/disappointing a whole lot of people. This is a pretty big marketing blunder.


> ChatGPT took off on Word of Mouth alone.

I remember for a good 2-3 months in 2023 ALL you could see on tiktok / youtube shorts was just garbage about 'how amazing' ChatGPT was. Like - video after video and I was surprised of the repeat content being recommended to me... No doubt openAI (or something) was behind that huge marketing push


Is it not possible this would be explained by people simply being interested in the technology and TikTok/Youtube algorithms noticing that—and that they would have placed you in the same bubble, which is probably an accurate assignment?

I doubt OpenAI spent even one cent marketing their system (e.g. as in paying other companies to push it).


Well if you were a typical highly engaged TikTok or YouTube user, you are probably 13-18 years old. The kind of cheating in school that ChatGPT enabled is revolutionary. That is going to go viral. It's not a marketing push. After years of essentially learning nothing during COVID lockdowns, can you understand how transformative that is? It's like 1,000x more exciting than pirating textbooks, stealing Mazdas, or whatever culturally self-destructive life hacks were being peddled by freakshow brocolliheads and Kim Kardashian-alikes on the platform.

It's ironic because the OpenAI creators really loved school and excelled academically. Nobody cares that ChatGPT destroyed advertising copywriting. But whatever little hope remained for the average high schooler post-lockdowns, it was destroyed by instant homework cheating via ChatGPT. So much for safety.


> No doubt

Who needs evidence when we have your lack of doubt hey?


I think you meant to say "All that "I" could see". There's a lot of bubbles in social media. Not everyone is part of your bubble.


No, it's just the masses sniffing out the new fascinating techbro thing to make content about.

In a way I'm sorry, that's what people do nowadays. I'd prefer it to be paid for, honestly.


"real-world impacts remain modest so far." Really? My Google usage has went down with 90% (it would just lead me to some really bad take from a journalist anyway, while ChatGPT can just hand me the latest research and knows my level of expertise). Sure it is not so helpful at work, but if OpenAI hasnt impacted the world I fail to see which company have in this decade.


“Replaced Google” is definitely an impact, but it’s nothing compared to the people that were claiming entire industries would be wiped out nearly overnight (programming, screenwriting, live support, etc).


Speak to some illustrators or voiceover artists - they're talking in very bleak terms about their future, because so many of them are literally being told by clients that their services are no longer required due to AI. A double-digit reduction in demand is manageable on aggregate, but it's devastating at the margin. White-collar workers having to drive Ubers or deliver packages because their jobs have been taken over by AI is no longer a hypothetical.


We had this in content writing and marketing last year. A lot of layoffs were going to happen anyway due to the end of ZIRP, AI came just at the right time, and so restructuring came bundled with "..and we are doing it with AI!".

It definitely took out a lot of jobs from the lowest rungs of the market, but on the more specialized / upper end of the ladder wages got actually higher and a lot of companies got burned, and now they have to readjust. It's rolling over slowly still, as there a lot of companies selling AI products and in turn new companies adopting those products. But it tells you a lot that

A) a company selling an AI assistant last year is now totally tied to automating busy work tasks around marketing and sales

B) AI writing companies are some of the busiest in employing human talent for... writing and editorial roles!

It's all very peculiar. I haven't seen anything like this in the past 15 years... maybe the financial crisis and big data was similar, but much much smaller at scale.


>It definitely took out a lot of jobs from the lowest rungs of the market, but on the more specialized / upper end of the ladder wages got actually higher

Effectively all mechanization, computerization, and I guess now AI-ization has done this. In the past you could have a rudimentary education and contribute to society. Then we started requiring more and more grade school, then higher education for years. Now we're talking about the student debt crisis!

At least if AI doesn't go ASI in the near term the question is how are we going to train the next generation of workers to go from unskilled to more skilled and useful than the AI is. Companies aren't going to want to do this. The individuals are going to think it's risky getting an education that could be replaced by a software update. If left to go out of control this is how a new generation of luddites will burn data centers in protest they are starving on the streets.


colleges are seeing apprentices placements drop - why train an apprentice for two years when ChatGPT will do the work for them?


We should be thinking pretty hard right about now why this kind of progress and saving these expenses is a BAD thing for humanity. The answer will touch deeply ingrained ideas about what and who should underpin and benefit from progress and value in society.


I think mostly claims have been around multiplying the efforts of people for now.


If Google hadn't ruined Search to help Advertising perhaps it wouldn't have been such a stark comparison in information quality.


Search was always a byproduct of Advertising. Don’t blame Google for sticking to their business model.

We were naive to think we could have nice things for free.


When google search first appeared, it had nothing to do with advertising. In fact, the founders wrote a paper on why advertising would be bad.


Found the zoomer.


It will be interesting to see how they compare, years from now, when ChatGPT has been similarly compromised.


It might not happen in that way since there are alternatives available. Google had/has a monopoly on search.


For the premium subscribers it'll be good, but they'll sure ruin the experience for free tier just like Spotify cause they just can't keep their business sustainable without showing vc's some profits.


There is little other way of making money from search.


I believe you, and I do turn to an LLM over Google for some queries where I'm not concerned about hallucination. (I use Llama 3 most of the time, because the privacy is absolute.)

But OpenAI is having a hard time retaining/increasing ChatGPT users. Also, Alphabet's stock is about as valuable as it's ever been. So I don't think we have evidence that this is really challenging Google's search dominance.


Google is an ad company. Ad prices are on an auction and most companies believe that they need ads. Less customers don't necessarily mean that the earnings go down, as when the clicks go down the prices might go up (without ad competitors). Ergo, they don't compete (yet at least).

But ChatGPT has really hurt Google's brand image.


Ironically, I was like that for a while, but now use regular google search again quite a bit. A lot of times, good old stack overflow is best.


The questions I ask ChatGPT have (almost) no monetary value for Google (programming, math, etc).

The questions I still ask Google, have a lot of monetary value (restaurants, cloths, movie, etc).


I use Google and it gives me AI answers.

But I agree seems SO often helps more than Google-AI.


It's well known that LLMs don't reason. That's not what they are for. It's a throw away comment to say that a product can't do what it explicitly is unable to do. Reasoning will require different architectures. Even with that LLMs are incredibly useful.


Chat GPT 3.5 has been neutered, as it it won't spit out anything that isn't overly politically correct. 4chan were hacking their way around it. Maybe that's why they decided it was "too dangerous".


" GPT-4o doesn't show much of an improvement in "reasoning" strength."

Maybe that is GPT-5.

And this release really is just incremental improvements in speed, and tying together a few different existing features.


> yet the real-world impacts remain modest so far

Go ask any teacher or graphician.


That's one of my biggest fears, teachers using AI generated content without "checks" to raise / teach / test our children.


> Do you remember when they teased GPT-2 as "too dangerous" for public access? I do.

Maybe not GPT-2, but in general LLMs and other generative AI types aren't without their downsides.

From companies looking to downsize their staff to replace them with software, to the work of artists/writers being devalued somewhat, to even easier scams and something like the rise of AI girlfriends, which has also gotten some critique, some of those can probably be a net negative.

Even when it's not pearl clutching over the advancements in technology and the social changes that arise, I do wonder how much my own development work will be devalued due to the somewhat lowered entry barrier into the industry and people looking for quick cash, same as with boot camps leading to more saturation. Probably not my position individually (not exactly entry level), but the market as a whole.

It's kind of at a point where I use LLMs for dev work not to fall behind, cause the productivity gains for simple problems and boilerplate are hard to argue with.


> They have been generating hype for years now, yet the real-world impacts remain modest so far.

I feel like everyone who makes this claim doesn't actually have any data to backup it up.


Like another comment mentioned, sigmoid curves [1] are ubiquitous with neural network systems. Neural network systems can be intoxicating because it's so "easy" (relatively speaking) to go from nothing to 80% in extremely short periods of time. And so it seems completely obvious that hitting 100% is imminent. Yet it turns out that each percent afterwards starts coming exponentially more slowly, and we tend to just bump into seemingly impassable asymptotes far from where we'd like to be.

~8 years ago when self driving technology was all the rage and every major company was getting on board with ever more impressive technological demos, it seemed entirely reasonable to expect that we'd all be in a world of complete self driving imminently. I remember mocking somebody online around the time who was pursuing a class C/commercial trucking license. Yet now a decade later, there are more truckers than ever and the tech itself seems further away than ever before. And that's because most have now accepted that progress on such has basically stalled out in spite of absolutely monumental efforts at moving forward.

So long as LLMs regularly hallucinate, they're not going to be useful for much other than tasks that can accept relatively high rates of failure. And many of those generally creative domains are the ones LLMs are paradoxically the weakest in - like writing. Reading a book written by an LLM would be cruel and unusual punishment given then current state of the art. One domain I do see them completely taking over is search. They work excellently as natural language search engines, and "failure" in such is very poorly defined.

[1] - https://en.wikipedia.org/wiki/Sigmoid_function


I'm not really sure your self-driving analogy is apt here. Waymo has cars on the road right now that are totally autonomous, and just expanded its footprint. It has been longer and more difficult than we all thought, and those early tech demos were a glimmer of what was to come; then we had to grind to get there, with a lot of engineering.

I think what maybe seems not obvious amidst the hype is that there is a hell of a lot of engineering left to do. The fact that you can squash the weights of a neural net down to 3 bits per param and it still works -- is evidence that we have quite a way to go with maturing this technology. Multimodality, improvements to the UX of it, the human-computer interface part of it. Those are fundamental tech things, but they are foremost engineering problems. Getting latency down. Getting efficiency up. Designing the experience, then building it out.

25 years ago, early tech demos on the internet were promising that everyone would do their shopping, entertainment, socializing, etc... online. Breathless hype. 5 years after that, the whole thing crashed, but it never went away. People just needed time to figure out how to use it and what it was useful for, and discover its limitations. 10 years after that, engineering efforts were systematized and applied against the difficult problems that still remained. And now: look at where we are. It just took time.


I don't think he's saying that AGI is impossible — almost noone (nowadays) would suggest that it's anything but an engineering challenge. The argument is simply one of scale, i.e. how long that engineering challenge will take to solve. Some people are suggesting on the order of years. I think they're suggesting it'll be closer to decades, if that.


AGI being "just an engineering challenge" implies that it is conceptually solved, and we need only figure out how to build it economically.

It most definitely is not.


Waymo cars are highly geofenced in areas with good weather and good quality roads. They only just (in January) gained the capability to drive on freeways.

Let me know when you can get a Waymo to drive you from New York to Montreal in winter.


> Waymo cars are highly geofenced in areas with good weather and good quality roads. They only just (in January) gained the capability to drive on freeways

They are an existence proof that the original claim that we seem further than ever before is just wrong.


There are 6 categories of self driving, starting at 0. The final level is the one we've obviously been aiming at, and most were expecting. It's fully automated self driving in all conditions and scenarios. Get in your car anywhere, and go to anywhere - with capability comparable to a human. Level 4, by contrast, is full self driving under certain circumstances and generally in geofenced areas - basically trolleys without rails. Get in a car, so long as conditions are favorable, and go to a limited set of premapped locations.

And level 4 is where Waymo is, and is staying. Their strategy is to to use tiny geofenced areas with a massive amount of preprocessing, mapping out every single part of an area, not just in terms of roads but also every single meta indicator - sign, signals, cross walks, lanes, and so on. And it creates a highly competent, but also highly rigid system. If road conditions change in any meaningful way, the most likely outcome with this strategy is simply that the network gets turned off until the preprocessing can be carried and reuploaded again. That's completely viable in small geofenced areas, but doesn't generalize at all.

So the presence of Waymo doesn't say much of anything about the presence of level 5 autonomy. If anything it suggests Waymo believes that level 5 autonomy is simply out of reach, because the overwhelming majority of tech that they're researching and developing would have no role whatsoever in level 5 automation. Tesla is still pushing for L5 automation, but if they don't achieve this then they'll probably just end up getting left behind by companies that double down on L4. And this does indeed seem to be the most likely scenario for the foreseeable future.


This sounds suspiciously like that old chestnut, the god of the gaps. You're splitting finer and finer hairs to maintain your position that, "no, really, they're not really doing what I'm saying they can't do", all the while self-driving cars are spreading and becoming more capable every year.

I don't think we have nearly as much visibility on what Waymo seems to believe about this tech as you seem to imply, nor do I think that their beliefs are necessarily authoritative. You seem disheartened that we haven't been able to solve self-driving in a couple of decades, and I'm of the opinion that geez, we basically have self-driving now and we started trying only a couple of decades ago.

How long after the invention of the transistor did we get personal computers? Maybe you just have unrealistic expectations of technological progress.


Level 5 was the goal and the expectation that everybody was aiming for. Waymo's views are easy to interpret from logically considering their actions. Level 4, especially as they are doing it, is in no way whatsoever a stepping stone to level 5. Yet they're spending tremendous resources directed towards things that would have literally and absolutely no place in level 5 autonomy. It seems logically inescapable to assume that not only do they think they'll be unable to hit level 5 in the foreseeable future, but also that nobody else will be able to either. If you can offer an alternative explanation or argument, please share!

Another piece of evidence also comes from last year when Google scaled back Waymo with layoffs as well as "pausing" its efforts at developing self driving truck technology. [1] That technology would require something closer to L5 autonomy, because again - massive preprocessing is quite brittle and doesn't scale well at all. Other companies that were heavily investing in self-driving tech have done similarly. For instance Uber sold off its entire self-driving division in 2021. I'm certainly happy to hear any sort of counter-argument, but you need some logic instead of ironically being the one trying to mindread me or Waymo!

[1] - https://www.theverge.com/2023/7/26/23809237/waymo-via-autono...


Not necessarily. If self-driving cars "aren't ready" and then you redefine what ready is, you've absolutely got your thumb on the scale of measuring progress.


Other way around: Self driving cars "are ready" but then people in this thread seemed to redefine what ready means.


Why do some people gloat about moving goalposts around?

15 years ago self driving of any sort was pure fantasy, yet here we are.

They'll release a version that can drive in poor weather and you'll complain that it can't drive in a tornado.


> "15 years ago self driving of any sort was pure fantasy, yet here we are."

This was 38 years ago: https://www.youtube.com/watch?v=ntIczNQKfjQ - "NavLab 1 (1986) : Carnegie Mellon : Robotics Institute History of Self-Driving Cars; NavLab or Navigation Laboratory was the first self-driving car with people riding on board. It was very slow, but for 1986 computing power, it was revolutionary. NavLab continued to lay the groundwork for Carnegie Mellon University's expertise in the field of autonomous vehicles."

This was 30+ years ago: https://www.youtube.com/watch?v=_HbVWm7wdmE - "Short video about Ernst Dickmanns VaMoR and VaMP projects - fully autonomous vehicles, which travelled thousands of miles autonomously on public roads in 1980s."

This was 29 years ago: https://www.youtube.com/watch?v=PAMVogK2TTk - "A South Korean professor [... Han Min-hong's] vehicle drove itself 300km (186 miles) all the way from Seoul to the southern port of Busan in 1995."

This was 19 years ago: https://www.youtube.com/watch?v=7a6GrKqOxeU - "DARPA Grand Challenge - 2005 Driverless Car Competition"


Stretching the timeline to 30 years doesn't make the achievement any less impressive.


It's okay! We'll just hook up 4o to the Waymo and get quippy messages like those in 4o's demo videos: "Oh, there's a tornado in front of you! Wow! Isn't nature exciting? Haha!"

As long as the Waymo can be fed with the details, we'll be good. ;)

Joking aside, I think there are some cases where moving the goalposts is the right approach: once the previous goalposts are hit, we should be pushing towards the new goalposts. Goalposts as advancement, not derision.

I suppose the intent of a message matters, but as people complain about "well it only does X now, it can't do Y" - probably true, but hey, let's get it to Y, then Z, then... who knows what. Challenge accepted, as the worn-out saying goes.


It's been 8 years and I still don't have my autonomous car.

Meanwhile I've been using ChatGPT at work for _more than a year_ and it's been tremendously helpful to me.

This is not hype, this is not about how AI will change our lives in the future. It's there right here, right now.


Of course. It's quite a handy tool. I love using it for searching documentation for some function that I know the behavior of, but not the name. And similarly, people have been using auto-steer, auto-park, and all these other little 'self driving adjacent' features for years as well. Those are also extremely handy. But the question is, what comes next?

The person I originally responded to stated, "We’re moving toward a world where every job will be modeled, and you’ll either be an AI owner, a model architect, an agent/hardware engineer, a technician, or just.. training data." And that far less likely than us achieving L5 self driving (if not only because driving is quite simple relative to many of the jobs he envisions AI taking over), yet L5 self driving seems as distant as ever as well.


> So long as LLMs regularly hallucinate, they're not going to be useful for much other than tasks that can accept relatively high rates of failure.

Yep. So basically they're useful for a vast, immense range of tasks today.

Some things they're not suited for. For example, I've been working on a system to extract certain financial "facts" across SEC filings. ChatGPT has not been helpful at all either with designing or implementing (except to give some broad, obvious hints about things like regular expressions), nor would it be useful if it was used for the actual automation.

But for many, many other tasks -- like design, architecture, brainstorming, marketing, sales, summarisation, step by step thinking through all sorts of processes, it's extremely valuable today. My list of ChatGPT sessions is so long already and I can't imagine life without it now. Going back to Google and random Quora/StackOverflow answers laced with adtech everywhere...


> I've been working on a system to extract certain financial "facts" across SEC filings. ChatGPT has not been helpful at all

The other day, I saw a demo from a startup (don't remember their name) that uses generative AI to perform financial analysis. The demo showed their AI-powered app basically performing a Google search for some companies, loosely interpreting those Google Stock Market Widgets that are presented in such searches, and then fetching recent news and summarizing them with AI, trying to extract some macro trends.

People were all hyped up about it, saying it will replace financial analysts in no time. From my point of view, that demo is orders of magnitude below the capacity of a single intern who receives the same task.

In short, I have the same perception as you. People are throwing generative AI into everything they can with high expectations, without doing any kind of basic homework to understand its strengths and weaknesses.


> So long as LLMs regularly hallucinate, they're not going to be useful for much other than tasks that can accept relatively high rates of failure.

But is this not what humans do, universally? We are certainly good at hiding it – and we are all good at coping with it – but my general sense when interacting with society is that there is a large amount of nonsense generated by humans that our systems must and do already have enormous flexibility for.

My sense is that's not an aspect of LLMs we should have any trouble with incorporating smoothly, just by adhering to the safety nets that we built in response to our own deficiencies.


The sigmoid is true in humans too. You can get 80% of the way to being sort of good at a thing in a couple of weeks, but then you hit the plateau. In a lot of fields confidently knowing and applying this has made people local jack of all trades experts... the person that often knows how to solve the problem. But Jack is no longer needed so much. ChatJack got`s your back. Better to be a the person who knows one thing in excruciating detail and depth, and never ever let anyone watch you work or train on your output.


I think it's more like an exponential curve where it looks flat moments before it shoots up.

mapping th genome was that way. On a 20yr schedule, barely any progress for 15 and then poof, done ahead of schedule


> or just.. training data.

I have a much less "utopian" view about the future. I remember during the renaissance of neural networks (ca. 2010-15) it was said that "more data leads to better models", and that was at a time when researchers frowned upon the term Artificial Intelligence and would rather use Machine Learning. Fast forward a decade LLMs are very good synthetic data generators that try to mimic human generated input and I can't think somehow that this wasn't the sole initial intent of LLMs. And that's it for me. There's not much to hype and no intelligence at all.

What happens now is that human generated input becomes more valuable and every online platform (including minor ones) will have now some form of gatekeeping in place, rather sooner than later. Besides that a lot of work still can't be done in front of a computer in isolation and probably never will, and even if so, automation is not a means to an end. We still don't know how to measure a lot of things and much less how to capture everything as data vectors.


The two AI’s talking to each other was like listening to two commercials talking to each other. Like a callcenter menu that you cannot skip. And they _kept repeating themselves_. Ugh. If this is the future I’m going to hide in a cave.


My new PC arrives tomorrow. Once I source myself two RTX 3060's I'll be an AI owner, no longer dependant on cloud APIs.

Currently the bottleneck is Agents. If you want a large language model to actually do anything you need an Agent. Agents so far need a human in the loop to keep them sane. Until that problem is solved most human jobs are still safe.


GPT 4o incorporated multimodality directly in the neural network, while reducing inference costs to half.

I fully expect GPT 5 (or at the latest 6) to similarly have native inclusion of agentic capabilities either this year or next year, assuming it doesn't already, but is just kept from the public.


Going to put the economy in a very, very weird situation if true.

Will be like, the end of millions of careers overnight.

It will probably strongly favour places like China and Russia though, where the economy is already strongly reliant on central control.


> It will probably strongly favour places like China and Russia though, where the economy is already strongly reliant on central control.

I think you may be literally right in the opposite sense to what I think you intended.

China (and maybe Russia) may be able to use central control to have an advantage when it comes to avoiding disasterous outcomes.

But when it comes to the rate of innovation, the US may have an advantage for the usual reasons. Less government intervention (due to lobbyism) combined with having several corporations actively competing with each other to be first/best usually leads to faster innovation. However, the downside may be the it also introduces a lot more risk.


Agentic capability just means it outputs a function call which it has had for a long time.


That's a very weak form. The way I use "agentic" is that it is trained to optimize the success of an agent, not just predict the next token.

The obvious way to to that is for it to plan a set of actions and evalute each possible way to reach some goal (or avoid an anti-goal). Kind of what AlphaZeros is doing for games. Q* is rumored to be a generalization of this.


You are far better off investing in one or more 3090s and loading up on DDR RAM.


> Agents so far need a human in the loop to keep them sane.

not quite sure that sanity is a business requirement


Yes, but to use a car dealership example, you don't want your Agent to sell a car to someone for $1 https://hothardware.com/news/car-dealerships-chatgpt-goes-aw...


> We’re moving toward a world where every job will be modeled, and you’ll either be an AI owner, a model architect, an agent/hardware engineer, a technician, or just.. training data.

I understand that you might be afraid. I believe that a world where only LLM companies rule the world is not practically achievable except in some distopian universe. The likelihood of the world where the only job are model architects, engineers or technicians is very very small.

Instead, let's consider the positive possibilities that LLMs can bring. It can lead to new and exciting opportunities across various fields. For instance, can serve as a tool to inspire new ideas for writers, artists, and musicians.

I think we are going towards a more collaborative era where computers and humans interact much more. Everything will be a remix :)


> The likelihood of the world where the only job are model architects, engineers or technicians is very very small.

Oh, especially since it will be a priority to automate their jobs, or somehow optimize them with an algorithm because that's a self-reinforcing improvement scheme that would give you a huge edge.


Every corporate workplace is already thinking: How can I surveil and record everything an employee does as training data for their replacement in 3 years time.


> Can it do math yet?

GPT-4? Not that well. AI? Definitely

https://deepmind.google/discover/blog/alphageometry-an-olymp...


Until the hallucination problem is solved, the output can't be trusted.

So outside of use-cases where the user can quickly verify the result (like picking a decent generated image etc.),I can't see it being used much.


Never heard of retrieval-augmented generation?


RAG? Sure. I even implemented systems using it, and enabling it, myself.

And guess what: RAG doesn't prevent hallucination. It can reduce it, and there are most certainly areas where it is incredibly useful (I should know, because that's what earns my paycheck), but it's useful despite still hallucinations being a thing, not because we solved that problem.


Are you implying that you’re the same person I was commenting to or are you just throwing your opinion into the mix?

Regardless, we’ve seen accuracy of ~98% with simple context-based prompting across every category of generation task. Don’t take my word for it, a simple search would show the effectiveness of “n-shot” prompting. Framing it as “it _can_ reduce” hallucinations is disingenuous at best, there really is no debate about how well it works. We can disagree on whether 98% accuracy is a solution but again I’d assert that for >50% of all possible real world uses for an LLM 98% is acceptable and thus the problem can be colloquially referred to as solved.

If you’re placing the bar at 100% hallucination-free accuracy then I’ve got some bad news to tell you about the accuracy of the floating point operations we run the world on


> Can it just do my entire job for me?

All AIs up to now lack autonomy. So I'd say until we crack this problem, it is not going to be able to do your job. Autonomy depends on a kind of data that is iterative, multi-turn, and learning from environments not from static datasets. We have the exact opposite, lots of non-iterative, off-policy (human made AI consumed) text.


This is still gpt4. I don’t expect much more from this version than what previous version could do, in terms of reasoning abilities.

But everyone is expecting them to release gpt5 later this year, and it is a bit scary to think what it will be able to do.


It's quite different from gpt4 in two respects:

1) It's natively multi-modal in a way I don't think gpt4 was.

2) It's at least twice as efficient in terms of compute. Maybe 3 times more efficient, considering the increase in performance.

Combined, those point towards some major breakthroughs having gone into the model. If the quality of the output hasn't gone up THAT much, it's probably because the technological innovations mostly were leveraged (for this version) to reduce costs rather than capabilities.

My guess is that we should expect them to leverage the 2x-3x boost in efficiency in a model that is at least as large as GTP4 relatively soon, probably this year unless OpenAI has safety concerns or something, and keeps it internal-only.


Branding aside, this pretty much is GPT 5.

The evidence for that is the change in the tokenizer. The only way to implement that is to re-train the entire base model from scratch. This implies that GPT 4o is not a fine-tuning of GPT 4. It's a new model, with a new tokenizer, new input and output token types, etc...

They could have called it GPT-5 and everyone would have believed them.


I’ve used it for a couple of hours to help with coding and it feels very similar to gpt4: still makes erroneous and inconsistent suggestions. Not calling it 4.5 was the right call. It is much faster though.

The expectations for gpt5 are sky high. I think we will see a similar jump as 3.5 -> 4.


Pretty sure they said they would not release GPT-5 on Monday. So it's something else still. And I don't see any sort of jump big enough to label it as 5.

I assume GPT-5 has to be a heavier, more expensive and slower model initially.

GPT-4o is like an optimisation of GPT-4.


That doesn't imply that it's GPT-5. A GPT-4 training run probably doesn't take them that long now they've acquired so many GPUs for training GPT-5.


I think 4o is actually noticeably smarter than 4, after having tried it a tiny bit on the playground.


There has been speculation that this is the same mystery model floating around on lmsys chat bot arena and they claim a real observable jump on elo scores but this remains to be seen some people don't think its even as capable as GPT4-Turbo so tbd


It's a completely new model trained from scratch that they've decided to label that way as part of their marketing strategy.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: