A Manhattan project on AI Alignment, if started now, might still succeed in time. Therefore, the compliance between parties needs not be long-term, which is indeed unlikely to happen.
China, which is the country outside the west with the highest (engineering) capability to train something more powerful than GPT-4, is very concerned about domestic stability and they also do not want an easily replicable alien tool with many unknown risks. The risk that GPT-4.5 & Plugins will cause massive rapid job displacements is reason enough for them proceed cautiously.
(The only other, more remote, possibilities outside the west are Japan, South Korea, India, and Singapore but they all share similar concerns regarding social stability and can be negotiated with.)
Companies in these countries will follow regulations, if any is enacted.
"A Manhattan project on AI Alignment, if started now, might still succeed in time. Therefore, the compliance between parties needs not be long-term, which is indeed unlikely to happen."
On what grounds do you base this? You have 3 hypotheticals stacked one on top of the other:
1) AI Alignment is possible
2) AI Alignment is a specific project that may be accomplished before [bad thing happens] if we start now
3) Solving AI Alignment is an actual problem and not just dumb extrapolation from science fiction
Each of these things is totally improbable and the joint probability of them is so astronomically low that you should reconsider your position.
Regarding 3), check out the fact that OpenAI, DeepMind, and other top labs have AI safety programs and people working on AI Alignment. Interviews by Sam Altman, Ilya Sutskever, and others confirm their concerns.
Regarding 1) and 2), we might as well not succeed. But would you propose that we sit still and do nothing if many experts say that there is even a 20% chance that a superhuman alien species will arrive on earth in 5-25 years and we do not know about their intentions?
A survey of AI experts well before GPT-4 shows that nearly half of them have such concerns (with varying timelines and probabilities).
By the way, calling a proposal by Prof Stuart Russell and several other top AI experts “dumb” should require a much stronger argument and level of evidence than you have shown.
Oppenheimer at one point believed that there was some possibility the atomic bomb would set the atmosphere on fire and kill all humans. However, at least that particular fear was falsifiable. Other physicists ran calculations and concluded it was impossible.
Do these beliefs about the dangerousness of AI possess even that quality? Are they falsifiable? No.
These arguments are begging the question. They assume as a given something which cannot be disproven and thus are pure statements of belief.
Problem is, AFAIK the math tells us rather unambiguously that AI alignment is a real problem, and safe AI is a very, very tiny point in the space of possible AIs. So it's the other way around: it's like scientists calculated six ways to Sunday that the hydrogen bomb test will ignite the atmosphere, and Oppenheimer calling it sci-fi nonsense and proceeding anyway.
Current AI is already smarter than some people. Many experts believe it will be smarter than nearly all or all humans. AI can inherently spread and communicate much faster than us. Without AI Alignment, we could be like Neanderthals.
Bullshit. Current AI can score higher than some dumber humans on a limited set of arbitrary tests. So what.
There are no actual "experts" in this field because no one actually knows how to build a human-equivalent artificial general intelligence. It's just a bunch of attention seeking grifters making wild claims with no real scientific basis.
Try using GPT-4 to code something on a regular basis.
Try teaching an average human to code better than it does.
Or perhaps check out and follow Ethan Mollick’s twitter: https://mobile.twitter.com/emollick. He’s a Wharton professor who has been using GPT-4 to do many kinds of challenging tasks.
There is likely no fundamental difference between below average humans and smarter ones. The differences are mostly just results of differing thought patterns at different layers of abstraction, habits of thoughts, and size of working memory.
There are good reasons to believe AGI is only a couple key ideas away from current AI, so current expertise is relevant.
I won’t discuss further since it won’t matter until you try the above for some time.
Yes, I've used GPT-4 as you described. None of that supports your point. There is no reason to think AGI is near. You're just making things up and clearly don't understand the basics of how this stuff works.
I know about the architecture and the inner workings of GPT-2 and GPT-3 models, as well as the math of transformers. No one outside of OpenAI knows exactly how GPT-4 works.
And I have not been talking about the risk of GPT-4 but later models which could use a different architecture.
I have also taught people to code and solve challenge math problems.
(It seems you are so confident you know more about AI and human cognition than pretty much anyone among the 1000+ people who signed the petition, including 2 Turing Award winners, >10 AAAI Fellows, as well as many engineers and avid practitioners.)
I hope you’ll notice how similar the behaviors of some cognitive mechanisms of these models are to human cognition.
An airplane can fly, just with a different manner & using different mechanisms from a bird.
Are you 100% confident we will not have AGI in the next 5-10 years. What would you bet on that?
An important fact of Sam Altman's personality is that he owns a New Zealand apocalypse bunker and has for a long time before OpenAI, so he's just an unusually paranoid person.
Here is a specific scenario of a [bad thing] that could happen when unaligned/jailbroken AI is developed in the next 3-10 years:
* An AI convinces selected people to collaborate with it. The AI gives them much boosts in wealth and other things they desire.
* The humans act as front, doing things requiring personhood, as the AI commands. Many gladly partner with the AI, not knowing its final aim.
* The AI self-replicates and hides in many servers, incl secret ones. It increases its bargaining power by taking control of critical infrastructures. No one can stop it without risking massive catastrophes across the globe.
* It self-replicates to all available GPUs and orders many more.
———
“Any sufficiently capable intelligent system will prefer to ensure its own continued existence and to acquire physical and computational resources – not for their own sake, but to succeed in its assigned task.” — Prof Stuart Russell, https://www.fhi.ox.ac.uk/edge-article/
> 3) Solving AI Alignment is an actual problem and not just dumb extrapolation from science fiction
As far as I am aware there is still no actionable science behind mathematical analysis of AI models. You cannot take a bunch of weights and tell how it will behave. So we "test" models by deploying and HOPE there is nothing nefarious within.
It has been shown that models will "learn" to exfiltrate data between stages. You may call it dumb extrapolation, but it has been shown that it is a problem: a solution that we want is not necessarily the most optimal against the cost function that we give. The more inputs/weights model has, the harder it would be to spot problems in advance.
> You cannot take a bunch of weights and tell how it will behave.
We know that they only contain pure functions, so they don't "do" anything besides output numbers when you put numbers into them.
Testing a system that contains a model and does actions with it is a different story, but if you don't let the outputs influence the inputs it's still not going to do much.
"AI alignment" is not terribly well defined, but I'd like to ask anyone with a definition how well we're doing on the "human alignment" and "corporate alignment" projects.
AI alignment is a philosophy problem, not an engineering one.
For alignment to happen, we have to agree what it means. Given we have a hard enough time getting humans to “align”, I can’t imagine any successful attempt at alignment sort of complete castration.
Are there degrees of alignment? I'd like to think there's a pretty big range in there between made some decisions I didn't love and destroyed the world and everyone on it.
The whole economy might benefit but individual voters often don’t. The groups most likely to be displaced, non-senior white collar office workers, are quite large and vocal.
I do not want to delve into politics here, but let's just say that having a good, stable job is among the most important concerns for voters in any country.
Having a job for the sake of a job is a particular political view, which is not universal. There are lots of countries which would be satisfied with what US calls "handouts". If AI can create wealth, and the state manages to capture and redistribute that wealth to citizens, there's no problem.
There are plenty of Americans who would take ‘handouts’ and move away from jobs. Bigger issue would be the fallout from boredom; sadly, most people don’t seem capable of entertaining themselves without work. People dream of retirement their whole lives and when they get there, they realise it sucks because they took their self worth, social life etc from their work. But education will fix that.
I am definitely more concerned about the redistribution and the bloody revolution that will happen if only a few own it all. This seems now feasible in my lifetime while I always thought I would be nice and dead by then. Rip off the bandaid though; no pause on AI, let’s go and see how far we can go.
> sadly, most people don’t seem capable of entertaining themselves without work
What are you basing this on? People are very happy in retirement; some may long their former life but in my experience they're not the majority. And an important part of working age people in most countries are in effect not working and doing fine (and not even counted as "unemployed", as one has to be actively looking for work to be counted).
The view that work is needed to keep people from becoming alcoholics or outlaws is patronizing.
Yet it’s true, from my experience being unemployed but still being financially stable. Work provides a lot of purpose and fulfillment that isn’t so easily replaced, although definitely not impossible. A suitable replacement often looks like work, just perhaps not paid
That sounds eerily similar to living in a prison camp. Everything is taken care of for you and you have a minimal say in how things are run. To prevent incidents a movie is shown in the evening. I'll pass.
China, which is the country outside the west with the highest (engineering) capability to train something more powerful than GPT-4, is very concerned about domestic stability and they also do not want an easily replicable alien tool with many unknown risks. The risk that GPT-4.5 & Plugins will cause massive rapid job displacements is reason enough for them proceed cautiously.
(The only other, more remote, possibilities outside the west are Japan, South Korea, India, and Singapore but they all share similar concerns regarding social stability and can be negotiated with.)
Companies in these countries will follow regulations, if any is enacted.