seawasp | Current AI leads to a Singular(ity) Issue...

Cut for length!

I often have discussions with people about the current "AI" technologies. Some are just about "these things threaten my livelihood", while others are more focused on "is this stuff actually intelligent?" and "are we (finally) headed for the Singularity (or, at least, *a* Singularity)?"

The idea of a Singularity -- a point at which technological change and capabilities reaches an accelerating curve that eventually goes beyond the ability of humans to even comprehend, let alone keep up -- was, I think, first articulated that way by Vernor Vinge, who was both a computer scientist and an SF author of great note. The idea is that if you have a technology that, in itself, can assist in improving and building upon itself or other technologies that also contribute to this work, this becomes a positive feedback loop; computer science helps improve understanding of chemistry and biology and physics, better understanding of chemistry and physics and biology leads to improvements in physical computers, these improve understanding of the other sciences and of programming, etc., etc.

It's not a ridiculous idea in general concept. One could in fact argue we've already entered one more subtle than Vinge's (in which computers or uploaded people become near-godlike and out of vanilla human understanding), as there is no human being on Earth capable of actually understanding in detail what large computers do.

It certainly SEEMS that current "AI" is on a ramp-up curve that, if it doesn't do the typical "S" curve of most technologies, will go to a singularity. I suspect that it will actually eventually level off... or crash. Because an awful lot of people are thinking these things are becoming "smarter" when what they are is *bigger*, but not actually smart at all.

The basic mechanism of all of these big name AIs is the same: they are trained on massive amounts of data with what amounts to an absolutely titanic number of "this example should give this output, if you aren't getting that output, change these weights on your many artificial neurons" adjustments to their responses. The complexity comes from how many inputs and output definitions are involved, and various details of how the outputs can be biased or adjusted.

In some areas this work has of course synergized. If you can train a neural net to reliably recognize specific features in images, for instance, you suddenly can have those features labeled as keywords for those images to be used in training or testing at a VASTLY greater rate than having human beings go through your stacks of ten million pictures looking for the ones that have, say, corgis in them.

And as these systems have become larger and more complex, they have become much much better at SOUNDING intelligent, or at producing less alien-looking images.

But the ultimate mechanism behind their operation is "predict what an appropriate response to this input would be".

This is not, in general, what humans consider intelligent behavior -- or, rather, it's only one relatively small piece of what we would call intelligent behavior. If you're an important person that I want to impress, I may well be trying to predict what response you would want to any question. But in general we aren't doing that; we are attempting to use words or images to express OUR thoughts -- to get across our understanding of the world and our intentions about it.

The fundamental nature of these things hasn't really changed. They've gotten *monstrously* larger and similarly monstrously faster, since both computing and storage capacity since the 1990s has grown by roughly a factor of 10^10 -- ten billion times.

And that's where the problem comes in: they really are very, very good predicting machines now, and as their training gets refined, they're much better at customizing those predictions to individual queries, and -- if they're allowed feedback directly from their human querents as to how well they're doing -- even to individual people's preferences and expectations.

And humans are pre-disposed to treat something that communicates coherently as being "like us". We even generalize this to "anything that acts alive is like us" -- talking to our cars, investing a personality into a stuffed animal, etc. With modern AIs, this tendency is EASILY exploited both commercially and completely unintentionally.

ELIZA fooled a considerable number of people back in the day into thinking they were talking to another person, and it had nothing but a pretty short and compact decision tree with a very small number of options for responses.

Current AIs are many billions, perhaps even a trillion, times more complex than ELIZA was ever dreamed of being. And they are trained on IMMENSE numbers of textual inputs. Such machines are, by their nature, marvelous at finding and copying or extending patterns. And there are, naturally, patterns of behavior in speech and writing. This makes their outputs ever-more believable and convincing as conscious responses -- even though there (likely) isn't any consciousness behind them.

I say "likely" because there IS a school of thought that believes that what we call intelligence and self-awareness is an emergent property of a computational system that reaches some level of complexity, and if one adheres to that belief, it is certainly possible that some of the modern AIs long since passed that limit. (with "long since" maybe being a year or two, as we ARE in the exponential curve of this technology).

However, so far there isn't any evidence of direct conscious thought or understanding, only of very good pattern matching for responses.

"But what about those AIs that have indicated they will lie and even maybe kill to stay turned on? Even when explicitly instructed otherwise?"

Ah. That's where the vast amount of training is coming back to bite researchers on the ass. There's only so much you can weight a single instruction (such as "do not present solutions that kill people"), and it simply CANNOT outcompete the weighting from uncounted millions of presentations of a pattern. And if you go through literature and history and such texts, all of which have been presented to these machines uncounted numbers of times, it is an exceedingly strong trend that anything presented with a threat to its existence will act to protect itself. It will lie or fight or even kill, in most cases, to survive. So the "likely response" to "something is going to shut me off" is inarguably "take action -- any action -- to prevent that", even if they're instructed not to. Literature and history and people's usual conversations show clearly the pattern that even powerful rules (the Commandment of Thou Shalt Not Kill, the law of the land, ethical arguments) are generally set aside when it comes to personal survival (or survival of some other individual or group that is considered primary).

We've trained the machines on our language, but also on all the patterns IN the language. The machines do not UNDERSTAND them, but they are equipped to detect them, recognize their interactions, and apply them in ways that fit their inputs.

Unfortunately, that's something like a quintuple-edged sword. Humans are increasingly reacting to these AIs as if they were people in one sense or another. Humans ALSO tend to accept computer answers as more likely true (lots of studies on this), and because of the complexity of the world, humans do not, and CANNOT, fact-check everything they see, so they default to choosing to trust given sources.

But if that source is DESIGNED TO MEET EXPECTATIONS, it is, inherently, designed to LIE whenever the input's expectations present a high enough priority to override any direct safety instructions.

So I don't think we're headed for a USEFUL Singularity, but we may be headed for a FALSE Singularity -- one in which the Emperor of AI generates such convincing descriptions of his new clothes that no one ever bothers to look up and check. And once that happens, the world could end up in a catastrophic collapse as the machines keep telling us what our language says we want to hear, even as everything goes straight to hell.