The computer that stunned humanity by beating the best mortal players at a strategy board game requiring "intuition" has become even smarter, its makers said Wednesday.
Even more startling, the updated version of AlphaGo is entirely self-taught -- a major step towards the rise of machines that achieve superhuman abilities "with no human input", they reported in the science journal Nature.
Not constrained by humans
Unlike its predecessors which trained on data from thousands of human games before practising by playing against itself, AlphaGo Zero did not learn from humans, or by playing against them, according to researchers at DeepMind, the British artificial intelligence company developing the system.
"All previous versions of AlphaGo... were told: 'Well, in this position the human expert played this particular move, and in this other position the human expert played here'," Silver said in a video explaining the advance.
AlphaGo Zero skipped this step.
Instead, it was programmed to respond to reward -- a positive point for a win versus a negative point for a loss.
Starting with just the rules of Go and no instructions, the system learnt the game, devised strategy and improved as it competed against itself -- starting with "completely random play" to figure out how the reward is earned.
This is a trial-and-error process known as "reinforcement learning".
Unlike its predecessors, AlphaGo Zero "is no longer constrained by the limits of human knowledge," Silver and DeepMind CEO Demis Hassabis wrote in a blog.
Amazingly, AlphaGo Zero used a single machine -- a human brain-mimicking "neural network" -- compared to the multiple-machine "brain" that beat Lee.
It had four data processing units compared to AlphaGo's 48, and played 4.9 million training games over three days compared to 30 million over several months.
Beginning of the end?
"People tend to assume that machine learning is all about big data and massive amounts of computation but actually what we saw with AlphaGo Zero is that algorithms matter much more," said Silver.
The findings suggested that AI based on reinforcement learning performed better than those that rely on human expertise, Satinder Singh of the University of Michigan wrote in a commentary also carried by Nature.
"However, this is not the beginning of any end because AlphaGo Zero, like all other successful AI so far, is extremely limited in what it knows and in what it can do compared with humans and even other animals," he said.
AlphaGo Zero's ability to learn on its own "might appear creepily autonomous", added Anders Sandberg of the Future of Humanity Institute at Oxford University.
But there was an important difference, he told AFP, "between the general-purpose smarts humans have and the specialised smarts" of computer software.
"What DeepMind has demonstrated over the past years is that one can make software that can be turned into experts in different domains... but it does not become generally intelligent."
It was also worth noting that AlphaGo was not programming itself, said Sandberg.
"The clever insights making Zero better was due to humans, not any piece of software suggesting that this approach would be good. I would start to get worried when that happens." (AFP)