Robots will soon learn from each other, Oxford researchers say

Robots will soon learn from each other, Oxford researchers say

AI could enable drones to find and rescue people in remote areas with limited network connectivity

Richard Humphreys |


Robots could soon learn from each other and work together to complete tasks such as rescuing people from hostile environments, researchers at Oxford University have said.

Humans currently program bots to act in a certain way and perform specific jobs, such as building cars in a factory, but the artificial intelligence (AI) experts from Oxford have suggested they can teach a group of bots to assess their surroundings and learn the best way to act on their own.

The research, which is partly supported by the Microsoft Research PhD Scholarship Program, was showcased on Microsoft’s stand at The AI Summit in London recently. The event, the largest conference and exhibition looking at the practical implications of AI for companies, featured a speech by Chris Bishop, laboratory director at Microsoft’s research lab in Cambridge.

Microsoft’s AI keeps learning as it interacts with more people, resulting in quicker and more accurate results in search engines. Bishop spoke about the importance of helping AI become smarter so it can assist people, while making sure humans are comfortable with that. “Enhancements to Bing, Cortana and HoloLens are all underpinned by AI,” he said, adding that “trust is at the heart of this”, when it comes to customer data. “You stay in control of your data.”

The idea of trusting AI so it can help humans was central to the research from the Oxford University team.

The group, from the university’s Whiteson Research Lab, in collaboration with PhD students from the engineering department, hope their work on AI could make it easier for drones to find and rescue people in remote areas where network connectivity is an issue. The technology could also be used in driverless cars, as the Oxford experts believe it would be safer and easier for the vehicles to “talk” to each other about the roads they are on and the traffic they encounter rather than being controlled from a central office somewhere in that city.

Jakob Foerster was one of six researchers who wrote a paper on the findings, entitled Stabilising Experience Replay for Deep Multi-Agent Reinforcement Learning. He used Azure, Microsoft’s cloud platform, and the popular military strategy game Starcraft to show how AI bots could learn from their environment and work together to defeat a computer-controlled enemy.

“The bots have no knowledge at the beginning. They don’t know the difference between walking and attacking, and they have to learn what’s an enemy; it’s all based on experience,” Foerster said. “The bots have to learn and continuously adapt to each other’s strategies.”

In a battle of 10 bots (five versus five) the AI ‘agents’, as they are known, lose badly at first. However, they slowly learn to work together and by the time they have completed around 100 battles, they have a win rate of 90%.

“What was really interesting was we found that the team got more points when a human player was introduced, rather than just having a group of bots,” Foerster added. This was because the human players were learning from the bots, too.

 

“We saw that the human instinct was to attack the big enemies, but the bots were attacking the smaller ones that were causing more damage. The humans then realised what they were doing and started to help the bots. Those were the teams that got the most points.”

Drones have previously worked together as a team to complete tasks such as putting out fires or building bridges, but they have either required regular contact with air traffic control or sensors to monitor the machines’ position and correct flight patterns. The Oxford students’ research showed drones could collaborate without those levels of supervision.

Foerster also dismissed the idea that drones could teach each other destructive and rebellious behaviour, saying that the AI would be programmed with the desire to help humans.

“We train these AI agents to maximise rewards. It is our task as researchers to define reward functions that are well aligned with our human and societal objectives,” he said. “For example, an AI system learning to drive a car in a simulator should be rewarded for safe driving, both regarding the integrity of the car itself and other participants in traffic.”

The research was seen by 1,000s of delegates at The AI Summit earlier this month. Jon Andrews, executive head of technology and investment at accounting firm PwC, opened the event by pointing out that despite evidence AI can help businesses, three-quarters of chief executives have no plans to introduce it, while 28% of those had looked at the technology and decided against it. The most popular reason for this lack of action was a poor understanding of AI.

More than 100 speakers attended the two-day event, including staff from the Met Office (which had worked with Microsoft to develop AI bots that can answer questions about the weather), Tesco, Accenture, Nvidia, Carphone Warehouse and the London Stock Exchange, among others.

Antonio Criminisi, principal researcher at Microsoft’s Cambridge lab, gave a talk on InnerEye, a research project that uses state-of-the-art AI to build image analysis tools that enable doctors to treat cancer in a more targeted and effective way.

Microsoft has stepped up its AI efforts in recent months. The company formed the Microsoft AI and Research Group in September last year, bringing together its world-class research organisation with more than 5,000 computer scientists and engineers. The group is led by Harry Shum, a 20-year Microsoft veteran whose career has spanned leadership roles across Microsoft Research and Bing engineering.

Subscribe to the Technology Record newsletter


  • ©2024 Tudor Rose. All Rights Reserved. Technology Record is published by Tudor Rose with the support and guidance of Microsoft.