Jack Parker-Holder, DeepMind: On open-endedness, evolving agents and environments, online adaptation, and offline learning

December 06, 2022

RSS · Spotify · Apple Podcasts · Pocket Casts

Jack Parker-Holder recently joined DeepMind after his Ph.D. with Stephen Roberts at Oxford. Jack is interested in using reinforcement learning to train generally capable agents, especially via an open-ended learning process where environments can adapt to constantly challenge the agent's capabilities. Before doing his Ph.D., Jack worked for 7 years in finance at JP Morgan. In this episode, we chat about open-endedness, evolving agents and environments, online adaptation, offline learning with world models, and much more.

Below are some highlights from our conversation as well as links to the papers, people, and groups referenced in the episode.

Some highlights from our conversation

"I think every time we've given machine learning models more of a chance to learn things for themselves, they've seem to have done better."

"If you start off with a simple problem and then you can expand the distribution over time to all possible problems, then there are literally definitions of AGI that say it's about the agent's abilities to achieve goals in a wide range of environments, so if you start off as simple, I think some open-ended process could eventually get something that resembles at least an increased form of generality in our artificial intelligence. And some people might call it AGI, but I think that's kind of a binary label, whereas I see it much more as a continuous thing."

"Diversity—that kind of relates already to finance because portfolio theory is all about diversity. […] We can't expect to make money in every situation because that's a little bit unrealistic. What we could do is make sure that if there's any conceivable scenario that we could sample from a generative world model, we don't completely blow up in that situation. It's similar to the curriculum based adversarial kind of stuff."

"So firstly evolution, okay. Everyone says it, but it has been shown to work in a larger scale setting than any of our other methods. But secondly, it really is a completely different way of optimizing agents and discovering new things. And so I just don't think that we should ignore something that's quite different to what we're doing. […] We know that gradient descent works really well with our current neural networks. So of course, if we try and evolve them, they may not be as good. But there could be other networks that we couldn't learn with gradient descent, but we can evolve. And then secondly, there could be other compute paradigms where evolution works really well."

Referenced in this podcast

Thanks to Tessa Hall for editing the podcast.