Summary of Fall 2020 GRASP SFI: Oleh Rybkin - October 21st

This is an AI generated summary. There may be inaccuracies.
Summarize another video · Purchase summarize.tech Premium

00:00:00 - 01:00:00

In this video, Oleh Rybkin discusses the Fall 2020 GRASP SFI, which is a research project that uses a combination of supervised and unsupervised methods to improve robotic manipulation. Rybkin goes into detail about the semi-supervised model, which uses human and robot data to improve the groundings of robot actions into human actions. The results of this project are demonstrated on real world data, where the robot is able to successfully perform tasks that are difficult for robots to achieve.

00:00:00 Oleg Rybkin is a PhD student working with Professor Costas Danalitas on computer vision/reinforcement learning. He will be discussing his recent work on skill scalable visual models based reinforcement learning. Rybkin is excited about building intelligent machines that can act in the real world, similar to intelligent robots from fiction movies. However, current artificial intelligence methods are not good enough and are limited to specific situations. He thinks data-driven techniques like deep learning can help solve the problem.
00:05:00 In this video, Oleh Rybkin discusses how model-based reinforcement learning can be used to generalize knowledge learned in one task to new tasks. He presents a project that uses self-supervised data collection to allow agents to learn generalizable skills without needing to be retrained from scratch for every task. Finally, he discusses how this project could be used to solve a new task, such as walking or running, without requiring any prior training.
00:10:00 The video discusses the architecture of a predictive model used to plan for exploratory actions, showing how the model uses a probabilistic latent state model to generate imagined rollouts of future latent states, and how the model uses self-supervised exploration behaviors to learn how to explore an environment effectively. The video also discusses how the model uses intrinsic rewards to motivate the agent to plan for exploratory actions, and how these exploratory actions can be used to solve new tasks.
00:15:00 In this talk, Oleh Rybkin discusses how Fall 2020 GRASP SFI aims to use trained models to plan for a different reward function, without needing any additional interaction with the world. This allows for tasks to be solved in a zero-shot manner, without needing any practice or prior knowledge. Rybkin also discusses how data collection may be necessary in order to achieve this goal.
00:20:00 Oleh Rybkin discusses fall 2020 GRASP SFI proposal, which aims to leverage large datasets of online video data in order to train action-conditioned models for planning with robotic agents. The model is able to learn reasonable action representations even without action data, and is able to improve upon prior work in the low data regime.
00:25:00 In this video, Oleh Rybkin discusses the Fall 2020 GRASP SFI, which is a research project involving a combination of supervised and unsupervised methods to improve robotic manipulation. Rybkin goes into detail about the semi-supervised model, which uses human and robot data to improve the groundings of robot actions into human actions. The results of this project are demonstrated on real world data, where the robot is able to successfully perform tasks that are difficult for robots to achieve.
00:30:00 In this talk, Oleh Rybkin discusses variational inference, a method for learning a generative model of data using a loss function. He shows how to balance the two loss functions--the "murder" and "complexity" penalties--and how to use beta vees to optimize the model. Though beta vees are often difficult to tune, they have some advantages, such as being insensitive to data variation and preserving evidence lower bounds.
00:35:00 In this video, Oleh Rybkin discusses the evidence lower bound, a tool used to tune a machine learning model. The evidence lower bound is derived from the likelihood of a given distribution, and is used to balance the two losses of a machine learning model. By learning the variance of the decoder, the model is able to adjust its confidence to better represent its data. This helps to improve the generalizability of the model.
00:40:00 The author discusses some issues with current probabilistic predictive modeling techniques, including their difficulty scaling to long-term planning tasks. He then presents a new, scalable probabilistic modeling technique called "optimal sigma ve."
00:45:00 In this video, Oleh Rybkin demonstrates the cross-entropy method, which is a technique for sampling random actions. He shows a sketch of a task and the trajectories that are sampled to achieve the successful ones. He then shows how latent collocation, a method for trajectory optimization, can improve the performance of shooting. Finally, he compares shooting and latent collocation on a simple task and demonstrates how let go can help with longer-term planning in robotic settings.
00:50:00 The speaker discusses how to do model-based planning in the virtual world, with a focus on overcoming exploration and optimising for reward. They discuss how to do this using a variety of methods, including using a trained oracle agent and collocation of data.
00:55:00 The video discusses the Fall 2020 GRASP SFI, which is a research project that aims to create a machine learning algorithm that can generalize to new domains. The main problem is that the domains are not aligned, which means that the actions don't reach the goals. However, the project is still worth exploring because it could lead to a more generalizable machine learning algorithm.

01:00:00 - 01:00:00

In this talk, Oleh Rybkin discusses how to provide a task to an intelligent robot using a known reward function. He says that, if the answer is no to the question of how to communicate tasks in a form known to the robot, then the learning bottleneck cannot be the problem, as it will always be harder to figure out what the robot is supposed to do.