During my first 2.5 years at OpenAI, I worked on the Robotics team on a moonshot idea: we wanted to teach a single, human-like robot hand to solve Rubik’s cube. It was a tremendously exciting, challenging, and emotional experience. We solved the challenge with deep reinforcement learning (RL), crazy amounts of domain randomization, and no real-world training data. More importantly, we conquered the challenge as a team.
From simulation and RL training to vision perception and hardware firmware, we collaborated so closely and cohesively. It was an amazing experiment and during that time, I often thought of Steve Jobs’ reality distortion field: when you believe in something so strongly and keep on pushing it so persistently, somehow you can make the impossible possible.
Since the beginning of 2021, I started leading the Applied AI Research team. Managing a team presents a different set of challenges and requires working style changes. I’m most proud of several projects related to language model safety within Applied AI:
- We designed and constructed a set of evaluation data and tasks to assess the tendency of pre-trained language models to generate hateful, sexual, or violent content.
- We created a detailed taxonomy and built a strong classifier to detect unwanted content as well as the reason why the content is inappropriate.
- We are working on various techniques to make the model less likely to generate unsafe outputs.
As the Applied AI team is practicing the best way to deploy cutting-edge AI techniques, such as large pre-trained language models, we see how powerful and useful they are for real-world tasks. We are also aware of the importance of safely deploying the techniques, as emphasized in our Charter.