Research Scientist Intern, Vision-Language And Embodied AI (PhD)
All the best with your application!
Want more jobs like this straight to your inbox?
Get Job Alerts
Get a curated list of the top robotics roles delivered straight to your inbox each week. We sift through hundreds of postings to find the high-salary positions, leading companies, and remote opportunities you actually want.
Unsubscribe anytime. We respect your privacy.
Summary
Redmond, United States
Internship
2+ years
About this Job
Reality Labs Research is seeking a Research Scientist Intern to help develop the next generation of assistance systems that guide users in contextual and adaptive environments. We welcome candidates with expertise in embodied AI, reinforcement learning, planning, multimodal learning, vision-language models, LLM interpretability, world model learning, and pose estimation (including hand and object pose). Our internships are twelve (12) to twenty-four (24) weeks long, with various start dates throughout the year.
Responsibilities
Plan and execute cutting-edge research on embodied AI algorithms, assistance policies, vision-language models, and world model learning for complex, real-world interaction tasks.Develop, implement, and evaluate methods for improving the performance and interpretability of VLMs and related AI/ML models.Leverage state-of-the-art simulators, RL/DRL, neuro-symbolic, AI planning, robotics, stochastic programming, and multimodal learning methods.Write modular, reusable research code and utilize Meta’s large infrastructure to scale experimentation.Collaborate cross-functionally with researchers and engineers to prototype and test models at scale.Deliver clear, compelling, and creative solutions to challenging problems.Work should result in publishable research in top-tier journals or conferences (e.g., NeurIPS, ICLR, CVPR, ECCV, ICML, ICCV, AAAI, IJCAI, ICRA, IEEE T-PAMI, IJCV, IEEE RA-L etc.).
Qualifications
Currently has, or is in the process of obtaining, a PhD in Machine Learning, Artificial Intelligence, Computer Vision, Robotics, Speech Processing, Applied Statistics, Computational Neuroscience, Algorithms, Computational Mathematics, or a related fieldProven research skills: problem definition, solution exploration, analysis, and presentation of results2+ years of experience in Python and machine learning libraries (Numpy, Scikit-Learn, Scipy, Pandas, Matplotlib, Tensorflow, Pytorch)Understanding of at least one of the following: embodied AI, reinforcement learning, planning, transfer/few-shot/zero-shot/continual/online learning, self-supervised learning, multi/cross-modal learning, vision-language models, LLM interpretability, world model learning, hand pose estimation, or object pose estimationMust obtain work authorization in country of employment at the time of hire, and maintain ongoing work authorization during employment Proven track record of significant results: grants, fellowships, patents, and first-authored publications at leading workshops or conferences (e.g., NeurIPS, ICLR, CVPR, ECCV, ICML, ICCV, AAAI, IJCAI, ICRA, IEEE T-PAMI, IJCV, IEEE RA-L etc.)Experience with VLM/LLM training/fine-tuning and solving traditional CV problems (e.g., hand/body pose estimation, object pose estimation, image classification/segmentation, image/video understanding, 3D scene reconstruction)Experience working and communicating cross-functionally in a team environmentIntent to return to the degree program after the completion of the internship/co-opAvailability for minimum 16 consecutive week internship
About the Company
