Master the computer vision skills behind advances in robotics and automation. Write programs to analyze images, implement feature extraction, and recognize objects using deep learning models.

Estimated time

3 Months

At 10-15 hrs/week

In collaboration with


Nvidia Deep Learning Institute

What You’ll Learn in Become a Computer Vision Expert Nanodegree

Foundations of Computer Vision

Learn cutting-edge computer vision and deep learning techniques—from basic image processing, to building and customizing convolutional neural networks. Apply these concepts to vision tasks such as automatic image captioning and object tracking, and build a robust portfolio of computer vision projects.

Work on a variety of computer vision and deep learning applications from basic image processing to automatic image captioning.

Prerequisite Knowledge

This program requires experience with Python, statistics, machine learning, and deep learning.

  • Intermediate to advanced Python experience. You are familiar with object-oriented programming. You can write nested for loops and can read and understand code written by others.
  • Intermediate statistics background. You are familiar with probability.
  • Intermediate knowledge of machine learning techniques. You can describe backpropagation, and have seen a few examples of neural network architecture (like a CNN for image classification).
  • You have seen or worked with a deep learning framework like TensorFlow, Keras, or PyTorch before.

Introduction to Computer Vision

Master computer vision and image processing essentials. Learn to extract important features from image data, and apply deep learning techniques to classification tasks.

Project – Facial Keypoint Detection

Use image processing techniques and deep learning to recognize faces and facial keypoints, such as the location of the eyes and mouth on a face.

Advanced Computer Vision and Deep Learning

Learn to apply deep learning architectures to computer vision tasks. Discover how to combine CNN and RNN networks to build an automatic image captioning application.

Project – Automatic Image Captioning

Combine CNN and RNN knowledge to build a network

that automatically produces captions, given an input image.

Object Tracking and Localization

Learn how to locate an object and track it over time. These techniques are used in a variety of moving systems, such as self-driving car navigation and drone flight.

Project – Landmark Detection & Tracking

Use sensor data to localize a robot and build a map of the environment with SLAM.

