PhD Course - 3D Computer Vision

Submitted by admin on Tue, 03/07/2023 - 15:59

Being able to extract accurate measurements of a three-dimensional scene is a fundamental task in several applications spanning from cultural heritage to industrial applications, to smart city digital twin, to autonomous vehicles. For this reason, Geometrical Computer Vision is an active research field: on the one hand, the widely used classical model-based algorithms can offer accurate and controllable solutions; on the other hand, the more recently emerged deep-learning based approaches try to offer alternatives to deal with 3D applications. 

In this course, Computer Vision techniques working in three-dimensional environments will be presented and discussed. Initially, the more relevant concepts related to projective geometry, camera models, and multi-view computer vision – essential to comprehend the 3D related applications that will be discussed in this course – will be introduced. Then, 3D reconstruction algorithms will be presented and an overview on visual odometry and on simultaneous localization and mapping will be given, including both model-based and learning-based approaches. During the course, the theoretical presentations will be sided with examples of ready-to-use practical solutions.

This course is part of the PhD Program in Information Engineering of the University of Florence (https://informationengineering.dinfo.unifi.it/).

Total time: 12 hours

Lecturer: Dr.  Marco Fanfani, PhD – marco.fanfani@unifi.it

Slide of the course: https://www.disit.org/sites/default/files/2023-03/3DCV_SLIDE_V1.0_2023…

Outline of the course

  • Introduction to projective geometry:
    • Camera projection
    • Homogeneous coordinate
    • Camera matrix
    • Epipolar geometry (Essential matrix, Fundamental matrix)
    • Planar Homographies
    • Absolute conic, IAC, Circular Points
  • Camera calibration
    • Calibration from planar patterns
    • Calibration from 3D pattern
    • Calibration from vanishing points
    • Calibration from metadata
    • Self-calibration
  • 3D reconstruction
    • Dense stereo
    • Structure from Motion
    • Multi-view stereo
    • Structured light reconstruction
    • DEM (DTM + DSM) modeling
    • Shape from Shading
    • Photometric Stereo
  • Visual Odometry, SLAM, and Localization
    • Bayesian SLAM
    • Indirect and direct visual odometry
    • Loop closure
    • Feature based localization

Map based localization