Camera and Stereo System Parameters

mirella melo
Feb 17, 2024
6 min read

The modeling, or simply the system parameters, can be divided into intrinsic parameters, which refer to the internal information of the device, and extrinsic parameters, which involve relevant notations related to components external to the camera.

As illustrated in Figure 1, it is the extrinsic and intrinsic parameters that allow the conversion of the different coordinate systems involved: (1) world coordinates, (2) camera coordinates, and (3) image coordinates. Geometric transformations in computer vision are usually based on homogeneous coordinates, briefly summarized as follows.

Figure 1: Illustration of the world, camera, and image coordinate systems and the corresponding set of parameters responsible for coordinate transformation between these systems. Source: author [1].

Homogeneous Coordinate System

Given a point A(𝑥, 𝑦) in R² Cartesian space, its corresponding representation in homogeneous coordinates is extended by one dimension. Therefore, Ã(𝑥, 𝑦, 𝑤) in R³, where 𝑤 is called the weight and the tilde (~) indicates that this point is represented in homogeneous coordinates. The conversion of a point Ã(𝑥, 𝑦, 𝑤) in the homogeneous coordinate system to Cartesian coordinates is done by dividing the weight 𝑤 by the other elements and then reducing one dimension. In this case, it becomes A(𝑢/𝑤, 𝑣/𝑤).

It is important to note that each row in homogeneous coordinates corresponds to a single point in Cartesian space. This can be observed through a point Ã(𝑘𝑢, 𝑘𝑣, 𝑘𝑤) with 𝑘 taking any non-zero real value.

The advantages of using this system are: (1) coordinate transformation calculations can be performed using matrix operations and linear equations, and (2) it allows for the representation of points at infinity, i.e., when 𝑤 = 0.

Intrinsic Parameters

The intrinsic parameters allow changing the coordinate system (or frame) of the camera (3D) to the coordinate system of the image plane (2D) and vice versa (Figure 1).

These parameters are illustrated from a primitive model of a photographic camera in Figure 2 - the pinhole camera. For convenience, this model is usually represented with the image plane between the center of projection of the camera 𝑂c (where the light rays converge) and the scene (illustrated by P). Another convenience is the adoption of the coordinate center of the image plane o coinciding with the orthogonal projection of 𝑂𝑐 onto the plane.

Figure 2: Illustration of the pinhole camera model, described with the image plane positioned between the projection center and the scene, preserving the orientation of the environment. Adapted: [2].

Thus, in homogeneous coordinates, the point 𝑃(𝑥, 𝑦, 𝑧, 1) concerning the camera frame can be related to 𝑝(𝑢, 𝑣, 1) in the image frame by Equation 1, where 𝑓 represents the focal length of the camera and 𝜆 the scale factor that allows the point 𝑝 transit through the entire line 𝑂𝑐𝑃.

Equation 1

However, the complete representation of the intrinsic parameters must consider additional factors, illustrated in Figure 3, namely: the physical dimension of the pixels indicated by ℎ𝑥 and ℎ𝑦, usually expressed in micrometers (𝜇𝑚); the origin of the image's coordinate system in the upper left corner; the point indicated by 𝑜(𝑥𝑜, 𝑦𝑜) called the central point, formed from the orthogonal projection of the origin of the camera system 𝑂𝑐 onto Π - plane 𝑥𝑦 of the two-dimensional system of the image; the focal length 𝑓 that is, the distance between 𝑂𝑐 and the principal point 𝑜.

In Cartesian coordinates, the equations below demonstrate how the transformation between the camera frame and the image frame is given a point 𝑝(𝑢, 𝑣) in the Π plane.

Then, it is possible to rewrite Equation 1 with the complete set of intrinsic parameters illustrated below.

Extrinsic Parameters

Given that the extrinsic parameters are related to external elements of the camera, their interpretation is closely related to the camera system being used. In essence, they encompass all the geometric parameters that enable the transformation between the world coordinate system, and the camera coordinate system and vice versa (Figure 1).

Figure 3 shows the world frame 𝑂𝑊 and the camera frame 𝑂𝐶. The vector 𝑇 describes a change in the position of the coordinate centers of the 𝑂𝑊 and 𝑂𝐶 systems, and the matrix 𝑅 represents the rotation around the corresponding axes of each system. Together, 𝑇 and 𝑅 constitute the extrinsic parameters, represented by the matrix 𝑀 in Equation 2.

Figure 3: Representation of the intrinsic and extrinsic parameters of a camera. Adapted: [3].

Equation 2

The relationship that ensures the transformation from world coordinates to image coordinates depends on both the intrinsic and extrinsic parameter matrices, as shown in the following equation:

Shortly, the transformation of 𝑝 ̃︀ in image coordinates to its respective world coordinate 𝑃𝑊 can also be indicated by the equation below, which considers the matrix of intrinsic parameters 𝐾, and extrinsic 𝑀 where 𝜆 is the scale factor.

Distortion Parameters

In an ideal scenario, straight lines in the world are represented as straight lines in the image. However, due to the curvature of the lens, the resulting images suffer from some form of distortion. There are two main lens distortion types: radial and tangential. The first type is caused by the lens itself, as illustrated in Figure 4, while the second type results from the non-parallelism of the lens with the image plane, as shown in Figure 5. The complete camera modeling also needs to consider calculating these distortion values to transform the distorted image coordinates into ideal coordinates, thus removing the distortion.

Figure 4: Types of radial distortions: (a) negative and (b) positive. Finally, (c) shows the lens

ideal without distortion. Source: [4].

Figure 5: tangential distortion - characterized by non-parallelism between the image plane and the converging lens. Source: author.

Determining the distortion parameters allows us to quantify how much a specific point in the image has been affected by distortion, enabling reverse engineering and achieving an undistorted (desired) image. Equations 3 to 6 illustrate a possible relationship between the desired image coordinates (𝑢, 𝑣) and the distorted image coordinates (𝑢𝑑, 𝑣𝑑) regarding radial and tangential distortion. The elements 𝑟𝑛(𝑛=1,2,3) denote the radial distortion coefficients, and 𝑡𝑛(𝑛=1,2) represent the tangential distortion coefficients. The variable 𝑑 refers to the radial distance in pixels (equation 8) between a point (𝑢, 𝑣) and the principal point (𝑢𝑑, 𝑣𝑑). It can be observed from Equations 3 and 4 that as d increases, the distortion increases nonlinearly.

The distortion elements are part of the set of intrinsic parameters of the camera. Their values, along with the extrinsic parameters, are calculated during the process known as camera calibration.

Camera Calibration

The camera calibration process is performed to calculate the intrinsic and extrinsic parameters of the camera. Initially, images of a known pattern need to be obtained. An image processing algorithm can easily recognize this pattern, and by establishing correspondences between the 3D points in the scene and their 2D projections on the image plane, the parameters can be estimated.

One widely used method is calibration using a chessboard pattern [5]. The method detects the corners in multiple views and estimates the parameters that minimize the error between the world coordinates and their respective projections onto the image plane. The minimum error is calculated using the Levenberg-Marquardt algorithm [6], which relies on initial parameter estimation. Figure 6 illustrates only three image pairs of the chessboard seen from different perspectives.

Figure 6: Camera calibration process from a planar pattern, in this case, the

chessboard. Source: author.,

Calibration using a planar (2D) pattern is widely employed, often using a chessboard pattern. However, there are also methods utilizing 1D patterns [7] or even dimensionless patterns, such as those using infrared sensors [8]. In cases with limited control over the image configuration (e.g., with a single image of the scene), calibration can be performed using deep learning-based methods [9].

Numerous techniques have been proposed and are available in open-source libraries like OpenCV, along with their respective references for further details, which can be accessed in [10]. A more in-depth understanding of this topic, including the techniques and the mathematics involved, can be found in [2].

References

[1] MELO, Mirella Santos Pessoa de. Navigable region mapping from a SLAM system and image segmentation. 2021. Master's Dissertation.Universidade Federal de Pernambuco.

[2] HARTLEY, R.; ZISSERMAN, A. Multiple View Geometry in Computer Vision. 2nd ed. [S.l.]:

Cambridge University Press, 2004.

[3] CYGANEK, B.; SIEBERT, J. P. An introduction to 3D computer vision techniques and

algorithms. [s.l.]: John Wiley & Sons, 2011.

[4] MATHWORKS. What Is Camera Calibration? 2016.

vision/ug/camera-calibration.html>. [Online; Accessed 20-September-2021].

[5] ZHANG, Z. A flexible new technique for camera calibration. IEEE Transactions on pattern

analysis and machine intelligence, IEEE, v. 22, no. 11, p. 1330–1334, 2000.

[6] MORÉ, J. J. The Levenberg-Marquardt algorithm: implementation and theory. In: Numerical analysis. [S.l.]: Springer, 1978. p. 105–116.

[7] BORGHESE, N. A.; CERVERI, P. Calibrating a video camera pair with a rigid bar. Pattern

Recognition, Elsevier, v. 33, no. 1, p. 81–95, 2000.

[8] SVOBODA, T.; MARTINEC, D.; PAJDLA, T. A convenient multicamera self-calibration for

virtual environments. Presence: Teleoperators & virtual environments, MIT Press One Rogers

Street, Cambridge, MA 02142-1209, USA journals-info . . . , v. 14, no. 4, p. 407–422, 2005.

[9] LOPEZ, M.; MARI, R.; GARGALLO, P.; KUANG, Y.; GONZALEZ-JIMENEZ, J.; HARO, G.

Deep single image camera calibration with radial distortion. In: Proceedings of the IEEE/CVF

Conference on Computer Vision and Pattern Recognition. [S.l.: s.n.], 2019. p. 11817–11825.

[10] BOUGUET, J.-Y. A few links related to camera calibration. 2015.

caltech.edu/bouguetj/calib_doc/htmls/links.html>. [Online; Accessed 20-September-2021].