dbcesar/solvePnP_comments.txt

## solvePnP_comments.txt
solvePnP gives the object coordinates in respect to the camera coordinates.

To get the camera coordinates with respect to object coordinates, you have to invert the solvePnP output.

There is a trick to invert transformation matrices that allows you to save the inversion operation,
which is usually expensive. Given a transformation [R|t], we have that inv([R|t]) = [R'|-R'*t],
where R' is the transpose of R.

So the C++ code would be like:

#########################################
#########################################
cv::Mat rvec, tvec;
solvePnP(..., rvec, tvec, ...);
// rvec is 3x1, tvec is 3x1

cv::Mat R;
cv::Rodrigues(rvec, R); // R is 3x3

R = R.t();  // rotation of inverse
tvec = -R * tvec; // translation of inverse

cv::Mat T(4, 4, R.type()); // T is 4x4
T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T
// fill the last row of T (NOTE: depending on your types, use float or double)
double *p = T.ptr<double>(3);
p[0] = p[1] = p[2] = 0; p[3] = 1;

// T is a 4x4 matrix with the pose of the camera in the object frame
#########################################
#########################################

Update: Later, to use T with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

// T is your 4x4 matrix in the OpenCV frame
cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame
These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.
	solvePnP gives the object coordinates in respect to the camera coordinates.

	To get the camera coordinates with respect to object coordinates, you have to invert the solvePnP output.

	There is a trick to invert transformation matrices that allows you to save the inversion operation,
	which is usually expensive. Given a transformation [R\|t], we have that inv([R\|t]) = [R'\|-R'*t],
	where R' is the transpose of R.

	So the C++ code would be like:

	#########################################
	#########################################
	cv::Mat rvec, tvec;
	solvePnP(..., rvec, tvec, ...);
	// rvec is 3x1, tvec is 3x1

	cv::Mat R;
	cv::Rodrigues(rvec, R); // R is 3x3

	R = R.t(); // rotation of inverse
	tvec = -R * tvec; // translation of inverse

	cv::Mat T(4, 4, R.type()); // T is 4x4
	T( cv::Range(0,3), cv::Range(0,3) ) = R * 1; // copies R into T
	T( cv::Range(0,3), cv::Range(3,4) ) = tvec * 1; // copies tvec into T
	// fill the last row of T (NOTE: depending on your types, use float or double)
	double *p = T.ptr<double>(3);
	p[0] = p[1] = p[2] = 0; p[3] = 1;

	// T is a 4x4 matrix with the pose of the camera in the object frame
	#########################################
	#########################################

	Update: Later, to use T with OpenGL you have to keep in mind that the axes of the camera frame differ between OpenCV and OpenGL.

	OpenCV uses the reference usually used in computer vision: X points to the right, Y down, Z to the front (as in this image). The frame of the camera in OpenGL is: X points to the right, Y up, Z to the back (as in this image). So, you need to apply a rotation around X axis of 180 degrees. The formula of this rotation matrix is in wikipedia.

	// T is your 4x4 matrix in the OpenCV frame
	cv::Mat RotX = ...; // 4x4 matrix with a 180 deg rotation around X
	cv::Mat Tgl = T * RotX; // OpenGL camera in the object frame
	These transformations are always confusing and I may be wrong at some step, so take this with a grain of salt.

	Finally, take into account that matrices in OpenCV are stored in row-major order in memory, and OpenGL ones, in column-major order.