- Cardboard VR Projects for Android
- Jonathan Linowes Matt Schoen
- 2099字
- 2021-07-16 10:54:11
3D camera, perspective, and head rotation
As awesome as this is (ha ha), our app is kind of boring and not very Cardboard-like. Specifically, it's stereoscopic (dual views) and has lens distortion, but it's not yet a 3D perspective view and it doesn't move with your head. We're going to fix this now.
Welcome to the matrix
We can't talk about developing for virtual reality without talking about matrix mathematics for 3D computer graphics.
What is a matrix? The answer is out there, Neo, and it's looking for you, and it will find you if you want it to. That's right, it's time to learn about the matrix. Everything will be different now. Your perspective is about to change.
We're building a three-dimensional scene. Each location in space is described by the X, Y, and Z coordinates. Objects in the scene may be constructed from X, Y, and Z vertices. An object can be transformed by moving, scaling, and/or rotating its vertices. This transformation can be represented mathematically with a matrix of 16 floating point values (four rows of four floats each). How it works mathematically is cool, but we won't get into it here.
Matrices can be combined by multiplying them together. For example, if you have a matrix that represents how much to resize an object (scale) and another matrix to reposition (translate), then you could make a third matrix, representing both the resizing and repositioning by multiplying the two together. You can't just use the primitive *
operator though. Also, note that unlike a simple scalar multiplication, matrix multiplication is not commutative. In other words, we know that a * b = b * a. However, for matrices A and B, AB ≠ BA! The Matrix Android class library provides functions for doing matrix math. Here's an example:
// allocate the matrix arrays float scale[] = new float[16]; float translate[] = new float[16]; float scaleAndTranslate[] = new float[16]; // initialize to Identity Matrix.setIdentityM(scale, 0); Matrix.setIdentityM(translate, 0); // scale by 2, move by 5 in Z Matrix.scaleM(scale, 0, 2.0, 2.0, 2.0); Matrix.translateM(translate, 0, 0, 0.0, 0.0, 5.0); // combine them with a matrix multiply Matrix.multipyMM(scaleAndTranslate, 0, translate, 0, scale, 0);
Note that due to the way in which matrix multiplication works, multiplying a vector by the result matrix will have the same effect as first multiplying it by the scale matrix (right-hand side), and then multiplying it by the translate matrix (left-hand side). This is the opposite of what you might expect.
Note
The documentation of the Matrix API can be found at http://developer.android.com/reference/android/opengl/Matrix.html.
This matrix stuff will be used a lot. Something that is worth mentioning here is precision loss. You might get a "drift" from the actual values if you repeatedly scale and translate that combined matrix because floating point calculations lose information due to rounding. It's not just a problem for computer graphics but also for banks and Bitcoin mining! (Remember the movie Office Space?)
One fundamental use of this matrix math, which we need immediately, is to transform a scene into a screen image (projection) as viewed from the user's perspective.
In a Cardboard VR app, to render the scene from a particular viewpoint, we think of a camera that is looking in a specific direction. The camera has X, Y, and Z positions like any other object and is rotated to its view direction. In VR, when you turn your head, the Cardboard SDK reads the motion sensors in your phone, determines the current head pose (the view direction and angles), and gives your app the corresponding transformation matrix.
In fact, in VR for each frame, we render two slightly different perspective views: one for each eye, offset by the actual distance between one's eyes (the interpupillary distance).
Also, in VR, we want to render the scene using a perspective projection (versus isometric) so that objects closer to you appear larger than the ones further away. This can be represented with a 4 x 4 matrix as well.
We can combine each of these transformations by multiplying them together to get a modelViewProjection
matrix:
modelViewProjection = modelTransform X camera X eyeView X perspectiveProjection
A complete modelViewProjection
(MVP) transformation matrix is a combination of any model transforms (for example, scaling or positioning the model in the scene) with the camera eye view and perspective projection.
When OpenGL goes to draw an object, the vertex shader can use this modelViewProjection
matrix to render the geometry. The whole scene gets drawn from the user's viewpoint, in the direction his head is pointing, with a perspective projection for each eye to appear stereoscopically through your Cardboard viewer. VR MVP FTW!
The MVP vertex shader
The super simple vertex shader that we wrote earlier doesn't transform each vertex; it just passed it through the next step in the pipeline. Now, we want it to be 3D-aware and use our modelViewProjection
(MVP) transformation matrix. Create a shader to handle it.
In the hierarchy view, right-click on the app/res/raw
folder, go to New | File, enter the name, mvp_vertex.shader
, and click on OK. Write the following code:
uniform mat4 u_MVP; attribute vec4 a_Position; void main() { gl_Position = u_MVP * a_Position; }
This shader is almost the same as simple_vertex
but transforms each vertex by the u_MVP
matrix. (Note that while multiplying matrices and vectors with *
does not work in Java, it does work in the shader code!)
Replace the shader resource in the compleShaders
function to use R.raw.mvp_vertex
instead:
simpleVertexShader = loadShader(GLES20.GL_VERTEX_SHADER, R.raw.mvp_vertex)
Setting up the perspective viewing matrices
To add the camera and view to our scene, we define a few variables. In the MainActivity.java
file, add the following code to the beginning of the MainActivity
class:
// Viewing variables private static final float Z_NEAR = 0.1f; private static final float Z_FAR = 100.0f; private static final float CAMERA_Z = 0.01f; private float[] camera; private float[] view; private float[] modelViewProjection; // Rendering variables private int triMVPMatrixParam;
The Z_NEAR
and Z_FAR
constants define the depth planes used later to calculate the perspective projection for the camera eye. CAMERA_Z
will be the position of the camera (for example, at X=0.0, Y=0.0, and Z=0.01).
The triMVPMatrixParam
variable will be used to set the model transformation matrix in our improved shader.
The camera
, view
, and modelViewProjection
matrices will be 4 x 4 matrices (an array of 16 floats) used for perspective calculations.
In onCreate
, we initialize the camera
, view
, and modelViewProjection
matrices:
protected void onCreate(Bundle savedInstanceState) { //... camera = new float[16]; view = new float[16]; modelViewProjection = new float[16]; }
In prepareRenderingTriangle
, we initialize the triMVPMatrixParam
variable:
// get handle to shape's transformation matrix triMVPMatrixParam = GLES20.glGetUniformLocation(triProgram, "u_MVP");
There is a longstanding (and pointless) debate in the 3D graphics world about which axis is up. We can somehow all agree that the X axis goes left and right, but does the Y axis go up and down, or is it Z? Plenty of software picks Z as the up-and-down direction, and defines Y as pointing in and out of the screen. On the other hand, the Cardboard SDK, Unity, Maya, and many others choose the reverse. If you think of the coordinate plane as drawn on graph paper, it all depends on where you put the paper. If you think of the graph as you look down from above, or draw it on a whiteboard, then Y is the vertical axis. If the graph is sitting on the table in front of you, then the missing Z axis is vertical, pointing up and down. In any case, the Cardboard SDK, and therefore the projects in this book, treat Z as the forward and backward axis.
Render in perspective
With things set up, we can now handle redrawing the screen for each frame.
First, set the camera position. It can be defined once, like in onCreate
. But, often in a VR application, the camera position in the scene can change, so we'll reset it for each frame.
The first thing to do is reset the camera matrix at the start of a new frame to a generic front-facing direction. Define the onNewFrame
method, as follows:
@Override public void onNewFrame(HeadTransform headTransform) { // Build the camera matrix and apply it to the ModelView. Matrix.setLookAtM(camera, 0, 0.0f, 0.0f, CAMERA_Z, 0.0f, 0.0f, 0.0f, 0.0f, 1.0f, 0.0f); }
Now, when it's time to draw the scene from the viewpoint of each eye, we calculate the perspective view matrix. Modify onDrawEye
as follows:
public void onDrawEye(Eye eye) { GLES20.glEnable(GLES20.GL_DEPTH_TEST); GLES20.glClear(GLES20.GL_COLOR_BUFFER_BIT | GLES20.GL_DEPTH_BUFFER_BIT); // Apply the eye transformation to the camera Matrix.multiplyMM(view, 0, eye.getEyeView(), 0, camera, 0); // Get the perspective transformation float[] perspective = eye.getPerspective(Z_NEAR, Z_FAR); // Apply perspective transformation to the view, and draw Matrix.multiplyMM(modelViewProjection, 0, perspective, 0, view, 0); drawTriangle(); }
The first two lines that we added reset the OpenGL depth buffer. When 3D scenes are rendered, in addition to the color of each pixel, OpenGL keeps track of the distance the object occupying that pixel is from the eye. If the same pixel is rendered for another object, the depth buffer will know whether it should be visible (closer) or ignored (further away). (Or, perhaps the colors get combined in some way, for example, transparency). We clear the buffer before rendering any geometry for each eye. The color buffer, which is the one you actually see on screen, is also cleared. Otherwise, in this case, you would end up filling the entire screen with a solid color.
Now, let's move on to the viewing transformations. onDrawEye
receives the current Eye
object, which describes the stereoscopic rendering details of the eye. In particular, the eye.getEyeView()
method returns a transformation matrix that includes head tracking rotation, position shift, and interpupillary distance shift. In other words, where the eye is located in the scene and what direction it's looking. Though Cardboard does not offer positional tracking, the positions of the eyes do change in order to simulate a virtual head. Your eyes don't rotate on a central axis, but rather your head pivots around your neck, which is a certain distance from the eyes. As a result, when the Cardboard SDK detects a change in orientation, the two virtual cameras move around the scene as though they were actual eyes in an actual head.
We need a transformation that represents the perspective view of the camera at this eye's position. As mentioned earlier, this is calculated as follows:
modelViewProjection = modelTransform X camera X eyeView X perspectiveProjection
We multiply the camera
by the eye view transform (getEyeView
), then multiply the result by the perspective projection transform (getPerspective
). Presently, we do not transform the triangle model itself and leave the modelTransform
matrix out.
The result (modelViewProjection
) is passed to OpenGL to be used by the shaders in the rendering pipeline (via glUniformMatrix4fv
). Then, we draw our stuff (via glDrawArrays
as written earlier).
Now, we need to pass the view matrix to the shader program. In the drawTriangle
method, add it as follows:
private void drawTriangle() { // Add program to OpenGL ES environment GLES20.glUseProgram(triProgram); // Pass the MVP transformation to the shader GLES20.glUniformMatrix4fv(triMVPMatrixParam, 1, false, modelViewProjection, 0); // . . .
Building and running
Let's build and run it. Go to Run | Run 'app', or simply use the green triangle Run icon on the toolbar. Now, moving the phone will change the display synchronized with your view direction. Insert the phone in a Google Cardboard viewer and it's like VR (kinda sorta).
Note that if your phone is lying flat on the table when the app starts, the camera in our scene will be facing straight down rather than forward at our triangle. What's worse, when you pick up the phone, the neutral direction may not be facing straight in front of you. So, each time you run apps in this book, pick up the phone first, so you look forward in VR, or keep the phone propped up in position (personally, I use a Gekkopod, which is available at http://gekkopod.com/).
Also, in general, make sure that your phone is not set to Lock Portrait in the Settings dialog box.