Create a gist now

Instantly share code, notes, and snippets.

What would you like to do?
The MediaCodec class first became available in Android 4.1 (API 16). In Android 4.3 (API 18), MediaCodec was expanded to include a way to provide input through a Surface (via the createInputSurface method). This allows input to come from camera preview or OpenGL ES rendering. This release also introduced MediaMuxer, which allows the output of th…
Samples (requires 4.3, API 18)
Generates a movie using OpenGL ES. Uses MediaCodec to encode the movie in an H.264 elementary stream, and MediaMuxer to convert the stream to a .MP4 file.
This was written as if it were a CTS test, but is not part of CTS. It should be straightforward to adapt the code to other environments. (requires 4.3, API 18)
Records video from the camera preview and encodes it as an MP4 file. Uses MediaCodec to encode the movie in an H.264 elementary stream, and MediaMuxer to convert the stream to a .MP4 file. As an added bonus, demonstrates the use of a GLES fragment shader to modify the video as it's being recorded.
This was written as if it were a CTS test, but is not part of CTS. It should be straightforward to adapt the code to other environments.
Android Breakout game recorder patch (requires 4.3, API 18)
This is a patch for Android Breakout v1.0.2 that adds game recording. While the game is playing at 60fps at full screen resolution, a 30fps 720p recording is made with the AVC codec. The file is saved in the app-private data area, e.g. /data/data/com.faddensoft.breakout/files/video.mp4.
This is essentially the same as, but it's part of a full app rather than an isolated CTS test. One key difference is in the EGL setup, which is done in a way that allows textures to be shared between the display and video contexts.
Another approach would be to render each game frame to a texture, and then render a full-screen quad with that twice (once for the display, once for the video). This could be faster for games that are expensive to render. (requires 4.3, API 18)
CTS test. There are three tests that do essentially the same thing, but in different ways. Each test will:
Generate video frames
Encode frames with AVC codec
Decode generated stream
Test decoded frames to see if they match the original
The generation, encoding, decoding, and checking are near-simultaneous: frames are generated and fed to the encoder, and data from the encoder is fed to the decoder as soon as it becomes available.
The three tests are:
Buffer-to-buffer. Buffers are software-generated YUV frames in ByteBuffer objects, and decoded to the same. This is the slowest (and least portable) approach, but it allows the application to examine and modify the YUV data.
Buffer-to-surface. Encoding is again done from software-generated YUV data in ByteBuffers, but this time decoding is done to a Surface. Output is checked with OpenGL ES, using glReadPixels().
Surface-to-surface. Frames are generated with OpenGL ES onto an input Surface, and decoded onto a Surface. This is the fastest approach, but may involve conversions between YUV and RGB.
Each test is run at three different resolutions: 720p (1280x720), QCIF (176x144), and QVGA (320x240).
The buffer-to-buffer and buffer-to-surface tests can be built with Android 4.1 (API 16). However, because the CTS tests did not exist until Android 4.3, a number of devices shipped with broken implementations. (requires 4.3, API 18)
CTS test. The test does the following:
Generate a series of video frames, and encode them with AVC. The encoded data stream is held in memory.
Decode the generated stream with MediaCodec, using an output Surface.
Edit the frame (swap green/blue color channels) with an OpenGL ES fragment shader.
Encode the frame with MediaCodec, using an input Surface.
Decode the edited video stream, verifying the output.
The middle decode-edit-encode pass performs decoding and encoding near-simultaneously, streaming frames directly from the decoder to the encoder. The initial generation and final verification are done as separate passes on video data held in memory.
Each test is run at three different resolutions: 720p (1280x720), QCIF (176x144), and QVGA (320x240).
No software-interpreted YUV buffers are used. Everything goes through Surface. There will be conversions between RGB and YUV at certain points; how many and where they happen depends on how the drivers are implemented. (requires 4.1, API 16)
Extracts the first 10 frames of video from a .mp4 file. Uses MediaExtractor to extract the CSD data and feed individual access units into a MediaCodec decoder. The frames are decoded to a Surface created from SurfaceTexture, rendered (off-screen) into a pbuffer, extracted with glReadPixels(), and saved to a PNG file with Bitmap#compress().
The cost of extracting a frame breaks down roughly like this (obtained by modifying the test to extract full-size frames from 720p video on a Nexus 5, observing the total time required to save 10 frames, and doing successive runs with later stages removed):
2% hardware decode
23% glReadPixels() (which must do a YUV --> RGB conversion as it copies the pixel data)
14% rearranging the byte order in Java
61% PNG compression
In theory, a Surface from the API 19 ImageReader class could be passed to the MediaCodec decoder, allowing direct access to the YUV data without the painful glReadPixels() step. As of Android 4.4, the MediaCodec decoder formats are not supported by ImageReader.
This was written as if it were a CTS test, but is not part of CTS. It should be straightforward to adapt the code to other environments. (requires 4.1, API 16)
CTS test. Tests decoding pre-recorded audio streams. (requires 4.1, API 16)
CTS test. Tests encoding of audio streams.
Q1. How do I play the video streams created by MediaCodec with the "video/avc" codec?
A1. The stream created is a raw H.264 elementary stream. The Totem Movie Player for Linux may work, but many other players won't touch them. You need to use the MediaMuxer class to create an MP4 file instead. See the EncodeAndMuxTest sample.
Q2. Why does my call to MediaCodec.configure() fail with an IllegalStateException when I try to create an encoder?
A2. This is usually because you haven't specified all of the mandatory keys required by the encoder. See this stackoverflow item for an example.
Q3. My video decoder is configured but won't accept data. What's wrong?
A3. A common mistake is neglecting to set the Codec-Specific Data, mentioned briefly in the documentation through the keys "csd-0" and "csd-1". This is a bunch of raw data with things like Sequence Parameter Set and Picture Parameter Set; all you really need to know is that the MediaCodec encoder generates them and the MediaCodec decoder wants them.
If you are feeding the output of the encoder to the decoder, you will note that the first packet you get from the encoder has the BUFFER_FLAG_CODEC_CONFIG flag set. You need to make sure you propagate this flag to the decoder, so that the first buffer the decoder receives does the setup. Alternatively, you can set the CSD data in the MediaFormat, and pass this into the decoder via configure(). You can see examples of both approaches in the EncodeDecodeTest sample.
If you're not sure how to set this up, you should probably be using MediaExtractor, which will handle it all for you.
Q4. Can I stream data into the decoder?
A4. Yes and no. The decoder takes a stream of "access units", which may not be a stream of bytes. For the video decoder, this means you need to preserve the "packet boundaries" established by the encoder. For example, see how the VideoChunks class in the DecodeEditEncodeTest sample operates. You can't just read arbitrary chunks of the file and pass them in.
Q5. I'm encoding the output of the camera through a YUV preview buffer. Why do the colors look wrong?
A5. The color formats for the camera output and the MediaCodec encoder input are different. Camera supports YV12 (planar YUV 4:2:0) and NV21 (semi-planar YUV 4:2:0). The MediaCodec encoders support one or more of:
#19 COLOR_FormatYUV420Planar (I420)
#20 COLOR_FormatYUV420PackedPlanar (also I420)
#21 COLOR_FormatYUV420SemiPlanar (NV12)
#39 COLOR_FormatYUV420PackedSemiPlanar (also NV12)
#xx COLOR_TI_FormatYUV420PackedSemiPlanar (also also NV12)
I420 has the same general data layout as YV12, but the Cr and Cb planes are reversed. Same with NV12 vs. NV21. So if you try to hand YV12 buffers from the camera to an encoder expecting something else, you'll see some odd color effects, like in these images. There is no common format.
A more portable, and more efficient, approach is to use the API 18 Surface input API, demonstrated in the CameraToMpegTest sample. The down side of this is that you have to operate in RGB rather than YUV, which is a problem for image processing software. If you can implement the image manipulation in a fragment shader, perhaps by converting between RGB and YUV before and after your computations, you can take advantage of code execution on the GPU.
Q6. What's this EGL_RECORDABLE_ANDROID flag?
A6. That tells EGL that the surface it creates must be compatible with the video codecs. Without this flag, EGL might use a buffer format that MediaCodec can't understand.
Q7. Can I use the ImageReader class with MediaCodec?
A7. No. The ImageReader class, added in Android 4.4 (API 19), provides a handy way to access data in a YUV surface. Unfortunately, as of API 19 it only works with buffers from Camera. Also, there is no corresponding ImageWriter class for creating content.
Further Assistance
Please post all questions to stackoverflow with the android tag (and, for MediaCodec issues, the mediacodec tag as well). Comments or feature requests for the framework or CTS tests should be made on the AOSP bug tracker.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment