Since adopting ExoPlayer in Collect our users started to face crashes in different points of the app. Analysing the crashes we initial and wrongly concluded that was a problem with Samsung phones, however further investigation showed the problem was with Samsung Gear 360 camera videos.
To understand what happened first we need to understand a little about MP4 files. MPEG-4 Part 14 aka MP4 is a multimedia container format that allows the storage of video, audio, subtitles and also images. The different media are stored in different tracks, those tracks are divided in small samples. Those samples are organized in a way where corresponding pieces of different tracks are put near of each other. E.g.:
------------------------------------------------------------------------------------------
| Video Time=0s Duration=1s | Video Time=1s Duration=1s | Audio Time=0s Duration=2s |....
------------------------------------------------------------------------------------------
The process of organizing samples described above is called interleaving. A well interleaved video facilitates the playback by allowing a more sequential file access.
Therefore, Seamsung Gear 360 videos issue was that they were badly interleaved, e.g.:
------------------------------------------------------------------------------------------
| Video Time=0s Duration=1s |...| Video Time=300s Duration=1s | Audio Time=0s Duration=2s |....
------------------------------------------------------------------------------------------
To handle that, ExoPlayer was storing all the video samples in memory until finding the audio sample and then starting the playback. The memory consumption was easily achieving 500mb, which was enough to generate OOM exceptions in other threads of the App.
To fix the problem we need to go deeper. First we need to identify problematic videos and then discover a way to fix then.
MP4 implements the ISO file format defined by the MPEG-4 Part 12 protocol, such protocol defines a general way to store multimedia. All the data is stored in boxes (or atoms) of different types for different proposes. Here a example of an usual MP4:
Atom ftyp @ 0 of size: 32, ends @ 32
Atom free @ 32 of size: 8, ends @ 40
Atom mdat @ 40 of size: 107669954, ends @ 107669994
Atom moov @ 107669994 of size: 20506, ends @ 107690500
Atom mvhd @ 107670002 of size: 108, ends @ 107670110
Atom trak @ 107670110 of size: 14146, ends @ 107684256
Atom tkhd @ 107670118 of size: 92, ends @ 107670210
Atom edts @ 107670210 of size: 36, ends @ 107670246
Atom elst @ 107670218 of size: 28, ends @ 107670246
Atom mdia @ 107670246 of size: 13563, ends @ 107683809
Atom mdhd @ 107670254 of size: 32, ends @ 107670286
Atom hdlr @ 107670286 of size: 45, ends @ 107670331
Atom minf @ 107670331 of size: 13478, ends @ 107683809
Atom smhd @ 107670339 of size: 16, ends @ 107670355
Atom dinf @ 107670355 of size: 36, ends @ 107670391
Atom dref @ 107670363 of size: 28, ends @ 107670391
Atom stbl @ 107670391 of size: 13418, ends @ 107683809
Atom stsd @ 107670399 of size: 106, ends @ 107670505
Atom mp4a @ 107670415 of size: 90, ends @ 107670505
Atom esds @ 107670451 of size: 54, ends @ 107670505
Atom stts @ 107670505 of size: 40, ends @ 107670545
Atom stsc @ 107670545 of size: 6688, ends @ 107677233
Atom stsz @ 107677233 of size: 4008, ends @ 107681241
Atom stco @ 107681241 of size: 2568, ends @ 107683809
Atom uuid=ffcc8263-f855-4a93-8814-587a02521fdd @ 107683809 of size: 447, ends @ 107684256
Atom trak @ 107684256 of size: 6146, ends @ 107690402
Atom tkhd @ 107684264 of size: 92, ends @ 107684356
Atom edts @ 107684356 of size: 36, ends @ 107684392
Atom elst @ 107684364 of size: 28, ends @ 107684392
Atom mdia @ 107684392 of size: 5563, ends @ 107689955
Atom mdhd @ 107684400 of size: 32, ends @ 107684432
Atom hdlr @ 107684432 of size: 45, ends @ 107684477
Atom minf @ 107684477 of size: 5478, ends @ 107689955
Atom vmhd @ 107684485 of size: 20, ends @ 107684505
Atom dinf @ 107684505 of size: 36, ends @ 107684541
Atom dref @ 107684513 of size: 28, ends @ 107684541
Atom stbl @ 107684541 of size: 5414, ends @ 107689955
Atom stsd @ 107684549 of size: 134, ends @ 107684683
Atom avc1 @ 107684565 of size: 118, ends @ 107684683
Atom avcC @ 107684651 of size: 32, ends @ 107684683
Atom stts @ 107684683 of size: 64, ends @ 107684747
Atom stss @ 107684747 of size: 40, ends @ 107684787
Atom stsc @ 107684787 of size: 28, ends @ 107684815
Atom stsz @ 107684815 of size: 2572, ends @ 107687387
Atom stco @ 107687387 of size: 2568, ends @ 107689955
Atom uuid=ffcc8263-f855-4a93-8814-587a02521fdd @ 107689955 of size: 447, ends @ 107690402
Atom udta @ 107690402 of size: 98, ends @ 107690500
Atom meta @ 107690410 of size: 90, ends @ 107690500
Atom hdlr @ 107690422 of size: 33, ends @ 107690455
Atom ilst @ 107690455 of size: 45, ends @ 107690500
Atom ©too @ 107690463 of size: 37, ends @ 107690500
Atom data @ 107690471 of size: 29, ends @ 107690500
The most important boxes are the mdat
which is responsible to store multimedia data and the moov
box which stores the
data indexing. In the moov
box we can find trak
boxes, in then are stored the information about the different tracks
inside the MP4, in this case we have two tracks, one for video and another for audio.
TO BE CONTINUED...