I recently spent some time refactoring how Collect interacts with the Android MediaPlayer (PR here). This resulted in my having to touch a fair amount of the stack involed in form entry so I thought it would be good to do a write up of the pain I experienced (within the code) with some suggestions around improvement we could make.
Disclaimer: all the "Next steps" are definitely just my own and first opinion so are very open to discussion.
When Collect renders a Form it uses an ODKView to represent each question (FormEntryPrompt) or "group" of questions. This view is hosted inside a FormEntryActivity. When the user swipes from left to right the ODKView is removed from the view hierarchy and a new one is made for the new question.