We could take a simple stage model like ARM Cortex-M3 , ie 3 stages model : Fetch, Decode and Execute.
Fetch only reads 4 bytes from RAM, to take the instruction and increases PC.
Decode, decodes an instruction and :
- Fetch next 4 bytes without increasing the PC
- If is prefix IF.cc, checks the condition and increases PC by 4.
- If the condition is true, calls again the Decode stage using the fetch on Decode as instruction to be processed.
- If the condition is false, don't call the Execute stage.