Skip to content

Instantly share code, notes, and snippets.

@SheldonWangRJT
Last active June 17, 2022 03:22
Show Gist options
  • Star 1 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save SheldonWangRJT/765f50f20d06c320d9c69eb1bf17124f to your computer and use it in GitHub Desktop.
Save SheldonWangRJT/765f50f20d06c320d9c69eb1bf17124f to your computer and use it in GitHub Desktop.
iOS CPU and GPU basics

iOS CPU and GPU basics

#iOSBySheldon #CPU #GPU #60FPS #CADisplayLink

Apple always encourage us as iOS developers to try to maintain our apps running with a 60 frames per second rate (60 FPS) to make sure the best user experience. Most of the time, this comes for free because the hardwares are getting better and better.

But in some cases you may need to dig deeper if your app as a significant frame drops to fix it for the users, which leads to my post today - the basic of CPU and GPU in iOS. Whenever I encounter this kind of question, I always want to find the answer directly from Apple’s official document. I was able to find this article here. I mean this article is written mainly for #Metal and iOS game developers but I found all the info that we need. As always, I will try to summarize the best part of the article.

To kick it off, few basics of iOS that you may want to remind:

  • iOS is running 60 FPS (as if now at least)
  • CPU and GPU can work separately and asynchronously
  • We can leverage CPU and GPU using Metal easily, but we can also do it in regular apps with the help of Runloop
  • Runloop is basically a while loop that runs over and over to handle events for a thread in iOS (Main thread will automatically create a Runloop instance for us)

The basic flow of rendering each frame in iOS is that:

  • The render loop starts a new frame.
  • The CPU writes new vertex data into the vertex buffer.
  • The CPU encodes render commands and commits a command buffer.
  • The GPU begins executing the command buffer.
  • The GPU reads vertex data from the vertex buffer.
  • The GPU renders pixels to a drawable.
  • The render loop completes the frame.

The flow is pretty straightforward but there is the thing called “vertex buffer” that you may not hear of. The vertex buffer is just a buffer that both CPU and GPU can access and share operations. The document also mentioned “fragment textures” to do the same thing as “vertex buffer”. So, the most intuitive way to run the flow can be achieved with a single vertex buffer as in the following image, which is NOT recommended: Single Buffer Flow

The reason that this is not recommended is apparent, time is wasted when CPU writers to buffer for GPU and as well as when GPU reads from buffer for CPU. We need to find a way to share the vertex buffer, the following image will show another NOT recommended flow.. Single Buffer Flow with Shared Access

It is also very apparent that this is not recommend because of the race conditions between CPU and GPU, there will be an access conflict. So 1 buffer is definitely not enough, and guess what, the solution can be easy, which is just, you are right, create few more buffers to use. How many buffers do we need? Answer is THREE (3). With multiple buffers, the flow will be like: Multi Buffers Flow

Why 3 buffers?

According to the article, 3 is the most efficient and effective number. Technically we only need 2 buffers. Say for Frame 1, we only use Buffer 1, when Frame 2 needs to be calculated by CPU, we use Buffer 2. And when Frame 3 needed we can use Buffer 1 again, but Apple suggested to use Buffer 3, and next we switch back to Buffer 1.

Of course when we use CPU and GPU both access the vertex buffer, we need the locking mechanism to make sure writing finished before reading and vise versa. Sample code uses dispatch_semaphore_wait to lock and unlock the access.

All the codes are given here in this sample project. Don't forgot to read the article again here.

Also a quick note, it is pretty easy to check the current frame rate by using CADisplayLink class. To create it, simply:

func createDisplayLink() {
    let displaylink = CADisplayLink(target: self,
                                    selector: #selector(step))
    displaylink.add(to: .current,
                    forMode: .defaultRunLoopMode)
}
func step(displaylink: CADisplayLink) {
    print(displaylink.timestamp)
}

Current frame rate can be calculated as:

let actualFramesPerSecond = 1 / (displaylink.targetTimestamp - displaylink.timestamp)

For more about CADisplayLink, check here

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment