title
How to develop a custom camera with AVFoundation using Swift 4 for iOS 11 How to retrieve the frame of a camera preview ? Access a camera stream to perform operation on frame
tags
AVFoundation, Swift, Swift4, iOS 11, video preview, picture, frame
links
https://developer.apple.com/documentation/avfoundation https://www.appcoda.com/avfoundation-swift-guide/
What is AVFoundation, When to use it
Understanding AVFoundation
AVFoundation is a framework developed by Apple and available on iOS (2.2+). This framework is used to create, edit and playback media content. It gives direct access to camera and microphone to developers.
In this tutorial we will see how to access the camera stream to perform operation on each frame.
Note: AVFoundation is a high customization framework. If you have an application that only require picture taking for example, you should use the UIImagePickerController.
Building a simple implementation of AVFoundation for live capture
To configure AVFoundation in an application, it is necessary to use a AVCaptureSession. This class owns the different configuration needed to access the camera and use the input properly. It will link the input configuration (camera, microphone), the output configuration (photo, video stream, audio stream) and the live preview.
The architecture of each class will look like the following:
AVCaptureDeviceInput ----feed----> AVCaptureSession ----provide---> AVCaptureVideoDataOutput/AVCapturePhotoOutput .... ----provide---> AVCaptureVideoPreviewLayer todo: replace this with a scheme
The following example is based on the following class :
class AVFoundationImplementation { var captureSession: AVCaptureSession?
var rearCamera: AVCaptureDevice? var rearCameraInput: AVCaptureDeviceInput?
var videoPreviewOutput: AVCaptureVideoDataOutput?
var videoPreviewLayer: AVCaptureVideoPreviewLayer?
var didOutputNewImage: (UIImage) -> Void }
Create the AVCaptureSession
An AVCaptureSession variable can be created with the default initialize.
self.captureSession = AVCaptureSession()
Configuring the Input
In our case, we will use the rear camera as input.
First step, we need to get an access to the rear camera device. All devices can be accessed with the DiscoverySession function of AVCaptureDevice.
let session = AVCaptureDevice.DiscoverySession(deviceTypes: [.builtInWideAngleCamera], mediaType: AVMediaType.video, position: .back)
self.rearCamera = session.devices.first
Then we will be able to configure as we want the camera. Always lock the camera before updating the configuration, otherwise your application will not be considered as the "owner" of the camera. It will result in a Exception.
if let rearCamera = self.rearCamera { try? rearCamera.lockForConfiguration() rearCamera.focusMode = .autoFocus rearCamera.unlockForConfiguration() }
And finally the camera is ready to be imported in the AVCaptureSession variable previously created.
if let rearCamera = self.rearCamera {
// we try to create the input from the found camera self.rearCameraInput = try? AVCaptureDeviceInput(device: rearCamera)
if let rearCameraInput = rearCameraInput, {
// always make sure the AVCaptureSession can accept the selected input
if captureSession?.canAddInput(rearCameraInput) {
// add the input to the current session
captureSession?.addInput(rearCameraInput)
}
} }
Configuring the preview
Once you are done configuring the input, it is recommanded to provide user a visual feedback. The easiest way is to add a view displaying the current input. The only object you have to create is a AVCaptureVideoPreviewLayer that will depend on the session you created.
if let captureSession = captureSession { // create the preview layer with the configuration you want self.videoPreviewLayer = AVCaptureVideoPreviewLayer(session: captureSession) self.videoPreviewLayer?.videoGravity = AVLayerVideoGravity.resizeAspectFill self.videoPreviewLayer?.connection?.videoOrientation = .portrait
// then add the layer to your current view view.layer.insertSublayer(self.videoPreviewLayer!, at: 0) self.videoPreviewLayer?.frame = view.frame }
Configuring the output
Depending on what type of information you are interested in, you will have to create a different ouput object. In this example, we want to get each frame of camera available. The AVCaptureVideoDataOutputSampleBufferDelegate will make it possible with its "didOutput sampleBuffer" function.
Be careful on the resource consumption of your application, this function will only be called at every available frame.
extension AVFoundationImplementation: AVCaptureVideoDataOutputSampleBufferDelegate {
func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { guard let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) else { return } let ciImage = CIImage(cvPixelBuffer: imageBuffer)
let context = CIContext()
guard let cgImage = context.createCGImage(ciImage, from: ciImage.extent) else { return }
let image = UIImage(cgImage: cgImage)
// the final picture is here, we call the completion block
self.didOutputNewImage()
}
}
Once the delegate is implemented, it is possible to link the corresponding output to the session.
self.videoOutput = AVCaptureVideoDataOutput() self.videoOutput!.setSampleBufferDelegate(self, queue: DispatchQueue(label: "sample buffer"))
// always make sure the AVCaptureSession can accept the selected output if captureSession.canAddOutput(self.videoOutput!) {
// add the output to the current session captureSession.addOutput(self.videoOutput!) }
Starting the session
Once you have configured your input and output, you can start the current session. As it is blocking call and it can take a long time, you should always start your session on a background thread.
TODO: add background thread
self.captureSession?.startRunning()
You can check at all time the status of the session with the isRunning property of captureSession. It is also recommended to stop the service when it is not necessary anymore.
// check at all time status if self.captureSession?.isRunning == true {
// if necessary, stop session like this self.captureSession?.stopRunning()
}
Bilan
This sample code is a very simple solution to get all available frames of a camera stream. We use this type of implementation on our application Facelytics for iOS, don't hesitate to test it !