Last active September 11, 2020 09:09
OpenPose Caffe Model Convert to CoreML Model

Base Refelence at melito

Start Coremltool

Before Setup coremltools

$ export PATH="$HOME/miniconda2/bin:$PATH"
$ source activate coreml

Edit pose_deploy_linevec.prototxt

edit input_dim of pose_deploy_linevec.prototxt.
320 = multiple of 16.

input: "image"
input_dim: 1
input_dim: 3
input_dim: 320 # This value will be defined at runtime
input_dim: 320 # This value will be defined at runtime


$ python

================= Starting Conversion from Caffe to CoreML ======================
Layer 0: Type: 'Input', Name: 'input'. Output(s): 'image'.
Ignoring batch size and retaining only the trailing 3 dimensions for conversion. 
Layer 1: Type: 'Convolution', Name: 'conv1_1'. Input(s): 'image'. Output(s): 'conv1_1'.
Layer 2: Type: 'ReLU', Name: 'relu1_1'. Input(s): 'conv1_1'. Output(s): 'conv1_1'.
Layer 3: Type: 'Convolution', Name: 'conv1_2'. Input(s): 'conv1_1'. Output(s): 'conv1_2'.
Layer 4: Type: 'ReLU', Name: 'relu1_2'. Input(s): 'conv1_2'. Output(s): 'conv1_2'.


Layer 181: Type: 'Concat', Name: 'concat_stage7'. Input(s): 'Mconv7_stage6_L2', 'Mconv7_stage6_L1'. Output(s): 'net_output'.

================= Summary of the conversion: ===================================
Detected input(s) and shape(s) (ignoring batch size):
'image' : 3, 320, 320

Network Input name(s): 'image'.
Network Output name(s): 'net_output'.

Swift Sample Memo

import coremltools
proto_file = 'pose_deploy_linevec.prototxt'
caffe_model = 'pose_iter_440000.caffemodel'
coreml_model = coremltools.converters.caffe.convert((caffe_model, proto_file)
, image_input_names='image'
, image_scale=1/255.
import UIKit
import CoreML
class ViewController: UIViewController {
override func viewDidLoad() {
// Do any additional setup after loading the view, typically from a nib.
let model = pose_coco()
@IBOutlet weak var imageView1: UIImageView!
@IBOutlet weak var imageView2: UIImageView!
public func coremlTest(){
let image = UIImage(named: "hoge.jpg")!
if let pixelBuffer = image.pixelBuffer(width: 320, height: 320) {
imageView2.image = UIImage(pixelBuffer: pixelBuffer)
if let prediction = try? model.prediction(image: pixelBuffer) {
let p = prediction.net_output
let m = try! MLMultiArray(shape:[40,40], dataType: .double)
let n = 18 * m.count
for i in 0..<m.count {
m[i] = p[i+n]
imageView1.image = m.image(offset: 0, scale: 255)
override func didReceiveMemoryWarning() {
// Dispose of any resources that can be recreated.
MacKaSL commented Feb 23, 2018

How can i create my own .caffemodel and .prototxt?

getting output as MLMutilArray 1 x 1 x 22 x 40 x 40 array how to convert it to UIImage or cv::Mat

Use of unresolved identifier 'MLMultiArray'

Value of type 'MLMultiArray' has no member 'image'

Now i am facing one Crash at line 154 of MLMultiArray+Image.swift
"let width = self.shape[widthAxis].intValue"

@vishaalkolhe90 are you succeed to convert the MLMutilArray 1 x 1 x 22 x 40 x 40 array to UIImage...?
if yes can you please share how to did this... as i am also stuck with this...

akhzarna commented Apr 9, 2019

I am also getting output as MLMutilArray 1 x 1 x 22 x 40 x 40 array how to convert it to UIImage or cv::Mat

akhzarna commented Apr 9, 2019

AndrzejKRK commented Sep 11, 2020

Maybe for somebody this is wiil by usefully

            let scale = 8.0 // in my case
            let heatmaps = prediction.net_output
            let total_keypoint_number = heatmaps.shape[2].intValue
            let heatmap_w = heatmaps.shape[3].intValue
            let heatmap_h = heatmaps.shape[4].intValue
            for k in 0..<total_keypoint_number-1 {
                var max_confidence = 0.0
                var tmp_x = 0
                var tmp_y = 0
                for i in 0..<heatmap_w {
                    for j in 0..<heatmap_h {
                        let index = k*(heatmap_w*heatmap_h) + i*(heatmap_h) + j
                        let confidence = heatmaps[index].doubleValue
                        guard confidence > 0 else { continue }
                        if confidence > max_confidence {
                            max_confidence = confidence
                            tmp_y = i * scale_y
                            tmp_x = j * scale_x

