Repository: Ma-Dan/YOLOv3-CoreML Branch: master Commit: 6076a11bcaa1 Files: 21 Total size: 236.6 MB Directory structure: gitextract_w4gyb_qn/ ├── .gitignore ├── Convert/ │ └── coreml.py ├── LICENSE.txt ├── README.markdown ├── YOLOv3 CoreML model/ │ ├── Yolov3.zip.001 │ ├── Yolov3.zip.002 │ └── Yolov3.zip.003 └── YOLOv3-CoreML/ ├── YOLOv3-CoreML/ │ ├── AppDelegate.swift │ ├── Assets.xcassets/ │ │ └── AppIcon.appiconset/ │ │ └── Contents.json │ ├── Base.lproj/ │ │ └── Main.storyboard │ ├── BoundingBox.swift │ ├── CVPixelBuffer+Helpers.swift │ ├── Helpers.swift │ ├── Info.plist │ ├── UIImage+CVPixelBuffer.swift │ ├── VideoCapture.swift │ ├── ViewController.swift │ └── YOLO.swift └── YOLOv3-CoreML.xcodeproj/ ├── project.pbxproj └── project.xcworkspace/ ├── contents.xcworkspacedata └── xcshareddata/ └── IDEWorkspaceChecks.plist ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # Xcode build/ DerivedData/ *.pbxuser !default.pbxuser *.mode1v3 !default.mode1v3 *.mode2v3 !default.mode2v3 *.perspectivev3 !default.perspectivev3 *.xcuserstate xcuserdata/ ## Other *.moved-aside *.xccheckout *.xcscmblueprint profile *.hmap *.ipa # CocoaPods Pods/ !Podfile.lock # Temporary files .DS_Store .Trashes .Spotlight-V100 *.swp *.lock # Python __pycache__/ *.py[cod] *$py.class # Jupyter Notebook .ipynb_checkpoints ================================================ FILE: Convert/coreml.py ================================================ import coremltools coreml_model = coremltools.converters.keras.convert('./model_data/yolo.h5', input_names='input1', image_input_names='input1', output_names=['output1', 'output2', 'output3'], image_scale=1/255.) coreml_model.input_description['input1'] = 'Input image' coreml_model.output_description['output1'] = 'The 13x13 grid (Scale1)' coreml_model.output_description['output2'] = 'The 26x26 grid (Scale2)' coreml_model.output_description['output3'] = 'The 52x52 grid (Scale3)' coreml_model.author = 'Original paper: Joseph Redmon, Ali Farhadi' coreml_model.license = 'Public Domain' coreml_model.short_description = "The YOLOv3 network from the paper 'YOLOv3: An Incremental Improvement'" coreml_model.save('Yolov3.mlmodel') ================================================ FILE: LICENSE.txt ================================================ Copyright (c) 2017 M.I. Hollemans Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated documentation files (the "Software"), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. ================================================ FILE: README.markdown ================================================ # YOLOv3 with Core ML This repo was forked and modified from [hollance/YOLO-CoreML-MPSNNGraph](https://github.com/hollance/YOLO-CoreML-MPSNNGraph). Some changes I made: 1. Add YOLOv3 model. 2. Only keep Keras converter. ## About YOLO object detection YOLO is an object detection network. It can detect multiple objects in an image and puts bounding boxes around these objects. [Read hollance's blog post about YOLO](http://machinethink.net/blog/object-detection-with-yolo/) to learn more about how it works. ![YOLO in action](YOLO.jpg) In this repo you'll find: - **YOLOv3-CoreML:** A demo app that runs the YOLOv3 neural network on Core ML. - **Converter:** The scripts needed to convert the original Keras YOLOv3 model to Core ML. To run the app: 1. Extract YOLOv3 CoreML model in YOLOv3 CoreML model folder and copy to YOLOv3-CoreML/YOLOv3-CoreML folder. 2. Open the **xcodeproj** file in Xcode 9 and run it on a device with iOS 11 or better installed. The reported "elapsed" time is how long it takes the YOLO neural net to process a single image. The FPS is the actual throughput achieved by the app. > **NOTE:** Running these kinds of neural networks eats up a lot of battery power. The app can put a limit on the number of times per second it runs the neural net. You can change this in `setUpCamera()` by changing the line `videoCapture.fps = 50` to a smaller number. ## Converting the models > **NOTE:** You don't need to convert the models yourself. Everything you need to run the demo apps is included in the Xcode projects already. The model is converted from Keras h5 model, follow the Quick Start guide [keras-yolo3](https://github.com/qqwweee/keras-yolo3) to get YOLOv3 Keras h5 model, then use coreml.py to convert h5 model to CoreML model. ================================================ FILE: YOLOv3 CoreML model/Yolov3.zip.001 ================================================ [File too large to display: 80.0 MB] ================================================ FILE: YOLOv3 CoreML model/Yolov3.zip.002 ================================================ [File too large to display: 80.0 MB] ================================================ FILE: YOLOv3 CoreML model/Yolov3.zip.003 ================================================ [File too large to display: 76.5 MB] ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/AppDelegate.swift ================================================ import UIKit @UIApplicationMain class AppDelegate: UIResponder, UIApplicationDelegate { var window: UIWindow? func application(_ application: UIApplication, didFinishLaunchingWithOptions launchOptions: [UIApplicationLaunchOptionsKey: Any]?) -> Bool { // Override point for customization after application launch. return true } } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/Assets.xcassets/AppIcon.appiconset/Contents.json ================================================ { "images" : [ { "idiom" : "iphone", "size" : "20x20", "scale" : "2x" }, { "idiom" : "iphone", "size" : "20x20", "scale" : "3x" }, { "idiom" : "iphone", "size" : "29x29", "scale" : "2x" }, { "idiom" : "iphone", "size" : "29x29", "scale" : "3x" }, { "idiom" : "iphone", "size" : "40x40", "scale" : "2x" }, { "idiom" : "iphone", "size" : "40x40", "scale" : "3x" }, { "idiom" : "iphone", "size" : "60x60", "scale" : "2x" }, { "idiom" : "iphone", "size" : "60x60", "scale" : "3x" }, { "idiom" : "ipad", "size" : "20x20", "scale" : "1x" }, { "idiom" : "ipad", "size" : "20x20", "scale" : "2x" }, { "idiom" : "ipad", "size" : "29x29", "scale" : "1x" }, { "idiom" : "ipad", "size" : "29x29", "scale" : "2x" }, { "idiom" : "ipad", "size" : "40x40", "scale" : "1x" }, { "idiom" : "ipad", "size" : "40x40", "scale" : "2x" }, { "idiom" : "ipad", "size" : "76x76", "scale" : "1x" }, { "idiom" : "ipad", "size" : "76x76", "scale" : "2x" }, { "idiom" : "ipad", "size" : "83.5x83.5", "scale" : "2x" }, { "idiom" : "ios-marketing", "size" : "1024x1024", "scale" : "1x" } ], "info" : { "version" : 1, "author" : "xcode" } } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/Base.lproj/Main.storyboard ================================================ Menlo-Regular ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/BoundingBox.swift ================================================ import Foundation import UIKit class BoundingBox { let shapeLayer: CAShapeLayer let textLayer: CATextLayer init() { shapeLayer = CAShapeLayer() shapeLayer.fillColor = UIColor.clear.cgColor shapeLayer.lineWidth = 4 shapeLayer.isHidden = true textLayer = CATextLayer() textLayer.foregroundColor = UIColor.black.cgColor textLayer.isHidden = true textLayer.contentsScale = UIScreen.main.scale textLayer.fontSize = 14 textLayer.font = UIFont(name: "Avenir", size: textLayer.fontSize) textLayer.alignmentMode = kCAAlignmentCenter } func addToLayer(_ parent: CALayer) { parent.addSublayer(shapeLayer) parent.addSublayer(textLayer) } func show(frame: CGRect, label: String, color: UIColor) { CATransaction.setDisableActions(true) let path = UIBezierPath(rect: frame) shapeLayer.path = path.cgPath shapeLayer.strokeColor = color.cgColor shapeLayer.isHidden = false textLayer.string = label textLayer.backgroundColor = color.cgColor textLayer.isHidden = false let attributes = [ NSAttributedStringKey.font: textLayer.font as Any ] let textRect = label.boundingRect(with: CGSize(width: 400, height: 100), options: .truncatesLastVisibleLine, attributes: attributes, context: nil) let textSize = CGSize(width: textRect.width + 12, height: textRect.height) let textOrigin = CGPoint(x: frame.origin.x - 2, y: frame.origin.y - textSize.height) textLayer.frame = CGRect(origin: textOrigin, size: textSize) } func hide() { shapeLayer.isHidden = true textLayer.isHidden = true } } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/CVPixelBuffer+Helpers.swift ================================================ import Foundation import Accelerate func resizePixelBuffer(_ srcPixelBuffer: CVPixelBuffer, cropX: Int, cropY: Int, cropWidth: Int, cropHeight: Int, scaleWidth: Int, scaleHeight: Int) -> CVPixelBuffer? { CVPixelBufferLockBaseAddress(srcPixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) guard let srcData = CVPixelBufferGetBaseAddress(srcPixelBuffer) else { print("Error: could not get pixel buffer base address") return nil } let srcBytesPerRow = CVPixelBufferGetBytesPerRow(srcPixelBuffer) let offset = cropY*srcBytesPerRow + cropX*4 var srcBuffer = vImage_Buffer(data: srcData.advanced(by: offset), height: vImagePixelCount(cropHeight), width: vImagePixelCount(cropWidth), rowBytes: srcBytesPerRow) let destBytesPerRow = scaleWidth*4 guard let destData = malloc(scaleHeight*destBytesPerRow) else { print("Error: out of memory") return nil } var destBuffer = vImage_Buffer(data: destData, height: vImagePixelCount(scaleHeight), width: vImagePixelCount(scaleWidth), rowBytes: destBytesPerRow) let error = vImageScale_ARGB8888(&srcBuffer, &destBuffer, nil, vImage_Flags(0)) CVPixelBufferUnlockBaseAddress(srcPixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) if error != kvImageNoError { print("Error:", error) free(destData) return nil } let releaseCallback: CVPixelBufferReleaseBytesCallback = { _, ptr in if let ptr = ptr { free(UnsafeMutableRawPointer(mutating: ptr)) } } let pixelFormat = CVPixelBufferGetPixelFormatType(srcPixelBuffer) var dstPixelBuffer: CVPixelBuffer? let status = CVPixelBufferCreateWithBytes(nil, scaleWidth, scaleHeight, pixelFormat, destData, destBytesPerRow, releaseCallback, nil, nil, &dstPixelBuffer) if status != kCVReturnSuccess { print("Error: could not create new pixel buffer") free(destData) return nil } return dstPixelBuffer } func resizePixelBuffer(_ pixelBuffer: CVPixelBuffer, width: Int, height: Int) -> CVPixelBuffer? { return resizePixelBuffer(pixelBuffer, cropX: 0, cropY: 0, cropWidth: CVPixelBufferGetWidth(pixelBuffer), cropHeight: CVPixelBufferGetHeight(pixelBuffer), scaleWidth: width, scaleHeight: height) } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/Helpers.swift ================================================ import Foundation import UIKit import CoreML import Accelerate // The labels for the 80 classes. let labels = [ "person", "bicycle", "car", "motorbike", "aeroplane", "bus", "train", "truck", "boat", "traffic light", "fire hydrant", "stop sign", "parking meter", "bench", "bird", "cat", "dog", "horse", "sheep", "cow", "elephant", "bear", "zebra", "giraffe", "backpack", "umbrella", "handbag", "tie", "suitcase", "frisbee", "skis", "snowboard", "sports ball", "kite", "baseball bat", "baseball glove", "skateboard", "surfboard", "tennis racket", "bottle", "wine glass", "cup", "fork", "knife", "spoon", "bowl", "banana", "apple", "sandwich", "orange", "broccoli", "carrot", "hot dog", "pizza", "donut", "cake", "chair", "sofa", "pottedplant", "bed", "diningtable", "toilet", "tvmonitor", "laptop", "mouse", "remote", "keyboard", "cell phone", "microwave", "oven", "toaster", "sink", "refrigerator", "book", "clock", "vase", "scissors", "teddy bear", "hair drier", "toothbrush" ] let anchors: [[Float]] = [[116,90, 156,198, 373,326], [30,61, 62,45, 59,119], [10,13, 16,30, 33,23]] /** Removes bounding boxes that overlap too much with other boxes that have a higher score. Based on code from https://github.com/tensorflow/tensorflow/blob/master/tensorflow/core/kernels/non_max_suppression_op.cc - Parameters: - boxes: an array of bounding boxes and their scores - limit: the maximum number of boxes that will be selected - threshold: used to decide whether boxes overlap too much */ func nonMaxSuppression(boxes: [YOLO.Prediction], limit: Int, threshold: Float) -> [YOLO.Prediction] { // Do an argsort on the confidence scores, from high to low. let sortedIndices = boxes.indices.sorted { boxes[$0].score > boxes[$1].score } var selected: [YOLO.Prediction] = [] var active = [Bool](repeating: true, count: boxes.count) var numActive = active.count // The algorithm is simple: Start with the box that has the highest score. // Remove any remaining boxes that overlap it more than the given threshold // amount. If there are any boxes left (i.e. these did not overlap with any // previous boxes), then repeat this procedure, until no more boxes remain // or the limit has been reached. outer: for i in 0..= limit { break } for j in i+1.. threshold { active[j] = false numActive -= 1 if numActive <= 0 { break outer } } } } } } return selected } /** Computes intersection-over-union overlap between two bounding boxes. */ public func IOU(a: CGRect, b: CGRect) -> Float { let areaA = a.width * a.height if areaA <= 0 { return 0 } let areaB = b.width * b.height if areaB <= 0 { return 0 } let intersectionMinX = max(a.minX, b.minX) let intersectionMinY = max(a.minY, b.minY) let intersectionMaxX = min(a.maxX, b.maxX) let intersectionMaxY = min(a.maxY, b.maxY) let intersectionArea = max(intersectionMaxY - intersectionMinY, 0) * max(intersectionMaxX - intersectionMinX, 0) return Float(intersectionArea / (areaA + areaB - intersectionArea)) } extension Array where Element: Comparable { /** Returns the index and value of the largest element in the array. */ public func argmax() -> (Int, Element) { precondition(self.count > 0) var maxIndex = 0 var maxValue = self[0] for i in 1.. maxValue { maxValue = self[i] maxIndex = i } } return (maxIndex, maxValue) } } /** Logistic sigmoid. */ public func sigmoid(_ x: Float) -> Float { return 1 / (1 + exp(-x)) } /** Computes the "softmax" function over an array. Based on code from https://github.com/nikolaypavlov/MLPNeuralNet/ This is what softmax looks like in "pseudocode" (actually using Python and numpy): x -= np.max(x) exp_scores = np.exp(x) softmax = exp_scores / np.sum(exp_scores) First we shift the values of x so that the highest value in the array is 0. This ensures numerical stability with the exponents, so they don't blow up. */ public func softmax(_ x: [Float]) -> [Float] { var x = x let len = vDSP_Length(x.count) // Find the maximum value in the input array. var max: Float = 0 vDSP_maxv(x, 1, &max, len) // Subtract the maximum from all the elements in the array. // Now the highest value in the array is 0. max = -max vDSP_vsadd(x, 1, &max, &x, 1, len) // Exponentiate all the elements in the array. var count = Int32(x.count) vvexpf(&x, x, &count) // Compute the sum of all exponentiated values. var sum: Float = 0 vDSP_sve(x, 1, &sum, len) // Divide each element by the sum. This normalizes the array contents // so that they all add up to 1. vDSP_vsdiv(x, 1, &sum, &x, 1, len) return x } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/Info.plist ================================================ CFBundleDevelopmentRegion $(DEVELOPMENT_LANGUAGE) CFBundleExecutable $(EXECUTABLE_NAME) CFBundleIdentifier $(PRODUCT_BUNDLE_IDENTIFIER) CFBundleInfoDictionaryVersion 6.0 CFBundleName $(PRODUCT_NAME) CFBundlePackageType APPL CFBundleShortVersionString 1.0 CFBundleVersion 1 LSRequiresIPhoneOS NSCameraUsageDescription Let's do some deep learning! UILaunchStoryboardName Main UIMainStoryboardFile Main UIRequiredDeviceCapabilities armv7 UIRequiresFullScreen UIStatusBarStyle UIStatusBarStyleLightContent UISupportedInterfaceOrientations UIInterfaceOrientationPortrait UISupportedInterfaceOrientations~ipad UIInterfaceOrientationPortrait ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/UIImage+CVPixelBuffer.swift ================================================ import UIKit extension UIImage { public func pixelBuffer(width: Int, height: Int) -> CVPixelBuffer? { var maybePixelBuffer: CVPixelBuffer? let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue, kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue] let status = CVPixelBufferCreate(kCFAllocatorDefault, Int(width), Int(height), kCVPixelFormatType_32ARGB, attrs as CFDictionary, &maybePixelBuffer) guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else { return nil } CVPixelBufferLockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) let pixelData = CVPixelBufferGetBaseAddress(pixelBuffer) guard let context = CGContext(data: pixelData, width: Int(width), height: Int(height), bitsPerComponent: 8, bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer), space: CGColorSpaceCreateDeviceRGB(), bitmapInfo: CGImageAlphaInfo.noneSkipFirst.rawValue) else { return nil } context.translateBy(x: 0, y: CGFloat(height)) context.scaleBy(x: 1, y: -1) UIGraphicsPushContext(context) self.draw(in: CGRect(x: 0, y: 0, width: width, height: height)) UIGraphicsPopContext() CVPixelBufferUnlockBaseAddress(pixelBuffer, CVPixelBufferLockFlags(rawValue: 0)) return pixelBuffer } } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/VideoCapture.swift ================================================ import UIKit import AVFoundation import CoreVideo public protocol VideoCaptureDelegate: class { func videoCapture(_ capture: VideoCapture, didCaptureVideoFrame: CVPixelBuffer?, timestamp: CMTime) } public class VideoCapture: NSObject { public var previewLayer: AVCaptureVideoPreviewLayer? public weak var delegate: VideoCaptureDelegate? public var fps = 15 let captureSession = AVCaptureSession() let videoOutput = AVCaptureVideoDataOutput() let queue = DispatchQueue(label: "net.machinethink.camera-queue") var lastTimestamp = CMTime() public func setUp(sessionPreset: AVCaptureSession.Preset = .medium, completion: @escaping (Bool) -> Void) { queue.async { let success = self.setUpCamera(sessionPreset: sessionPreset) DispatchQueue.main.async { completion(success) } } } func setUpCamera(sessionPreset: AVCaptureSession.Preset) -> Bool { captureSession.beginConfiguration() captureSession.sessionPreset = sessionPreset guard let captureDevice = AVCaptureDevice.default(for: AVMediaType.video) else { print("Error: no video devices available") return false } guard let videoInput = try? AVCaptureDeviceInput(device: captureDevice) else { print("Error: could not create AVCaptureDeviceInput") return false } if captureSession.canAddInput(videoInput) { captureSession.addInput(videoInput) } let previewLayer = AVCaptureVideoPreviewLayer(session: captureSession) previewLayer.videoGravity = AVLayerVideoGravity.resizeAspect previewLayer.connection?.videoOrientation = .portrait self.previewLayer = previewLayer let settings: [String : Any] = [ kCVPixelBufferPixelFormatTypeKey as String: NSNumber(value: kCVPixelFormatType_32BGRA), ] videoOutput.videoSettings = settings videoOutput.alwaysDiscardsLateVideoFrames = true videoOutput.setSampleBufferDelegate(self, queue: queue) if captureSession.canAddOutput(videoOutput) { captureSession.addOutput(videoOutput) } // We want the buffers to be in portrait orientation otherwise they are // rotated by 90 degrees. Need to set this _after_ addOutput()! videoOutput.connection(with: AVMediaType.video)?.videoOrientation = .portrait captureSession.commitConfiguration() return true } public func start() { if !captureSession.isRunning { captureSession.startRunning() } } public func stop() { if captureSession.isRunning { captureSession.stopRunning() } } } extension VideoCapture: AVCaptureVideoDataOutputSampleBufferDelegate { public func captureOutput(_ output: AVCaptureOutput, didOutput sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { // Because lowering the capture device's FPS looks ugly in the preview, // we capture at full speed but only call the delegate at its desired // framerate. let timestamp = CMSampleBufferGetPresentationTimeStamp(sampleBuffer) let deltaTime = timestamp - lastTimestamp if deltaTime >= CMTimeMake(1, Int32(fps)) { lastTimestamp = timestamp let imageBuffer = CMSampleBufferGetImageBuffer(sampleBuffer) delegate?.videoCapture(self, didCaptureVideoFrame: imageBuffer, timestamp: timestamp) } } public func captureOutput(_ output: AVCaptureOutput, didDrop sampleBuffer: CMSampleBuffer, from connection: AVCaptureConnection) { //print("dropped frame") } } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML/ViewController.swift ================================================ import UIKit import Vision import AVFoundation import CoreMedia import VideoToolbox class ViewController: UIViewController { @IBOutlet weak var videoPreview: UIView! @IBOutlet weak var timeLabel: UILabel! @IBOutlet weak var debugImageView: UIImageView! let yolo = YOLO() var videoCapture: VideoCapture! var request: VNCoreMLRequest! var startTimes: [CFTimeInterval] = [] var boundingBoxes = [BoundingBox]() var colors: [UIColor] = [] let ciContext = CIContext() var resizedPixelBuffer: CVPixelBuffer? var framesDone = 0 var frameCapturingStartTime = CACurrentMediaTime() let semaphore = DispatchSemaphore(value: 2) override func viewDidLoad() { super.viewDidLoad() timeLabel.text = "" setUpBoundingBoxes() setUpCoreImage() setUpVision() setUpCamera() frameCapturingStartTime = CACurrentMediaTime() } override func didReceiveMemoryWarning() { super.didReceiveMemoryWarning() print(#function) } // MARK: - Initialization func setUpBoundingBoxes() { for _ in 0.. Double { // Measure how many frames were actually delivered per second. framesDone += 1 let frameCapturingElapsed = CACurrentMediaTime() - frameCapturingStartTime let currentFPSDelivered = Double(framesDone) / frameCapturingElapsed if frameCapturingElapsed > 1 { framesDone = 0 frameCapturingStartTime = CACurrentMediaTime() } return currentFPSDelivered } func show(predictions: [YOLO.Prediction]) { for i in 0.. [Prediction] { if let output = try? model.prediction(input1: image) { return computeBoundingBoxes(features: [output.output1, output.output2, output.output3]) } else { return [] } } public func computeBoundingBoxes(features: [MLMultiArray]) -> [Prediction] { assert(features[0].count == 255*13*13) assert(features[1].count == 255*26*26) assert(features[2].count == 255*52*52) var predictions = [Prediction]() let blockSize: Float = 32 let boxesPerCell = 3 let numClasses = 80 // The 416x416 image is divided into a 13x13 grid. Each of these grid cells // will predict 5 bounding boxes (boxesPerCell). A bounding box consists of // five data items: x, y, width, height, and a confidence score. Each grid // cell also predicts which class each bounding box belongs to. // // The "features" array therefore contains (numClasses + 5)*boxesPerCell // values for each grid cell, i.e. 125 channels. The total features array // contains 255x13x13 elements. // NOTE: It turns out that accessing the elements in the multi-array as // `features[[channel, cy, cx] as [NSNumber]].floatValue` is kinda slow. // It's much faster to use direct memory access to the features. var gridHeight = [13, 26, 52] var gridWidth = [13, 26, 52] var featurePointer = UnsafeMutablePointer(OpaquePointer(features[0].dataPointer)) var channelStride = features[0].strides[0].intValue var yStride = features[0].strides[1].intValue var xStride = features[0].strides[2].intValue func offset(_ channel: Int, _ x: Int, _ y: Int) -> Int { return channel*channelStride + y*yStride + x*xStride } for i in 0..<3 { featurePointer = UnsafeMutablePointer(OpaquePointer(features[i].dataPointer)) channelStride = features[i].strides[0].intValue yStride = features[i].strides[1].intValue xStride = features[i].strides[2].intValue for cy in 0.. confidenceThreshold { let rect = CGRect(x: CGFloat(x - w/2), y: CGFloat(y - h/2), width: CGFloat(w), height: CGFloat(h)) let prediction = Prediction(classIndex: detectedClass, score: confidenceInClass, rect: rect) predictions.append(prediction) } } } } } // We already filtered out any bounding boxes that have very low scores, // but there still may be boxes that overlap too much with others. We'll // use "non-maximum suppression" to prune those duplicate bounding boxes. return nonMaxSuppression(boxes: predictions, limit: YOLO.maxBoundingBoxes, threshold: iouThreshold) } } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML.xcodeproj/project.pbxproj ================================================ // !$*UTF8*$! { archiveVersion = 1; classes = { }; objectVersion = 48; objects = { /* Begin PBXBuildFile section */ 372CB262209C9B0F00F501E3 /* AppDelegate.swift in Sources */ = {isa = PBXBuildFile; fileRef = 372CB261209C9B0F00F501E3 /* AppDelegate.swift */; }; 372CB265209C9B2600F501E3 /* BoundingBox.swift in Sources */ = {isa = PBXBuildFile; fileRef = 372CB263209C9B2600F501E3 /* BoundingBox.swift */; }; 372CB266209C9B2600F501E3 /* Helpers.swift in Sources */ = {isa = PBXBuildFile; fileRef = 372CB264209C9B2600F501E3 /* Helpers.swift */; }; 7BA1C6D01EF27DA000BB25EF /* VideoCapture.swift in Sources */ = {isa = PBXBuildFile; fileRef = 7BC25FB51EF27C0D002ECBBA /* VideoCapture.swift */; }; 7BA1C6D21EF2827800BB25EF /* UIImage+CVPixelBuffer.swift in Sources */ = {isa = PBXBuildFile; fileRef = 7BA1C6D11EF2827500BB25EF /* UIImage+CVPixelBuffer.swift */; }; 7BA1C6D81EF2871600BB25EF /* YOLO.swift in Sources */ = {isa = PBXBuildFile; fileRef = 7BA1C6D71EF2871600BB25EF /* YOLO.swift */; }; 7BA1C6DA1EF2B30200BB25EF /* CVPixelBuffer+Helpers.swift in Sources */ = {isa = PBXBuildFile; fileRef = 7BA1C6D91EF2B30200BB25EF /* CVPixelBuffer+Helpers.swift */; }; 7BC25FA41EF1B7D1002ECBBA /* ViewController.swift in Sources */ = {isa = PBXBuildFile; fileRef = 7BC25FA31EF1B7D1002ECBBA /* ViewController.swift */; }; 7BC25FA71EF1B7D1002ECBBA /* Main.storyboard in Resources */ = {isa = PBXBuildFile; fileRef = 7BC25FA51EF1B7D1002ECBBA /* Main.storyboard */; }; 7BC25FA91EF1B7D1002ECBBA /* Assets.xcassets in Resources */ = {isa = PBXBuildFile; fileRef = 7BC25FA81EF1B7D1002ECBBA /* Assets.xcassets */; }; 7BC25FB41EF1CEB0002ECBBA /* YOLOv3.mlmodel in Sources */ = {isa = PBXBuildFile; fileRef = 7BC25FB31EF1CEAE002ECBBA /* YOLOv3.mlmodel */; }; /* End PBXBuildFile section */ /* Begin PBXFileReference section */ 372CB261209C9B0F00F501E3 /* AppDelegate.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = AppDelegate.swift; sourceTree = ""; }; 372CB263209C9B2600F501E3 /* BoundingBox.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = BoundingBox.swift; sourceTree = ""; }; 372CB264209C9B2600F501E3 /* Helpers.swift */ = {isa = PBXFileReference; fileEncoding = 4; lastKnownFileType = sourcecode.swift; path = Helpers.swift; sourceTree = ""; }; 7BA1C6D11EF2827500BB25EF /* UIImage+CVPixelBuffer.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = "UIImage+CVPixelBuffer.swift"; sourceTree = ""; }; 7BA1C6D71EF2871600BB25EF /* YOLO.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = YOLO.swift; sourceTree = ""; }; 7BA1C6D91EF2B30200BB25EF /* CVPixelBuffer+Helpers.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = "CVPixelBuffer+Helpers.swift"; sourceTree = ""; }; 7BC25F9E1EF1B7D1002ECBBA /* YOLOv3-CoreML.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = "YOLOv3-CoreML.app"; sourceTree = BUILT_PRODUCTS_DIR; }; 7BC25FA31EF1B7D1002ECBBA /* ViewController.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = ViewController.swift; sourceTree = ""; }; 7BC25FA61EF1B7D1002ECBBA /* Base */ = {isa = PBXFileReference; lastKnownFileType = file.storyboard; name = Base; path = Base.lproj/Main.storyboard; sourceTree = ""; }; 7BC25FA81EF1B7D1002ECBBA /* Assets.xcassets */ = {isa = PBXFileReference; lastKnownFileType = folder.assetcatalog; path = Assets.xcassets; sourceTree = ""; }; 7BC25FAD1EF1B7D1002ECBBA /* Info.plist */ = {isa = PBXFileReference; lastKnownFileType = text.plist.xml; path = Info.plist; sourceTree = ""; }; 7BC25FB31EF1CEAE002ECBBA /* YOLOv3.mlmodel */ = {isa = PBXFileReference; lastKnownFileType = file.mlmodel; path = YOLOv3.mlmodel; sourceTree = ""; }; 7BC25FB51EF27C0D002ECBBA /* VideoCapture.swift */ = {isa = PBXFileReference; lastKnownFileType = sourcecode.swift; path = VideoCapture.swift; sourceTree = ""; }; /* End PBXFileReference section */ /* Begin PBXFrameworksBuildPhase section */ 7BC25F9B1EF1B7D1002ECBBA /* Frameworks */ = { isa = PBXFrameworksBuildPhase; buildActionMask = 2147483647; files = ( ); runOnlyForDeploymentPostprocessing = 0; }; /* End PBXFrameworksBuildPhase section */ /* Begin PBXGroup section */ 7BC25F951EF1B7D1002ECBBA = { isa = PBXGroup; children = ( 7BC25FA01EF1B7D1002ECBBA /* YOLOv3-CoreML */, 7BC25F9F1EF1B7D1002ECBBA /* Products */, ); sourceTree = ""; }; 7BC25F9F1EF1B7D1002ECBBA /* Products */ = { isa = PBXGroup; children = ( 7BC25F9E1EF1B7D1002ECBBA /* YOLOv3-CoreML.app */, ); name = Products; sourceTree = ""; }; 7BC25FA01EF1B7D1002ECBBA /* YOLOv3-CoreML */ = { isa = PBXGroup; children = ( 372CB263209C9B2600F501E3 /* BoundingBox.swift */, 372CB264209C9B2600F501E3 /* Helpers.swift */, 372CB261209C9B0F00F501E3 /* AppDelegate.swift */, 7BC25FA81EF1B7D1002ECBBA /* Assets.xcassets */, 7BA1C6D91EF2B30200BB25EF /* CVPixelBuffer+Helpers.swift */, 7BC25FAD1EF1B7D1002ECBBA /* Info.plist */, 7BC25FA51EF1B7D1002ECBBA /* Main.storyboard */, 7BC25FB31EF1CEAE002ECBBA /* YOLOv3.mlmodel */, 7BA1C6D11EF2827500BB25EF /* UIImage+CVPixelBuffer.swift */, 7BC25FB51EF27C0D002ECBBA /* VideoCapture.swift */, 7BC25FA31EF1B7D1002ECBBA /* ViewController.swift */, 7BA1C6D71EF2871600BB25EF /* YOLO.swift */, ); path = "YOLOv3-CoreML"; sourceTree = ""; }; /* End PBXGroup section */ /* Begin PBXNativeTarget section */ 7BC25F9D1EF1B7D1002ECBBA /* YOLOv3-CoreML */ = { isa = PBXNativeTarget; buildConfigurationList = 7BC25FB01EF1B7D1002ECBBA /* Build configuration list for PBXNativeTarget "YOLOv3-CoreML" */; buildPhases = ( 7BC25F9A1EF1B7D1002ECBBA /* Sources */, 7BC25F9B1EF1B7D1002ECBBA /* Frameworks */, 7BC25F9C1EF1B7D1002ECBBA /* Resources */, ); buildRules = ( ); dependencies = ( ); name = "YOLOv3-CoreML"; productName = "YOLOv3-CoreML"; productReference = 7BC25F9E1EF1B7D1002ECBBA /* YOLOv3-CoreML.app */; productType = "com.apple.product-type.application"; }; /* End PBXNativeTarget section */ /* Begin PBXProject section */ 7BC25F961EF1B7D1002ECBBA /* Project object */ = { isa = PBXProject; attributes = { LastSwiftUpdateCheck = 0900; LastUpgradeCheck = 0900; ORGANIZATIONNAME = MachineThink; TargetAttributes = { 7BC25F9D1EF1B7D1002ECBBA = { CreatedOnToolsVersion = 9.0; ProvisioningStyle = Automatic; }; }; }; buildConfigurationList = 7BC25F991EF1B7D1002ECBBA /* Build configuration list for PBXProject "YOLOv3-CoreML" */; compatibilityVersion = "Xcode 8.0"; developmentRegion = en; hasScannedForEncodings = 0; knownRegions = ( en, Base, ); mainGroup = 7BC25F951EF1B7D1002ECBBA; productRefGroup = 7BC25F9F1EF1B7D1002ECBBA /* Products */; projectDirPath = ""; projectRoot = ""; targets = ( 7BC25F9D1EF1B7D1002ECBBA /* YOLOv3-CoreML */, ); }; /* End PBXProject section */ /* Begin PBXResourcesBuildPhase section */ 7BC25F9C1EF1B7D1002ECBBA /* Resources */ = { isa = PBXResourcesBuildPhase; buildActionMask = 2147483647; files = ( 7BC25FA91EF1B7D1002ECBBA /* Assets.xcassets in Resources */, 7BC25FA71EF1B7D1002ECBBA /* Main.storyboard in Resources */, ); runOnlyForDeploymentPostprocessing = 0; }; /* End PBXResourcesBuildPhase section */ /* Begin PBXSourcesBuildPhase section */ 7BC25F9A1EF1B7D1002ECBBA /* Sources */ = { isa = PBXSourcesBuildPhase; buildActionMask = 2147483647; files = ( 7BA1C6D01EF27DA000BB25EF /* VideoCapture.swift in Sources */, 7BA1C6DA1EF2B30200BB25EF /* CVPixelBuffer+Helpers.swift in Sources */, 7BC25FB41EF1CEB0002ECBBA /* YOLOv3.mlmodel in Sources */, 372CB265209C9B2600F501E3 /* BoundingBox.swift in Sources */, 372CB266209C9B2600F501E3 /* Helpers.swift in Sources */, 372CB262209C9B0F00F501E3 /* AppDelegate.swift in Sources */, 7BA1C6D21EF2827800BB25EF /* UIImage+CVPixelBuffer.swift in Sources */, 7BC25FA41EF1B7D1002ECBBA /* ViewController.swift in Sources */, 7BA1C6D81EF2871600BB25EF /* YOLO.swift in Sources */, ); runOnlyForDeploymentPostprocessing = 0; }; /* End PBXSourcesBuildPhase section */ /* Begin PBXVariantGroup section */ 7BC25FA51EF1B7D1002ECBBA /* Main.storyboard */ = { isa = PBXVariantGroup; children = ( 7BC25FA61EF1B7D1002ECBBA /* Base */, ); name = Main.storyboard; sourceTree = ""; }; /* End PBXVariantGroup section */ /* Begin XCBuildConfiguration section */ 7BC25FAE1EF1B7D1002ECBBA /* Debug */ = { isa = XCBuildConfiguration; buildSettings = { ALWAYS_SEARCH_USER_PATHS = NO; CLANG_ANALYZER_NONNULL = YES; CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE; CLANG_CXX_LANGUAGE_STANDARD = "gnu++14"; CLANG_CXX_LIBRARY = "libc++"; CLANG_ENABLE_MODULES = YES; CLANG_ENABLE_OBJC_ARC = YES; CLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES; CLANG_WARN_BOOL_CONVERSION = YES; CLANG_WARN_COMMA = YES; CLANG_WARN_CONSTANT_CONVERSION = YES; CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR; CLANG_WARN_DOCUMENTATION_COMMENTS = YES; CLANG_WARN_EMPTY_BODY = YES; CLANG_WARN_ENUM_CONVERSION = YES; CLANG_WARN_INFINITE_RECURSION = YES; CLANG_WARN_INT_CONVERSION = YES; CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR; CLANG_WARN_RANGE_LOOP_ANALYSIS = YES; CLANG_WARN_STRICT_PROTOTYPES = YES; CLANG_WARN_SUSPICIOUS_MOVE = YES; CLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE; CLANG_WARN_UNREACHABLE_CODE = YES; CLANG_WARN__DUPLICATE_METHOD_MATCH = YES; CODE_SIGN_IDENTITY = "iPhone Developer"; COPY_PHASE_STRIP = NO; DEBUG_INFORMATION_FORMAT = dwarf; ENABLE_STRICT_OBJC_MSGSEND = YES; ENABLE_TESTABILITY = YES; GCC_C_LANGUAGE_STANDARD = gnu11; GCC_DYNAMIC_NO_PIC = NO; GCC_NO_COMMON_BLOCKS = YES; GCC_OPTIMIZATION_LEVEL = 0; GCC_PREPROCESSOR_DEFINITIONS = ( "DEBUG=1", "$(inherited)", ); GCC_WARN_64_TO_32_BIT_CONVERSION = YES; GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR; GCC_WARN_UNDECLARED_SELECTOR = YES; GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE; GCC_WARN_UNUSED_FUNCTION = YES; GCC_WARN_UNUSED_VARIABLE = YES; IPHONEOS_DEPLOYMENT_TARGET = 11.0; MTL_ENABLE_DEBUG_INFO = YES; ONLY_ACTIVE_ARCH = YES; SDKROOT = iphoneos; SWIFT_ACTIVE_COMPILATION_CONDITIONS = DEBUG; SWIFT_OPTIMIZATION_LEVEL = "-Onone"; }; name = Debug; }; 7BC25FAF1EF1B7D1002ECBBA /* Release */ = { isa = XCBuildConfiguration; buildSettings = { ALWAYS_SEARCH_USER_PATHS = NO; CLANG_ANALYZER_NONNULL = YES; CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE; CLANG_CXX_LANGUAGE_STANDARD = "gnu++14"; CLANG_CXX_LIBRARY = "libc++"; CLANG_ENABLE_MODULES = YES; CLANG_ENABLE_OBJC_ARC = YES; CLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES; CLANG_WARN_BOOL_CONVERSION = YES; CLANG_WARN_COMMA = YES; CLANG_WARN_CONSTANT_CONVERSION = YES; CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR; CLANG_WARN_DOCUMENTATION_COMMENTS = YES; CLANG_WARN_EMPTY_BODY = YES; CLANG_WARN_ENUM_CONVERSION = YES; CLANG_WARN_INFINITE_RECURSION = YES; CLANG_WARN_INT_CONVERSION = YES; CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR; CLANG_WARN_RANGE_LOOP_ANALYSIS = YES; CLANG_WARN_STRICT_PROTOTYPES = YES; CLANG_WARN_SUSPICIOUS_MOVE = YES; CLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE; CLANG_WARN_UNREACHABLE_CODE = YES; CLANG_WARN__DUPLICATE_METHOD_MATCH = YES; CODE_SIGN_IDENTITY = "iPhone Developer"; COPY_PHASE_STRIP = NO; DEBUG_INFORMATION_FORMAT = "dwarf-with-dsym"; ENABLE_NS_ASSERTIONS = NO; ENABLE_STRICT_OBJC_MSGSEND = YES; GCC_C_LANGUAGE_STANDARD = gnu11; GCC_NO_COMMON_BLOCKS = YES; GCC_WARN_64_TO_32_BIT_CONVERSION = YES; GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR; GCC_WARN_UNDECLARED_SELECTOR = YES; GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE; GCC_WARN_UNUSED_FUNCTION = YES; GCC_WARN_UNUSED_VARIABLE = YES; IPHONEOS_DEPLOYMENT_TARGET = 11.0; MTL_ENABLE_DEBUG_INFO = NO; SDKROOT = iphoneos; SWIFT_OPTIMIZATION_LEVEL = "-Owholemodule"; VALIDATE_PRODUCT = YES; }; name = Release; }; 7BC25FB11EF1B7D1002ECBBA /* Debug */ = { isa = XCBuildConfiguration; buildSettings = { ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; CODE_SIGN_IDENTITY = "iPhone Developer: 1067837450@qq.com (EA34W5QFTF)"; "CODE_SIGN_IDENTITY[sdk=iphoneos*]" = "iPhone Developer"; CODE_SIGN_STYLE = Automatic; DEVELOPMENT_TEAM = LXGRKDMEQU; GCC_OPTIMIZATION_LEVEL = s; INFOPLIST_FILE = "YOLOv3-CoreML/Info.plist"; LD_RUNPATH_SEARCH_PATHS = "$(inherited) @executable_path/Frameworks"; PRODUCT_BUNDLE_IDENTIFIER = "net.machinethink.YOLOv3-CoreML"; PRODUCT_NAME = "$(TARGET_NAME)"; PROVISIONING_PROFILE_SPECIFIER = ""; SWIFT_OPTIMIZATION_LEVEL = "-Owholemodule"; SWIFT_VERSION = 4.0; TARGETED_DEVICE_FAMILY = "1,2"; }; name = Debug; }; 7BC25FB21EF1B7D1002ECBBA /* Release */ = { isa = XCBuildConfiguration; buildSettings = { ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon; CODE_SIGN_IDENTITY = "iPhone Developer: 1067837450@qq.com (EA34W5QFTF)"; "CODE_SIGN_IDENTITY[sdk=iphoneos*]" = "iPhone Developer"; CODE_SIGN_STYLE = Automatic; DEVELOPMENT_TEAM = LXGRKDMEQU; INFOPLIST_FILE = "YOLOv3-CoreML/Info.plist"; LD_RUNPATH_SEARCH_PATHS = "$(inherited) @executable_path/Frameworks"; PRODUCT_BUNDLE_IDENTIFIER = "net.machinethink.YOLOv3-CoreML"; PRODUCT_NAME = "$(TARGET_NAME)"; PROVISIONING_PROFILE_SPECIFIER = ""; SWIFT_VERSION = 4.0; TARGETED_DEVICE_FAMILY = "1,2"; }; name = Release; }; /* End XCBuildConfiguration section */ /* Begin XCConfigurationList section */ 7BC25F991EF1B7D1002ECBBA /* Build configuration list for PBXProject "YOLOv3-CoreML" */ = { isa = XCConfigurationList; buildConfigurations = ( 7BC25FAE1EF1B7D1002ECBBA /* Debug */, 7BC25FAF1EF1B7D1002ECBBA /* Release */, ); defaultConfigurationIsVisible = 0; defaultConfigurationName = Release; }; 7BC25FB01EF1B7D1002ECBBA /* Build configuration list for PBXNativeTarget "YOLOv3-CoreML" */ = { isa = XCConfigurationList; buildConfigurations = ( 7BC25FB11EF1B7D1002ECBBA /* Debug */, 7BC25FB21EF1B7D1002ECBBA /* Release */, ); defaultConfigurationIsVisible = 0; defaultConfigurationName = Release; }; /* End XCConfigurationList section */ }; rootObject = 7BC25F961EF1B7D1002ECBBA /* Project object */; } ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML.xcodeproj/project.xcworkspace/contents.xcworkspacedata ================================================ ================================================ FILE: YOLOv3-CoreML/YOLOv3-CoreML.xcodeproj/project.xcworkspace/xcshareddata/IDEWorkspaceChecks.plist ================================================ IDEDidComputeMac32BitWarning