Repository: huggingface/sam2-studio
Branch: main
Commit: 0dd708d0162e
Files: 38
Total size: 107.3 KB
Directory structure:
gitextract_6ppky8xg/
├── .gitignore
├── LICENSE
├── README.md
├── SAM2-Demo/
│ ├── Assets.xcassets/
│ │ ├── AccentColor.colorset/
│ │ │ └── Contents.json
│ │ ├── AppIcon.appiconset/
│ │ │ └── Contents.json
│ │ └── Contents.json
│ ├── Common/
│ │ ├── CGImage+Extension.swift
│ │ ├── CGImage+RawBytes.swift
│ │ ├── Color+Extension.swift
│ │ ├── CoreImageExtensions.swift
│ │ ├── DirectoryDocument.swift
│ │ ├── MLMultiArray+Image.swift
│ │ ├── Models.swift
│ │ ├── NSImage+Extension.swift
│ │ └── SAM2.swift
│ ├── ContentView.swift
│ ├── Preview Content/
│ │ └── Preview Assets.xcassets/
│ │ └── Contents.json
│ ├── Ripple/
│ │ ├── Ripple.metal
│ │ ├── RippleModifier.swift
│ │ └── RippleViewModifier.swift
│ ├── SAM2_1SmallImageEncoderFLOAT16.mlpackage/
│ │ ├── Data/
│ │ │ └── com.apple.CoreML/
│ │ │ └── model.mlmodel
│ │ └── Manifest.json
│ ├── SAM2_1SmallMaskDecoderFLOAT16.mlpackage/
│ │ ├── Data/
│ │ │ └── com.apple.CoreML/
│ │ │ └── model.mlmodel
│ │ └── Manifest.json
│ ├── SAM2_1SmallPromptEncoderFLOAT16.mlpackage/
│ │ ├── Data/
│ │ │ └── com.apple.CoreML/
│ │ │ └── model.mlmodel
│ │ └── Manifest.json
│ ├── SAM2_Demo.entitlements
│ ├── SAM2_DemoApp.swift
│ └── Views/
│ ├── AnnotationListView.swift
│ ├── ImageView.swift
│ ├── LayerListView.swift
│ ├── MaskEditor.swift
│ ├── SubtoolbarView.swift
│ └── ZoomableScrollView.swift
├── SAM2-Demo.xcodeproj/
│ ├── project.pbxproj
│ └── project.xcworkspace/
│ ├── contents.xcworkspacedata
│ └── xcshareddata/
│ └── swiftpm/
│ └── Package.resolved
└── sam2-cli/
└── MainCommand.swift
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
.DS_Store
xcuserdata/
================================================
FILE: LICENSE
================================================
Apache License
Version 2.0, January 2004
http://www.apache.org/licenses/
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
1. Definitions.
"License" shall mean the terms and conditions for use, reproduction,
and distribution as defined by Sections 1 through 9 of this document.
"Licensor" shall mean the copyright owner or entity authorized by
the copyright owner that is granting the License.
"Legal Entity" shall mean the union of the acting entity and all
other entities that control, are controlled by, or are under common
control with that entity. For the purposes of this definition,
"control" means (i) the power, direct or indirect, to cause the
direction or management of such entity, whether by contract or
otherwise, or (ii) ownership of fifty percent (50%) or more of the
outstanding shares, or (iii) beneficial ownership of such entity.
"You" (or "Your") shall mean an individual or Legal Entity
exercising permissions granted by this License.
"Source" form shall mean the preferred form for making modifications,
including but not limited to software source code, documentation
source, and configuration files.
"Object" form shall mean any form resulting from mechanical
transformation or translation of a Source form, including but
not limited to compiled object code, generated documentation,
and conversions to other media types.
"Work" shall mean the work of authorship, whether in Source or
Object form, made available under the License, as indicated by a
copyright notice that is included in or attached to the work
(an example is provided in the Appendix below).
"Derivative Works" shall mean any work, whether in Source or Object
form, that is based on (or derived from) the Work and for which the
editorial revisions, annotations, elaborations, or other modifications
represent, as a whole, an original work of authorship. For the purposes
of this License, Derivative Works shall not include works that remain
separable from, or merely link (or bind by name) to the interfaces of,
the Work and Derivative Works thereof.
"Contribution" shall mean any work of authorship, including
the original version of the Work and any modifications or additions
to that Work or Derivative Works thereof, that is intentionally
submitted to Licensor for inclusion in the Work by the copyright owner
or by an individual or Legal Entity authorized to submit on behalf of
the copyright owner. For the purposes of this definition, "submitted"
means any form of electronic, verbal, or written communication sent
to the Licensor or its representatives, including but not limited to
communication on electronic mailing lists, source code control systems,
and issue tracking systems that are managed by, or on behalf of, the
Licensor for the purpose of discussing and improving the Work, but
excluding communication that is conspicuously marked or otherwise
designated in writing by the copyright owner as "Not a Contribution."
"Contributor" shall mean Licensor and any individual or Legal Entity
on behalf of whom a Contribution has been received by Licensor and
subsequently incorporated within the Work.
2. Grant of Copyright License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
copyright license to reproduce, prepare Derivative Works of,
publicly display, publicly perform, sublicense, and distribute the
Work and such Derivative Works in Source or Object form.
3. Grant of Patent License. Subject to the terms and conditions of
this License, each Contributor hereby grants to You a perpetual,
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
(except as stated in this section) patent license to make, have made,
use, offer to sell, sell, import, and otherwise transfer the Work,
where such license applies only to those patent claims licensable
by such Contributor that are necessarily infringed by their
Contribution(s) alone or by combination of their Contribution(s)
with the Work to which such Contribution(s) was submitted. If You
institute patent litigation against any entity (including a
cross-claim or counterclaim in a lawsuit) alleging that the Work
or a Contribution incorporated within the Work constitutes direct
or contributory patent infringement, then any patent licenses
granted to You under this License for that Work shall terminate
as of the date such litigation is filed.
4. Redistribution. You may reproduce and distribute copies of the
Work or Derivative Works thereof in any medium, with or without
modifications, and in Source or Object form, provided that You
meet the following conditions:
(a) You must give any other recipients of the Work or
Derivative Works a copy of this License; and
(b) You must cause any modified files to carry prominent notices
stating that You changed the files; and
(c) You must retain, in the Source form of any Derivative Works
that You distribute, all copyright, patent, trademark, and
attribution notices from the Source form of the Work,
excluding those notices that do not pertain to any part of
the Derivative Works; and
(d) If the Work includes a "NOTICE" text file as part of its
distribution, then any Derivative Works that You distribute must
include a readable copy of the attribution notices contained
within such NOTICE file, excluding those notices that do not
pertain to any part of the Derivative Works, in at least one
of the following places: within a NOTICE text file distributed
as part of the Derivative Works; within the Source form or
documentation, if provided along with the Derivative Works; or,
within a display generated by the Derivative Works, if and
wherever such third-party notices normally appear. The contents
of the NOTICE file are for informational purposes only and
do not modify the License. You may add Your own attribution
notices within Derivative Works that You distribute, alongside
or as an addendum to the NOTICE text from the Work, provided
that such additional attribution notices cannot be construed
as modifying the License.
You may add Your own copyright statement to Your modifications and
may provide additional or different license terms and conditions
for use, reproduction, or distribution of Your modifications, or
for any such Derivative Works as a whole, provided Your use,
reproduction, and distribution of the Work otherwise complies with
the conditions stated in this License.
5. Submission of Contributions. Unless You explicitly state otherwise,
any Contribution intentionally submitted for inclusion in the Work
by You to the Licensor shall be under the terms and conditions of
this License, without any additional terms or conditions.
Notwithstanding the above, nothing herein shall supersede or modify
the terms of any separate license agreement you may have executed
with Licensor regarding such Contributions.
6. Trademarks. This License does not grant permission to use the trade
names, trademarks, service marks, or product names of the Licensor,
except as required for reasonable and customary use in describing the
origin of the Work and reproducing the content of the NOTICE file.
7. Disclaimer of Warranty. Unless required by applicable law or
agreed to in writing, Licensor provides the Work (and each
Contributor provides its Contributions) on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
implied, including, without limitation, any warranties or conditions
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
PARTICULAR PURPOSE. You are solely responsible for determining the
appropriateness of using or redistributing the Work and assume any
risks associated with Your exercise of permissions under this License.
8. Limitation of Liability. In no event and under no legal theory,
whether in tort (including negligence), contract, or otherwise,
unless required by applicable law (such as deliberate and grossly
negligent acts) or agreed to in writing, shall any Contributor be
liable to You for damages, including any direct, indirect, special,
incidental, or consequential damages of any character arising as a
result of this License or out of the use or inability to use the
Work (including but not limited to damages for loss of goodwill,
work stoppage, computer failure or malfunction, or any and all
other commercial damages or losses), even if such Contributor
has been advised of the possibility of such damages.
9. Accepting Warranty or Additional Liability. While redistributing
the Work or Derivative Works thereof, You may choose to offer,
and charge a fee for, acceptance of support, warranty, indemnity,
or other liability obligations and/or rights consistent with this
License. However, in accepting such obligations, You may act only
on Your own behalf and on Your sole responsibility, not on behalf
of any other Contributor, and only if You agree to indemnify,
defend, and hold each Contributor harmless for any liability
incurred by, or claims asserted against, such Contributor by reason
of your accepting any such warranty or additional liability.
END OF TERMS AND CONDITIONS
APPENDIX: How to apply the Apache License to your work.
To apply the Apache License to your work, attach the following
boilerplate notice, with the fields enclosed by brackets "[]"
replaced with your own identifying information. (Don't include
the brackets!) The text should be enclosed in the appropriate
comment syntax for the file format. We also recommend that a
file or class name and description of purpose be included on the
same "printed page" as the copyright notice for easier
identification within third-party archives.
Copyright 2022 Hugging Face SAS.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
================================================
FILE: README.md
================================================
# SAM2 Studio
This is a Swift demo app for SAM 2 Core ML models.

SAM 2 (Segment Anything in Images and Videos), is a collection of foundation models from FAIR that aim to solve promptable visual segmentation in images and videos. See the [SAM 2 paper](https://arxiv.org/abs/2408.00714) for more information.
## Quick Start ⚡️
Download the compiled version [here!](https://huggingface.co/coreml-projects/sam-2-studio).
## How to Use
If you prefer to compile it yourself or want to use a larger model, simply download the repo, compile with Xcode and run.
The app comes with the Small version of the model, but you can replace it with one of the supported models:
- [SAM 2.1 Tiny](https://huggingface.co/apple/coreml-sam2.1-tiny)
- [SAM 2.1 Small](https://huggingface.co/apple/coreml-sam2.1-small)
- [SAM 2.1 Base](https://huggingface.co/apple/coreml-sam2.1-baseplus)
- [SAM 2.1 Large](https://huggingface.co/apple/coreml-sam2.1-large)
For the older models, please check out the [Apple](https://huggingface.co/apple) organisation on HuggingFace.
This demo supports images, video support will be coming later.
### Selecting Objects
- You can select one or more _foreground_ points to choose objects in the image. Each additional point is interpreted as a _refinement_ of the previous mask.
- Use a _background_ point to indicate an area to be removed from the current mask.
- You can use a _box_ to select an approximate area that contains the object you're interested in.
## Converting Models
If you want to use a fine-tuned model, you can convert it using [this fork of the SAM 2 repo](https://github.com/huggingface/segment-anything-2/tree/coreml-conversion). Please, let us know what you use it for!
## Feedback and Contributions
Feedback, issues and PRs are welcome! Please, feel free to [get in touch](https://github.com/huggingface/sam2-swiftui/issues/new).
## Citation
To cite the SAM 2 paper, model, or software, please use the below:
```
@article{ravi2024sam2,
title={SAM 2: Segment Anything in Images and Videos},
author={Ravi, Nikhila and Gabeur, Valentin and Hu, Yuan-Ting and Hu, Ronghang and Ryali, Chaitanya and Ma, Tengyu and Khedr, Haitham and R{\"a}dle, Roman and Rolland, Chloe and Gustafson, Laura and Mintun, Eric and Pan, Junting and Alwala, Kalyan Vasudev and Carion, Nicolas and Wu, Chao-Yuan and Girshick, Ross and Doll{\'a}r, Piotr and Feichtenhofer, Christoph},
journal={arXiv preprint arXiv:2408.00714},
url={https://arxiv.org/abs/2408.00714},
year={2024}
}
```
================================================
FILE: SAM2-Demo/Assets.xcassets/AccentColor.colorset/Contents.json
================================================
{
"colors" : [
{
"idiom" : "universal"
}
],
"info" : {
"author" : "xcode",
"version" : 1
}
}
================================================
FILE: SAM2-Demo/Assets.xcassets/AppIcon.appiconset/Contents.json
================================================
{
"images" : [
{
"idiom" : "mac",
"scale" : "1x",
"size" : "16x16"
},
{
"idiom" : "mac",
"scale" : "2x",
"size" : "16x16"
},
{
"idiom" : "mac",
"scale" : "1x",
"size" : "32x32"
},
{
"idiom" : "mac",
"scale" : "2x",
"size" : "32x32"
},
{
"idiom" : "mac",
"scale" : "1x",
"size" : "128x128"
},
{
"idiom" : "mac",
"scale" : "2x",
"size" : "128x128"
},
{
"idiom" : "mac",
"scale" : "1x",
"size" : "256x256"
},
{
"idiom" : "mac",
"scale" : "2x",
"size" : "256x256"
},
{
"idiom" : "mac",
"scale" : "1x",
"size" : "512x512"
},
{
"idiom" : "mac",
"scale" : "2x",
"size" : "512x512"
}
],
"info" : {
"author" : "xcode",
"version" : 1
}
}
================================================
FILE: SAM2-Demo/Assets.xcassets/Contents.json
================================================
{
"info" : {
"author" : "xcode",
"version" : 1
}
}
================================================
FILE: SAM2-Demo/Common/CGImage+Extension.swift
================================================
//
// CGImage+Extension.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 8/20/24.
//
import ImageIO
extension CGImage {
func resized(to size: CGSize) -> CGImage? {
let width: Int = Int(size.width)
let height: Int = Int(size.height)
let bytesPerPixel = self.bitsPerPixel / 8
let destBytesPerRow = width * bytesPerPixel
guard let colorSpace = self.colorSpace else { return nil }
guard let context = CGContext(data: nil,
width: width,
height: height,
bitsPerComponent: self.bitsPerComponent,
bytesPerRow: destBytesPerRow,
space: colorSpace,
bitmapInfo: self.bitmapInfo.rawValue)
else { return nil }
context.interpolationQuality = .high
context.draw(self, in: CGRect(x: 0, y: 0, width: width, height: height))
return context.makeImage()
}
}
================================================
FILE: SAM2-Demo/Common/CGImage+RawBytes.swift
================================================
/*
Copyright (c) 2017-2019 M.I. Hollemans
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to
deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
IN THE SOFTWARE.
*/
import CoreGraphics
extension CGImage {
/**
Converts the image into an array of RGBA bytes.
*/
@nonobjc public func toByteArrayRGBA() -> [UInt8] {
var bytes = [UInt8](repeating: 0, count: width * height * 4)
bytes.withUnsafeMutableBytes { ptr in
if let colorSpace = colorSpace,
let context = CGContext(
data: ptr.baseAddress,
width: width,
height: height,
bitsPerComponent: bitsPerComponent,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: CGImageAlphaInfo.premultipliedLast.rawValue) {
let rect = CGRect(x: 0, y: 0, width: width, height: height)
context.draw(self, in: rect)
}
}
return bytes
}
/**
Creates a new CGImage from an array of RGBA bytes.
*/
@nonobjc public class func fromByteArrayRGBA(_ bytes: [UInt8],
width: Int,
height: Int) -> CGImage? {
return fromByteArray(bytes, width: width, height: height,
bytesPerRow: width * 4,
colorSpace: CGColorSpaceCreateDeviceRGB(),
alphaInfo: .premultipliedLast)
}
/**
Creates a new CGImage from an array of grayscale bytes.
*/
@nonobjc public class func fromByteArrayGray(_ bytes: [UInt8],
width: Int,
height: Int) -> CGImage? {
return fromByteArray(bytes, width: width, height: height,
bytesPerRow: width,
colorSpace: CGColorSpaceCreateDeviceGray(),
alphaInfo: .none)
}
@nonobjc class func fromByteArray(_ bytes: [UInt8],
width: Int,
height: Int,
bytesPerRow: Int,
colorSpace: CGColorSpace,
alphaInfo: CGImageAlphaInfo) -> CGImage? {
return bytes.withUnsafeBytes { ptr in
let context = CGContext(data: UnsafeMutableRawPointer(mutating: ptr.baseAddress!),
width: width,
height: height,
bitsPerComponent: 8,
bytesPerRow: bytesPerRow,
space: colorSpace,
bitmapInfo: alphaInfo.rawValue)
return context?.makeImage()
}
}
}
================================================
FILE: SAM2-Demo/Common/Color+Extension.swift
================================================
//
// Color+Extension.swift
// SAM2-Demo
//
// Created by Fleetwood on 01/10/2024.
//
import SwiftUI
#if canImport(UIKit)
import UIKit
#elseif canImport(AppKit)
import AppKit
#endif
extension Color {
#if canImport(UIKit)
var asNative: UIColor { UIColor(self) }
#elseif canImport(AppKit)
var asNative: NSColor { NSColor(self) }
#endif
var rgba: (red: CGFloat, green: CGFloat, blue: CGFloat, alpha: CGFloat) {
let color = asNative.usingColorSpace(.deviceRGB)!
var t = (CGFloat(), CGFloat(), CGFloat(), CGFloat())
color.getRed(&t.0, green: &t.1, blue: &t.2, alpha: &t.3)
return t
}
}
func colorDistance(_ color1: Color, _ color2: Color) -> Double {
let rgb1 = color1.rgba;
let rgb2 = color2.rgba;
let rDiff = rgb1.red - rgb2.red
let gDiff = rgb1.green - rgb2.green
let bDiff = rgb1.blue - rgb2.blue
return sqrt(rDiff*rDiff + gDiff*gDiff + bDiff*bDiff)
}
// Determine the Euclidean distance of all candidates from current set of colors.
// Find the **maximum min-distance** from all current colors.
func furthestColor(from existingColors: [Color], among candidateColors: [Color]) -> Color {
var maxMinDistance: Double = 0
var furthestColor: Color = SAMSegmentation.randomCandidateColor() ?? SAMSegmentation.defaultColor
for candidate in candidateColors {
let minDistance = existingColors.map { colorDistance(candidate, $0) }.min() ?? 0
if minDistance > maxMinDistance {
maxMinDistance = minDistance
furthestColor = candidate
}
}
return furthestColor
}
================================================
FILE: SAM2-Demo/Common/CoreImageExtensions.swift
================================================
import CoreImage
import CoreImage.CIFilterBuiltins
import ImageIO
import UniformTypeIdentifiers
extension CIImage {
/// Returns a resized image.
func resized(to size: CGSize) -> CIImage {
let outputScaleX = size.width / extent.width
let outputScaleY = size.height / extent.height
var outputImage = self.transformed(by: CGAffineTransform(scaleX: outputScaleX, y: outputScaleY))
outputImage = outputImage.transformed(
by: CGAffineTransform(translationX: -outputImage.extent.origin.x, y: -outputImage.extent.origin.y)
)
return outputImage
}
public func withAlpha<T: BinaryFloatingPoint>(_ alpha: T) -> CIImage? {
guard alpha != 1 else { return self }
let filter = CIFilter.colorMatrix()
filter.inputImage = self
filter.aVector = CIVector(x: 0, y: 0, z: 0, w: CGFloat(alpha))
return filter.outputImage
}
public func applyingThreshold(_ threshold: Float) -> CIImage? {
let filter = CIFilter.colorThreshold()
filter.inputImage = self
filter.threshold = threshold
return filter.outputImage
}
}
extension CIContext {
/// Renders an image to a new pixel buffer.
func render(_ image: CIImage, pixelFormat: OSType) -> CVPixelBuffer? {
var output: CVPixelBuffer!
let status = CVPixelBufferCreate(
kCFAllocatorDefault,
Int(image.extent.width),
Int(image.extent.height),
pixelFormat,
nil,
&output
)
guard status == kCVReturnSuccess else {
return nil
}
render(image, to: output, bounds: image.extent, colorSpace: nil)
return output
}
/// Writes the image as a PNG.
func writePNG(_ image: CIImage, to url: URL) {
let outputCGImage = createCGImage(image, from: image.extent, format: .BGRA8, colorSpace: nil)!
guard let destination = CGImageDestinationCreateWithURL(url as CFURL, UTType.png.identifier as CFString, 1, nil) else {
fatalError("Failed to create an image destination.")
}
CGImageDestinationAddImage(destination, outputCGImage, nil)
CGImageDestinationFinalize(destination)
}
}
================================================
FILE: SAM2-Demo/Common/DirectoryDocument.swift
================================================
//
// DirectoryDocument.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/10/24.
//
import SwiftUI
import UniformTypeIdentifiers
struct DirectoryDocument: FileDocument {
static var readableContentTypes: [UTType] { [.folder] }
init(initialContentType: UTType = .folder) {
// Initialize if needed
}
init(configuration: ReadConfiguration) throws {
// Initialize if needed
}
func fileWrapper(configuration: WriteConfiguration) throws -> FileWrapper {
return FileWrapper(directoryWithFileWrappers: [:])
}
}
================================================
FILE: SAM2-Demo/Common/MLMultiArray+Image.swift
================================================
/*
Copyright (c) 2017-2020 M.I. Hollemans
Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to
deal in the Software without restriction, including without limitation the
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or
sell copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:
The above copyright notice and this permission notice shall be included in
all copies or substantial portions of the Software.
THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING
FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS
IN THE SOFTWARE.
*/
import Accelerate
import CoreML
public func clamp<T: Comparable>(_ x: T, min: T, max: T) -> T {
if x < min { return min }
if x > max { return max }
return x
}
public protocol MultiArrayType: Comparable {
static var multiArrayDataType: MLMultiArrayDataType { get }
static func +(lhs: Self, rhs: Self) -> Self
static func -(lhs: Self, rhs: Self) -> Self
static func *(lhs: Self, rhs: Self) -> Self
static func /(lhs: Self, rhs: Self) -> Self
init(_: Int)
var toUInt8: UInt8 { get }
}
extension Double: MultiArrayType {
public static var multiArrayDataType: MLMultiArrayDataType { return .double }
public var toUInt8: UInt8 { return UInt8(self) }
}
extension Float: MultiArrayType {
public static var multiArrayDataType: MLMultiArrayDataType { return .float32 }
public var toUInt8: UInt8 { return UInt8(self) }
}
extension Int32: MultiArrayType {
public static var multiArrayDataType: MLMultiArrayDataType { return .int32 }
public var toUInt8: UInt8 { return UInt8(self) }
}
extension Float16: MultiArrayType {
public static var multiArrayDataType: MLMultiArrayDataType { return .float16 }
public var toUInt8: UInt8 { return UInt8(self) }
}
extension MLMultiArray {
/**
Converts the multi-array to a CGImage.
The multi-array must have at least 2 dimensions for a grayscale image, or
at least 3 dimensions for a color image.
The default expected shape is (height, width) or (channels, height, width).
However, you can change this using the `axes` parameter. For example, if
the array shape is (1, height, width, channels), use `axes: (3, 1, 2)`.
If `channel` is not nil, only converts that channel to a grayscale image.
This lets you visualize individual channels from a multi-array with more
than 4 channels.
Otherwise, converts all channels. In this case, the number of channels in
the multi-array must be 1 for grayscale, 3 for RGB, or 4 for RGBA.
Use the `min` and `max` parameters to put the values from the array into
the range [0, 255], if not already:
- `min`: should be the smallest value in the data; this will be mapped to 0.
- `max`: should be the largest value in the data; will be mapped to 255.
For example, if the range of the data in the multi-array is [-1, 1], use
`min: -1, max: 1`. If the range is already [0, 255], then use the defaults.
*/
public func cgImage(min: Double = 0,
max: Double = 255,
channel: Int? = nil,
axes: (Int, Int, Int)? = nil) -> CGImage? {
switch self.dataType {
case .double:
return _image(min: min, max: max, channel: channel, axes: axes)
case .float32:
return _image(min: Float(min), max: Float(max), channel: channel, axes: axes)
case .int32:
return _image(min: Int32(min), max: Int32(max), channel: channel, axes: axes)
case .float16:
return _image(min: Float16(min), max: Float16(max), channel: channel, axes: axes)
@unknown default:
fatalError("Unsupported data type \(dataType.rawValue)")
}
}
/**
Helper function that allows us to use generics. The type of `min` and `max`
is also the dataType of the MLMultiArray.
*/
private func _image<T: MultiArrayType>(min: T,
max: T,
channel: Int?,
axes: (Int, Int, Int)?) -> CGImage? {
if let (b, w, h, c) = toRawBytes(min: min, max: max, channel: channel, axes: axes) {
if c == 1 {
return CGImage.fromByteArrayGray(b, width: w, height: h)
} else {
return CGImage.fromByteArrayRGBA(b, width: w, height: h)
}
}
return nil
}
/**
Converts the multi-array into an array of RGBA or grayscale pixels.
- Note: This is not particularly fast, but it is flexible. You can change
the loops to convert the multi-array whichever way you please.
- Note: The type of `min` and `max` must match the dataType of the
MLMultiArray object.
- Returns: tuple containing the RGBA bytes, the dimensions of the image,
and the number of channels in the image (1, 3, or 4).
*/
public func toRawBytes<T: MultiArrayType>(min: T,
max: T,
channel: Int? = nil,
axes: (Int, Int, Int)? = nil)
-> (bytes: [UInt8], width: Int, height: Int, channels: Int)? {
// MLMultiArray with unsupported shape?
if shape.count < 2 {
print("Cannot convert MLMultiArray of shape \(shape) to image")
return nil
}
// Figure out which dimensions to use for the channels, height, and width.
let channelAxis: Int
let heightAxis: Int
let widthAxis: Int
if let axes = axes {
channelAxis = axes.0
heightAxis = axes.1
widthAxis = axes.2
guard channelAxis >= 0 && channelAxis < shape.count &&
heightAxis >= 0 && heightAxis < shape.count &&
widthAxis >= 0 && widthAxis < shape.count else {
print("Invalid axes \(axes) for shape \(shape)")
return nil
}
} else if shape.count == 2 {
// Expected shape for grayscale is (height, width)
heightAxis = 0
widthAxis = 1
channelAxis = -1 // Never be used
} else {
// Expected shape for color is (channels, height, width)
channelAxis = 0
heightAxis = 1
widthAxis = 2
}
let height = self.shape[heightAxis].intValue
let width = self.shape[widthAxis].intValue
let yStride = self.strides[heightAxis].intValue
let xStride = self.strides[widthAxis].intValue
let channels: Int
let cStride: Int
let bytesPerPixel: Int
let channelOffset: Int
// MLMultiArray with just two dimensions is always grayscale. (We ignore
// the value of channelAxis here.)
if shape.count == 2 {
channels = 1
cStride = 0
bytesPerPixel = 1
channelOffset = 0
// MLMultiArray with more than two dimensions can be color or grayscale.
} else {
let channelDim = self.shape[channelAxis].intValue
if let channel = channel {
if channel < 0 || channel >= channelDim {
print("Channel must be -1, or between 0 and \(channelDim - 1)")
return nil
}
channels = 1
bytesPerPixel = 1
channelOffset = channel
} else if channelDim == 1 {
channels = 1
bytesPerPixel = 1
channelOffset = 0
} else {
if channelDim != 3 && channelDim != 4 {
print("Expected channel dimension to have 1, 3, or 4 channels, got \(channelDim)")
return nil
}
channels = channelDim
bytesPerPixel = 4
channelOffset = 0
}
cStride = self.strides[channelAxis].intValue
}
// Allocate storage for the RGBA or grayscale pixels. Set everything to
// 255 so that alpha channel is filled in if only 3 channels.
let count = height * width * bytesPerPixel
var pixels = [UInt8](repeating: 255, count: count)
// Grab the pointer to MLMultiArray's memory.
var ptr = UnsafeMutablePointer<T>(OpaquePointer(self.dataPointer))
ptr = ptr.advanced(by: channelOffset * cStride)
// Loop through all the pixels and all the channels and copy them over.
for c in 0..<channels {
for y in 0..<height {
for x in 0..<width {
let value = ptr[c*cStride + y*yStride + x*xStride]
let scaled = (value - min) * T(255) / (max - min)
let pixel = clamp(scaled, min: T(0), max: T(255)).toUInt8
pixels[(y*width + x)*bytesPerPixel + c] = pixel
}
}
}
return (pixels, width, height, channels)
}
}
/**
Fast conversion from MLMultiArray to CGImage using the vImage framework.
- Parameters:
- features: A multi-array with data type FLOAT32 and three dimensions
(3, height, width).
- min: The smallest value in the multi-array. This value, as well as any
smaller values, will be mapped to 0 in the output image.
- max: The largest value in the multi-array. This and any larger values
will be will be mapped to 255 in the output image.
- Returns: a new CGImage or nil if the conversion fails
*/
public func createCGImage(fromFloatArray features: MLMultiArray,
min: Float = 0,
max: Float = 255) -> CGImage? {
assert(features.dataType == .float32)
assert(features.shape.count == 3)
let ptr = UnsafeMutablePointer<Float>(OpaquePointer(features.dataPointer))
let height = features.shape[1].intValue
let width = features.shape[2].intValue
let channelStride = features.strides[0].intValue
let rowStride = features.strides[1].intValue
let srcRowBytes = rowStride * MemoryLayout<Float>.stride
var blueBuffer = vImage_Buffer(data: ptr,
height: vImagePixelCount(height),
width: vImagePixelCount(width),
rowBytes: srcRowBytes)
var greenBuffer = vImage_Buffer(data: ptr.advanced(by: channelStride),
height: vImagePixelCount(height),
width: vImagePixelCount(width),
rowBytes: srcRowBytes)
var redBuffer = vImage_Buffer(data: ptr.advanced(by: channelStride * 2),
height: vImagePixelCount(height),
width: vImagePixelCount(width),
rowBytes: srcRowBytes)
let destRowBytes = width * 4
var error: vImage_Error = 0
var pixels = [UInt8](repeating: 0, count: height * destRowBytes)
pixels.withUnsafeMutableBufferPointer { ptr in
var destBuffer = vImage_Buffer(data: ptr.baseAddress!,
height: vImagePixelCount(height),
width: vImagePixelCount(width),
rowBytes: destRowBytes)
error = vImageConvert_PlanarFToBGRX8888(&blueBuffer,
&greenBuffer,
&redBuffer,
Pixel_8(255),
&destBuffer,
[max, max, max],
[min, min, min],
vImage_Flags(0))
}
if error == kvImageNoError {
return CGImage.fromByteArrayRGBA(pixels, width: width, height: height)
} else {
return nil
}
}
================================================
FILE: SAM2-Demo/Common/Models.swift
================================================
//
// Models.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 8/19/24.
//
import Foundation
import SwiftUI
enum SAMCategoryType: Int {
case background = 0
case foreground = 1
case boxOrigin = 2
case boxEnd = 3
var description: String {
switch self {
case .foreground:
return "Foreground"
case .background:
return "Background"
case .boxOrigin:
return "Box Origin"
case .boxEnd:
return "Box End"
}
}
}
struct SAMCategory: Hashable {
let id: UUID = UUID()
let type: SAMCategoryType
let name: String
let iconName: String
let color: Color
var typeDescription: String {
type.description
}
static let foreground = SAMCategory(
type: .foreground,
name: "Foreground",
iconName: "square.on.square.dashed",
color: .pink
)
static let background = SAMCategory(
type: .background,
name: "Background",
iconName: "square.on.square.intersection.dashed",
color: .purple
)
static let boxOrigin = SAMCategory(
type: .boxOrigin,
name: "Box Origin",
iconName: "",
color: .white
)
static let boxEnd = SAMCategory(
type: .boxEnd,
name: "Box End",
iconName: "",
color: .white
)
}
struct SAMPoint: Hashable {
let id = UUID()
let coordinates: CGPoint
let category: SAMCategory
let dateAdded = Date()
}
struct SAMBox: Hashable, Identifiable {
let id = UUID()
var startPoint: CGPoint
var endPoint: CGPoint
let category: SAMCategory
let dateAdded = Date()
var midpoint: CGPoint {
return CGPoint(
x: (startPoint.x + endPoint.x) / 2,
y: (startPoint.y + endPoint.y) / 2
)
}
}
extension SAMBox {
var points: [SAMPoint] {
[SAMPoint(coordinates: startPoint, category: .boxOrigin), SAMPoint(coordinates: endPoint, category: .boxEnd)]
}
}
struct SAMSegmentation: Hashable, Identifiable {
let id = UUID()
var image: CIImage
var tintColor: Color {
didSet {
updateTintedImage()
}
}
var title: String = ""
var firstAppearance: Int?
var isHidden: Bool = false
private var tintedImage: CIImage?
static let defaultColor: Color = Color(.sRGB, red: 30/255, green: 144/255, blue: 1)
static let candidateColors: [Color] = [
defaultColor,
Color.red,
Color.green,
Color.brown,
Color.indigo,
Color.cyan,
Color.yellow,
Color.purple,
Color.orange,
Color.teal,
Color.indigo,
Color.mint,
Color.pink,
]
init(image: CIImage, tintColor: Color = Color(.sRGB, red: 30/255, green: 144/255, blue: 1), title: String = "", firstAppearance: Int? = nil, isHidden: Bool = false) {
self.image = image
self.tintColor = tintColor
self.title = title
self.firstAppearance = firstAppearance
self.isHidden = isHidden
updateTintedImage()
}
private mutating func updateTintedImage() {
let ciColor = CIColor(color: NSColor(tintColor))
let monochromeFilter = CIFilter.colorMonochrome()
monochromeFilter.inputImage = image
monochromeFilter.color = ciColor!
monochromeFilter.intensity = 1.0
tintedImage = monochromeFilter.outputImage
}
static func randomCandidateColor() -> Color? {
Self.candidateColors.randomElement()
}
var cgImage: CGImage {
let context = CIContext()
return context.createCGImage(tintedImage ?? image, from: (tintedImage ?? image).extent)!
}
}
struct SAMTool: Hashable {
let id: UUID = UUID()
let name: String
let iconName: String
}
// Tools
let pointTool: SAMTool = SAMTool(name: "Point", iconName: "hand.point.up.left")
let boundingBoxTool: SAMTool = SAMTool(name: "Bounding Box", iconName: "rectangle.dashed")
================================================
FILE: SAM2-Demo/Common/NSImage+Extension.swift
================================================
//
// NSImage+Extension.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 8/20/24.
//
import AppKit
import VideoToolbox
extension NSImage {
/**
Converts the image to an ARGB `CVPixelBuffer`.
*/
public func pixelBuffer() -> CVPixelBuffer? {
return pixelBuffer(width: Int(size.width), height: Int(size.height))
}
/**
Resizes the image to `width` x `height` and converts it to an ARGB
`CVPixelBuffer`.
*/
public func pixelBuffer(width: Int, height: Int) -> CVPixelBuffer? {
return pixelBuffer(width: width, height: height,
pixelFormatType: kCVPixelFormatType_32ARGB,
colorSpace: CGColorSpaceCreateDeviceRGB(),
alphaInfo: .noneSkipFirst)
}
/**
Resizes the image to `width` x `height` and converts it to a `CVPixelBuffer`
with the specified pixel format, color space, and alpha channel.
*/
public func pixelBuffer(width: Int, height: Int,
pixelFormatType: OSType,
colorSpace: CGColorSpace,
alphaInfo: CGImageAlphaInfo) -> CVPixelBuffer? {
var maybePixelBuffer: CVPixelBuffer?
let attrs = [kCVPixelBufferCGImageCompatibilityKey: kCFBooleanTrue,
kCVPixelBufferCGBitmapContextCompatibilityKey: kCFBooleanTrue]
let status = CVPixelBufferCreate(kCFAllocatorDefault,
width,
height,
pixelFormatType,
attrs as CFDictionary,
&maybePixelBuffer)
guard status == kCVReturnSuccess, let pixelBuffer = maybePixelBuffer else {
return nil
}
let flags = CVPixelBufferLockFlags(rawValue: 0)
guard kCVReturnSuccess == CVPixelBufferLockBaseAddress(pixelBuffer, flags) else {
return nil
}
defer { CVPixelBufferUnlockBaseAddress(pixelBuffer, flags) }
guard let context = CGContext(data: CVPixelBufferGetBaseAddress(pixelBuffer),
width: width,
height: height,
bitsPerComponent: 8,
bytesPerRow: CVPixelBufferGetBytesPerRow(pixelBuffer),
space: colorSpace,
bitmapInfo: alphaInfo.rawValue)
else {
return nil
}
NSGraphicsContext.saveGraphicsState()
let nscg = NSGraphicsContext(cgContext: context, flipped: true)
NSGraphicsContext.current = nscg
context.translateBy(x: 0, y: CGFloat(height))
context.scaleBy(x: 1, y: -1)
self.draw(in: CGRect(x: 0, y: 0, width: width, height: height))
NSGraphicsContext.restoreGraphicsState()
return pixelBuffer
}
}
================================================
FILE: SAM2-Demo/Common/SAM2.swift
================================================
//
// SAM2.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 8/20/24.
//
import SwiftUI
import CoreML
import CoreImage
import CoreImage.CIFilterBuiltins
import Combine
import UniformTypeIdentifiers
@MainActor
class SAM2: ObservableObject {
@Published var imageEncodings: SAM2_1SmallImageEncoderFLOAT16Output?
@Published var promptEncodings: SAM2_1SmallPromptEncoderFLOAT16Output?
@Published private(set) var initializationTime: TimeInterval?
@Published private(set) var initialized: Bool?
private var imageEncoderModel: SAM2_1SmallImageEncoderFLOAT16?
private var promptEncoderModel: SAM2_1SmallPromptEncoderFLOAT16?
private var maskDecoderModel: SAM2_1SmallMaskDecoderFLOAT16?
// TODO: examine model inputs instead
var inputSize: CGSize { CGSize(width: 1024, height: 1024) }
var width: CGFloat { inputSize.width }
var height: CGFloat { inputSize.height }
init() {
Task {
await loadModels()
}
}
private func loadModels() async {
let startTime = CFAbsoluteTimeGetCurrent()
do {
let configuration = MLModelConfiguration()
configuration.computeUnits = .cpuAndGPU
let (imageEncoder, promptEncoder, maskDecoder) = try await Task.detached(priority: .userInitiated) {
let imageEncoder = try SAM2_1SmallImageEncoderFLOAT16(configuration: configuration)
let promptEncoder = try SAM2_1SmallPromptEncoderFLOAT16(configuration: configuration)
let maskDecoder = try SAM2_1SmallMaskDecoderFLOAT16(configuration: configuration)
return (imageEncoder, promptEncoder, maskDecoder)
}.value
let endTime = CFAbsoluteTimeGetCurrent()
self.initializationTime = endTime - startTime
self.initialized = true
self.imageEncoderModel = imageEncoder
self.promptEncoderModel = promptEncoder
self.maskDecoderModel = maskDecoder
print("Initialized models in \(String(format: "%.4f", self.initializationTime!)) seconds")
} catch {
print("Failed to initialize models: \(error)")
self.initializationTime = nil
self.initialized = false
}
}
// Convenience for use in the CLI
private var modelLoading: AnyCancellable?
func ensureModelsAreLoaded() async throws -> SAM2 {
let _ = try await withCheckedThrowingContinuation { continuation in
modelLoading = self.$initialized.sink { newValue in
if let initialized = newValue {
if initialized {
continuation.resume(returning: self)
} else {
continuation.resume(throwing: SAM2Error.modelNotLoaded)
}
}
}
}
return self
}
static func load() async throws -> SAM2 {
try await SAM2().ensureModelsAreLoaded()
}
func getImageEncoding(from pixelBuffer: CVPixelBuffer) async throws {
guard let model = imageEncoderModel else {
throw SAM2Error.modelNotLoaded
}
let encoding = try model.prediction(image: pixelBuffer)
self.imageEncodings = encoding
}
func getImageEncoding(from url: URL) async throws {
guard let model = imageEncoderModel else {
throw SAM2Error.modelNotLoaded
}
let inputs = try SAM2_1SmallImageEncoderFLOAT16Input(imageAt: url)
let encoding = try await model.prediction(input: inputs)
self.imageEncodings = encoding
}
func getPromptEncoding(from allPoints: [SAMPoint], with size: CGSize) async throws {
guard let model = promptEncoderModel else {
throw SAM2Error.modelNotLoaded
}
let transformedCoords = try transformCoords(allPoints.map { $0.coordinates }, normalize: false, origHW: size)
// Create MLFeatureProvider with the required input format
let pointsMultiArray = try MLMultiArray(shape: [1, NSNumber(value: allPoints.count), 2], dataType: .float32)
let labelsMultiArray = try MLMultiArray(shape: [1, NSNumber(value: allPoints.count)], dataType: .int32)
for (index, point) in transformedCoords.enumerated() {
pointsMultiArray[[0, index, 0] as [NSNumber]] = NSNumber(value: Float(point.x))
pointsMultiArray[[0, index, 1] as [NSNumber]] = NSNumber(value: Float(point.y))
labelsMultiArray[[0, index] as [NSNumber]] = NSNumber(value: allPoints[index].category.type.rawValue)
}
let encoding = try model.prediction(points: pointsMultiArray, labels: labelsMultiArray)
self.promptEncodings = encoding
}
func bestMask(for output: SAM2_1SmallMaskDecoderFLOAT16Output) -> MLMultiArray {
if #available(macOS 15.0, *) {
let scores = output.scoresShapedArray.scalars
let argmax = scores.firstIndex(of: scores.max() ?? 0) ?? 0
return MLMultiArray(output.low_res_masksShapedArray[0, argmax])
} else {
// Convert scores to float32 for compatibility with macOS < 15,
// plus ugly loop copy (could do some memcpys)
let scores = output.scores
let floatScores = (0..<scores.count).map { scores[$0].floatValue }
let argmax = floatScores.firstIndex(of: floatScores.max() ?? 0) ?? 0
let allMasks = output.low_res_masks
let (h, w) = (allMasks.shape[2], allMasks.shape[3])
let slice = try! MLMultiArray(shape: [h, w], dataType: allMasks.dataType)
for i in 0..<h.intValue {
for j in 0..<w.intValue {
let position = [0, argmax, i, j] as [NSNumber]
slice[[i as NSNumber, j as NSNumber]] = allMasks[position]
}
}
return slice
}
}
func getMask(for original_size: CGSize) async throws -> CIImage? {
guard let model = maskDecoderModel else {
throw SAM2Error.modelNotLoaded
}
if let image_embedding = self.imageEncodings?.image_embedding,
let feats0 = self.imageEncodings?.feats_s0,
let feats1 = self.imageEncodings?.feats_s1,
let sparse_embedding = self.promptEncodings?.sparse_embeddings,
let dense_embedding = self.promptEncodings?.dense_embeddings {
let output = try model.prediction(image_embedding: image_embedding, sparse_embedding: sparse_embedding, dense_embedding: dense_embedding, feats_s0: feats0, feats_s1: feats1)
// Extract best mask and ignore the others
let lowFeatureMask = bestMask(for: output)
// TODO: optimization
// Preserve range for upsampling
var minValue: Double = 9999
var maxValue: Double = -9999
for i in 0..<lowFeatureMask.count {
let v = lowFeatureMask[i].doubleValue
if v > maxValue { maxValue = v }
if v < minValue { minValue = v }
}
let threshold = -minValue / (maxValue - minValue)
// Resize first, then threshold
if let maskcgImage = lowFeatureMask.cgImage(min: minValue, max: maxValue) {
let ciImage = CIImage(cgImage: maskcgImage, options: [.colorSpace: NSNull()])
let resizedImage = try resizeImage(ciImage, to: original_size, applyingThreshold: Float(threshold))
return resizedImage?.maskedToAlpha()?.samTinted()
}
}
return nil
}
private func transformCoords(_ coords: [CGPoint], normalize: Bool = false, origHW: CGSize) throws -> [CGPoint] {
guard normalize else {
return coords.map { CGPoint(x: $0.x * width, y: $0.y * height) }
}
let w = origHW.width
let h = origHW.height
return coords.map { coord in
let normalizedX = coord.x / w
let normalizedY = coord.y / h
return CGPoint(x: normalizedX * width, y: normalizedY * height)
}
}
private func resizeImage(_ image: CIImage, to size: CGSize, applyingThreshold threshold: Float = 1) throws -> CIImage? {
let scale = CGAffineTransform(scaleX: size.width / image.extent.width,
y: size.height / image.extent.height)
return image.transformed(by: scale).applyingThreshold(threshold)
}
}
extension CIImage {
/// This is only appropriate for grayscale mask images (our case). CIColorMatrix can be used more generally.
func maskedToAlpha() -> CIImage? {
let filter = CIFilter.maskToAlpha()
filter.inputImage = self
return filter.outputImage
}
func samTinted() -> CIImage? {
let filter = CIFilter.colorMatrix()
filter.rVector = CIVector(x: 30/255, y: 0, z: 0, w: 1)
filter.gVector = CIVector(x: 0, y: 144/255, z: 0, w: 1)
filter.bVector = CIVector(x: 0, y: 0, z: 1, w: 1)
filter.biasVector = CIVector(x: -1, y: -1, z: -1, w: 0)
filter.inputImage = self
return filter.outputImage?.cropped(to: self.extent)
}
}
enum SAM2Error: Error {
case modelNotLoaded
case pixelBufferCreationFailed
case imageResizingFailed
}
@discardableResult func writeCGImage(_ image: CGImage, to destinationURL: URL) -> Bool {
guard let destination = CGImageDestinationCreateWithURL(destinationURL as CFURL, UTType.png.identifier as CFString, 1, nil) else { return false }
CGImageDestinationAddImage(destination, image, nil)
return CGImageDestinationFinalize(destination)
}
================================================
FILE: SAM2-Demo/ContentView.swift
================================================
import SwiftUI
import PhotosUI
import UniformTypeIdentifiers
import CoreML
import os
// TODO: Add reset, bounding box, and eraser
let logger = Logger(
subsystem:
"com.cyrilzakka.SAM2-Demo.ContentView",
category: "ContentView")
struct PointsOverlay: View {
@Binding var selectedPoints: [SAMPoint]
@Binding var selectedTool: SAMTool?
let imageSize: CGSize
var body: some View {
ForEach(selectedPoints, id: \.self) { point in
Circle()
.frame(width: 10, height: 10)
.foregroundStyle(point.category.color)
.position(point.coordinates.toSize(imageSize))
}
}
}
struct BoundingBoxesOverlay: View {
let boundingBoxes: [SAMBox]
let currentBox: SAMBox?
let imageSize: CGSize
var body: some View {
ForEach(boundingBoxes) { box in
BoundingBoxPath(box: box, imageSize: imageSize)
}
if let currentBox = currentBox {
BoundingBoxPath(box: currentBox, imageSize: imageSize)
}
}
}
struct BoundingBoxPath: View {
let box: SAMBox
let imageSize: CGSize
var body: some View {
Path { path in
path.move(to: box.startPoint.toSize(imageSize))
path.addLine(to: CGPoint(x: box.endPoint.x, y: box.startPoint.y).toSize(imageSize))
path.addLine(to: box.endPoint.toSize(imageSize))
path.addLine(to: CGPoint(x: box.startPoint.x, y: box.endPoint.y).toSize(imageSize))
path.closeSubpath()
}
.stroke(
box.category.color,
style: StrokeStyle(lineWidth: 2, dash: [5, 5])
)
}
}
struct SegmentationOverlay: View {
@Binding var segmentationImage: SAMSegmentation
let imageSize: CGSize
@State var counter: Int = 0
var origin: CGPoint = .zero
var shouldAnimate: Bool = false
var body: some View {
let nsImage = NSImage(cgImage: segmentationImage.cgImage, size: imageSize)
Image(nsImage: nsImage)
.resizable()
.scaledToFit()
.allowsHitTesting(false)
.frame(width: imageSize.width, height: imageSize.height)
.opacity(segmentationImage.isHidden ? 0:0.6)
.modifier(RippleEffect(at: CGPoint(x: segmentationImage.cgImage.width/2, y: segmentationImage.cgImage.height/2), trigger: counter))
.onAppear {
if shouldAnimate {
counter += 1
}
}
}
}
struct ContentView: View {
// ML Models
@StateObject private var sam2 = SAM2()
@State private var currentSegmentation: SAMSegmentation?
@State private var segmentationImages: [SAMSegmentation] = []
@State private var imageSize: CGSize = .zero
// File importer
@State private var imageURL: URL?
@State private var isImportingFromFiles: Bool = false
@State private var displayImage: NSImage?
// Mask exporter
@State private var exportURL: URL?
@State private var exportMaskToPNG: Bool = false
@State private var showInspector: Bool = true
@State private var selectedSegmentations = Set<SAMSegmentation.ID>()
// Photos Picker
@State private var isImportingFromPhotos: Bool = false
@State private var selectedItem: PhotosPickerItem?
@State private var error: Error?
// ML Model Properties
var tools: [SAMTool] = [pointTool, boundingBoxTool]
var categories: [SAMCategory] = [.foreground, .background]
@State private var selectedTool: SAMTool?
@State private var selectedCategory: SAMCategory?
@State private var selectedPoints: [SAMPoint] = []
@State private var boundingBoxes: [SAMBox] = []
@State private var currentBox: SAMBox?
@State private var originalSize: NSSize?
@State private var currentScale: CGFloat = 1.0
@State private var visibleRect: CGRect = .zero
var body: some View {
NavigationSplitView(sidebar: {
VStack {
LayerListView(segmentationImages: $segmentationImages, selectedSegmentations: $selectedSegmentations, currentSegmentation: $currentSegmentation)
Spacer()
Button(action: {
if let currentSegmentation = self.currentSegmentation {
self.segmentationImages.append(currentSegmentation)
self.reset()
}
}, label: {
Text("New Mask")
}).padding()
}
}, detail: {
ZStack {
ZoomableScrollView(visibleRect: $visibleRect) {
if let image = displayImage {
ImageView(image: image, currentScale: $currentScale, selectedTool: $selectedTool, selectedCategory: $selectedCategory, selectedPoints: $selectedPoints, boundingBoxes: $boundingBoxes, currentBox: $currentBox, segmentationImages: $segmentationImages, currentSegmentation: $currentSegmentation, imageSize: $imageSize, originalSize: $originalSize, sam2: sam2)
} else {
ContentUnavailableView("No Image Loaded", systemImage: "photo.fill.on.rectangle.fill", description: Text("Please import a photo to get started."))
}
}
VStack(spacing: 0) {
SubToolbar(selectedPoints: $selectedPoints, boundingBoxes: $boundingBoxes, segmentationImages: $segmentationImages, currentSegmentation: $currentSegmentation)
Spacer()
}
}
})
.inspector(isPresented: $showInspector, content: {
if selectedSegmentations.isEmpty {
ContentUnavailableView(label: {
Label(title: {
Text("No Mask Selected")
.font(.subheadline)
}, icon: {})
})
.inspectorColumnWidth(min: 200, ideal: 200, max: 200)
} else {
MaskEditor(exportMaskToPNG: $exportMaskToPNG, segmentationImages: $segmentationImages, selectedSegmentations: $selectedSegmentations, currentSegmentation: $currentSegmentation)
.inspectorColumnWidth(min: 200, ideal: 200, max: 200)
.toolbar {
Spacer()
Button {
showInspector.toggle()
} label: {
Label("Toggle Inspector", systemImage: "sidebar.trailing")
}
}
}
})
.toolbar {
// Tools
ToolbarItemGroup(placement: .principal) {
Picker(selection: $selectedTool, content: {
ForEach(tools, id: \.self) { tool in
Label(tool.name, systemImage: tool.iconName)
.tag(tool)
.labelStyle(.titleAndIcon)
}
}, label: {
Label("Tools", systemImage: "pencil.and.ruler")
})
.pickerStyle(.menu)
Picker(selection: $selectedCategory, content: {
ForEach(categories, id: \.self) { cat in
Label(cat.name, systemImage: cat.iconName)
.tag(cat)
.labelStyle(.titleAndIcon)
}
}, label: {
Label("Tools", systemImage: "pencil.and.ruler")
})
.pickerStyle(.menu)
}
// Import
ToolbarItemGroup {
Menu {
Button(action: {
isImportingFromPhotos = true
}, label: {
Label("From Photos", systemImage: "photo.on.rectangle.angled.fill")
})
Button(action: {
isImportingFromFiles = true
}, label: {
Label("From Files", systemImage: "folder.fill")
})
} label: {
Label("Import", systemImage: "photo.badge.plus")
}
}
}
.onAppear {
if selectedTool == nil {
selectedTool = tools[0]
}
if selectedCategory == nil {
selectedCategory = categories.first
}
}
// MARK: - Image encoding
.onChange(of: displayImage) {
segmentationImages = []
self.reset()
Task {
if let displayImage, let pixelBuffer = displayImage.pixelBuffer(width: 1024, height: 1024) {
originalSize = displayImage.size
do {
try await sam2.getImageEncoding(from: pixelBuffer)
} catch {
self.error = error
}
}
}
}
// MARK: - Photos Importer
.photosPicker(isPresented: $isImportingFromPhotos, selection: $selectedItem, matching: .any(of: [.images, .screenshots, .livePhotos]))
.onChange(of: selectedItem) {
Task {
if let loadedData = try? await
selectedItem?.loadTransferable(type: Data.self) {
DispatchQueue.main.async {
selectedPoints.removeAll()
displayImage = NSImage(data: loadedData)
}
} else {
logger.error("Error loading image from Photos.")
}
}
}
// MARK: - File Importer
.fileImporter(isPresented: $isImportingFromFiles,
allowedContentTypes: [.image]) { result in
switch result {
case .success(let file):
self.selectedItem = nil
self.selectedPoints.removeAll()
self.imageURL = file
loadImage(from: file)
case .failure(let error):
logger.error("File import error: \(error.localizedDescription)")
self.error = error
}
}
// MARK: - File exporter
.fileExporter(
isPresented: $exportMaskToPNG,
document: DirectoryDocument(initialContentType: .folder),
contentType: .folder,
defaultFilename: "Segmentations"
) { result in
if case .success(let url) = result {
exportURL = url
var selectedToExport = segmentationImages.filter { segmentation in
selectedSegmentations.contains(segmentation.id)
}
if let currentSegmentation {
selectedToExport.append(currentSegmentation)
}
exportSegmentations(selectedToExport, to: url)
}
}
}
// MARK: - Private Methods
private func loadImage(from url: URL) {
guard url.startAccessingSecurityScopedResource() else {
logger.error("Failed to access the file. Security-scoped resource access denied.")
return
}
defer { url.stopAccessingSecurityScopedResource() }
do {
let imageData = try Data(contentsOf: url)
if let image = NSImage(data: imageData) {
DispatchQueue.main.async {
self.displayImage = image
}
} else {
logger.error("Failed to create NSImage from file data")
}
} catch {
logger.error("Error loading image data: \(error.localizedDescription)")
self.error = error
}
}
func exportSegmentations(_ segmentations: [SAMSegmentation], to directory: URL) {
let fileManager = FileManager.default
do {
try fileManager.createDirectory(at: directory, withIntermediateDirectories: true, attributes: nil)
for (index, segmentation) in segmentations.enumerated() {
let filename = "segmentation_\(index + 1).png"
let fileURL = directory.appendingPathComponent(filename)
if let destination = CGImageDestinationCreateWithURL(fileURL as CFURL, UTType.png.identifier as CFString, 1, nil) {
CGImageDestinationAddImage(destination, segmentation.cgImage, nil)
if CGImageDestinationFinalize(destination) {
print("Saved segmentation \(index + 1) to \(fileURL.path)")
} else {
print("Failed to save segmentation \(index + 1)")
}
}
}
} catch {
print("Error creating directory: \(error.localizedDescription)")
}
}
private func reset() {
selectedPoints = []
boundingBoxes = []
currentBox = nil
currentSegmentation = nil
}
}
struct SizePreferenceKey: PreferenceKey {
static var defaultValue: CGSize = .zero
static func reduce(value: inout CGSize, nextValue: () -> CGSize) {
value = nextValue()
}
}
#Preview {
ContentView()
}
================================================
FILE: SAM2-Demo/Preview Content/Preview Assets.xcassets/Contents.json
================================================
{
"info" : {
"author" : "xcode",
"version" : 1
}
}
================================================
FILE: SAM2-Demo/Ripple/Ripple.metal
================================================
// Ripple.metal
/*
See the LICENSE at the end of the article for this sample’s licensing information.
Abstract:
A shader that applies a ripple effect to a view when using it as a SwiftUI layer
effect.
*/
#include <metal_stdlib>
#include <SwiftUI/SwiftUI.h>
using namespace metal;
[[ stitchable ]]
half4 Ripple(
float2 position,
SwiftUI::Layer layer,
float2 origin,
float time,
float amplitude,
float frequency,
float decay,
float speed
) {
// The distance of the current pixel position from `origin`.
float distance = length(position - origin);
// The amount of time it takes for the ripple to arrive at the current pixel position.
float delay = distance / speed;
// Adjust for delay, clamp to 0.
time -= delay;
time = max(0.0, time);
// The ripple is a sine wave that Metal scales by an exponential decay
// function.
float rippleAmount = amplitude * sin(frequency * time) * exp(-decay * time);
// A vector of length `amplitude` that points away from position.
float2 n = normalize(position - origin);
// Scale `n` by the ripple amount at the current pixel position and add it
// to the current pixel position.
//
// This new position moves toward or away from `origin` based on the
// sign and magnitude of `rippleAmount`.
float2 newPosition = position + rippleAmount * n;
// Sample the layer at the new position.
half4 color = layer.sample(newPosition);
// Lighten or darken the color based on the ripple amount and its alpha
// component.
color.rgb += 0.3 * (rippleAmount / amplitude) * color.a;
return color;
}
================================================
FILE: SAM2-Demo/Ripple/RippleModifier.swift
================================================
//
// RippleModifier.swift
// HuggingChat-Mac
//
// Created by Cyril Zakka on 8/28/24.
//
import SwiftUI
/* See the LICENSE at the end of the article for this sample's licensing information. */
/// A modifier that applies a ripple effect to its content.
struct RippleModifier: ViewModifier {
var origin: CGPoint
var elapsedTime: TimeInterval
var duration: TimeInterval
var amplitude: Double
var frequency: Double
var decay: Double
var speed: Double
func body(content: Content) -> some View {
let shader = ShaderLibrary.Ripple(
.float2(origin),
.float(elapsedTime),
// Parameters
.float(amplitude),
.float(frequency),
.float(decay),
.float(speed)
)
let maxSampleOffset = maxSampleOffset
let elapsedTime = elapsedTime
let duration = duration
content.visualEffect { view, _ in
view.layerEffect(
shader,
maxSampleOffset: maxSampleOffset,
isEnabled: 0 < elapsedTime && elapsedTime < duration
)
}
}
var maxSampleOffset: CGSize {
CGSize(width: amplitude, height: amplitude)
}
}
================================================
FILE: SAM2-Demo/Ripple/RippleViewModifier.swift
================================================
//
// RippleViewModifier.swift
// HuggingChat-Mac
//
// Created by Cyril Zakka on 8/28/24.
//
import SwiftUI
struct RippleEffect<T: Equatable>: ViewModifier {
var origin: CGPoint
var trigger: T
var amplitude: Double
var frequency: Double
var decay: Double
var speed: Double
init(at origin: CGPoint, trigger: T, amplitude: Double = 12, frequency: Double = 15, decay: Double = 8, speed: Double = 1200) {
self.origin = origin
self.trigger = trigger
self.amplitude = amplitude
self.frequency = frequency
self.decay = decay
self.speed = speed
}
func body(content: Content) -> some View {
let origin = origin
let duration = duration
let amplitude = amplitude
let frequency = frequency
let decay = decay
let speed = speed
content.keyframeAnimator(
initialValue: 0,
trigger: trigger
) { view, elapsedTime in
view.modifier(RippleModifier(
origin: origin,
elapsedTime: elapsedTime,
duration: duration,
amplitude: amplitude,
frequency: frequency,
decay: decay,
speed: speed
))
} keyframes: { _ in
MoveKeyframe(0)
LinearKeyframe(duration, duration: duration)
}
}
var duration: TimeInterval { 3 }
}
================================================
FILE: SAM2-Demo/SAM2_1SmallImageEncoderFLOAT16.mlpackage/Manifest.json
================================================
{
"fileFormatVersion": "1.0.0",
"itemInfoEntries": {
"4C20C7AA-F42B-4CCD-84C3-73C031A91D48": {
"author": "com.apple.CoreML",
"description": "CoreML Model Weights",
"name": "weights",
"path": "com.apple.CoreML/weights"
},
"DDCB1D63-C7BD-4A13-8EB5-D7151371105B": {
"author": "com.apple.CoreML",
"description": "CoreML Model Specification",
"name": "model.mlmodel",
"path": "com.apple.CoreML/model.mlmodel"
}
},
"rootModelIdentifier": "DDCB1D63-C7BD-4A13-8EB5-D7151371105B"
}
================================================
FILE: SAM2-Demo/SAM2_1SmallMaskDecoderFLOAT16.mlpackage/Manifest.json
================================================
{
"fileFormatVersion": "1.0.0",
"itemInfoEntries": {
"6FA6762D-69A1-4A0B-AB0D-512638FD7ECF": {
"author": "com.apple.CoreML",
"description": "CoreML Model Specification",
"name": "model.mlmodel",
"path": "com.apple.CoreML/model.mlmodel"
},
"DB82D069-C4C9-41FB-A178-262063485D28": {
"author": "com.apple.CoreML",
"description": "CoreML Model Weights",
"name": "weights",
"path": "com.apple.CoreML/weights"
}
},
"rootModelIdentifier": "6FA6762D-69A1-4A0B-AB0D-512638FD7ECF"
}
================================================
FILE: SAM2-Demo/SAM2_1SmallPromptEncoderFLOAT16.mlpackage/Manifest.json
================================================
{
"fileFormatVersion": "1.0.0",
"itemInfoEntries": {
"BE0329D0-1E5D-4FF9-8ECE-350FC8DE699D": {
"author": "com.apple.CoreML",
"description": "CoreML Model Weights",
"name": "weights",
"path": "com.apple.CoreML/weights"
},
"C1F60EF7-4F31-4243-8BE5-C107CB23EADF": {
"author": "com.apple.CoreML",
"description": "CoreML Model Specification",
"name": "model.mlmodel",
"path": "com.apple.CoreML/model.mlmodel"
}
},
"rootModelIdentifier": "C1F60EF7-4F31-4243-8BE5-C107CB23EADF"
}
================================================
FILE: SAM2-Demo/SAM2_Demo.entitlements
================================================
<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE plist PUBLIC "-//Apple//DTD PLIST 1.0//EN" "http://www.apple.com/DTDs/PropertyList-1.0.dtd">
<plist version="1.0">
<dict>
<key>com.apple.security.app-sandbox</key>
<true/>
<key>com.apple.security.files.user-selected.read-write</key>
<true/>
</dict>
</plist>
================================================
FILE: SAM2-Demo/SAM2_DemoApp.swift
================================================
//
// SAM2_DemoApp.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 8/19/24.
//
import SwiftUI
@main
struct SAM2_DemoApp: App {
var body: some Scene {
WindowGroup {
ContentView()
}
.windowToolbarStyle(UnifiedWindowToolbarStyle(showsTitle: false))
}
}
================================================
FILE: SAM2-Demo/Views/AnnotationListView.swift
================================================
//
// AnnotationListView.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/8/24.
//
import SwiftUI
struct AnnotationListView: View {
@Binding var segmentation: SAMSegmentation
@State var showHideIcon: Bool = false
var body: some View {
HStack {
Image(nsImage: NSImage(cgImage: segmentation.cgImage, size: NSSize(width: 25, height: 25)))
.background(.quinary)
.mask(RoundedRectangle(cornerRadius: 5))
VStack(alignment: .leading) {
Text(segmentation.title)
.font(.headline)
.foregroundStyle(segmentation.isHidden ? .tertiary:.primary)
// Text(segmentation.firstAppearance)
// .font(.subheadline)
// .foregroundStyle(.secondary)
}
Spacer()
Button("", systemImage: segmentation.isHidden ? "eye.slash.fill" :"eye.fill", action: {
segmentation.isHidden.toggle()
})
.opacity(segmentation.isHidden ? 1 : (showHideIcon ? 1:0))
.buttonStyle(.borderless)
.foregroundStyle(.secondary)
}
.onHover { state in
showHideIcon = state
}
}
}
================================================
FILE: SAM2-Demo/Views/ImageView.swift
================================================
//
// ImageView.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/8/24.
//
import SwiftUI
struct ImageView: View {
let image: NSImage
@Binding var currentScale: CGFloat
@Binding var selectedTool: SAMTool?
@Binding var selectedCategory: SAMCategory?
@Binding var selectedPoints: [SAMPoint]
@Binding var boundingBoxes: [SAMBox]
@Binding var currentBox: SAMBox?
@Binding var segmentationImages: [SAMSegmentation]
@Binding var currentSegmentation: SAMSegmentation?
@Binding var imageSize: CGSize
@Binding var originalSize: NSSize?
@State var animationPoint: CGPoint = .zero
@ObservedObject var sam2: SAM2
@State private var error: Error?
var pointSequence: [SAMPoint] {
boundingBoxes.flatMap { $0.points } + selectedPoints
}
var body: some View {
Image(nsImage: image)
.resizable()
.aspectRatio(contentMode: .fit)
.scaleEffect(currentScale)
.onTapGesture(coordinateSpace: .local) { handleTap(at: $0) }
.gesture(boundingBoxGesture)
.onHover { changeCursorAppearance(is: $0) }
.background(GeometryReader { geometry in
Color.clear.preference(key: SizePreferenceKey.self, value: geometry.size)
})
.onPreferenceChange(SizePreferenceKey.self) { imageSize = $0 }
.onChange(of: selectedPoints.count, {
if !selectedPoints.isEmpty {
performForwardPass()
}
})
.onChange(of: boundingBoxes.count, {
if !boundingBoxes.isEmpty {
performForwardPass()
}
})
.overlay {
PointsOverlay(selectedPoints: $selectedPoints, selectedTool: $selectedTool, imageSize: imageSize)
BoundingBoxesOverlay(boundingBoxes: boundingBoxes, currentBox: currentBox, imageSize: imageSize)
if !segmentationImages.isEmpty {
ForEach(Array(segmentationImages.enumerated()), id: \.element.id) { index, segmentation in
SegmentationOverlay(segmentationImage: $segmentationImages[index], imageSize: imageSize, shouldAnimate: false)
.zIndex(Double (segmentationImages.count - index))
}
}
if let currentSegmentation = currentSegmentation {
SegmentationOverlay(segmentationImage: .constant(currentSegmentation), imageSize: imageSize, origin: animationPoint, shouldAnimate: true)
.zIndex(Double(segmentationImages.count + 1))
}
}
}
private func changeCursorAppearance(is inside: Bool) {
if inside {
if selectedTool == pointTool {
NSCursor.pointingHand.push()
} else if selectedTool == boundingBoxTool {
NSCursor.crosshair.push()
}
} else {
NSCursor.pop()
}
}
private var boundingBoxGesture: some Gesture {
DragGesture(minimumDistance: 0)
.onChanged { value in
guard selectedTool == boundingBoxTool else { return }
if currentBox == nil {
currentBox = SAMBox(startPoint: value.startLocation.fromSize(imageSize), endPoint: value.location.fromSize(imageSize), category: selectedCategory!)
} else {
currentBox?.endPoint = value.location.fromSize(imageSize)
}
}
.onEnded { value in
guard selectedTool == boundingBoxTool else { return }
if let box = currentBox {
boundingBoxes.append(box)
animationPoint = box.midpoint.toSize(imageSize)
currentBox = nil
}
}
}
private func handleTap(at location: CGPoint) {
if selectedTool == pointTool {
placePoint(at: location)
animationPoint = location
}
}
private func placePoint(at coordinates: CGPoint) {
let samPoint = SAMPoint(coordinates: coordinates.fromSize(imageSize), category: selectedCategory!)
self.selectedPoints.append(samPoint)
}
private func performForwardPass() {
Task {
do {
try await sam2.getPromptEncoding(from: pointSequence, with: imageSize)
if let mask = try await sam2.getMask(for: originalSize ?? .zero) {
DispatchQueue.main.async {
let colorSet = self.segmentationImages.map { $0.tintColor };
let furthestColor = furthestColor(from: colorSet, among: SAMSegmentation.candidateColors)
let segmentationNumber = segmentationImages.count
let segmentationOverlay = SAMSegmentation(image: mask, tintColor: furthestColor, title: "Untitled \(segmentationNumber + 1)")
self.currentSegmentation = segmentationOverlay
}
}
} catch {
self.error = error
}
}
}
}
#Preview {
ContentView()
}
extension CGPoint {
func fromSize(_ size: CGSize) -> CGPoint {
CGPoint(x: x / size.width, y: y / size.height)
}
func toSize(_ size: CGSize) -> CGPoint {
CGPoint(x: x * size.width, y: y * size.height)
}
}
================================================
FILE: SAM2-Demo/Views/LayerListView.swift
================================================
//
// LayerListView.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/8/24.
//
import SwiftUI
struct LayerListView: View {
@Binding var segmentationImages: [SAMSegmentation]
@Binding var selectedSegmentations: Set<SAMSegmentation.ID>
@Binding var currentSegmentation: SAMSegmentation?
var body: some View {
List(selection: $selectedSegmentations) {
Section("Annotations List") {
ForEach(Array(segmentationImages.enumerated()), id: \.element.id) { index, segmentation in
AnnotationListView(segmentation: $segmentationImages[index])
.padding(.horizontal, 5)
.contextMenu {
Button(role: .destructive) {
if let index = segmentationImages.firstIndex(where: { $0.id == segmentation.id }) {
segmentationImages.remove(at: index)
}
} label: {
Label("Delete", systemImage: "trash.fill")
}
}
}
.onDelete(perform: delete)
.onMove(perform: move)
if let currentSegmentation = currentSegmentation {
AnnotationListView(segmentation: .constant(currentSegmentation))
.tag(currentSegmentation.id)
}
}
}
.listStyle(.sidebar)
}
func delete(at offsets: IndexSet) {
segmentationImages.remove(atOffsets: offsets)
}
func move(from source: IndexSet, to destination: Int) {
segmentationImages.move(fromOffsets: source, toOffset: destination)
}
}
#Preview {
ContentView()
}
================================================
FILE: SAM2-Demo/Views/MaskEditor.swift
================================================
//
// MaskEditor.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/10/24.
//
import SwiftUI
struct MaskEditor: View {
@Binding var exportMaskToPNG: Bool
@Binding var segmentationImages: [SAMSegmentation]
@Binding var selectedSegmentations: Set<SAMSegmentation.ID>
@Binding var currentSegmentation: SAMSegmentation?
@State private var bgColor =
Color(.sRGB, red: 30/255, green: 144/255, blue: 1)
var body: some View {
Form {
Section {
ColorPicker("Color", selection: $bgColor)
.onChange(of: bgColor) { oldColor, newColor in
updateSelectedSegmentationsColor(newColor)
}
Button("Export Selected...", action: {
exportMaskToPNG = true
})
.frame(maxWidth: .infinity, maxHeight: .infinity)
}
}
.frame(maxWidth: .infinity, maxHeight: .infinity)
.onChange(of: selectedSegmentations) { oldValue, newValue in
bgColor = getColorOfFirstSelectedSegmentation()
}
.onAppear {
bgColor = getColorOfFirstSelectedSegmentation()
}
}
private func updateSelectedSegmentationsColor(_ newColor: Color) {
for id in selectedSegmentations {
for index in segmentationImages.indices where segmentationImages[index].id == id {
segmentationImages[index].tintColor = newColor
}
if currentSegmentation?.id == id {
currentSegmentation?.tintColor = newColor
}
}
}
private func getColorOfFirstSelectedSegmentation() -> Color {
if let firstSelectedId = selectedSegmentations.first {
if let firstSelectedSegmentation = segmentationImages.first(where: { $0.id == firstSelectedId }) {
return firstSelectedSegmentation.tintColor
} else {
if let currentSegmentation {
return currentSegmentation.tintColor
}
}
}
return bgColor // Return default color if no segmentation is selected
}
}
#Preview {
ContentView()
}
================================================
FILE: SAM2-Demo/Views/SubtoolbarView.swift
================================================
//
// SubtoolbarView.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/8/24.
//
import SwiftUI
struct SubToolbar: View {
@Binding var selectedPoints: [SAMPoint]
@Binding var boundingBoxes: [SAMBox]
@Binding var segmentationImages: [SAMSegmentation]
@Binding var currentSegmentation: SAMSegmentation?
var body: some View {
if selectedPoints.count > 0 || boundingBoxes.count > 0 {
ZStack {
Rectangle()
.fill(.regularMaterial)
.frame(height: 30)
HStack {
Spacer()
Button("Undo", action: undo)
.padding(.trailing, 5)
.disabled(selectedPoints.isEmpty && boundingBoxes.isEmpty)
Button("Reset", action: resetAll)
.padding(.trailing, 5)
.disabled(selectedPoints.isEmpty && boundingBoxes.isEmpty)
}
}
.transition(.move(edge: .top))
}
}
private func newMask() {
}
private func resetAll() {
selectedPoints.removeAll()
boundingBoxes.removeAll()
segmentationImages = []
currentSegmentation = nil
}
private func undo() {
if let lastPoint = selectedPoints.last, let lastBox = boundingBoxes.last {
if lastPoint.dateAdded > lastBox.dateAdded {
selectedPoints.removeLast()
} else {
boundingBoxes.removeLast()
}
} else if !selectedPoints.isEmpty {
selectedPoints.removeLast()
} else if !boundingBoxes.isEmpty {
boundingBoxes.removeLast()
}
if selectedPoints.isEmpty && boundingBoxes.isEmpty {
currentSegmentation = nil
}
}
}
#Preview {
ContentView()
}
================================================
FILE: SAM2-Demo/Views/ZoomableScrollView.swift
================================================
//
// ZoomableScrollView.swift
// SAM2-Demo
//
// Created by Cyril Zakka on 9/12/24.
//
import AppKit
import SwiftUI
struct ZoomableScrollView<Content: View>: NSViewRepresentable {
@Binding var visibleRect: CGRect
private var content: Content
init(visibleRect: Binding<CGRect>, @ViewBuilder content: () -> Content) {
self._visibleRect = visibleRect
self.content = content()
}
func makeNSView(context: Context) -> NSScrollView {
let scrollView = NSScrollView()
scrollView.hasVerticalScroller = true
scrollView.hasHorizontalScroller = true
scrollView.autohidesScrollers = true
scrollView.allowsMagnification = false
scrollView.maxMagnification = 20
scrollView.minMagnification = 1
let hostedView = context.coordinator.hostingView
hostedView.translatesAutoresizingMaskIntoConstraints = true
hostedView.autoresizingMask = [.width, .height]
hostedView.frame = scrollView.bounds
scrollView.documentView = hostedView
return scrollView
}
func makeCoordinator() -> Coordinator {
let coordinator = Coordinator(hostingView: NSHostingView(rootView: self.content), parent: self)
coordinator.listen()
return coordinator
}
func updateNSView(_ nsView: NSScrollView, context: Context) {
context.coordinator.hostingView.rootView = self.content
}
// MARK: - Coordinator
class Coordinator: NSObject {
var hostingView: NSHostingView<Content>
var parent: ZoomableScrollView<Content>
init(hostingView: NSHostingView<Content>, parent: ZoomableScrollView<Content>) {
self.hostingView = hostingView
self.parent = parent
}
func listen() {
NotificationCenter.default.addObserver(forName: NSScrollView.didEndLiveMagnifyNotification, object: nil, queue: nil) { notification in
let scrollView = notification.object as! NSScrollView
print("did magnify: \(scrollView.magnification), \(scrollView.documentVisibleRect)")
self.parent.visibleRect = scrollView.documentVisibleRect
}
NotificationCenter.default.addObserver(forName: NSScrollView.didEndLiveScrollNotification, object: nil, queue: nil) { notification in
let scrollView = notification.object as! NSScrollView
print("did scroll: \(scrollView.magnification), \(scrollView.documentVisibleRect)")
self.parent.visibleRect = scrollView.documentVisibleRect
}
}
}
}
================================================
FILE: SAM2-Demo.xcodeproj/project.pbxproj
================================================
// !$*UTF8*$!
{
archiveVersion = 1;
classes = {
};
objectVersion = 70;
objects = {
/* Begin PBXBuildFile section */
EBAB91282C88A05500F57B83 /* ArgumentParser in Frameworks */ = {isa = PBXBuildFile; productRef = EBAB91272C88A05500F57B83 /* ArgumentParser */; };
/* End PBXBuildFile section */
/* Begin PBXCopyFilesBuildPhase section */
EBAB911D2C889D5200F57B83 /* CopyFiles */ = {
isa = PBXCopyFilesBuildPhase;
buildActionMask = 2147483647;
dstPath = /usr/share/man/man1/;
dstSubfolderSpec = 0;
files = (
);
runOnlyForDeploymentPostprocessing = 1;
};
/* End PBXCopyFilesBuildPhase section */
/* Begin PBXFileReference section */
EBAB911F2C889D5200F57B83 /* sam2-cli */ = {isa = PBXFileReference; explicitFileType = "compiled.mach-o.executable"; includeInIndex = 0; path = "sam2-cli"; sourceTree = BUILT_PRODUCTS_DIR; };
F136320F2C73AE78009DEF15 /* SAM 2 Studio.app */ = {isa = PBXFileReference; explicitFileType = wrapper.application; includeInIndex = 0; path = "SAM 2 Studio.app"; sourceTree = BUILT_PRODUCTS_DIR; };
/* End PBXFileReference section */
/* Begin PBXFileSystemSynchronizedBuildFileExceptionSet section */
EBAB912A2C88A21E00F57B83 /* Exceptions for "SAM2-Demo" folder in "sam2-cli" target */ = {
isa = PBXFileSystemSynchronizedBuildFileExceptionSet;
membershipExceptions = (
"Common/CGImage+Extension.swift",
"Common/CGImage+RawBytes.swift",
Common/CoreImageExtensions.swift,
Common/DirectoryDocument.swift,
"Common/MLMultiArray+Image.swift",
Common/Models.swift,
Common/SAM2.swift,
SAM2_1SmallImageEncoderFLOAT16.mlpackage,
SAM2_1SmallMaskDecoderFLOAT16.mlpackage,
SAM2_1SmallPromptEncoderFLOAT16.mlpackage,
);
target = EBAB911E2C889D5200F57B83 /* sam2-cli */;
};
/* End PBXFileSystemSynchronizedBuildFileExceptionSet section */
/* Begin PBXFileSystemSynchronizedRootGroup section */
EBAB91202C889D5200F57B83 /* sam2-cli */ = {
isa = PBXFileSystemSynchronizedRootGroup;
path = "sam2-cli";
sourceTree = "<group>";
};
F13632112C73AE78009DEF15 /* SAM2-Demo */ = {
isa = PBXFileSystemSynchronizedRootGroup;
exceptions = (
EBAB912A2C88A21E00F57B83 /* Exceptions for "SAM2-Demo" folder in "sam2-cli" target */,
);
path = "SAM2-Demo";
sourceTree = "<group>";
};
/* End PBXFileSystemSynchronizedRootGroup section */
/* Begin PBXFrameworksBuildPhase section */
EBAB911C2C889D5200F57B83 /* Frameworks */ = {
isa = PBXFrameworksBuildPhase;
buildActionMask = 2147483647;
files = (
EBAB91282C88A05500F57B83 /* ArgumentParser in Frameworks */,
);
runOnlyForDeploymentPostprocessing = 0;
};
F136320C2C73AE78009DEF15 /* Frameworks */ = {
isa = PBXFrameworksBuildPhase;
buildActionMask = 2147483647;
files = (
);
runOnlyForDeploymentPostprocessing = 0;
};
/* End PBXFrameworksBuildPhase section */
/* Begin PBXGroup section */
F13632062C73AE77009DEF15 = {
isa = PBXGroup;
children = (
F13632112C73AE78009DEF15 /* SAM2-Demo */,
EBAB91202C889D5200F57B83 /* sam2-cli */,
F13632102C73AE78009DEF15 /* Products */,
);
sourceTree = "<group>";
};
F13632102C73AE78009DEF15 /* Products */ = {
isa = PBXGroup;
children = (
F136320F2C73AE78009DEF15 /* SAM 2 Studio.app */,
EBAB911F2C889D5200F57B83 /* sam2-cli */,
);
name = Products;
sourceTree = "<group>";
};
/* End PBXGroup section */
/* Begin PBXNativeTarget section */
EBAB911E2C889D5200F57B83 /* sam2-cli */ = {
isa = PBXNativeTarget;
buildConfigurationList = EBAB91252C889D5200F57B83 /* Build configuration list for PBXNativeTarget "sam2-cli" */;
buildPhases = (
EBAB911B2C889D5200F57B83 /* Sources */,
EBAB911C2C889D5200F57B83 /* Frameworks */,
EBAB911D2C889D5200F57B83 /* CopyFiles */,
);
buildRules = (
);
dependencies = (
);
fileSystemSynchronizedGroups = (
EBAB91202C889D5200F57B83 /* sam2-cli */,
);
name = "sam2-cli";
packageProductDependencies = (
EBAB91272C88A05500F57B83 /* ArgumentParser */,
);
productName = "sam2-cli";
productReference = EBAB911F2C889D5200F57B83 /* sam2-cli */;
productType = "com.apple.product-type.tool";
};
F136320E2C73AE78009DEF15 /* SAM2-Demo */ = {
isa = PBXNativeTarget;
buildConfigurationList = F136321E2C73AE79009DEF15 /* Build configuration list for PBXNativeTarget "SAM2-Demo" */;
buildPhases = (
F136320B2C73AE78009DEF15 /* Sources */,
F136320C2C73AE78009DEF15 /* Frameworks */,
F136320D2C73AE78009DEF15 /* Resources */,
);
buildRules = (
);
dependencies = (
);
fileSystemSynchronizedGroups = (
F13632112C73AE78009DEF15 /* SAM2-Demo */,
);
name = "SAM2-Demo";
packageProductDependencies = (
);
productName = "SAM2-Demo";
productReference = F136320F2C73AE78009DEF15 /* SAM 2 Studio.app */;
productType = "com.apple.product-type.application";
};
/* End PBXNativeTarget section */
/* Begin PBXProject section */
F13632072C73AE77009DEF15 /* Project object */ = {
isa = PBXProject;
attributes = {
BuildIndependentTargetsInParallel = 1;
LastSwiftUpdateCheck = 1600;
LastUpgradeCheck = 1600;
TargetAttributes = {
EBAB911E2C889D5200F57B83 = {
CreatedOnToolsVersion = 16.0;
};
F136320E2C73AE78009DEF15 = {
CreatedOnToolsVersion = 16.0;
};
};
};
buildConfigurationList = F136320A2C73AE77009DEF15 /* Build configuration list for PBXProject "SAM2-Demo" */;
developmentRegion = en;
hasScannedForEncodings = 0;
knownRegions = (
en,
Base,
);
mainGroup = F13632062C73AE77009DEF15;
minimizedProjectReferenceProxies = 1;
packageReferences = (
EBAB91262C88A05500F57B83 /* XCRemoteSwiftPackageReference "swift-argument-parser" */,
);
preferredProjectObjectVersion = 77;
productRefGroup = F13632102C73AE78009DEF15 /* Products */;
projectDirPath = "";
projectRoot = "";
targets = (
F136320E2C73AE78009DEF15 /* SAM2-Demo */,
EBAB911E2C889D5200F57B83 /* sam2-cli */,
);
};
/* End PBXProject section */
/* Begin PBXResourcesBuildPhase section */
F136320D2C73AE78009DEF15 /* Resources */ = {
isa = PBXResourcesBuildPhase;
buildActionMask = 2147483647;
files = (
);
runOnlyForDeploymentPostprocessing = 0;
};
/* End PBXResourcesBuildPhase section */
/* Begin PBXSourcesBuildPhase section */
EBAB911B2C889D5200F57B83 /* Sources */ = {
isa = PBXSourcesBuildPhase;
buildActionMask = 2147483647;
files = (
);
runOnlyForDeploymentPostprocessing = 0;
};
F136320B2C73AE78009DEF15 /* Sources */ = {
isa = PBXSourcesBuildPhase;
buildActionMask = 2147483647;
files = (
);
runOnlyForDeploymentPostprocessing = 0;
};
/* End PBXSourcesBuildPhase section */
/* Begin XCBuildConfiguration section */
EBAB91232C889D5200F57B83 /* Debug */ = {
isa = XCBuildConfiguration;
buildSettings = {
CODE_SIGN_STYLE = Automatic;
DEVELOPMENT_TEAM = "";
ENABLE_HARDENED_RUNTIME = YES;
PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0;
};
name = Debug;
};
EBAB91242C889D5200F57B83 /* Release */ = {
isa = XCBuildConfiguration;
buildSettings = {
CODE_SIGN_STYLE = Automatic;
DEVELOPMENT_TEAM = "";
ENABLE_HARDENED_RUNTIME = YES;
PRODUCT_NAME = "$(TARGET_NAME)";
SWIFT_VERSION = 5.0;
};
name = Release;
};
F136321C2C73AE79009DEF15 /* Debug */ = {
isa = XCBuildConfiguration;
buildSettings = {
ALWAYS_SEARCH_USER_PATHS = NO;
ARCHS = arm64;
ASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;
CLANG_ANALYZER_NONNULL = YES;
CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;
CLANG_CXX_LANGUAGE_STANDARD = "gnu++20";
CLANG_ENABLE_MODULES = YES;
CLANG_ENABLE_OBJC_ARC = YES;
CLANG_ENABLE_OBJC_WEAK = YES;
CLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;
CLANG_WARN_BOOL_CONVERSION = YES;
CLANG_WARN_COMMA = YES;
CLANG_WARN_CONSTANT_CONVERSION = YES;
CLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;
CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;
CLANG_WARN_DOCUMENTATION_COMMENTS = YES;
CLANG_WARN_EMPTY_BODY = YES;
CLANG_WARN_ENUM_CONVERSION = YES;
CLANG_WARN_INFINITE_RECURSION = YES;
CLANG_WARN_INT_CONVERSION = YES;
CLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;
CLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;
CLANG_WARN_OBJC_LITERAL_CONVERSION = YES;
CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;
CLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;
CLANG_WARN_RANGE_LOOP_ANALYSIS = YES;
CLANG_WARN_STRICT_PROTOTYPES = YES;
CLANG_WARN_SUSPICIOUS_MOVE = YES;
CLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;
CLANG_WARN_UNREACHABLE_CODE = YES;
CLANG_WARN__DUPLICATE_METHOD_MATCH = YES;
COPY_PHASE_STRIP = NO;
DEBUG_INFORMATION_FORMAT = dwarf;
DEVELOPMENT_TEAM = 2EADP68M95;
ENABLE_STRICT_OBJC_MSGSEND = YES;
ENABLE_TESTABILITY = YES;
ENABLE_USER_SCRIPT_SANDBOXING = YES;
GCC_C_LANGUAGE_STANDARD = gnu17;
GCC_DYNAMIC_NO_PIC = NO;
GCC_NO_COMMON_BLOCKS = YES;
GCC_OPTIMIZATION_LEVEL = 0;
GCC_PREPROCESSOR_DEFINITIONS = (
"DEBUG=1",
"$(inherited)",
);
GCC_WARN_64_TO_32_BIT_CONVERSION = YES;
GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;
GCC_WARN_UNDECLARED_SELECTOR = YES;
GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;
GCC_WARN_UNUSED_FUNCTION = YES;
GCC_WARN_UNUSED_VARIABLE = YES;
LOCALIZATION_PREFERS_STRING_CATALOGS = YES;
MACOSX_DEPLOYMENT_TARGET = 14.3;
MTL_ENABLE_DEBUG_INFO = INCLUDE_SOURCE;
MTL_FAST_MATH = YES;
ONLY_ACTIVE_ARCH = YES;
SDKROOT = macosx;
SWIFT_ACTIVE_COMPILATION_CONDITIONS = "DEBUG $(inherited)";
SWIFT_OPTIMIZATION_LEVEL = "-Onone";
};
name = Debug;
};
F136321D2C73AE79009DEF15 /* Release */ = {
isa = XCBuildConfiguration;
buildSettings = {
ALWAYS_SEARCH_USER_PATHS = NO;
ARCHS = arm64;
ASSETCATALOG_COMPILER_GENERATE_SWIFT_ASSET_SYMBOL_EXTENSIONS = YES;
CLANG_ANALYZER_NONNULL = YES;
CLANG_ANALYZER_NUMBER_OBJECT_CONVERSION = YES_AGGRESSIVE;
CLANG_CXX_LANGUAGE_STANDARD = "gnu++20";
CLANG_ENABLE_MODULES = YES;
CLANG_ENABLE_OBJC_ARC = YES;
CLANG_ENABLE_OBJC_WEAK = YES;
CLANG_WARN_BLOCK_CAPTURE_AUTORELEASING = YES;
CLANG_WARN_BOOL_CONVERSION = YES;
CLANG_WARN_COMMA = YES;
CLANG_WARN_CONSTANT_CONVERSION = YES;
CLANG_WARN_DEPRECATED_OBJC_IMPLEMENTATIONS = YES;
CLANG_WARN_DIRECT_OBJC_ISA_USAGE = YES_ERROR;
CLANG_WARN_DOCUMENTATION_COMMENTS = YES;
CLANG_WARN_EMPTY_BODY = YES;
CLANG_WARN_ENUM_CONVERSION = YES;
CLANG_WARN_INFINITE_RECURSION = YES;
CLANG_WARN_INT_CONVERSION = YES;
CLANG_WARN_NON_LITERAL_NULL_CONVERSION = YES;
CLANG_WARN_OBJC_IMPLICIT_RETAIN_SELF = YES;
CLANG_WARN_OBJC_LITERAL_CONVERSION = YES;
CLANG_WARN_OBJC_ROOT_CLASS = YES_ERROR;
CLANG_WARN_QUOTED_INCLUDE_IN_FRAMEWORK_HEADER = YES;
CLANG_WARN_RANGE_LOOP_ANALYSIS = YES;
CLANG_WARN_STRICT_PROTOTYPES = YES;
CLANG_WARN_SUSPICIOUS_MOVE = YES;
CLANG_WARN_UNGUARDED_AVAILABILITY = YES_AGGRESSIVE;
CLANG_WARN_UNREACHABLE_CODE = YES;
CLANG_WARN__DUPLICATE_METHOD_MATCH = YES;
COPY_PHASE_STRIP = NO;
DEBUG_INFORMATION_FORMAT = "dwarf-with-dsym";
DEVELOPMENT_TEAM = 2EADP68M95;
ENABLE_NS_ASSERTIONS = NO;
ENABLE_STRICT_OBJC_MSGSEND = YES;
ENABLE_USER_SCRIPT_SANDBOXING = YES;
GCC_C_LANGUAGE_STANDARD = gnu17;
GCC_NO_COMMON_BLOCKS = YES;
GCC_WARN_64_TO_32_BIT_CONVERSION = YES;
GCC_WARN_ABOUT_RETURN_TYPE = YES_ERROR;
GCC_WARN_UNDECLARED_SELECTOR = YES;
GCC_WARN_UNINITIALIZED_AUTOS = YES_AGGRESSIVE;
GCC_WARN_UNUSED_FUNCTION = YES;
GCC_WARN_UNUSED_VARIABLE = YES;
LOCALIZATION_PREFERS_STRING_CATALOGS = YES;
MACOSX_DEPLOYMENT_TARGET = 14.3;
MTL_ENABLE_DEBUG_INFO = NO;
MTL_FAST_MATH = YES;
SDKROOT = macosx;
SWIFT_COMPILATION_MODE = wholemodule;
};
name = Release;
};
F136321F2C73AE79009DEF15 /* Debug */ = {
isa = XCBuildConfiguration;
buildSettings = {
ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;
ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;
CODE_SIGN_ENTITLEMENTS = "SAM2-Demo/SAM2_Demo.entitlements";
CODE_SIGN_STYLE = Automatic;
COMBINE_HIDPI_IMAGES = YES;
CURRENT_PROJECT_VERSION = 1;
DEVELOPMENT_ASSET_PATHS = "\"SAM2-Demo/Preview Content\"";
DEVELOPMENT_TEAM = "";
ENABLE_HARDENED_RUNTIME = YES;
ENABLE_PREVIEWS = YES;
GENERATE_INFOPLIST_FILE = YES;
INFOPLIST_KEY_NSHumanReadableCopyright = "";
LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)",
"@executable_path/../Frameworks",
);
MARKETING_VERSION = 1.0;
PRODUCT_BUNDLE_IDENTIFIER = "co.huggingface.sam-2-studio";
PRODUCT_NAME = "SAM 2 Studio";
SWIFT_EMIT_LOC_STRINGS = YES;
SWIFT_VERSION = 5.0;
};
name = Debug;
};
F13632202C73AE79009DEF15 /* Release */ = {
isa = XCBuildConfiguration;
buildSettings = {
ASSETCATALOG_COMPILER_APPICON_NAME = AppIcon;
ASSETCATALOG_COMPILER_GLOBAL_ACCENT_COLOR_NAME = AccentColor;
CODE_SIGN_ENTITLEMENTS = "SAM2-Demo/SAM2_Demo.entitlements";
CODE_SIGN_STYLE = Automatic;
COMBINE_HIDPI_IMAGES = YES;
CURRENT_PROJECT_VERSION = 1;
DEVELOPMENT_ASSET_PATHS = "\"SAM2-Demo/Preview Content\"";
DEVELOPMENT_TEAM = "";
ENABLE_HARDENED_RUNTIME = YES;
ENABLE_PREVIEWS = YES;
GENERATE_INFOPLIST_FILE = YES;
INFOPLIST_KEY_NSHumanReadableCopyright = "";
LD_RUNPATH_SEARCH_PATHS = (
"$(inherited)",
"@executable_path/../Frameworks",
);
MARKETING_VERSION = 1.0;
PRODUCT_BUNDLE_IDENTIFIER = "co.huggingface.sam-2-studio";
PRODUCT_NAME = "SAM 2 Studio";
SWIFT_EMIT_LOC_STRINGS = YES;
SWIFT_VERSION = 5.0;
};
name = Release;
};
/* End XCBuildConfiguration section */
/* Begin XCConfigurationList section */
EBAB91252C889D5200F57B83 /* Build configuration list for PBXNativeTarget "sam2-cli" */ = {
isa = XCConfigurationList;
buildConfigurations = (
EBAB91232C889D5200F57B83 /* Debug */,
EBAB91242C889D5200F57B83 /* Release */,
);
defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release;
};
F136320A2C73AE77009DEF15 /* Build configuration list for PBXProject "SAM2-Demo" */ = {
isa = XCConfigurationList;
buildConfigurations = (
F136321C2C73AE79009DEF15 /* Debug */,
F136321D2C73AE79009DEF15 /* Release */,
);
defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release;
};
F136321E2C73AE79009DEF15 /* Build configuration list for PBXNativeTarget "SAM2-Demo" */ = {
isa = XCConfigurationList;
buildConfigurations = (
F136321F2C73AE79009DEF15 /* Debug */,
F13632202C73AE79009DEF15 /* Release */,
);
defaultConfigurationIsVisible = 0;
defaultConfigurationName = Release;
};
/* End XCConfigurationList section */
/* Begin XCRemoteSwiftPackageReference section */
EBAB91262C88A05500F57B83 /* XCRemoteSwiftPackageReference "swift-argument-parser" */ = {
isa = XCRemoteSwiftPackageReference;
repositoryURL = "https://github.com/apple/swift-argument-parser.git";
requirement = {
kind = upToNextMajorVersion;
minimumVersion = 1.5.0;
};
};
/* End XCRemoteSwiftPackageReference section */
/* Begin XCSwiftPackageProductDependency section */
EBAB91272C88A05500F57B83 /* ArgumentParser */ = {
isa = XCSwiftPackageProductDependency;
package = EBAB91262C88A05500F57B83 /* XCRemoteSwiftPackageReference "swift-argument-parser" */;
productName = ArgumentParser;
};
/* End XCSwiftPackageProductDependency section */
};
rootObject = F13632072C73AE77009DEF15 /* Project object */;
}
================================================
FILE: SAM2-Demo.xcodeproj/project.xcworkspace/contents.xcworkspacedata
================================================
<?xml version="1.0" encoding="UTF-8"?>
<Workspace
version = "1.0">
<FileRef
location = "self:">
</FileRef>
</Workspace>
================================================
FILE: SAM2-Demo.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved
================================================
{
"originHash" : "59ba1edda695b389d6c9ac1809891cd779e4024f505b0ce1a9d5202b6762e38a",
"pins" : [
{
"identity" : "swift-argument-parser",
"kind" : "remoteSourceControl",
"location" : "https://github.com/apple/swift-argument-parser.git",
"state" : {
"revision" : "41982a3656a71c768319979febd796c6fd111d5c",
"version" : "1.5.0"
}
}
],
"version" : 3
}
================================================
FILE: sam2-cli/MainCommand.swift
================================================
import ArgumentParser
import CoreImage
import CoreML
import ImageIO
import UniformTypeIdentifiers
import Combine
let context = CIContext(options: [.outputColorSpace: NSNull()])
enum PointType: Int, ExpressibleByArgument {
case background = 0
case foreground = 1
var asCategory: SAMCategory {
switch self {
case .background:
return SAMCategory.background
case .foreground:
return SAMCategory.foreground
}
}
}
@main
struct MainCommand: AsyncParsableCommand {
static let configuration = CommandConfiguration(
commandName: "sam2-cli",
abstract: "Perform segmentation using the SAM v2 model."
)
@Option(name: .shortAndLong, help: "The input image file.")
var input: String
// TODO: multiple points
@Option(name: .shortAndLong, parsing: .upToNextOption, help: "List of input coordinates in format 'x,y'. Coordinates are relative to the input image size. Separate multiple entries with spaces, but don't use spaces between the coordinates.")
var points: [CGPoint]
@Option(name: .shortAndLong, parsing: .upToNextOption, help: "Point types that correspond to the input points. Use as many as points, 0 for background and 1 for foreground.")
var types: [PointType]
@Option(name: .shortAndLong, help: "The output PNG image file, showing the segmentation map overlaid on top of the original image.")
var output: String
@Option(name: [.long, .customShort("k")], help: "The output file name for the segmentation mask.")
var mask: String? = nil
@MainActor
mutating func run() async throws {
// TODO: specify directory with loadable .mlpackages instead
let sam = try await SAM2.load()
print("Models loaded in: \(String(describing: sam.initializationTime))")
let targetSize = sam.inputSize
// Load the input image
guard let inputImage = CIImage(contentsOf: URL(filePath: input), options: [.colorSpace: NSNull()]) else {
print("Failed to load image.")
throw ExitCode(EXIT_FAILURE)
}
print("Original image size \(inputImage.extent)")
// Resize the image to match the model's expected input
let resizedImage = inputImage.resized(to: targetSize)
// Convert to a pixel buffer
guard let pixelBuffer = context.render(resizedImage, pixelFormat: kCVPixelFormatType_32ARGB) else {
print("Failed to create pixel buffer for input image.")
throw ExitCode(EXIT_FAILURE)
}
// Execute the model
let clock = ContinuousClock()
let start = clock.now
try await sam.getImageEncoding(from: pixelBuffer)
let duration = clock.now - start
print("Image encoding took \(duration.formatted(.units(allowed: [.seconds, .milliseconds])))")
let startMask = clock.now
let pointSequence = zip(points, types).map { point, type in
SAMPoint(coordinates:point, category:type.asCategory)
}
try await sam.getPromptEncoding(from: pointSequence, with: inputImage.extent.size)
guard let maskImage = try await sam.getMask(for: inputImage.extent.size) else {
throw ExitCode(EXIT_FAILURE)
}
let maskDuration = clock.now - startMask
print("Prompt encoding and mask generation took \(maskDuration.formatted(.units(allowed: [.seconds, .milliseconds])))")
// Write masks
if let mask = mask {
context.writePNG(maskImage, to: URL(filePath: mask))
}
// Overlay over original and save
guard let outputImage = maskImage.withAlpha(0.6)?.composited(over: inputImage) else {
print("Failed to blend mask.")
throw ExitCode(EXIT_FAILURE)
}
context.writePNG(outputImage, to: URL(filePath: output))
}
}
extension CGPoint: ExpressibleByArgument {
public init?(argument: String) {
let components = argument.split(separator: ",").map(String.init)
guard components.count == 2,
let x = Double(components[0]),
let y = Double(components[1]) else {
return nil
}
self.init(x: x, y: y)
}
}
gitextract_6ppky8xg/
├── .gitignore
├── LICENSE
├── README.md
├── SAM2-Demo/
│ ├── Assets.xcassets/
│ │ ├── AccentColor.colorset/
│ │ │ └── Contents.json
│ │ ├── AppIcon.appiconset/
│ │ │ └── Contents.json
│ │ └── Contents.json
│ ├── Common/
│ │ ├── CGImage+Extension.swift
│ │ ├── CGImage+RawBytes.swift
│ │ ├── Color+Extension.swift
│ │ ├── CoreImageExtensions.swift
│ │ ├── DirectoryDocument.swift
│ │ ├── MLMultiArray+Image.swift
│ │ ├── Models.swift
│ │ ├── NSImage+Extension.swift
│ │ └── SAM2.swift
│ ├── ContentView.swift
│ ├── Preview Content/
│ │ └── Preview Assets.xcassets/
│ │ └── Contents.json
│ ├── Ripple/
│ │ ├── Ripple.metal
│ │ ├── RippleModifier.swift
│ │ └── RippleViewModifier.swift
│ ├── SAM2_1SmallImageEncoderFLOAT16.mlpackage/
│ │ ├── Data/
│ │ │ └── com.apple.CoreML/
│ │ │ └── model.mlmodel
│ │ └── Manifest.json
│ ├── SAM2_1SmallMaskDecoderFLOAT16.mlpackage/
│ │ ├── Data/
│ │ │ └── com.apple.CoreML/
│ │ │ └── model.mlmodel
│ │ └── Manifest.json
│ ├── SAM2_1SmallPromptEncoderFLOAT16.mlpackage/
│ │ ├── Data/
│ │ │ └── com.apple.CoreML/
│ │ │ └── model.mlmodel
│ │ └── Manifest.json
│ ├── SAM2_Demo.entitlements
│ ├── SAM2_DemoApp.swift
│ └── Views/
│ ├── AnnotationListView.swift
│ ├── ImageView.swift
│ ├── LayerListView.swift
│ ├── MaskEditor.swift
│ ├── SubtoolbarView.swift
│ └── ZoomableScrollView.swift
├── SAM2-Demo.xcodeproj/
│ ├── project.pbxproj
│ └── project.xcworkspace/
│ ├── contents.xcworkspacedata
│ └── xcshareddata/
│ └── swiftpm/
│ └── Package.resolved
└── sam2-cli/
└── MainCommand.swift
Condensed preview — 38 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (118K chars).
[
{
"path": ".gitignore",
"chars": 22,
"preview": ".DS_Store\nxcuserdata/\n"
},
{
"path": "LICENSE",
"chars": 11347,
"preview": " Apache License\n Version 2.0, January 2004\n "
},
{
"path": "README.md",
"chars": 2554,
"preview": "# SAM2 Studio\n\nThis is a Swift demo app for SAM 2 Core ML models.\n\n\n\nSAM 2 (Segment Anyt"
},
{
"path": "SAM2-Demo/Assets.xcassets/AccentColor.colorset/Contents.json",
"chars": 123,
"preview": "{\n \"colors\" : [\n {\n \"idiom\" : \"universal\"\n }\n ],\n \"info\" : {\n \"author\" : \"xcode\",\n \"version\" : 1\n }"
},
{
"path": "SAM2-Demo/Assets.xcassets/AppIcon.appiconset/Contents.json",
"chars": 904,
"preview": "{\n \"images\" : [\n {\n \"idiom\" : \"mac\",\n \"scale\" : \"1x\",\n \"size\" : \"16x16\"\n },\n {\n \"idiom\" : "
},
{
"path": "SAM2-Demo/Assets.xcassets/Contents.json",
"chars": 63,
"preview": "{\n \"info\" : {\n \"author\" : \"xcode\",\n \"version\" : 1\n }\n}\n"
},
{
"path": "SAM2-Demo/Common/CGImage+Extension.swift",
"chars": 1064,
"preview": "//\n// CGImage+Extension.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 8/20/24.\n//\n\nimport ImageIO\n\nextension CGI"
},
{
"path": "SAM2-Demo/Common/CGImage+RawBytes.swift",
"chars": 3741,
"preview": "/*\n Copyright (c) 2017-2019 M.I. Hollemans\n\n Permission is hereby granted, free of charge, to any person obtaining a c"
},
{
"path": "SAM2-Demo/Common/Color+Extension.swift",
"chars": 1613,
"preview": "//\n// Color+Extension.swift\n// SAM2-Demo\n//\n// Created by Fleetwood on 01/10/2024.\n//\n\nimport SwiftUI\n\n#if canImport("
},
{
"path": "SAM2-Demo/Common/CoreImageExtensions.swift",
"chars": 2266,
"preview": "import CoreImage\nimport CoreImage.CIFilterBuiltins\nimport ImageIO\nimport UniformTypeIdentifiers\n\nextension CIImage {\n "
},
{
"path": "SAM2-Demo/Common/DirectoryDocument.swift",
"chars": 566,
"preview": "//\n// DirectoryDocument.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/10/24.\n//\n\n\nimport SwiftUI\nimport Unifor"
},
{
"path": "SAM2-Demo/Common/MLMultiArray+Image.swift",
"chars": 11866,
"preview": "/*\n Copyright (c) 2017-2020 M.I. Hollemans\n\n Permission is hereby granted, free of charge, to any person obtaining a c"
},
{
"path": "SAM2-Demo/Common/Models.swift",
"chars": 4043,
"preview": "//\n// Models.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 8/19/24.\n//\n\nimport Foundation\nimport SwiftUI\n\nenum S"
},
{
"path": "SAM2-Demo/Common/NSImage+Extension.swift",
"chars": 2846,
"preview": "//\n// NSImage+Extension.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 8/20/24.\n//\n\nimport AppKit\nimport VideoToo"
},
{
"path": "SAM2-Demo/Common/SAM2.swift",
"chars": 9780,
"preview": "//\n// SAM2.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 8/20/24.\n//\n\nimport SwiftUI\nimport CoreML\nimport CoreIm"
},
{
"path": "SAM2-Demo/ContentView.swift",
"chars": 13944,
"preview": "import SwiftUI\nimport PhotosUI\nimport UniformTypeIdentifiers\nimport CoreML\n\nimport os\n\n// TODO: Add reset, bounding box,"
},
{
"path": "SAM2-Demo/Preview Content/Preview Assets.xcassets/Contents.json",
"chars": 63,
"preview": "{\n \"info\" : {\n \"author\" : \"xcode\",\n \"version\" : 1\n }\n}\n"
},
{
"path": "SAM2-Demo/Ripple/Ripple.metal",
"chars": 1657,
"preview": "// Ripple.metal\n\n/*\nSee the LICENSE at the end of the article for this sample’s licensing information.\n\nAbstract:\nA sha"
},
{
"path": "SAM2-Demo/Ripple/RippleModifier.swift",
"chars": 1249,
"preview": "//\n// RippleModifier.swift\n// HuggingChat-Mac\n//\n// Created by Cyril Zakka on 8/28/24.\n//\n\nimport SwiftUI\n\n/* See the"
},
{
"path": "SAM2-Demo/Ripple/RippleViewModifier.swift",
"chars": 1456,
"preview": "//\n// RippleViewModifier.swift\n// HuggingChat-Mac\n//\n// Created by Cyril Zakka on 8/28/24.\n//\n\nimport SwiftUI\n\nstruct"
},
{
"path": "SAM2-Demo/SAM2_1SmallImageEncoderFLOAT16.mlpackage/Manifest.json",
"chars": 617,
"preview": "{\n \"fileFormatVersion\": \"1.0.0\",\n \"itemInfoEntries\": {\n \"4C20C7AA-F42B-4CCD-84C3-73C031A91D48\": {\n "
},
{
"path": "SAM2-Demo/SAM2_1SmallMaskDecoderFLOAT16.mlpackage/Manifest.json",
"chars": 617,
"preview": "{\n \"fileFormatVersion\": \"1.0.0\",\n \"itemInfoEntries\": {\n \"6FA6762D-69A1-4A0B-AB0D-512638FD7ECF\": {\n "
},
{
"path": "SAM2-Demo/SAM2_1SmallPromptEncoderFLOAT16.mlpackage/Manifest.json",
"chars": 617,
"preview": "{\n \"fileFormatVersion\": \"1.0.0\",\n \"itemInfoEntries\": {\n \"BE0329D0-1E5D-4FF9-8ECE-350FC8DE699D\": {\n "
},
{
"path": "SAM2-Demo/SAM2_Demo.entitlements",
"chars": 311,
"preview": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<!DOCTYPE plist PUBLIC \"-//Apple//DTD PLIST 1.0//EN\" \"http://www.apple.com/DTDs/P"
},
{
"path": "SAM2-Demo/SAM2_DemoApp.swift",
"chars": 302,
"preview": "//\n// SAM2_DemoApp.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 8/19/24.\n//\n\nimport SwiftUI\n\n@main\nstruct SAM2_"
},
{
"path": "SAM2-Demo/Views/AnnotationListView.swift",
"chars": 1278,
"preview": "//\n// AnnotationListView.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/8/24.\n//\n\nimport SwiftUI\n\nstruct Annota"
},
{
"path": "SAM2-Demo/Views/ImageView.swift",
"chars": 5735,
"preview": "//\n// ImageView.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/8/24.\n//\n\nimport SwiftUI\n\nstruct ImageView: View"
},
{
"path": "SAM2-Demo/Views/LayerListView.swift",
"chars": 1862,
"preview": "//\n// LayerListView.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/8/24.\n//\n\nimport SwiftUI\n\nstruct LayerListVi"
},
{
"path": "SAM2-Demo/Views/MaskEditor.swift",
"chars": 2283,
"preview": "//\n// MaskEditor.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/10/24.\n//\n\nimport SwiftUI\n\nstruct MaskEditor: V"
},
{
"path": "SAM2-Demo/Views/SubtoolbarView.swift",
"chars": 1956,
"preview": "//\n// SubtoolbarView.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/8/24.\n//\n\nimport SwiftUI\n\nstruct SubToolbar"
},
{
"path": "SAM2-Demo/Views/ZoomableScrollView.swift",
"chars": 2617,
"preview": "//\n// ZoomableScrollView.swift\n// SAM2-Demo\n//\n// Created by Cyril Zakka on 9/12/24.\n//\n\nimport AppKit\nimport SwiftUI"
},
{
"path": "SAM2-Demo.xcodeproj/project.pbxproj",
"chars": 15737,
"preview": "// !$*UTF8*$!\n{\n\tarchiveVersion = 1;\n\tclasses = {\n\t};\n\tobjectVersion = 70;\n\tobjects = {\n\n/* Begin PBXBuildFile section *"
},
{
"path": "SAM2-Demo.xcodeproj/project.xcworkspace/contents.xcworkspacedata",
"chars": 135,
"preview": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<Workspace\n version = \"1.0\">\n <FileRef\n location = \"self:\">\n </FileRef"
},
{
"path": "SAM2-Demo.xcodeproj/project.xcworkspace/xcshareddata/swiftpm/Package.resolved",
"chars": 409,
"preview": "{\n \"originHash\" : \"59ba1edda695b389d6c9ac1809891cd779e4024f505b0ce1a9d5202b6762e38a\",\n \"pins\" : [\n {\n \"identit"
},
{
"path": "sam2-cli/MainCommand.swift",
"chars": 4248,
"preview": "import ArgumentParser\nimport CoreImage\nimport CoreML\nimport ImageIO\nimport UniformTypeIdentifiers\nimport Combine\n\nlet co"
}
]
// ... and 3 more files (download for full content)
About this extraction
This page contains the full source code of the huggingface/sam2-studio GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 38 files (107.3 KB), approximately 27.3k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.