Skip to content

Instantly share code, notes, and snippets.

@nicklockwood
Last active May 8, 2021 10:54
Show Gist options
  • Star 13 You must be signed in to star a gist
  • Fork 1 You must be signed in to fork a gist
  • Save nicklockwood/81b9f122f3db9e7132be7bd61d0c0cea to your computer and use it in GitHub Desktop.
Save nicklockwood/81b9f122f3db9e7132be7bd61d0c0cea to your computer and use it in GitHub Desktop.
extension Data {
init?(hexString: String) {
let count = hexString.count / 2
var data = Data(capacity: count)
var i = hexString.startIndex
for _ in 0 ..< count {
let j = hexString.index(after: i)
if var byte = UInt8(hexString[i ... j], radix: 16) {
data.append(&byte, count: 1)
} else {
return nil
}
i = hexString.index(after: j)
}
self = data
}
}
@arthurpalves
Copy link

I find it interesting for whoever checking this out to see why Nick came up with it in the first place:
See Twitter thread

@josephlord
Copy link

Do you need indexes into the string at all? Can you just create an iterator and call next() twice?

I haven't performance tested and I don't like the new string creation in the UInt8 construction but maybe there is a better way to do that, it is nice to get rid of the indexing though

extension Data {
    init?(hexString: String) {
        let count = hexString.count / 2
        var data = Data(capacity: count)
        var itr = hexString.makeIterator()
        for _ in 0 ..< count {
            if var byte = UInt8("\(itr.next()!)\(itr.next()!)", radix: 16) {
                data.append(&byte, count: 1)
            } else {
                return nil
            }
        }
        self = data
    }
}

@fallback
Copy link

@josephlord
I like your idea about Iterator, it looks more... erm, swift-ish.
IMHO, Iterator here gives another piece of complexity (under the hood) and, supposedly, takes more memory than having int (startIndex) stored in local var.
Another concern is that you're explicitly unwrapping the result of .next call so there might be crash.
Of course, that's my concerns only, I don't pretend to judge, just interested in promising conversation here.

@josephlord
Copy link

@fallback I don't think the iterator makes a copy, it is really very similar under the hood with it just managing the index.

Regarding the force unwraps I think it preserves the original behaviour (indexing also crashes when out of bounds) but in both cases the count being hexString.count / 2 means that it is safe.

The even Swiftier approach would probably map the string into pairs of letters (itself a function taking an iterator) and then zip that with the data indices and forEach on it. That does involve multiple passes though which I was avoiding (although it can probably be done lazily to get back to the same fundamental operations).

@mayoff
Copy link

mayoff commented Apr 14, 2021

embrace your inner Substring

import Foundation

extension Data {
    init?<Hex: StringProtocol>(hexString: Hex) {
        var hex = hexString[...]

        self.init(capacity: hex.count / 2)
        while !hex.isEmpty {
            guard
                let hi = hex.popFirst()?.hexDigitValue,
                let lo = hex.popFirst()?.hexDigitValue
            else { return nil }
            append(UInt8(hi << 4 | lo))
        }
    }
}

@josephlord
Copy link

That works and shows me hexDigitValue which I should have used in the iterator approach. Is there any advantage to the Substring over the iterator? I imagine it just has an additional end index though I haven’t checked.

Also if we use iterator it can be made generic over collections of Characters so it should work on Strings, Substrings and arrays of characters. It could work on Sequences if we didn’t need the count upfront.

extension Data {
    init?<S>(hexString: S) where S : Collection, S.Element == Character {
        let count = hexString.count / 2
        var data = Data(capacity: count)
        var itr = hexString.makeIterator()
        for _ in 0 ..< count {
            guard let hi = itr.next()?.hexDigitValue,
                  let lo = itr.next()?.hexDigitValue else { return nil }
            data.append(UInt8(hi << 4 | lo))
        }
        self = data
    }
}

@nicklockwood
Copy link
Author

nicklockwood commented Apr 14, 2021

From my own timings, popping a substring is slower than using index.after(). Not sure how it compares to iterator: https://twitter.com/nicklockwood/status/1382142248247308292

@mayoff
Copy link

mayoff commented Apr 14, 2021

Try using hexString.utf8.withContiguousStorageIfAvailable. You'll have to write your own hexDigitValue though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment