我们需要检查一个数组,集合, 字符串或者是其他的集合类型是空,这在日常开发中是常见的操作之一。之前在oc中会这么判断,我们先排除字符串为nil 或者 null的情况:
- (void)test6{
NSString *text = @"hello world";
NSLog(@"%ld",[text length]);
NSArray *array = @[@"a",@"b",@"c"];
NSLog(@"%ld",[array count]);
}
2020-06-20 16:32:49.657371+0800 objcTest[99024:8064113] 11
2020-06-20 16:32:49.657554+0800 objcTest[99024:8064113] 3
Swift中的运用
按照以往的习惯我们可能会这么使用
func test(){
let text = "hello world"
print("\(text.count)")
let array = ["a","b","c"]
print("\(array.count)")
}
输出:11
输出:3
这么写也是没有一点毛病,但是我们会发现在String系列 以及Array系列都会有个isEmpty属性
在String.Swift中有isEmpty的描述
/// To check whether a string is empty, use its `isEmpty` property instead of
/// comparing the length of one of the views to `0`. Unlike with `isEmpty`,
/// calculating a view's `count` property requires iterating through the
/// elements of the string.
大意就是如果要判断string是否是空,请用isEmpty属性,而不要用count属性,因为这会迭代整个string的元素。
我们知道在Swift中面向协议的思想贯穿始终,我们在Collection协议中找到isEmpty的定义
public protocol Collection: Sequence {
/// A Boolean value indicating whether the collection is empty.
///
/// When you need to check whether your collection is empty, use the
/// `isEmpty` property instead of checking that the `count` property is
/// equal to zero. For collections that don't conform to
/// `RandomAccessCollection`, accessing the `count` property iterates
/// through the elements of the collection.
/// - Complexity: O(1)
var isEmpty: Bool { get }
}
同样是判断空建议用isEmpty ,这里还补充了一下 对于不符合 RandomAccessCollection 协议的类型,count 属性会迭代遍历集合里面的全部元素。
public protocol RandomAccessCollection: BidirectionalCollection
where SubSequence: RandomAccessCollection, Indices: RandomAccessCollection
{
...
override var startIndex: Index { get }
override var endIndex: Index { get }
...
}
我们现在只讨论两个典型的类型String和 Array,它们都符合Collection协议。
String的isEmpty、count 比较
先上代码
func test(){
let text = "hello world,coding is happy,so we will going all along"
let time1 = CFAbsoluteTimeGetCurrent()
for i in 0..<500000 {
print("\(text.count == 0) index:\(i)")
}
let time2 = CFAbsoluteTimeGetCurrent()
print("time:\(time2 - time1)")
}
time:4.98923397064209
func test6(){
let text = "hello world,coding is happy,so we will going all along"
let time1 = CFAbsoluteTimeGetCurrent()
for i in 0..<500000 {
print("\(text.isEmpty) index:\(i)")
}
let time2 = CFAbsoluteTimeGetCurrent()
print("time:\(time2 - time1)")
}
time:5.166522026062012
这不对啊,这和Swift源码代码注释说的不大一样啊,两个几乎差不多,甚至isEmpty 比count更加耗时,另外Check if a Swift Array Is Empty: Best Practice Revisited 说的,也是String 建议用isEmpty 来判断是否为空,不然会迭代整个集合的元素来计算。见鬼了,还是去翻翻源码把。
public struct String {
public // @SPI(Foundation)
var _guts: _StringGuts
@inlinable @inline(__always)
internal init(_ _guts: _StringGuts) {
self._guts = _guts
_invariantCheck()
}
...
}
在StringLegacy.swift 中
extension String {
...
/// A Boolean value indicating whether a string has no characters.
@inlinable
public var isEmpty: Bool {
@inline(__always) get { return _guts.isEmpty }
}
}
从这些能知道String的本质是_StringGuts ,String的方法基本都是基于_StringGuts构建的,在_StringGuts中
xtension _StringGuts {
// The number of code units
@inlinable @inline(__always)
internal var count: Int { return _object.count }
@inlinable @inline(__always)
internal var isEmpty: Bool { return count == 0 }
...
}
能看出来其实String 的isEmpty 和 count 其实还是一样的,isEmpty的具体实现还是拿count == 0来判断的
暂时结论:在String中,isEmpty 和 count 其实是相同的性能,都是通过迭代集合内元素来计算得出
为啥是暂时结论呢,因为我也不明白为啥文档说String的isEmpty 是O(1)操作 现在看来是O(n)操作。
Array的isEmpty、count 比较(也包含ContiguousArray)
在Collection协议中
public var isEmpty: Bool {
return startIndex == endIndex
}
Array.swift中
@inlinable
public var startIndex: Int {
return 0
}
...
/// If the array is empty, `endIndex` is equal to `startIndex`.
@inlinable
public var endIndex: Int {
@inlinable
get {
return _getCount()
}
}
...
@_semantics("array.get_count")
internal func _getCount() -> Int {
return _buffer.immutableCount
}
immutableCount 是从_ArrayBody里面的count的值取出来的,既然Array 是符合 RandomAccessCollection协议的,count 和 isEmpty 的操作性能几乎一样都是O(1)操作,从哪里得出来的呢,得需要从array的扩容来研究,先从append方法来看
@inlinable
@_semantics("array.append_element")
public mutating func append(_ newElement: __owned Element) {
// Separating uniqueness check and capacity check allows hoisting the
// uniqueness check out of a loop.
_makeUniqueAndReserveCapacityIfNotUnique()
let oldCount = _buffer.mutableCount
_reserveCapacityAssumingUniqueBuffer(oldCount: oldCount)
_appendElementAssumeUniqueAndCapacity(oldCount, newElement: newElement)
_endMutation()
}
#源码分析
·第一行的函数,看名字表示,如果这个数组不是惟一的,就把它变成惟一的,而且保留数组容量。在Xcode里面搜索一下这个函数:
@inlinable
@_semantics("array.make_mutable")
internal mutating func _makeUniqueAndReserveCapacityIfNotUnique() {
if _slowPath(!_buffer.beginCOWMutation()) {
_createNewBuffer(bufferIsUnique: false,
minimumCapacity: count + 1,
growForAppend: true)
}
}
/// Returns `true` and puts the buffer in a mutable state iff the buffer's
/// storage is uniquely-referenced.
///
/// - Precondition: The buffer must be immutable.
///
/// - Warning: It's a requirement to call `beginCOWMutation` before the buffer
/// is mutated.
@_alwaysEmitIntoClient
internal mutating func beginCOWMutation() -> Bool {
let isUnique: Bool
if !_isClassOrObjCExistential(Element.self) {
isUnique = _storage.beginCOWMutationUnflaggedNative()
} else if !_storage.beginCOWMutationNative() {
return false
} else {
isUnique = _isNative
}
#if INTERNAL_CHECKS_ENABLED
if isUnique {
_native.isImmutable = false
}
#endif
return isUnique
}
如果数组不是唯一引用的,就会调用_createNewBuffer创建新的数组buffer,假设我们的数组是唯一引用的,这个方法就不会执行 ,咱们继续走
·第二行用一个变量oldCount保存数组当前长度。
·第三行表示:保存数组容量,假如是唯一引用的情况下。之所以做出这样的假设,是因为此前已经调用过_makeUniqueAndReserveCapacityIfNotUnique方法,即使这个数组不是唯一引用,也被复制了一份新的。我们来看看_reserveCapacityAssumingUniqueBuffer方法的实现:
@inlinable
@_semantics("array.mutate_unknown")
internal mutating func _reserveCapacityAssumingUniqueBuffer(oldCount: Int) {
// Due to make_mutable hoisting the situation can arise where we hoist
// _makeMutableAndUnique out of loop and use it to replace
// _makeUniqueAndReserveCapacityIfNotUnique that preceeds this call. If the
// array was empty _makeMutableAndUnique does not replace the empty array
// buffer by a unique buffer (it just replaces it by the empty array
// singleton).
// This specific case is okay because we will make the buffer unique in this
// function because we request a capacity > 0 and therefore _copyToNewBuffer
// will be called creating a new buffer.
let capacity = _buffer.mutableCapacity
_internalInvariant(capacity == 0 || _buffer.isMutableAndUniquelyReferenced())
if _slowPath(oldCount + 1 > capacity) {
_createNewBuffer(bufferIsUnique: capacity > 0,
minimumCapacity: oldCount + 1,
growForAppend: true)
}
}
检查如果oldCount + 1 > capacity 那就说明容量不够了,就需要扩容了,这时候调用_createNewBuffer 方法
@_alwaysEmitIntoClient
internal mutating func _createNewBuffer(
bufferIsUnique: Bool, minimumCapacity: Int, growForAppend: Bool
) {
_internalInvariant(!bufferIsUnique || _buffer.isUniquelyReferenced())
_buffer = _buffer._consumeAndCreateNew(bufferIsUnique: bufferIsUnique,
minimumCapacity: minimumCapacity,
growForAppend: growForAppend)
}
internal __consuming func _consumeAndCreateNew(
bufferIsUnique: Bool, minimumCapacity: Int, growForAppend: Bool
) -> _ArrayBuffer {
let newCapacity = _growArrayCapacity(oldCapacity: capacity,
minimumCapacity: minimumCapacity,
growForAppend: growForAppend)
let c = count
_internalInvariant(newCapacity >= c)
let newBuffer = _ContiguousArrayBuffer<Element>(
_uninitializedCount: c, minimumCapacity: newCapacity)
if bufferIsUnique {
// As an optimization, if the original buffer is unique, we can just move
// the elements instead of copying.
let dest = newBuffer.firstElementAddress
dest.moveInitialize(from: mutableFirstElementAddress,
count: c)
_native.mutableCount = 0
} else {
_copyContents(
subRange: 0..<c,
initializing: newBuffer.mutableFirstElementAddress)
}
return _ArrayBuffer(_buffer: newBuffer, shiftedToStartIndex: 0)
}
另外提下:在_consumeAndCreateNew方法中, _growArrayCapacity 方法里面做了对容量的处理
internal func _growArrayCapacity(_ capacity: Int) -> Int {
return capacity * 2
}
@_alwaysEmitIntoClient
internal func _growArrayCapacity(
oldCapacity: Int, minimumCapacity: Int, growForAppend: Bool
) -> Int {
if growForAppend {
if oldCapacity < minimumCapacity {
// When appending to an array, grow exponentially.
return Swift.max(minimumCapacity, _growArrayCapacity(oldCapacity))
}
return oldCapacity
}
// If not for append, just use the specified capacity, ignoring oldCapacity.
// This means that we "shrink" the buffer in case minimumCapacity is less
// than oldCapacity.
return minimumCapacity
}
当旧的数组容量小于minimumCapacity的时候,minimumCapacity就是oldCount + 1,这时候容量的扩容其实是呈指数增长,每次扩容,旧的容量就翻两倍。
同时也解释了append方法不是线程安全的,如果在某一个线程中插入新元素,导致了数组扩容,那么Swift会创建一个新的数组(意味着地址完全不同)。然后ARC会为我们自动释放旧的数组,但这时候可能另一个线程还在访问旧的数组对象。
初始化新的newBuffer 用_ContiguousArrayBuffer构造器 同时对数组storageHeader进行初始化
internal func _initStorageHeader(count: Int, capacity: Int) {
#if _runtime(_ObjC)
let verbatim = _isBridgedVerbatimToObjectiveC(Element.self)
#else
let verbatim = false
#endif
// We can initialize by assignment because _ArrayBody is a trivial type,
// i.e. contains no references.
_storage.countAndCapacity = _ArrayBody(
count: count,
capacity: capacity,
elementTypeIsBridgedVerbatim: verbatim)
}
数组storage起始于_ArrayBody,结束于连续的数组元素 , _ArrayBody主要是描述当前array 的信息同理ContiguousArray也是,此时数组count ,capacity等都会到_ArrayBody 对象里;
取count最终我们会追踪到
internal var count: Int {
get {
return _storage.countAndCapacity.count
}
nonmutating set {
_internalInvariant(newValue >= 0)
_internalInvariant(
newValue <= mutableCapacity,
"Can't grow an array buffer past its capacity")
mutableStorage.countAndCapacity.count = newValue
}
}
我们会直接从countAndCapacity(_ArrayBody类型)里面取出count,操作为O(1),另外array的isEmpty是根据startIndex endIndex是否相等来判断的,endIndex 取值就是取count,所以对于array 这些服从RandomAccessCollection协议的类型来说,isEmpty 和 count == 0 判断是一模一样的,都是O(1)操作。
总结:
1.String 的isEmpty 和 count==0判断效率是一样的,并不像想象中 count==0 会迭代集合里面的全部元素 O(n)。
2.Array这些符合RandomAccessCollection协议的类型同样,isEmpty 和 count==0判断效率是一样的 O(1)。
3.另外呢对于数据量大的数组,注意数据过大导致指数级增长的扩容带来的性能消耗。
4.数组的变更注意如果不是唯一引用的情况,要注意线程安全处理,因为会摧毁旧的数组,创建扩容两倍长度于旧数组的新数组,把旧数组的元素拷贝过来,再把元素插入到新数组中,会形成新的数组地址。
参考资料:
官方文档
Check if a Swift Array Is Empty: Best Practice Revisited
Swift数组扩容原理