Go 接口转换原理

通过前面提到的 iface 的源码可以看到，实际上它包含接口的类型 *interface type 和实体类型的类型 *_type，这两者都是 iface 的字段 *itab 的成员。也就是说生成一个 itab 同时需要接口的类型和实体的类型。

1	<interface 类型，实体类型> ->itable

当判定一种类型是否满足某个接口时，Go 使用类型的方法集和接口所需要的方法集进行匹配，如果类型的方法集完全包含接口的方法集，则可认为该类型实现了该接口。

例如某类型有 m 个方法，某接口有 n 个方法，则很容易知道这种判定的时间复杂度为 O(mn)，Go 会对方法集的函数按照函数名的字典序进行排序，所以实际的时间复杂度为 O(m+n)。

探索将一个接口转换给另外一个接口背后的原理，当然，能转换的原因必然是类型兼容。

直接来看一个例子：

package main

import "fmt"

type coder interface {
	code()
	run()
}

type runner interface {
	run()
}

type Gopher struct {
	language string
}

func (g Gopher) code() {
	return
}

func (g Gopher) run() {
	return
}

func main() {
	var c coder = Gopher{}

	var r runner
	r = c
	fmt.Println(c, r)
}

简单解释上述代码：定义了两个 interface: coder 和 runner。定义了一个实体类型 Gopher，类型 Gopher 实现了两个方法，分别是 run() 和 code()。main 函数里定义了一个接口变量 c，绑定了一个 Gopher 对象，之后将 c 赋值给另外一个接口变量 r 。赋值成功的原因是 c 中包含 run() 方法。这样，两个接口变量完成了转换。

执行命令：

1	go tool compile -S ./src/main.go

得到 main 函数的汇编命令，可以看到： r = c 这一行语句实际上是调用了 runtime.convI2I(SB)，也就是 convI2I 函数，从函数名来看，就是将一个 interface 转换成另外一个 interface，看下它的源代码：

func convI2I(inter *interfacetype, i iface) (r iface) {
	tab := i.tab
	if tab == nil {
		return
	}
	if tab.inter == inter {
		r.tab = tab
		r.data = i.data
		return
	}
	r.tab = getitab(inter, tab._type, false)
	r.data = i.data
	return
}

代码比较简单，函数参数 inter 表示接口类型，i 表示绑定了实体类型的接口，r 则表示接口转换了之后的新的 iface。通过前面的分析，我们又知道， iface 是由 tab 和 data 两个字段组成。所以，实际上 convI2I 函数真正要做的事，找到新 interface 的 tab 和 data，就大功告成了。

convI2I函数查找tab和data

tab 包括接口类型 interfacetype 和实体类型 _type。所以最关键的语句是 r.tab = getitab(inter, tab._type, false)。

因此，重点来看下 getitab 函数的源码，只看关键的地方：

func getitab(inter *interfacetype, typ *_type, canfail bool) *itab {
	// ……

    // 根据 inter, typ 计算出 hash 值
	h := itabhash(inter, typ)

	// look twice - once without lock, once with.
	// common case will be no lock contention.
	var m *itab
	var locked int
	for locked = 0; locked < 2; locked++ {
		if locked != 0 {
			lock(&ifaceLock)
        }
        
        // 遍历哈希表的一个 slot
		for m = (*itab)(atomic.Loadp(unsafe.Pointer(&hash[h]))); m != nil; m = m.link {

            // 如果在 hash 表中已经找到了 itab（inter 和 typ 指针都相同）
			if m.inter == inter && m._type == typ {
                // ……
                
				if locked != 0 {
					unlock(&ifaceLock)
				}
				return m
			}
		}
	}

    // 在 hash 表中没有找到 itab，那么新生成一个 itab
	m = (*itab)(persistentalloc(unsafe.Sizeof(itab{})+uintptr(len(inter.mhdr)-1)*sys.PtrSize, 0, &memstats.other_sys))
	m.inter = inter
    m._type = typ
    
    // 添加到全局的 hash 表中
	additab(m, true, canfail)
	unlock(&ifaceLock)
	if m.bad {
		return nil
	}
	return m
}

简单总结：getitab 函数会根据 interfacetype 和 _type 去全局的 itab 哈希表中查找，如果能找到，则直接返回；否则，会根据给定的 interfacetype 和 _type 新生成一个 itab，并插入到 itab 哈希表，这样下一次就可以直接拿到 itab。

写itab

这里查找了两次，并且第二次上锁了，这是因为如果第一次没找到，在第二次仍然没有找到相应的 itab 的情况下，需要新生成一个，并且写入哈希表，因此需要加锁。这样，其他协程在查找相同的 itab 并且也没有找到时，第二次查找时，会被挂住，之后，就会查到第一个协程写入哈希表的 itab。

再来看一下 additab 函数的代码：

// 检查 _type 是否符合 interface_type 并且创建对应的 itab 结构体 将其放到 hash 表中
func additab(m *itab, locked, canfail bool) {
	inter := m.inter
	typ := m._type
	x := typ.uncommon()

	// both inter and typ have method sorted by name,
	// and interface names are unique,
	// so can iterate over both in lock step;
    // the loop is O(ni+nt) not O(ni*nt).
    // 
    // inter 和 typ 的方法都按方法名称进行了排序
    // 并且方法名都是唯一的。所以循环的次数是固定的
    // 只用循环 O(ni+nt)，而非 O(ni*nt)
	ni := len(inter.mhdr)
	nt := int(x.mcount)
	xmhdr := (*[1 << 16]method)(add(unsafe.Pointer(x), uintptr(x.moff)))[:nt:nt]
	j := 0
	for k := 0; k < ni; k++ {
		i := &inter.mhdr[k]
		itype := inter.typ.typeOff(i.ityp)
		name := inter.typ.nameOff(i.name)
		iname := name.name()
		ipkg := name.pkgPath()
		if ipkg == "" {
			ipkg = inter.pkgpath.name()
		}
		for ; j < nt; j++ {
			t := &xmhdr[j]
            tname := typ.nameOff(t.name)
            // 检查方法名字是否一致
			if typ.typeOff(t.mtyp) == itype && tname.name() == iname {
				pkgPath := tname.pkgPath()
				if pkgPath == "" {
					pkgPath = typ.nameOff(x.pkgpath).name()
				}
				if tname.isExported() || pkgPath == ipkg {
					if m != nil {
                        // 获取函数地址，并加入到itab.fun数组中
						ifn := typ.textOff(t.ifn)
						*(*unsafe.Pointer)(add(unsafe.Pointer(&m.fun[0]), uintptr(k)*sys.PtrSize)) = ifn
					}
					goto nextimethod
				}
			}
		}
        // ……
        
		m.bad = true
		break
	nextimethod:
	}
	if !locked {
		throw("invalid itab locking")
    }

    // 计算 hash 值
    h := itabhash(inter, typ)
    // 加到Hash Slot链表中
	m.link = hash[h]
	m.inhash = true
	atomicstorep(unsafe.Pointer(&hash[h]), unsafe.Pointer(m))
}

additab 会检查 itab 持有的 interfacetype 和 _type 是否符合，就是看 _type 是否完全实现了 interfacetype 的方法，也就是看两者的方法列表重叠的部分就是 interfacetype 所持有的方法列表。注意到其中有一个双层循环，乍一看，循环次数是 ni * nt，但由于两者的函数列表都按照函数名称进行了排序，因此最终只执行了 ni + nt 次，代码里通过一个小技巧来实现：第二层循环并没有从 0 开始计数，而是从上一次遍历到的位置开始。

求 hash 值的函数比较简单：

func itabhash(inter *interfacetype, typ *_type) uint32 {
	h := inter.typ.hash
	h += 17 * typ.hash
	return h % hashSize
}

hashSize 的值是 1009。

更一般的，当把实体类型赋值给接口的时候，会调用 conv 系列函数，例如空接口调用 convT2E 系列、非空接口调用 convT2I 系列。这些函数比较相似：

具体类型转空接口时，_type 字段直接复制源类型的 _type；调用 mallocgc 获得一块新内存，把值复制进去，data 再指向这块新内存。

具体类型转非空接口时，入参 tab 是编译器在编译阶段预先生成好的，新接口 tab 字段直接指向入参 tab 指向的 itab；调用 mallocgc 获得一块新内存，把值复制进去，data 再指向这块新内存。

而对于接口转接口，itab 调用 getitab 函数获取。只用生成一次，之后直接从 hash 表中获取。

参考资料

【接口赋值、反射】http://wudaijun.com/2018/01/go-interface-implement/

【itab】http://legendtkl.com/2017/07/01/golang-interface-implement/

【和 C++ 的对比】https://www.jianshu.com/p/b38b1719636e

【getitab源码说明】https://www.twblogs.net/a/5c245d59bd9eee16b3db561d