前言
这段时间一直在做区块链公链项目开发,主要是基于bitcoin-core源码进行开发,理解区块链原理及基础概念;个人同时对于以太坊也感兴趣,所以准备拿go-ethereum学习一番,过程会持续几个月,这里把学习笔记记录下来;本人现在对ethereum也是菜鸟小白,这篇文章主要是针对go-ethereum小白,大牛就请绕过吧。
现在开始吧
区块链基本概念:交易、区块、区块链,是区块链中的核心基础,今天就从这几个概念入手分析吧。(这里忍不住多说几句,任何科学领域基础概念真的很重要,工作中遇到的很多问题都是因为基本概念理解不到位,解决问题时需要把基本概念重新理解一遍;依然记得若干年之前南京大学徐家福教授的演讲,一位同学问怎么才能学好计算机,徐教授什么话也没说,拿起粉笔在黑板上颤抖着手写到:“基础概念,基础概念,基础概念”,大家知道基础知识的重要性了吧。如果大家不知道徐教授就去百度吧,新中国计算机领域的创始人之一,学术界流行一句话“北有杨芙清(清华的),南有徐家福(南大的)”)。
交易
core/types/transaction.go
type Transaction struct {
data txdata //交易的内容,在txdata类型中存储
// caches 以下三个字段是否只在memory里存储?
hash atomic.Value // 交易的hash
size atomic.Value // 交易的大小
from atomic.Value // 交易的发起方
}
type txdata struct {
AccountNonce uint64 // account nonce?干什么的,不知道
Price *big.Int // gasprice
GasLimit uint64 // gaslimit,我们知道在写智能合约时设置gas limit,能够防止程序异常消耗太多的gas
Recipient *common.Address // 交易的receiver, nil means contract creation
Amount *big.Int //交易以太的数量??有待确认
Payload []byte //交易可以携带payload,智能合约的字节码存放在此??有待确认
// Signature values,和签名相关,暂不深究
V *big.Int
R *big.Int
S *big.Int
// This is only used when marshaling to JSON.
Hash *common.Hash
}
以上是交易的结构体定义,下面看一下和交易相关的函数和方法(golang 既有函数的概念,又有方法的概念)。
func NewTransaction(nonce uint64, to common.Address, amount *big.Int, gasLimit uint64, gasPrice *big.Int, data []byte)
*Transaction {
return newTransaction(nonce, &to, amount, gasLimit, gasPrice, data)
}
func NewContractCreation(nonce uint64, amount *big.Int, gasLimit uint64, gasPrice *big.Int, data []byte) *Transaction {
return newTransaction(nonce, nil, amount, gasLimit, gasPrice, data)
}
上面的函数,NewTransaction是创建普通的交易,NewContractCreation是用来创建智能合约,两者都调用了内部函数(非导出)newTransaction,区别很明显就是NewContractCreation把交易接收地址设置为nil。
func newTransaction(nonce uint64, to *common.Address, amount *big.Int, gasLimit uint64, gasPrice *big.Int, data []byte)
*Transaction {
if len(data) > 0 {
data = common.CopyBytes(data)
}
d := txdata{
AccountNonce: nonce,
Recipient: to,
Payload: data,
Amount: new(big.Int),
GasLimit: gasLimit,
Price: new(big.Int),
V: new(big.Int),
R: new(big.Int),
S: new(big.Int),
}
if amount != nil {
d.Amount.Set(amount)
}
if gasPrice != nil {
d.Price.Set(gasPrice)
}
return &Transaction{data: d}
}
以上函数简单,先创建txdata对象,然后初始化数据成员。
下面的EncodeRLP与DecodeRLP主要负责交易的 RLP编解码。
// EncodeRLP implements rlp.Encoder
func (tx *Transaction) EncodeRLP(w io.Writer) error {
return rlp.Encode(w, &tx.data)
}
// DecodeRLP implements rlp.Decoder
func (tx *Transaction) DecodeRLP(s *rlp.Stream) error {
_, size, _ := s.Kind()
err := s.Decode(&tx.data)
if err == nil {
tx.size.Store(common.StorageSize(rlp.ListSize(size)))
}
return err
}
这里先简单介绍几个与交易相关的函数和方法,更多的函数和方法大家可以查阅源代码。不展开的原因我是想先追求整体理解,然后再是具体实现。
区块
core/types/block.go
先看一下区块头的数据结构:
// Header represents a block header in the Ethereum blockchain.
type Header struct {
ParentHash common.Hash //上一个区块的hash,用来把区块组织成链
UncleHash common.Hash //unclehash,理解不是太深入,后面再说
Coinbase common.Address //POW共识算法coinbase交易对应的地址
Root common.Hash // state trie tree Root
TxHash common.Hash // transactions Root
ReceiptHash common.Hash // receipts Root
Bloom Bloom //应该是bloom filter,暂不深入
Difficulty *big.Int //POW共识算法的难度值,会随着区块链高度进行调整
Number *big.Int // 区块高度
GasLimit uint64 // 区块头的gaslimit??还不太清楚是什么作用
GasUsed uint64 //整个区块中交易的消耗的gas?有待确认
Time *big.Int //出块时间
Extra []byte
MixDigest common.Hash //暂时不知道用来干什么
Nonce BlockNonce //POW共识算法中的nonce
}
下面的方法获取block header 的hash值:
// Hash returns the block hash of the header, which is simply the keccak256 hash of its
// RLP encoding.
func (h *Header) Hash() common.Hash {
return rlpHash(h)
}
// HashNoNonce returns the hash which is used as input for the proof-of-work search.
func (h *Header) HashNoNonce() common.Hash {
return rlpHash([]interface{}{
h.ParentHash,
h.UncleHash,
h.Coinbase,
h.Root,
h.TxHash,
h.ReceiptHash,
h.Bloom,
h.Difficulty,
h.Number,
h.GasLimit,
h.GasUsed,
h.Time,
h.Extra,
})
}
func rlpHash(x interface{}) (h common.Hash) {
hw := sha3.NewKeccak256()
rlp.Encode(hw, x)
hw.Sum(h[:0])
return h
}
以下是区块体的数据结构:
// Body is a simple (mutable, non-safe) data container for storing and moving
// a block's data contents (transactions and uncles) together.
type Body struct {
Transactions []*Transaction //区块体中的所有交易
Uncles []*Header//还不太明白这个字段作用,难道和分叉有关?
}
区块的数据结构如下:
// Block represents an entire block in the Ethereum blockchain.
type Block struct {
header *Header // 区块头指针
uncles []*Header
transactions Transactions //区块体中的交易
// caches
hash atomic.Value
size atomic.Value
// Td is used by package core to store the total difficulty
// of the chain up to and including the block.
td *big.Int //链上的总体难度值
// These fields are used by package eth to track
// inter-peer block relay.
//如注释,这两个字段用来追踪节点之间区块的转发
ReceivedAt time.Time //区块收到的时间
ReceivedFrom interface{} //标记本区块是从哪个对端节点收到的
}
// StorageBlock defines the RLP encoding of a Block stored in the
// state database. The StorageBlock encoding contains fields that
// would otherwise need to be recomputed.
type StorageBlock Block
// "external" block encoding. used for eth protocol, etc.
type extblock struct {
Header *Header
Txs []*Transaction
Uncles []*Header
}
//以上的Block数据结构是在内存中存储的,在数据库中实际存储的结构体是 storageblock,定义如下:
// "storage" block encoding. used for database.
type storageblock struct {
Header *Header
Txs []*Transaction
Uncles []*Header
TD *big.Int
}
下面简单分析下创建block的函数NewBlock:
// NewBlock creates a new block. The input data is copied,
// changes to header and to the field values will not affect the block.
// The values of TxHash, UncleHash, ReceiptHash and Bloom in header
// are ignored and set to values derived from the given txs, uncles and receipts.
func NewBlock(header *Header, txs []*Transaction, uncles []*Header, receipts []*Receipt) *Block {
b := &Block{header: CopyHeader(header), td: new(big.Int)} //创建block 对象
// TODO: panic if len(txs) != len(receipts)
if len(txs) == 0 {
b.header.TxHash = EmptyRootHash //如果区块中没有交易,TXHash赋值为 EmptyRootHash
} else {
b.header.TxHash = DeriveSha(Transactions(txs))
b.transactions = make(Transactions, len(txs))
copy(b.transactions, txs) //创建并拷贝tx的副本
}
if len(receipts) == 0 {
b.header.ReceiptHash = EmptyRootHash
} else {
b.header.ReceiptHash = DeriveSha(Receipts(receipts))
b.header.Bloom = CreateBloom(receipts)
}
if len(uncles) == 0 {
b.header.UncleHash = EmptyUncleHash
} else {
b.header.UncleHash = CalcUncleHash(uncles)
b.uncles = make([]*Header, len(uncles))
for i := range uncles {
b.uncles[i] = CopyHeader(uncles[i])
}
}
return b
}
// NewBlockWithHeader creates a block with the given header data. The
// header data is copied, changes to header and to the field values
// will not affect the block.
func NewBlockWithHeader(header *Header) *Block {
return &Block{header: CopyHeader(header)}
}
// DecodeRLP与 EncodeRLP方法是block的RLP编解码的具体实现:
// DecodeRLP decodes the Ethereum
func (b *Block) DecodeRLP(s *rlp.Stream) error {
var eb extblock
_, size, _ := s.Kind()
if err := s.Decode(&eb); err != nil {
return err
}
b.header, b.uncles, b.transactions = eb.Header, eb.Uncles, eb.Txs
b.size.Store(common.StorageSize(rlp.ListSize(size)))
return nil
}
// EncodeRLP serializes b into the Ethereum RLP block format.
func (b *Block) EncodeRLP(w io.Writer) error {
return rlp.Encode(w, extblock{
Header: b.header,
Txs: b.transactions,
Uncles: b.uncles,
})
}
这部分简单介绍了区块头、区块体、区块、数据库区块的数据结构和方法,详情请参考源码。
区块链
core/blockchain.go
介绍了交易、区块头、区块后,我们来看下区块链的数据结构:
// BlockChain represents the canonical chain given a database with a genesis
// block. The Blockchain manages chain imports, reverts, chain reorganisations.
// Importing blocks in to the block chain happens according to the set of rules
// defined by the two stage Validator. Processing of blocks is done using the
// Processor which processes the included transaction. The validation of the state
// is done in the second part of the Validator. Failing results in aborting of the import.
//
// The BlockChain also helps in returning blocks from **any** chain included
// in the database as well as blocks that represents the canonical chain. It's
// important to note that GetBlock can return any block and does not need to be
// included in the canonical one where as GetBlockByNumber always represents the
// canonical chain.
type BlockChain struct {
chainConfig *params.ChainConfig // Chain & network configuration 字面意思是链和网络的配置,这些配置暂不深究
cacheConfig *CacheConfig // Cache configuration for pruning
db ethdb.Database // Low level persistent database to store final content in,底层level db
triegc *prque.Prque // Priority queue mapping block numbers to tries to gc 不知道什么作用
gcproc time.Duration // Accumulates canonical block processing for trie dumping
hc *HeaderChain
rmLogsFeed event.Feed // event.Feed存放着订阅者信息,blockchain有事件发生时通知订阅者
chainFeed event.Feed
chainSideFeed event.Feed
chainHeadFeed event.Feed
logsFeed event.Feed
scope event.SubscriptionScope
genesisBlock *types.Block // 创世区块指针
mu sync.RWMutex // global mutex for locking chain operations
chainmu sync.RWMutex // blockchain insertion lock
procmu sync.RWMutex // block processor lock
checkpoint int // checkpoint counts towards the new checkpoint
currentBlock atomic.Value // 当前区块
currentFastBlock atomic.Value // Current head of the fast-sync chain (may be above the block chain!)
stateCache state.Database // State database to reuse between imports (contains state cache)
bodyCache *lru.Cache // Cache for the most recent block bodies
bodyRLPCache *lru.Cache // Cache for the most recent block bodies in RLP encoded format
blockCache *lru.Cache // Cache for the most recent entire blocks
futureBlocks *lru.Cache // future blocks are blocks added for later processing
quit chan struct{} // blockchain quit channel
running int32 // running must be called atomically
// procInterrupt must be atomically called
procInterrupt int32 // interrupt signaler for block processing
wg sync.WaitGroup // chain processing wait group for shutting down
engine consensus.Engine // 共识算法引擎,不同算法识别了Engine接口
processor Processor // block processor interface,非常重要的数据成员
validator Validator // block and state validator interface,非常重要的数据成员
vmConfig vm.Config // 虚拟机的配置
badBlocks *lru.Cache // Bad block cache
}
以上就是区块链的数据结构,好多数据成员不明白什么作用,没有问题直接跳过,随着分析的不断深入,我们就会理解。
// NewBlockChain returns a fully initialised block chain using information available in the database. It initialises the default Ethereum Validator and Processor.
func NewBlockChain(db ethdb.Database, cacheConfig *CacheConfig, chainConfig *params.ChainConfig, engine consensus.Engine, vmConfig vm.Config) (*BlockChain, error) {
// 如果 cache config 为空,创建对象
if cacheConfig == nil {
cacheConfig = &CacheConfig{
TrieNodeLimit: 256 * 1024 * 1024,
TrieTimeLimit: 5 * time.Minute,
}
}
// 初始化各种数据成员
bodyCache, _ := lru.New(bodyCacheLimit)
bodyRLPCache, _ := lru.New(bodyCacheLimit)
blockCache, _ := lru.New(blockCacheLimit)
futureBlocks, _ := lru.New(maxFutureBlocks)
badBlocks, _ := lru.New(badBlockLimit)
// 创建区块链对象
bc := &BlockChain{
chainConfig: chainConfig,
cacheConfig: cacheConfig,
db: db,
triegc: prque.New(),
stateCache: state.NewDatabase(db),
quit: make(chan struct{}),
bodyCache: bodyCache,
bodyRLPCache: bodyRLPCache,
blockCache: blockCache,
futureBlocks: futureBlocks,
engine: engine,
vmConfig: vmConfig,
badBlocks: badBlocks,
}
// 创建区块链的验证器和处理器,后面会看到这两个数据成员非常重要
bc.SetValidator(NewBlockValidator(chainConfig, bc, engine))
bc.SetProcessor(NewStateProcessor(chainConfig, bc, engine))
var err error
bc.hc, err = NewHeaderChain(db, chainConfig, engine, bc.getProcInterrupt)
if err != nil {
return nil, err
}
// 获取创世区块
bc.genesisBlock = bc.GetBlockByNumber(0)
if bc.genesisBlock == nil {
return nil, ErrNoGenesis
}
if err := bc.loadLastState(); err != nil {
return nil, err
}
// Check the current state of the block hashes and make sure that we do not have any of the bad blocks in our chain
for hash := range BadHashes {
if header := bc.GetHeaderByHash(hash); header != nil {
// get the canonical block corresponding to the offending header's number
headerByNumber := bc.GetHeaderByNumber(header.Number.Uint64())
// make sure the headerByNumber (if present) is in our current canonical chain
if headerByNumber != nil && headerByNumber.Hash() == header.Hash() {
log.Error("Found bad hash, rewinding chain", "number", header.Number, "hash", header.ParentHash)
bc.SetHead(header.Number.Uint64() - 1)
log.Error("Chain rewind was successful, resuming normal operation")
}
}
}
// Take ownership of this particular state
go bc.update()
return bc, nil
}
以上就是初始化区块链的方法,BlockChain还有好多重要方法,由于篇幅有限,这里就不一一展开了,以后分析需要时回来再看。下面简单标记了几个重要方法,详情可以参考代码。
// loadLastState loads the last known chain state from the database. This method
// assumes that the chain manager mutex is held.
func (bc *BlockChain) loadLastState() error {
......
}
// SetHead rewinds the local chain to a new head. In the case of headers, everythingabove the new head will be deleted and the new one set. In the case of blocks though, the head may be further rewound if block bodies are missing (non-archive nodes after a fast sync).
func (bc *BlockChain) SetHead(head uint64) error {
......
}
// CurrentBlock retrieves the current head block of the canonical chain. The block is retrieved from the blockchain's internal cache.
func (bc *BlockChain) CurrentBlock() *types.Block {
......
}
// SetProcessor sets the processor required for making state modifications.
func (bc *BlockChain) SetProcessor(processor Processor) {
......
}
// SetValidator sets the validator which is used to validate incoming blocks.
func (bc *BlockChain) SetValidator(validator Validator) {
......
}
// Validator returns the current validator.
func (bc *BlockChain) Validator() Validator {
......
}
// Processor returns the current processor.
func (bc *BlockChain) Processor() Processor {
......
}
总结:
这篇文章简单介绍了go ethetheum源码中交易、区块、区块链的数据结构以及初始化方法,希望大家有所了解。下一篇文章将会介绍交易池,看一下交易是如何被打包成区块的,区块是如何被广播到网络上的。由于笔者也是菜鸟,很多概念理解不深入,不理解的概念暂且放过,分析深入后就会理解的。