序
本文主要研究一下storm trident的coordinator
实例
代码示例
@Test
public void testDebugTopologyBuild(){
FixedBatchSpout spout = new FixedBatchSpout(new Fields("user", "score"), 3,
new Values("nickt1", 4),
new Values("nickt2", 7),
new Values("nickt3", 8),
new Values("nickt4", 9),
new Values("nickt5", 7),
new Values("nickt6", 11),
new Values("nickt7", 5)
);
spout.setCycle(false);
TridentTopology topology = new TridentTopology();
Stream stream1 = topology.newStream("spout1",spout)
.each(new Fields("user", "score"), new BaseFunction() {
@Override
public void execute(TridentTuple tuple, TridentCollector collector) {
System.out.println("tuple:"+tuple);
}
},new Fields());
topology.build();
}
- 这里使用的spout为FixedBatchSpout,它是IBatchSpout类型
拓扑图
MasterBatchCoordinator
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
public class MasterBatchCoordinator extends BaseRichSpout {
public static final Logger LOG = LoggerFactory.getLogger(MasterBatchCoordinator.class);
public static final long INIT_TXID = 1L;
public static final String BATCH_STREAM_ID = "$batch";
public static final String COMMIT_STREAM_ID = "$commit";
public static final String SUCCESS_STREAM_ID = "$success";
private static final String CURRENT_TX = "currtx";
private static final String CURRENT_ATTEMPTS = "currattempts";
private List<TransactionalState> _states = new ArrayList();
TreeMap<Long, TransactionStatus> _activeTx = new TreeMap<Long, TransactionStatus>();
TreeMap<Long, Integer> _attemptIds;
private SpoutOutputCollector _collector;
Long _currTransaction;
int _maxTransactionActive;
List<ITridentSpout.BatchCoordinator> _coordinators = new ArrayList();
List<String> _managedSpoutIds;
List<ITridentSpout> _spouts;
WindowedTimeThrottler _throttler;
boolean _active = true;
public MasterBatchCoordinator(List<String> spoutIds, List<ITridentSpout> spouts) {
if(spoutIds.isEmpty()) {
throw new IllegalArgumentException("Must manage at least one spout");
}
_managedSpoutIds = spoutIds;
_spouts = spouts;
LOG.debug("Created {}", this);
}
public List<String> getManagedSpoutIds(){
return _managedSpoutIds;
}
@Override
public void activate() {
_active = true;
}
@Override
public void deactivate() {
_active = false;
}
@Override
public void open(Map conf, TopologyContext context, SpoutOutputCollector collector) {
_throttler = new WindowedTimeThrottler((Number)conf.get(Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS), 1);
for(String spoutId: _managedSpoutIds) {
_states.add(TransactionalState.newCoordinatorState(conf, spoutId));
}
_currTransaction = getStoredCurrTransaction();
_collector = collector;
Number active = (Number) conf.get(Config.TOPOLOGY_MAX_SPOUT_PENDING);
if(active==null) {
_maxTransactionActive = 1;
} else {
_maxTransactionActive = active.intValue();
}
_attemptIds = getStoredCurrAttempts(_currTransaction, _maxTransactionActive);
for(int i=0; i<_spouts.size(); i++) {
String txId = _managedSpoutIds.get(i);
_coordinators.add(_spouts.get(i).getCoordinator(txId, conf, context));
}
LOG.debug("Opened {}", this);
}
@Override
public void close() {
for(TransactionalState state: _states) {
state.close();
}
LOG.debug("Closed {}", this);
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
// in partitioned example, in case an emitter task receives a later transaction than it's emitted so far,
// when it sees the earlier txid it should know to emit nothing
declarer.declareStream(BATCH_STREAM_ID, new Fields("tx"));
declarer.declareStream(COMMIT_STREAM_ID, new Fields("tx"));
declarer.declareStream(SUCCESS_STREAM_ID, new Fields("tx"));
}
@Override
public Map<String, Object> getComponentConfiguration() {
Config ret = new Config();
ret.setMaxTaskParallelism(1);
ret.registerSerialization(TransactionAttempt.class);
return ret;
}
//......
}
- prepare方法首先从Config.TOPOLOGY_TRIDENT_BATCH_EMIT_INTERVAL_MILLIS(
topology.trident.batch.emit.interval.millis,在defaults.yaml默认为500
)读取触发batch的频率配置,然后创建WindowedTimeThrottler,其maxAmt值为1 - 这里使用TransactionalState在zookeeper上维护transactional状态
- 之后读取Config.TOPOLOGY_MAX_SPOUT_PENDING(
topology.max.spout.pending,在defaults.yaml中默认为null
)设置_maxTransactionActive,如果为null,则设置为1
MasterBatchCoordinator.nextTuple
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
@Override
public void nextTuple() {
sync();
}
private void sync() {
// note that sometimes the tuples active may be less than max_spout_pending, e.g.
// max_spout_pending = 3
// tx 1, 2, 3 active, tx 2 is acked. there won't be a commit for tx 2 (because tx 1 isn't committed yet),
// and there won't be a batch for tx 4 because there's max_spout_pending tx active
TransactionStatus maybeCommit = _activeTx.get(_currTransaction);
if(maybeCommit!=null && maybeCommit.status == AttemptStatus.PROCESSED) {
maybeCommit.status = AttemptStatus.COMMITTING;
_collector.emit(COMMIT_STREAM_ID, new Values(maybeCommit.attempt), maybeCommit.attempt);
LOG.debug("Emitted on [stream = {}], [tx_status = {}], [{}]", COMMIT_STREAM_ID, maybeCommit, this);
}
if(_active) {
if(_activeTx.size() < _maxTransactionActive) {
Long curr = _currTransaction;
for(int i=0; i<_maxTransactionActive; i++) {
if(!_activeTx.containsKey(curr) && isReady(curr)) {
// by using a monotonically increasing attempt id, downstream tasks
// can be memory efficient by clearing out state for old attempts
// as soon as they see a higher attempt id for a transaction
Integer attemptId = _attemptIds.get(curr);
if(attemptId==null) {
attemptId = 0;
} else {
attemptId++;
}
_attemptIds.put(curr, attemptId);
for(TransactionalState state: _states) {
state.setData(CURRENT_ATTEMPTS, _attemptIds);
}
TransactionAttempt attempt = new TransactionAttempt(curr, attemptId);
final TransactionStatus newTransactionStatus = new TransactionStatus(attempt);
_activeTx.put(curr, newTransactionStatus);
_collector.emit(BATCH_STREAM_ID, new Values(attempt), attempt);
LOG.debug("Emitted on [stream = {}], [tx_attempt = {}], [tx_status = {}], [{}]", BATCH_STREAM_ID, attempt, newTransactionStatus, this);
_throttler.markEvent();
}
curr = nextTransactionId(curr);
}
}
}
}
- nextTuple就是调用sync方法,该方法在ack及fail中均有调用;sync方法首先根据事务状态,如果需要提交,则会往MasterBatchCoordinator.COMMIT_STREAM_ID(
$commit
)发送tuple;之后根据_maxTransactionActive以及WindowedTimeThrottler限制,符合要求才启动新的TransactionAttempt,往MasterBatchCoordinator.BATCH_STREAM_ID($batch
)发送tuple,同时对WindowedTimeThrottler标记下windowEvent数量
MasterBatchCoordinator.ack
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
@Override
public void ack(Object msgId) {
TransactionAttempt tx = (TransactionAttempt) msgId;
TransactionStatus status = _activeTx.get(tx.getTransactionId());
LOG.debug("Ack. [tx_attempt = {}], [tx_status = {}], [{}]", tx, status, this);
if(status!=null && tx.equals(status.attempt)) {
if(status.status==AttemptStatus.PROCESSING) {
status.status = AttemptStatus.PROCESSED;
LOG.debug("Changed status. [tx_attempt = {}] [tx_status = {}]", tx, status);
} else if(status.status==AttemptStatus.COMMITTING) {
_activeTx.remove(tx.getTransactionId());
_attemptIds.remove(tx.getTransactionId());
_collector.emit(SUCCESS_STREAM_ID, new Values(tx));
_currTransaction = nextTransactionId(tx.getTransactionId());
for(TransactionalState state: _states) {
state.setData(CURRENT_TX, _currTransaction);
}
LOG.debug("Emitted on [stream = {}], [tx_attempt = {}], [tx_status = {}], [{}]", SUCCESS_STREAM_ID, tx, status, this);
}
sync();
}
}
- ack主要是根据当前事务状态进行不同操作,如果之前是AttemptStatus.PROCESSING状态,则更新为AttemptStatus.PROCESSED;如果之前是AttemptStatus.COMMITTING,则移除当前事务,然后往MasterBatchCoordinator.SUCCESS_STREAM_ID(
$success
)发送tuple,更新_currTransaction为nextTransactionId;最后再调用sync触发新的TransactionAttempt
MasterBatchCoordinator.fail
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/MasterBatchCoordinator.java
@Override
public void fail(Object msgId) {
TransactionAttempt tx = (TransactionAttempt) msgId;
TransactionStatus stored = _activeTx.remove(tx.getTransactionId());
LOG.debug("Fail. [tx_attempt = {}], [tx_status = {}], [{}]", tx, stored, this);
if(stored!=null && tx.equals(stored.attempt)) {
_activeTx.tailMap(tx.getTransactionId()).clear();
sync();
}
}
- fail方法将当前事务从_activeTx中移除,然后清空_activeTx中txId大于这个失败txId的数据,最后再调用sync判断是否该触发新的TransactionAttempt(
注意这里没有变更_currTransaction,因而sync方法触发新的TransactionAttempt的_txid还是当前这个失败的_currTransaction
)
TridentSpoutCoordinator
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/spout/TridentSpoutCoordinator.java
public class TridentSpoutCoordinator implements IBasicBolt {
public static final Logger LOG = LoggerFactory.getLogger(TridentSpoutCoordinator.class);
private static final String META_DIR = "meta";
ITridentSpout<Object> _spout;
ITridentSpout.BatchCoordinator<Object> _coord;
RotatingTransactionalState _state;
TransactionalState _underlyingState;
String _id;
public TridentSpoutCoordinator(String id, ITridentSpout<Object> spout) {
_spout = spout;
_id = id;
}
@Override
public void prepare(Map conf, TopologyContext context) {
_coord = _spout.getCoordinator(_id, conf, context);
_underlyingState = TransactionalState.newCoordinatorState(conf, _id);
_state = new RotatingTransactionalState(_underlyingState, META_DIR);
}
@Override
public void execute(Tuple tuple, BasicOutputCollector collector) {
TransactionAttempt attempt = (TransactionAttempt) tuple.getValue(0);
if(tuple.getSourceStreamId().equals(MasterBatchCoordinator.SUCCESS_STREAM_ID)) {
_state.cleanupBefore(attempt.getTransactionId());
_coord.success(attempt.getTransactionId());
} else {
long txid = attempt.getTransactionId();
Object prevMeta = _state.getPreviousState(txid);
Object meta = _coord.initializeTransaction(txid, prevMeta, _state.getState(txid));
_state.overrideState(txid, meta);
collector.emit(MasterBatchCoordinator.BATCH_STREAM_ID, new Values(attempt, meta));
}
}
@Override
public void cleanup() {
_coord.close();
_underlyingState.close();
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
declarer.declareStream(MasterBatchCoordinator.BATCH_STREAM_ID, new Fields("tx", "metadata"));
}
@Override
public Map<String, Object> getComponentConfiguration() {
Config ret = new Config();
ret.setMaxTaskParallelism(1);
return ret;
}
}
- TridentSpoutCoordinator的nextTuple根据streamId分别做不同的处理
- 如果是MasterBatchCoordinator.SUCCESS_STREAM_ID(
$success
)则表示master那边接收到了ack已经成功了,然后coordinator就清除该txId之前的数据,然后回调ITridentSpout.BatchCoordinator的success方法 - 如果是MasterBatchCoordinator.BATCH_STREAM_ID(
$batch
)则要启动新的TransactionAttempt,则往MasterBatchCoordinator.BATCH_STREAM_ID($batch
)发送tuple,该tuple会被下游的bolt接收(在本实例就是使用TridentSpoutExecutor包装了用户spout的TridentBoltExecutor
)
TridentBoltExecutor
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/TridentBoltExecutor.java
public class TridentBoltExecutor implements IRichBolt {
public static final String COORD_STREAM_PREFIX = "$coord-";
public static String COORD_STREAM(String batch) {
return COORD_STREAM_PREFIX + batch;
}
RotatingMap<Object, TrackedBatch> _batches;
@Override
public void prepare(Map conf, TopologyContext context, OutputCollector collector) {
_messageTimeoutMs = context.maxTopologyMessageTimeout() * 1000L;
_lastRotate = System.currentTimeMillis();
_batches = new RotatingMap<>(2);
_context = context;
_collector = collector;
_coordCollector = new CoordinatedOutputCollector(collector);
_coordOutputCollector = new BatchOutputCollectorImpl(new OutputCollector(_coordCollector));
_coordConditions = (Map) context.getExecutorData("__coordConditions");
if(_coordConditions==null) {
_coordConditions = new HashMap<>();
for(String batchGroup: _coordSpecs.keySet()) {
CoordSpec spec = _coordSpecs.get(batchGroup);
CoordCondition cond = new CoordCondition();
cond.commitStream = spec.commitStream;
cond.expectedTaskReports = 0;
for(String comp: spec.coords.keySet()) {
CoordType ct = spec.coords.get(comp);
if(ct.equals(CoordType.single())) {
cond.expectedTaskReports+=1;
} else {
cond.expectedTaskReports+=context.getComponentTasks(comp).size();
}
}
cond.targetTasks = new HashSet<>();
for(String component: Utils.get(context.getThisTargets(),
COORD_STREAM(batchGroup),
new HashMap<String, Grouping>()).keySet()) {
cond.targetTasks.addAll(context.getComponentTasks(component));
}
_coordConditions.put(batchGroup, cond);
}
context.setExecutorData("_coordConditions", _coordConditions);
}
_bolt.prepare(conf, context, _coordOutputCollector);
}
//......
@Override
public void cleanup() {
_bolt.cleanup();
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
_bolt.declareOutputFields(declarer);
for(String batchGroup: _coordSpecs.keySet()) {
declarer.declareStream(COORD_STREAM(batchGroup), true, new Fields("id", "count"));
}
}
@Override
public Map<String, Object> getComponentConfiguration() {
Map<String, Object> ret = _bolt.getComponentConfiguration();
if(ret==null) ret = new HashMap<>();
ret.put(Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS, 5);
// TODO: Need to be able to set the tick tuple time to the message timeout, ideally without parameterization
return ret;
}
}
- prepare的时候,先创建了CoordinatedOutputCollector,之后用OutputCollector包装,再最后包装为BatchOutputCollectorImpl,调用ITridentBatchBolt.prepare方法,ITridentBatchBolt这里头使用的实现类为TridentSpoutExecutor
- prepare初始化了RotatingMap<Object, TrackedBatch> _batches = new RotatingMap<>(2);
- prepare主要做的是构建CoordCondition,这里主要是计算expectedTaskReports以及targetTasks
TridentBoltExecutor.execute
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/TridentBoltExecutor.java
@Override
public void execute(Tuple tuple) {
if(TupleUtils.isTick(tuple)) {
long now = System.currentTimeMillis();
if(now - _lastRotate > _messageTimeoutMs) {
_batches.rotate();
_lastRotate = now;
}
return;
}
String batchGroup = _batchGroupIds.get(tuple.getSourceGlobalStreamId());
if(batchGroup==null) {
// this is so we can do things like have simple DRPC that doesn't need to use batch processing
_coordCollector.setCurrBatch(null);
_bolt.execute(null, tuple);
_collector.ack(tuple);
return;
}
IBatchID id = (IBatchID) tuple.getValue(0);
//get transaction id
//if it already exists and attempt id is greater than the attempt there
TrackedBatch tracked = (TrackedBatch) _batches.get(id.getId());
// if(_batches.size() > 10 && _context.getThisTaskIndex() == 0) {
// System.out.println("Received in " + _context.getThisComponentId() + " " + _context.getThisTaskIndex()
// + " (" + _batches.size() + ")" +
// "\ntuple: " + tuple +
// "\nwith tracked " + tracked +
// "\nwith id " + id +
// "\nwith group " + batchGroup
// + "\n");
//
// }
//System.out.println("Num tracked: " + _batches.size() + " " + _context.getThisComponentId() + " " + _context.getThisTaskIndex());
// this code here ensures that only one attempt is ever tracked for a batch, so when
// failures happen you don't get an explosion in memory usage in the tasks
if(tracked!=null) {
if(id.getAttemptId() > tracked.attemptId) {
_batches.remove(id.getId());
tracked = null;
} else if(id.getAttemptId() < tracked.attemptId) {
// no reason to try to execute a previous attempt than we've already seen
return;
}
}
if(tracked==null) {
tracked = new TrackedBatch(new BatchInfo(batchGroup, id, _bolt.initBatchState(batchGroup, id)), _coordConditions.get(batchGroup), id.getAttemptId());
_batches.put(id.getId(), tracked);
}
_coordCollector.setCurrBatch(tracked);
//System.out.println("TRACKED: " + tracked + " " + tuple);
TupleType t = getTupleType(tuple, tracked);
if(t==TupleType.COMMIT) {
tracked.receivedCommit = true;
checkFinish(tracked, tuple, t);
} else if(t==TupleType.COORD) {
int count = tuple.getInteger(1);
tracked.reportedTasks++;
tracked.expectedTupleCount+=count;
checkFinish(tracked, tuple, t);
} else {
tracked.receivedTuples++;
boolean success = true;
try {
_bolt.execute(tracked.info, tuple);
if(tracked.condition.expectedTaskReports==0) {
success = finishBatch(tracked, tuple);
}
} catch(FailedException e) {
failBatch(tracked, e);
}
if(success) {
_collector.ack(tuple);
} else {
_collector.fail(tuple);
}
}
_coordCollector.setCurrBatch(null);
}
private TupleType getTupleType(Tuple tuple, TrackedBatch batch) {
CoordCondition cond = batch.condition;
if(cond.commitStream!=null
&& tuple.getSourceGlobalStreamId().equals(cond.commitStream)) {
return TupleType.COMMIT;
} else if(cond.expectedTaskReports > 0
&& tuple.getSourceStreamId().startsWith(COORD_STREAM_PREFIX)) {
return TupleType.COORD;
} else {
return TupleType.REGULAR;
}
}
private void failBatch(TrackedBatch tracked, FailedException e) {
if(e!=null && e instanceof ReportedFailedException) {
_collector.reportError(e);
}
tracked.failed = true;
if(tracked.delayedAck!=null) {
_collector.fail(tracked.delayedAck);
tracked.delayedAck = null;
}
}
- TridentBoltExecutor的execute方法首先判断是否是tickTuple,如果是判断距离_lastRotate的时间(
prepare的时候初始化为当时的时间
)是否超过_messageTimeoutMs,如果是则进行_batches.rotate()操作;tickTuple的发射频率为Config.TOPOLOGY_TICK_TUPLE_FREQ_SECS(topology.tick.tuple.freq.secs
),在TridentBoltExecutor中它被设置为5秒;_messageTimeoutMs为context.maxTopologyMessageTimeout() * 1000L,它从整个topology的component的Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS(topology.message.timeout.secs,defaults.yaml中默认为30
)最大值*1000 - _batches按TransactionAttempt的txId来存储TrackedBatch信息,如果没有则创建一个新的TrackedBatch;创建TrackedBatch时,会回调_bolt的initBatchState方法
- 之后判断tuple的类型,这里分为TupleType.COMMIT、TupleType.COORD、TupleType.REGULAR;如果是TupleType.COMMIT类型,则设置tracked.receivedCommit为true,然后调用checkFinish方法;如果是TupleType.COORD类型,则更新reportedTasks及expectedTupleCount计数,再调用checkFinish方法;如果是TupleType.REGULAR类型(
coordinator发送过来的batch信息
),则更新receivedTuples计数,然后调用_bolt.execute方法(这里的_bolt为TridentSpoutExecutor
),对于tracked.condition.expectedTaskReports==0的则立马调用finishBatch,将该batch从_batches中移除;如果有FailedException则直接failBatch上报error信息,之后对tuple进行ack或者fail;如果下游是each操作,一个batch中如果是部分抛出FailedException异常,则需要等到所有batch中的tuple执行完,等到TupleType.COORD触发检测checkFinish,这个时候才能fail通知到master,也就是有一些滞后性,比如这个batch中有3个tuple,第二个tuple抛出FailedException,还会继续执行第三个tuple,最后该batch的tuple都处理完了,才收到TupleType.COORD触发检测checkFinish。
TridentBoltExecutor.checkFinish
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/topology/TridentBoltExecutor.java
private void checkFinish(TrackedBatch tracked, Tuple tuple, TupleType type) {
if(tracked.failed) {
failBatch(tracked);
_collector.fail(tuple);
return;
}
CoordCondition cond = tracked.condition;
boolean delayed = tracked.delayedAck==null &&
(cond.commitStream!=null && type==TupleType.COMMIT
|| cond.commitStream==null);
if(delayed) {
tracked.delayedAck = tuple;
}
boolean failed = false;
if(tracked.receivedCommit && tracked.reportedTasks == cond.expectedTaskReports) {
if(tracked.receivedTuples == tracked.expectedTupleCount) {
finishBatch(tracked, tuple);
} else {
//TODO: add logging that not all tuples were received
failBatch(tracked);
_collector.fail(tuple);
failed = true;
}
}
if(!delayed && !failed) {
_collector.ack(tuple);
}
}
private void failBatch(TrackedBatch tracked) {
failBatch(tracked, null);
}
private void failBatch(TrackedBatch tracked, FailedException e) {
if(e!=null && e instanceof ReportedFailedException) {
_collector.reportError(e);
}
tracked.failed = true;
if(tracked.delayedAck!=null) {
_collector.fail(tracked.delayedAck);
tracked.delayedAck = null;
}
}
- TridentBoltExecutor在execute的时候,在tuple是TupleType.COMMIT以及TupleType.COORD的时候都会调用checkFinish
- 一旦_bolt.execute(tracked.info, tuple)方法抛出FailedException,则会调用failBatch,它会标记tracked.failed为true
- checkFinish在发现tracked.failed为true的时候,会调用_collector.fail(tuple),然后回调MasterBatchCoordinator的fail方法
TridentSpoutExecutor
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/spout/TridentSpoutExecutor.java
public class TridentSpoutExecutor implements ITridentBatchBolt {
public static final String ID_FIELD = "$tx";
public static final Logger LOG = LoggerFactory.getLogger(TridentSpoutExecutor.class);
AddIdCollector _collector;
ITridentSpout<Object> _spout;
ITridentSpout.Emitter<Object> _emitter;
String _streamName;
String _txStateId;
TreeMap<Long, TransactionAttempt> _activeBatches = new TreeMap<>();
public TridentSpoutExecutor(String txStateId, String streamName, ITridentSpout<Object> spout) {
_txStateId = txStateId;
_spout = spout;
_streamName = streamName;
}
@Override
public void prepare(Map conf, TopologyContext context, BatchOutputCollector collector) {
_emitter = _spout.getEmitter(_txStateId, conf, context);
_collector = new AddIdCollector(_streamName, collector);
}
@Override
public void execute(BatchInfo info, Tuple input) {
// there won't be a BatchInfo for the success stream
TransactionAttempt attempt = (TransactionAttempt) input.getValue(0);
if(input.getSourceStreamId().equals(MasterBatchCoordinator.COMMIT_STREAM_ID)) {
if(attempt.equals(_activeBatches.get(attempt.getTransactionId()))) {
((ICommitterTridentSpout.Emitter) _emitter).commit(attempt);
_activeBatches.remove(attempt.getTransactionId());
} else {
throw new FailedException("Received commit for different transaction attempt");
}
} else if(input.getSourceStreamId().equals(MasterBatchCoordinator.SUCCESS_STREAM_ID)) {
// valid to delete before what's been committed since
// those batches will never be accessed again
_activeBatches.headMap(attempt.getTransactionId()).clear();
_emitter.success(attempt);
} else {
_collector.setBatch(info.batchId);
_emitter.emitBatch(attempt, input.getValue(1), _collector);
_activeBatches.put(attempt.getTransactionId(), attempt);
}
}
@Override
public void cleanup() {
_emitter.close();
}
@Override
public void declareOutputFields(OutputFieldsDeclarer declarer) {
List<String> fields = new ArrayList<>(_spout.getOutputFields().toList());
fields.add(0, ID_FIELD);
declarer.declareStream(_streamName, new Fields(fields));
}
@Override
public Map<String, Object> getComponentConfiguration() {
return _spout.getComponentConfiguration();
}
@Override
public void finishBatch(BatchInfo batchInfo) {
}
@Override
public Object initBatchState(String batchGroup, Object batchId) {
return null;
}
}
- TridentSpoutExecutor使用的BatchOutputCollector为TridentBoltExecutor在prepare方法构造的,经过几层包装,先是CoordinatedOutputCollector,然后是OutputCollector,最后是BatchOutputCollectorImpl;这里最主要的是CoordinatedOutputCollector包装,它维护每个taskId发出的tuple的数量;而在这个executor的prepare方法里头,该collector又被包装为AddIdCollector,主要是添加了batchId信息(
即TransactionAttempt信息
) - TridentSpoutExecutor的ITridentSpout就是包装了用户设置的原始spout(
IBatchSpout类型
)的BatchSpoutExecutor(假设原始spout是IBatchSpout类型的,因而会通过BatchSpoutExecutor包装为ITridentSpout类型
),其execute方法根据不同stream类型进行不同处理,如果是master发过来的MasterBatchCoordinator.COMMIT_STREAM_ID($commit
)则调用emitter的commit方法提交当前TransactionAttempt(本文的实例没有commit信息
),然后将该tx从_activeBatches中移除;如果是master发过来的MasterBatchCoordinator.SUCCESS_STREAM_ID($success
)则先把_activeBatches中txId小于该txId的TransactionAttempt移除,然后调用emitter的success方法,标记TransactionAttempt成功,该方法回调原始spout(IBatchSpout类型
)的ack方法 - 非MasterBatchCoordinator.COMMIT_STREAM_ID(
$commit
)及MasterBatchCoordinator.SUCCESS_STREAM_ID($success
)类型的tuple,则是启动batch的消息,这里设置batchId,然后调用emitter的emitBatch进行数据发送(这里传递的batchId就是TransactionAttempt的txId
),同时将该TransactionAttempt放入_activeBatches中(这里的batch相当于TransactionAttempt
)
FixedBatchSpout
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/testing/FixedBatchSpout.java
public class FixedBatchSpout implements IBatchSpout {
Fields fields;
List<Object>[] outputs;
int maxBatchSize;
HashMap<Long, List<List<Object>>> batches = new HashMap<Long, List<List<Object>>>();
public FixedBatchSpout(Fields fields, int maxBatchSize, List<Object>... outputs) {
this.fields = fields;
this.outputs = outputs;
this.maxBatchSize = maxBatchSize;
}
int index = 0;
boolean cycle = false;
public void setCycle(boolean cycle) {
this.cycle = cycle;
}
@Override
public void open(Map conf, TopologyContext context) {
index = 0;
}
@Override
public void emitBatch(long batchId, TridentCollector collector) {
List<List<Object>> batch = this.batches.get(batchId);
if(batch == null){
batch = new ArrayList<List<Object>>();
if(index>=outputs.length && cycle) {
index = 0;
}
for(int i=0; index < outputs.length && i < maxBatchSize; index++, i++) {
batch.add(outputs[index]);
}
this.batches.put(batchId, batch);
}
for(List<Object> list : batch){
collector.emit(list);
}
}
@Override
public void ack(long batchId) {
this.batches.remove(batchId);
}
@Override
public void close() {
}
@Override
public Map<String, Object> getComponentConfiguration() {
Config conf = new Config();
conf.setMaxTaskParallelism(1);
return conf;
}
@Override
public Fields getOutputFields() {
return fields;
}
}
- 用户使用的spout是IBatchSpout类型,这里缓存了每个batchId对应的tuple数据,实现的是transactional spout的语义
TridentTopology.newStream
storm-1.2.2/storm-core/src/jvm/org/apache/storm/trident/TridentTopology.java
public Stream newStream(String txId, IRichSpout spout) {
return newStream(txId, new RichSpoutBatchExecutor(spout));
}
public Stream newStream(String txId, IBatchSpout spout) {
Node n = new SpoutNode(getUniqueStreamId(), spout.getOutputFields(), txId, spout, SpoutNode.SpoutType.BATCH);
return addNode(n);
}
public Stream newStream(String txId, ITridentSpout spout) {
Node n = new SpoutNode(getUniqueStreamId(), spout.getOutputFields(), txId, spout, SpoutNode.SpoutType.BATCH);
return addNode(n);
}
public Stream newStream(String txId, IPartitionedTridentSpout spout) {
return newStream(txId, new PartitionedTridentSpoutExecutor(spout));
}
public Stream newStream(String txId, IOpaquePartitionedTridentSpout spout) {
return newStream(txId, new OpaquePartitionedTridentSpoutExecutor(spout));
}
public Stream newStream(String txId, ITridentDataSource dataSource) {
if (dataSource instanceof IBatchSpout) {
return newStream(txId, (IBatchSpout) dataSource);
} else if (dataSource instanceof ITridentSpout) {
return newStream(txId, (ITridentSpout) dataSource);
} else if (dataSource instanceof IPartitionedTridentSpout) {
return newStream(txId, (IPartitionedTridentSpout) dataSource);
} else if (dataSource instanceof IOpaquePartitionedTridentSpout) {
return newStream(txId, (IOpaquePartitionedTridentSpout) dataSource);
} else {
throw new UnsupportedOperationException("Unsupported stream");
}
}
- 用户在TridentTopology.newStream可以直接使用IBatchSpout类似的spout,使用它的好处就是TridentTopology在build的时候会使用BatchSpoutExecutor将其包装为ITridentSpout类型(
省得用户再去实现ITridentSpout的相关接口,屏蔽trident spout的相关逻辑,使得之前一直使用普通topology的用户可以快速上手trident topology
) - BatchSpoutExecutor实现了ITridentSpout接口,将IBatchSpout适配为ITridentSpout,使用的coordinator是EmptyCoordinator,使用的emitter是BatchSpoutEmitter
- 如果用户在TridentTopology.newStream使用的spout是IPartitionedTridentSpout类型,则TridentTopology在newStream方法内部会使用PartitionedTridentSpoutExecutor将其包装为ITridentSpout类型,对于IOpaquePartitionedTridentSpout则使用OpaquePartitionedTridentSpoutExecutor将其包装为ITridentSpout类型
小结
- TridentTopology在newStream或者build方法里头会将ITridentDataSource中不是ITridentSpout类型的IBatchSpout(
在build方法
)、IPartitionedTridentSpout(在newStream方法
)、IOpaquePartitionedTridentSpout(在newStream方法
)适配为ITridentSpout类型;分别使用BatchSpoutExecutor、PartitionedTridentSpoutExecutor、OpaquePartitionedTridentSpoutExecutor进行适配(TridentTopologyBuilder在buildTopology的时候,对于ITridentSpout类型的spout先用TridentSpoutExecutor包装,再用TridentBoltExecutor包装,最后转换为bolt,而整个TridentTopology真正的spout就是MasterBatchCoordinator;这里可以看到一个IBatchSpout的spout先经过BatchSpoutExecutor包装为ITridentSpout类型,之后再经过TridentSpoutExecutor及TridentBoltExecutor包装为bolt
) - IBatchSpout的ack是针对batch维度的,也就是TransactionAttempt维度,注意这里没有fail方法,如果emitBatch方法抛出了FailedException异常,则TridentBoltExecutor会调用failBatch方法(
一个batch的tuples会等所有tuple执行完再触发checkFinish
),进行reportError以及标记TrackedBatch的failed为true,之后TridentBoltExecutor在checkFinish的时候,一旦发现tracked.failed为true的时候,会调用_collector.fail(tuple),然后回调MasterBatchCoordinator的fail方法 - MasterBatchCoordinator的fail方法会将当前TransactionAttempt从_activeTx移除,然后一并移除txId大于失败的txId的数据,最后调用sync方法继续TransactionAttempt(
注意这里没有更改_currTransaction值,因而会继续从失败的txId开始重试,只有在ack方法里头会更改_currTransaction为nextTransactionId
) - TridentBoltExecutor的execute方法会根据tickTuple来检测距离上次rotate是否超过_messageTimeoutMs(
取component中Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS最大值*1000,这里*1000是将秒转换为毫秒
),超过的话进行rotate操作,_batches的最后一个bucket将会被移除掉;这里的tickTuple的频率为5秒,Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS按30秒算的话,_messageTimeoutMs为30*1000,相当于每5秒检测一下距离上次rotate时间是否超过30秒,如果超过则进行rotate,丢弃最后一个bucket的数据(TrackedBatch
),这里相当于重置超时的TrackedBatch信息 - 关于MasterBatchCoordinator的fail的情况,有几种情况,一种是下游componnent主动抛出FailException,这个时候会触发master的fail,再次重试TransactionAttempt;一种是下游component处理tuple时间超过Config.TOPOLOGY_MESSAGE_TIMEOUT_SECS(
topology.message.timeout.secs,defaults.yaml中默认为30
),这个时候ack会触发master的fail,导致该TransactionAttempt失败继续重试,目前没有对attempt的次数做限制,实际生产过程中要注意,因为只要该batchId的一个tuple失败,整个batchId的tuples都会重发,这个时候下游如果没有做好处理,可能会出现一个batchId中前面部分tuple成功,后面部分失败,导致成功的tuple不断重复处理(要避免失败的batch中tuples部分处理成功部分处理失败这个问题就需要配合使用Trident的State
)。