目前训练数据中,总会出现loss非常高的样本,那就需要针对这类样本进行更多的训练,由于按照样本长度已经将样本放在不同的bucket中,那就需要针对某个bucket进行更多采样。
以下代码为样例:
import random
def rand_pick(seq, probabilities):
x = random.uniform(0, 1)
cumprob = 0.0
for item , item_pro in zip(seq, probabilities):
cumprob += item_pro
if x < cumprob:
break
return item
value_list = [0 , 1, 2]
probabilities = [0.4 , 0.3, 0.3]
for i in range(10):
print(rand_pick(value_list, probabilities))