Make Your Own Neural Network

Introduction

  • I will have failed if I haven’t shown you how school level mathematics and simple computer recipes can be incredibly powerful - by making our own artificial intelligence mimicking the learning ability of human brains.

Part 1 - How They Work

  • A human may find it hard to do large sums very quickly but the process of doing it doesn’t require much intelligence at all.
    We can process the quite large amount of information that the images contain, and very successfully process it to recognise what’s in the image. This kind of task isn’t easy for computers - in fact it’s incredibly difficult.
  • When we don’t know exactly how something works we can try to estimate it with a model which includes parameters which we can adjust. If we didn’t know how to convert kilometres to miles, we might use a linear function as a model, with an adjustable gradient.
    A good way of refining these models is to adjust the parameters based on how wrong the model is compared to known true examples.
    建立含参模型→猜测初始参数值→根据与已知数据集的误差修正参数(误差越大,修正越大)→循环修正过程直至达到误差要求。

  • 从神经网络的学习过程可见完美主义以及害怕犯错的问题,即世上并无完美之事,而错误让我们知道离正确有多远。

  • Visualising data is often very helpful to get a better understand of training data, a feel for it, which isn’t easy to get just by looking at a list or table of numbers.

  • We want to use the error to inform the required change in parameter

  • We moderate the updates.
    This way we move in the direction that the training example suggests, but do so slightly cautiously, keeping some of the previous value which was arrived at through potentially many previous training iterations.
    The moderation can dampen the impact of those errors or noise.
    The moderating factor is often called a learning rate.

  • 神经网络中使用的学习速率告诉我们,学习时用力过猛会导致前面学过的内容被洗掉,以及犯了错误以后不要矫枉过正。

  • Traditional computers processed data very much sequentially, and in pretty exact concrete terms. There is no fuzziness or ambiguity about their cold hard calculations. Animal brains, on the other hand, although apparently running at much slower rhythms, seemed to process signals in parallel, and fuzziness was a feature of their computation.

  • Observations suggest that neurons don’t react readily, but instead suppress the input until it has grown so large that it triggers an output. You can think of this as a threshold that must be reached before any output is produced.

  • The sigmoid function is much easier to do calculations with than other S-shaped functions.

  • Interestingly, if only one of the several inputs is large and the rest small, this may be enough to fire the neuron. What’s more, the neuron can fire if some of the inputs are individually almost, but not quite, large enough because when combined the signal is large enough to overcome the threshold. In an intuitive way, this gives you a sense of the more sophisticated, and in a sense fuzzy, calculations that such neurons can do.

  • It is the weights that do the learning in a neural networks as they are iteratively refined to give better and better results.

  • The many calculations needed to feed a signal forward through a neural network can be expressed as matrix multiplication.

  • We’re using the weights in two ways. Firstly we use the weights to propagate signals forward from the input to the output layers in a neural network. Secondly we use the weights to propagate the error backwards from the output back into the network. It is called backpropagation.

  • Trying to vectorise the process: Being able to express a lot of calculations in matrix form makes it more concise to write down, and also allows computers to do all that work much more efficiently.

  • A matrix approach to propagating the errors back:

  • Gradient descent is a really good way of working out the minimum of a function.
    To avoid ending up in the wrong valley, or function minimum, we train neural networks several times starting from different starting link weights.

  • The final answer that describes the slope of the error function, E=(target - actual)^2, so we can adjust the weight w_{jk}:

  • This is the key to training neural networks.
    It’s worth a second look, and the colour coding helps show each part. The first part is simply the (target - actual) error. The sum expression inside the sigmoids is simply the signal into the final layer node. It’s just the signal into a node before the activation squashing function is applied. That last part is the output from the previous hidden layer node j.

  • The slope of the error function for the weights between the input and hidden layers:

  • The updated weight w_{jk} is the old weight adjusted by the negative of the error slope with a learning rate \alpha:
如何根据偏差的平方和对于权重参数的梯度变化,来调整权重参数值。
  • A very flat activation function is problematic because we use the gradient to learn new weights.
    To avoid saturating a neural network, we should try to keep the inputs small.
    We shouldn’t make it too small either, because the gradient also depends on the incoming signal (o_j).
    A good recommendation is to rescale inputs into the range 0.0 to 1.0. Some will add a small offset to the inputs, like 0.01.

  • The weights are initialised randomly sampling from a range that is roughly the inverse of the square root of the number of links into a node. So if each node has 3 links into it, the initial weights should be in the range 1/(√3) = 0.577. If each node has 100 incoming links, the weights should be in the range 1/(√100) = 0.1.
    This is sampling from a normal distribution with mean zero and a standard deviation which is the inverse of the square root of the number of links into a node.
    This assumes quite a few things which may not be true, such as an activation function like the alternative tanh() and a specific distribution of the input signals.

Part 2 - DIY with Python

  • Let’s sketch out what a neural network class should look like. We know it should have at least three functions:
    initialisation - to set the number of input, hidden and output nodes
    train - refine the weights after being given a training set example to learn from
    query - give an answer from the output nodes after being given an input
# neural network class definition
class neuralNetwork:
   
    # initialise the neural network
    def __init__():
        pass
   
    # train the neural network
    def train():
        pass
   
    # query the neural network
    def query():
        pass
  • Good programmers, computer scientists and mathematicians, try to create general code rather than specific code whenever they can.

  • A good technique to start small and grow code, finding and fixing problems along the way:

    # initialise the neural network
    def __init__(self, inputnodes, hiddennodes, outputnodes, learningrate):
        # set number of nodes in each input, hidden, output layer
        self.inodes = inputnodes
        self.hnodes = hiddennodes
        self.onodes = outputnodes
        # learning rate
        self.lr = learningrate
        pass

# number of input, hidden and output nodes
input_nodes = 3
hidden_nodes = 3
output_nodes = 3
# learning rate is 0.3
learning_rate = 0.3
# create instance of neural network
n = neuralNetwork(input_nodes,hidden_nodes,output_nodes, learning_rate)
  • Neural networks should find features or patterns in the input which can be expressed in a shorter form than the input itself. So by choosing a value smaller than the number of inputs, we force the network to try to summarise the key features. However if we choose too few hidden layer nodes, then we restrict the ability of the network to find sufficient features or patterns. We’d be taking away its ability to express its own understanding of the training data.

  • There isn’t a perfect method for choosing how many hidden nodes there should be for a problem. Indeed there isn’t a perfect method for choosing the number of hidden layers either. The best approaches, for now, are to experiment until you find a good configuration for the problem you’re trying to solve.

  • Overfitting is something to beware of across many different kinds of machine learning, not just neural networks.
    神经网络只是机器学习的一种。
    过度学习会导致对新事物的接受度下降,变得顽固。

  • Neural network learning is a random process at heart and can sometimes not work so well, and sometimes work really badly.

  • Do the testing experiment many times for each combination of learning rates and epochs to minimise the effect of randomness that is inherent in gradient descent.

  • The hidden layers are where the learning happens. Actually, it’s the link weights before and after the hidden nodes that do the learning.
    You can’t learn more than the learning capacity, but you can change the network shape to increase the capacity.

  • 问题:怎么修改代码,以设置隐藏层数和每个隐藏层的节点数?

最后编辑于
©著作权归作者所有,转载或内容合作请联系作者
  • 序言:七十年代末,一起剥皮案震惊了整个滨河市,随后出现的几起案子,更是在滨河造成了极大的恐慌,老刑警刘岩,带你破解...
    沈念sama阅读 194,242评论 5 459
  • 序言:滨河连续发生了三起死亡事件,死亡现场离奇诡异,居然都是意外死亡,警方通过查阅死者的电脑和手机,发现死者居然都...
    沈念sama阅读 81,769评论 2 371
  • 文/潘晓璐 我一进店门,熙熙楼的掌柜王于贵愁眉苦脸地迎上来,“玉大人,你说我怎么就摊上这事。” “怎么了?”我有些...
    开封第一讲书人阅读 141,484评论 0 319
  • 文/不坏的土叔 我叫张陵,是天一观的道长。 经常有香客问我,道长,这世上最难降的妖魔是什么? 我笑而不...
    开封第一讲书人阅读 52,133评论 1 263
  • 正文 为了忘掉前任,我火速办了婚礼,结果婚礼上,老公的妹妹穿的比我还像新娘。我一直安慰自己,他们只是感情好,可当我...
    茶点故事阅读 61,007评论 4 355
  • 文/花漫 我一把揭开白布。 她就那样静静地躺着,像睡着了一般。 火红的嫁衣衬着肌肤如雪。 梳的纹丝不乱的头发上,一...
    开封第一讲书人阅读 46,080评论 1 272
  • 那天,我揣着相机与录音,去河边找鬼。 笑死,一个胖子当着我的面吹牛,可吹牛的内容都是我干的。 我是一名探鬼主播,决...
    沈念sama阅读 36,496评论 3 381
  • 文/苍兰香墨 我猛地睁开眼,长吁一口气:“原来是场噩梦啊……” “哼!你这毒妇竟也来了?” 一声冷哼从身侧响起,我...
    开封第一讲书人阅读 35,190评论 0 253
  • 序言:老挝万荣一对情侣失踪,失踪者是张志新(化名)和其女友刘颖,没想到半个月后,有当地人在树林里发现了一具尸体,经...
    沈念sama阅读 39,464评论 1 290
  • 正文 独居荒郊野岭守林人离奇死亡,尸身上长有42处带血的脓包…… 初始之章·张勋 以下内容为张勋视角 年9月15日...
    茶点故事阅读 34,549评论 2 309
  • 正文 我和宋清朗相恋三年,在试婚纱的时候发现自己被绿了。 大学时的朋友给我发了我未婚夫和他白月光在一起吃饭的照片。...
    茶点故事阅读 36,330评论 1 326
  • 序言:一个原本活蹦乱跳的男人离奇死亡,死状恐怖,灵堂内的尸体忽然破棺而出,到底是诈尸还是另有隐情,我是刑警宁泽,带...
    沈念sama阅读 32,205评论 3 312
  • 正文 年R本政府宣布,位于F岛的核电站,受9级特大地震影响,放射性物质发生泄漏。R本人自食恶果不足惜,却给世界环境...
    茶点故事阅读 37,567评论 3 298
  • 文/蒙蒙 一、第九天 我趴在偏房一处隐蔽的房顶上张望。 院中可真热闹,春花似锦、人声如沸。这庄子的主人今日做“春日...
    开封第一讲书人阅读 28,889评论 0 17
  • 文/苍兰香墨 我抬头看了看天上的太阳。三九已至,却和暖如春,着一层夹袄步出监牢的瞬间,已是汗流浃背。 一阵脚步声响...
    开封第一讲书人阅读 30,160评论 1 250
  • 我被黑心中介骗来泰国打工, 没想到刚下飞机就差点儿被人妖公主榨干…… 1. 我叫王不留,地道东北人。 一个月前我还...
    沈念sama阅读 41,475评论 2 341
  • 正文 我出身青楼,却偏偏与公主长得像,于是被迫代替她去往敌国和亲。 传闻我的和亲对象是个残疾皇子,可洞房花烛夜当晚...
    茶点故事阅读 40,650评论 2 335

推荐阅读更多精彩内容

  • rljs by sennchi Timeline of History Part One The Cognitiv...
    sennchi阅读 7,251评论 0 10
  • 引言 ①本文章适合有 UIWebView 基础的人看,如果实在没用过的话,至少你要知道 UIWebView 是个什...
    KyXu阅读 56,287评论 46 147
  • 我们学校因为位置的原因,一放学门前的交通成问题。搬了新校,我们家委会成员主动承担起了门前的交通管制。今天轮...
    美丽的西双版纳阅读 123评论 0 0
  • (文章是本人原创作品,未经授权,禁止任何形式的转载) 摄影:南宫星霖 我站在秋天的山坡上,看到枯黄...
    明月清风sxw阅读 1,140评论 12 22
  • 你来找我,说自己总是担心自己的孩子。比如孩子刚出生时,你看着弱小的他想着他会不会死去?你开始回避这些恐怖的意像,你...
    安喜喜阅读 245评论 0 0