Network Data
The Zachary's karate club data include 78 connections between 34 members:
Using the spring_layout algorithm to visualize it, we get
Task:
Predict surrounding members in a team size 2c+1 of every member w_t such that we have the most possible name list for team members:
Model
Objective function: Maximize the log probability of any "social context" member given the current center (target) member:
![Eq. 2](http://latex.codecogs.com/svg.latex?J(\theta)=\frac{1}{T}\sum_{t=1}^T\sum_{-c\leq j\leq c} \log p(w_{t+j}|w_t),)
in which T is the total number of target members. In other words, the right sigma sums over all members in a time window and the left sigma sums over all time windows.
Now let's define the probability function as
or the simplified version,
![Eq. 4](http://latex.codecogs.com/svg.latex?p(O|I)=\frac{\exp(v_o^T v_i)}{\sum_{w=1}W\exp(v_oT v_i)},)
in which I and O correspond to the target member and the context member, correspondingly, and w goes over all possible context members in the training data.
Now we can do some math to obtain the gradience descent as
![Eq. 5](http://latex.codecogs.com/svg.latex? \frac{\partial \log p(O|I)}{\partial v_i}=\frac{\partial }{\partial v_i}[v_o^T v_i-\log \sum_{w=1}W\exp(v_oT v_i) ],)
By applying the chain rule twice, we solve it as
![Eq. 6](http://latex.codecogs.com/svg.latex? \frac{\partial \log p(O|I)}{\partial v_i}=v_o-\sum_{x=1}W\frac{\exp(v_oT v_i)}{\sum_{w=1}W\exp(v_oT v_i)}v_x = v_o-\sum_{x=1}^Wp(O|I)v_x,)
In order to maximize Eq. 2, we shall update the vector vi as
![Eq. 7](http://latex.codecogs.com/svg.latex? v_i^{new} = v_i^{old}+\epsilon \frac{1}{T}\sum_{t=1}^T\sum_{-c\leq o\leq c} [v_o-\sum_{x=1}^Wp(O|I)v_x],)
in which P(O|I) is given by Eq. 4.
Updating vectors
Now let's assume all these members are embedded in a 2D space, i.e., each vi is of length 2,