//
you're reading...
Uncategorized

Using MATLAB to determine network centrality: Eigenvectors and Eigenvalues of Graphs

The principal eigenvector of a graph is often used to measure the centrality of its vertices, which is a measure of prominence or importance in the network.

An eigenvalue of a graph is defined as an eigenvalue of the graph’s adjacency matrix A, or of the graph’s Laplacian matrix, which is L= DA, where D is a diagonal matrix with Dv,v equal to the degree of vertex v. The principal eigenvector of a graph is defined the eigenvector corresponding to the largest or eigenvalue of the adjacency.

Consider a graph with n nodes with adjacency matrix A, an nxn 0/1 matrix. The real number d is called an eigenvalue of A if there exists a nonzero vector x such that Ax = dx. The vector x is called an eigenvector for A. The equation Ax = dx is equivalent to (A-dI)x = 0, so all of the following are equivalent:

1. d is an eigenvalue of A.
2. (A-dI)x = 0 has a nontrivial solution where x is nonzero.
3. A-dI is singular matrix.
4. det(A-dI) = 0.

Given an eigenvalue d, the eigenvectors for A are the nonzero solutions x to (A-dI)x=0.  The expression det(A-dI) is a polynomial in variable d of degree n, and is called the characteristic polynomial of A. By property 4, the eigenvalues are the roots of the characteristic polynomial of A.

Let us determine eigenvalues and eigenvectors for a given graph with MATLAB:

Description of Matlab EIG() function

d = eigs(A) returns a vector of A‘s eigenvalues. A must be a square matrix.

[V,D] = eigs(A) returns a diagonal matrix D of A‘s eigenvalues and a full matrix V whose columns are the corresponding eigenvectors.

[V,D,flag] = eigs(A) also returns a convergence flag. If flag is 0 then all the eigenvalues converged; otherwise not all converged.

Let G be the following graph:

screen-capture

We compute the eigenvalues and eigenvectors of the adjacency matrix A as follows:


>> A= [
0 1 1 1 0 0 0 0;
1 0 0 1 1 0 0 0;
1 0 0 0 0 1 1 1;
1 1 0 0 1 0 0 0;
0 1 0 1 0 0 0 0;
0 0 1 0 0 0 1 0;
0 0 1 0 0 1 0 1;
0 0 1 0 0 0 1 0];

>> eig(A)

ans =

-2.0000
-1.4458
-1.0000
-1.0000
-0.0000
0.2032
2.3703
2.8723

>>[V D] = eig(A);

>>V(:,8)

ans =

-0.4063
-0.3455
-0.4760
-0.3455
-0.2406
-0.2949
-0.3711
-0.2949

It is only a bit annoying that the components are all negative. I am not sure why this occurs in MATLAB, but we can simply ignore the sign when we are considering using the eigenvector for centrality measure.

An alternative means to finding the eigenvectors and eigenvalues is as follows: In MATLAB we can find the characteristic polynomial of a matrix A by entering poly(A). If A is an nxn matrix, poly(A) is a row vector with n+1 elements that are the coefficients of the characteristic polynomial. The command roots(C) computes the roots of the polynomial whose coefficients are the elements of the vector C. Thus, roots(poly(A)) returns the eigenvalues of A in a column vector. Corresponding to each eigenvalue d found above, we need to find the nonzero solutions x to (A-dI)x = 0. Thus we need to solve a homogeneous system, but add constraint that norm of solution is nonzero and we seek unit vector of norm 1.

The numerically best way to solve such equations subject to the norm constraint is to perform singular value decomposition on the matrix B=A-dI. Singular Value Decomposition (SVD) factors the matrix into a diagonal matrix D and two orthogonal matrices U, V, such that  B = UDV^t. The diagonal entries of D are related to eigenvalues of B^tB. If the number of linearly independent equations in B is the same as the number of unknowns, then D will have exactly one diagonal entry di = 0. The matrices are often reordered such that the diagonal entries of D are in descending order (“ordered SVD”) – then this is the last (rightmost) entry. The exact solution of the equation system is given by the column of V, which corresponds to the diagonal entry di = 0 (in ordered SVD, this is the last column of V). One way of doing this in MATLAB is the following.

>> roots(poly(A))
ans =
2.8723
2.3703
-2.0000
-1.4458
-1.0000 + 0.0000i
-1.0000 - 0.0000i
0.2032
-0.0000

>>[U,S,V] = svd(A- (2.8723)*eye(8),'econ');

>> b = V(:,size(A,2))
b =
0.4063
0.3455
0.4760
0.3455
0.2406
0.2949
0.3711
0.2949

Note that the values are essentially the same as the previous approach. There is some small precision roundoff error. So then b is the approximate, normalized, principal eigenvector associated with the largest eigenvalue of A (d=2.8723), and node 3 has the largest component in the principal eigenvector. So we may say that node 3 is most central in the graph with respect to this measure.

So the principal eigenvector gives us a meaningful way to solve for importance values of nodes in graph, where our understanding is that the importance of a node is proportional to the sum of the importance values of its neighbors. Now let us suppose that instead of summing importance values we averaged them. This should certainly be meaningful, since we can say the the importance of a person can be thought of as the average importance of the friends they keep. So we are motivated to modify A so that average values are taken as follows:

>> M = diag(1./sum(A)) * A

M =

0    0.3333    0.3333    0.3333         0         0         0         0
0.3333         0         0    0.3333    0.3333         0         0         0
0.2500         0         0         0         0    0.2500    0.2500    0.2500
0.3333    0.3333         0         0    0.3333         0         0         0
0    0.5000         0    0.5000         0         0         0         0
0         0    0.5000         0         0         0    0.5000         0
0         0    0.3333         0         0    0.3333         0    0.3333
0         0    0.5000         0         0         0    0.5000         0
>> [V D] = eig(M)

V =

0.3536    0.1865   -0.6558   -0.1826    0.4141   -0.2912    0.0914   -0.0000
0.3536    0.3788    0.0477   -0.1826   -0.2474    0.2751    0.7035    0.0000
0.3536   -0.2669   -0.2468    0.5477   -0.3754   -0.0373   -0.2742   -0.0000
0.3536    0.3788    0.0477   -0.1826   -0.2474    0.2751   -0.5208   -0.0000
0.3536    0.4318    0.6192    0.5477    0.3532   -0.4685   -0.2742    0.0000
0.3536   -0.3702    0.1546   -0.0000    0.4380    0.4135    0.0000    0.7071
0.3536   -0.3827    0.2706   -0.5477   -0.2382   -0.4484    0.2742   -0.0000
0.3536   -0.3702    0.1546   -0.0000    0.4380    0.4135    0.0000   -0.7071

D =

1.0000    0         0         0         0         0         0         0
0    0.8773         0         0         0         0         0         0
0         0    0.0770         0         0         0         0         0
0         0         0   -0.3333         0         0         0         0
0         0         0         0   -0.7005         0         0         0
0         0         0         0         0   -0.5872         0         0
0         0         0         0         0         0   -0.3333         0
0         0         0         0         0         0         0    0.0000

 

We have exposed the flaw in this model, since the only realistic solution come from the frist eigenvector (associated eigenvalue 1) that simply ranks all the nodes exactly the same- if everyone is ranked as the average of neighbors then everyone is same. Hence we need a method to break this symmetry. One simple idea is to identify nodes in the network that are “juiced-up” from the outside. This is analogous to introducing an electric current or voltage.

Electrical Networks and the Laplace Matrix 

The model where node importance is the average of the importance of its set of neighbors places the problem close the heart of electrical network theory and the Laplacian.

We can consider the graph G as an electrical network where we attach a battery with source s and sink t (we can do this for any pair or collection of nodes) , which will then generate current flowing from s to t, and where all the edges represent unit resistors.  We can then solve for electrical potentials at each node of the graph that satisfies the usual Kirchoff/Ohm current  laws, which say the current (difference in potentials) flowing in is the current flowing out. In this way for each vertex v different from s and t, the weight or potential of that vertex is the average of the weights of the vertices adjacent to it.

Hence we will solve the system by taking the Laplace matrix extended by the non-zero vector of assumed currents introduced. When the graph is connected there will be one free variable, since the rank of L is n-1. We can set the last potential to zero,  then strike out the last column and solve for remaining potentials.

Let us use the graph from above and attach a battery a 100 volt battery between nodes 2 and 6, and we will assume without loss of generality that last node 8 has voltage weight 0, this allows us to strike out the last column and our system becomes uniquely constrained. Here is the solution:


>> L=diag(sum(A))-A

L =

3    -1    -1    -1     0     0     0     0
-1     3     0    -1    -1     0     0     0
-1     0     4     0     0    -1    -1    -1
-1    -1     0     3    -1     0     0     0
0    -1     0    -1     2     0     0     0
0     0    -1     0     0     2    -1     0
0     0    -1     0     0    -1     3    -1
0     0    -1     0     0     0    -1     2

>>Ls=L(1:8,1:7);

>> linsolve(Ls, [0 100 0 0 0 -100 0 0 ]')

ans =

112.50
175.00
12.50
150.00
162.50
-50.00
-12.50

Advertisements

Discussion

No comments yet.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: