Social
 	
  Network	
  Analysis	
  
                    	
  
Lecture	
  1:	
  Networks,	
  Random	
  
     Graphs	
  and	
  Metrics	
  
                   	
  
                   	
  
                              About	
  Me	
  
•  Reader	
  in	
  Mobile	
  Systems	
  	
  
    –  NetOS	
  Research	
  Group	
  
•  Research	
  on	
  Mobile,	
  Social	
  and	
  Sensor	
  Systems	
  
•  More	
  specifically,	
  
    –  Human	
  Mobility	
  and	
  Social	
  Network	
  modelling	
  
    –  OpportunisGc	
  Mobile	
  Networks	
  
    –  Mobile	
  Sensor	
  Systems	
  and	
  Networks	
  
Networks	
  are	
  Everywhere	
  
Facebook	
  Friendship	
  Network	
  
          Terrorist	
  Network	
  
  Six degrees of
 Mohammed Atta
Uncloaking Terrorist
Networks, by Valdis
      Krebs
The	
  Internet	
  
   Source: Bill Cheswick http://www.cheswick.com/ches/map/gallery/index.html
Airline	
  Network	
  
  Source: Northwest Airlines WorldTraveler Magazine
Railway	
  Network	
  
    Source: TRTA, March 2003 - Tokyo rail map
        What	
  Kind	
  of	
  Networks? 	
  	
  
•    Who	
  talks	
  to	
  whom?	
  
•    Who	
  is	
  friend	
  with	
  whom?	
  
•    What	
  leads	
  to	
  what?	
  
•    Who	
  is	
  a	
  relaGve	
  of	
  whom?	
  
•    Who	
  eats	
  whom?	
  	
  
•    Who	
  sends	
  messages	
  to	
  whom?	
  
                  In	
  This	
  Course	
  
•  We	
  will	
  study	
  the	
  models	
  and	
  metrics	
  which	
  
   allow	
  us	
  to	
  understand	
  these	
  phenomena.	
  
•  We	
  will	
  show	
  analysis	
  over	
  large	
  datasets	
  of	
  
   real	
  social	
  and	
  technological	
  networks.	
  
                      List	
  of	
  Lectures	
  
•    Lecture	
  1:	
  Networks	
  and	
  Random	
  Graphs	
  
•    Lecture	
  2:	
  Small	
  World	
  and	
  Weak	
  Ties	
  	
  
•    Lecture	
  3:	
  Centrality	
  and	
  community	
  detecGon	
  
•    Lecture	
  4:	
  Modularity,	
  overlapping	
  communiGes	
  	
  
•    Lecture	
  5:	
  Structure	
  of	
  the	
  Web	
  and	
  Power	
  laws	
  
•    Lecture	
  6:	
  The	
  Internet	
  and	
  Robustness	
  	
  
•    Lecture	
  7:	
  InformaGon	
  Cascades	
  on	
  Networks	
  
•    Lecture	
  8:	
  Epidemic	
  DisseminaGon	
  on	
  Networks	
  
•    Lecture	
  9:	
  Cascades	
  and	
  Epidemics	
  ApplicaGons	
  
•    Lecture	
  10:	
  Time	
  Varying	
  Social	
  Networks	
  
•    Lecture	
  11-‐12:	
  PracGcal	
  Tutorial	
  
•    Lecture	
  13:	
  Geo-‐Social	
  Networks	
  
•    Lecture	
  14-‐15:	
  Student	
  PresentaGons	
  
•    Lecture	
  16:	
  Industrial	
  PresentaGon	
  
                 In	
  This	
  Lecture	
  
•  We	
  will	
  introduce:	
  	
  
    –  Networks/graphs	
  	
  
    –  Basic	
  network	
  measures	
  
    –  Random	
  Graphs	
  
    –  Examples	
  
             A	
  Network	
  is	
  a	
  Graph	
  
A	
  graph	
  G	
  is	
  a	
  tuple	
  (V,E)	
  of	
  a	
  set	
  of	
  verGces	
  V	
  
     and	
  edges	
  E.	
  An	
  edge	
  in	
  E	
  connects	
  two	
  
     verGces	
  in	
  V.	
  	
  
A	
  neighbour	
  set	
  N(v)	
  is	
  the	
  set	
  of	
  verGces	
  
     adjacent	
  to	
  v:	
  
             N(v) = {u " V | u # v,(v,u) " E}
                           Node	
  Degree	
  
•  The	
  node	
  degree	
  is	
  the	
  number	
  of	
  neighbours	
  
   of	
  a	
  node	
  
•  E.g.,	
  Degree	
  of	
  A	
  is	
  2	
                  A	
  
                                                E	
              B	
  
  The	
  study	
  of	
  the	
  degree	
  
  distribuGon	
  of	
  networks	
  allows	
  
  the	
  classificaGon	
  of	
  networks	
                D	
             c	
  
  in	
  different	
  categories	
  
Directed	
  &	
  Undirected	
  Graphs	
  
                   A	
  
                                                                                        A	
  
   E	
                         B	
  
                                                                        E	
                          B	
  
                  D	
                  c	
  
                                                                                       D	
                   C	
  
     Undirected	
  Graph	
                                                      Directed	
  Graph	
  
                                               Example	
  of	
  Undirected	
  Graphs:	
  Facebook,	
  Co-‐presence	
  	
  
                                               	
                                                 Examples	
  of	
  Directed:	
  Twijer,	
  Email,	
  Phone	
  Calls	
  
                  Paths	
  and	
  Cycles	
  
•  A	
  path	
  is	
  a	
  sequence	
  of	
  nodes	
  in	
  which	
  each	
  
   pair	
  of	
  consecuGve	
  nodes	
  is	
  connected	
  by	
  an	
  
   edge.	
  
    –  If	
  graph	
  is	
  directed	
  the	
  edge	
  needs	
  to	
  be	
  in	
  the	
  
       right	
  direcGon.	
  
    –  E.g.	
  A-‐E-‐D	
  is	
  a	
  path	
  in	
  both	
  previous	
  graphs	
  
•  A	
  cycle	
  is	
  a	
  path	
  where	
  the	
  start	
  node	
  is	
  also	
  
   the	
  end	
  node	
  
    –  E.g.	
  E-‐A-‐B-‐C	
  is	
  a	
  cycle	
  in	
  the	
  undirected	
  graph	
  
                      ConnecGvity	
  
•  A	
  graph	
  is	
  connected	
  if	
  there	
  is	
  a	
  path	
  
   between	
  each	
  pair	
  of	
  nodes.	
  
•  Example	
  of	
  disconnected	
  graph:	
  
                                  A	
  
                      E	
                 B	
  
                                 D	
              c	
  
                      Components	
  
•  A	
  connected	
  component	
  of	
  a	
  graph	
  is	
  the	
  
   subset	
  of	
  nodes	
  for	
  which	
  each	
  of	
  them	
  has	
  a	
  
   path	
  to	
  all	
  others	
  (and	
  the	
  subset	
  is	
  not	
  part	
  of	
  
   a	
  larger	
  subset	
  with	
  this	
  property).	
  
    –  Connected	
  components:	
  A-‐B-‐C	
  and	
  E-‐D	
  
•  A	
  giant	
  component	
  is	
  a	
  connected	
  component	
  
   containing	
  a	
  significant	
  fracGon	
  of	
  nodes	
  in	
  
   the	
  network. 	
  	
  
    –  Real	
  networks	
  olen	
  have	
  one	
  unique	
  giant	
  
       component.	
  
          Path	
  Length/Distance	
  
•  The	
  distance	
  (d)	
  between	
  two	
  nodes	
  in	
  a	
  
   graph	
  is	
  the	
  length	
  of	
  the	
  shortest	
  path	
  linking	
  
   the	
  two	
  graphs.	
  
•  The	
  diameter	
  of	
  the	
  graph	
  is	
  the	
  maximum	
  
   distance	
  between	
  any	
  pair	
  of	
  its	
  nodes.	
  
                                 A	
  
                      E	
                B	
  
                                                         d(A,C)=2	
  
                                D	
              C	
  
               Small-‐world	
  Phenomenon	
  
                Milgram’s	
  Experiment	
  
•  Two	
  random	
  people	
  are	
  
   connected	
  through	
  only	
  a	
  
   few	
  (6)	
  intermediate	
  
   acquaintances.	
  
•  Milgram’s	
  experiment	
  
   (1967)	
  shows	
  the	
  known	
  “six	
  
   degrees	
  of	
  separaGon”:	
  
    –  Choose	
  300	
  people	
  at	
  random	
  
    –  Ask	
  them	
  to	
  send	
  a	
  lejer	
  
       through	
  friends	
  to	
  a	
  
       stockbroker	
  near	
  Boston.	
  
    –  64	
  successful	
  chains.	
  
                    Erdos	
  Number	
  
	
  
•  Erdos	
  Number:	
  
     distance	
  from	
  
     the	
  
     mathemaGcian	
  
     (most	
  people	
  
     are	
  4-‐5	
  hops	
  
     away)	
  based	
  on	
  
     collaboraGon.	
  
                Bacon	
  Number	
  
•  A	
  network	
  of	
  actors	
  who	
  costarred	
  
   in	
  a	
  movie.	
  
•  Most	
  actors	
  are	
  no	
  more	
  than	
  ~3	
  
   hops	
  from	
  Kevin	
  Bacon.	
  
•  One	
  very	
  obscure	
  movie	
  was	
  at	
  
   distance	
  8.	
  	
  
               Random	
  Graphs	
  
•  First	
  way	
  to	
  model	
  these	
  networks:	
  
•  Erdos-‐Renyi	
  Random	
  Graph	
  [Erdos-‐Renyi	
  ‘59]:	
  
	
  G(n,p):	
  graph	
  with	
  n	
  verGces	
  where	
  an	
  edge	
  
     exists	
  with	
  independent	
  random	
  probability	
  
     0<p<1	
  for	
  each	
  edge.	
  
          Random	
  Graph	
  Model	
  
•  For	
  each	
  node	
  n1,	
  an	
  edge	
  to	
  node	
  n2	
  exists	
  
   with	
  probability	
  p.	
  
•  Degree	
  distribuGon	
  is	
  binomial.	
  
•  The	
  probability	
  of	
  a	
  node	
  to	
  have	
  degree	
  k:	
  
                                 k        k              N "1"k
     P(ki = k) = C               N "1   p (1" p)
•  Where	
       C   k
                     N "1   =(     N "1
                                   k      )
•  Expected	
  Degree	
  of	
  a	
  node:	
  	
  (N "1)p # Np	
  
       Random	
  Graphs	
  ProperGes	
  
•  For	
  large	
  N	
  this	
  is	
  approximated	
  by	
  the	
  
   Poisson	
  distribuGon	
  with	
  	
  
                                      k                  k
                        " Np   (Np)    "k k
         P(k) ! e                   =e
                                 k!       k!	
  
Degree	
  DistribuGon	
  	
  
of	
  Random	
  Graphs	
  
                k	
  
       Random	
  Graph	
  Diameter	
  
•  The	
  diameter	
  of	
  random	
  graph	
  and	
  the	
  
   average	
  path	
  length	
  of	
  the	
  graph	
  have	
  been	
  
   demonstrated	
  to	
  be:	
  
                ln(N )   ln(N )
            d=         =        ! lrand
               ln( pN ) ln(
                          ! k )
	
                    The	
  average	
  distance	
  between	
  two	
  nodes	
  
                      is	
  quite	
  small	
  wrt	
  to	
  the	
  size	
  of	
  the	
  graph.	
  
	
  
RelaGonship	
  of	
  <k>	
  and	
  connecGvity	
  
 •    <k>	
  =	
  average	
  degree	
  (np)	
  
 •    If	
  <k>	
  <	
  1	
  disconnected	
  network	
  
 •    If	
  <k>	
  >	
  1	
  a	
  giant	
  component	
  appears	
  
 •    If	
  <k>	
  >=	
  ln(N)	
  graph	
  is	
  totally	
  connected	
  
          Random	
  Graph	
  Diameter:	
  
               An	
  IntuiGon	
  
                                    A
                                                                         <k>	
  new	
  connecGons	
  
                                            …	
  
                                                                                   <k>	
  new	
  connecGons	
  per	
  
                                                                                   each	
  new	
  node	
  
                                                    …	
  
• 	
  The	
  nodes	
  at	
  distance	
  l	
  from	
  A	
  will	
  be	
  ~	
  <k>l	
  
                                                    N=k^l	
  	
  log	
  N=	
  l	
  log	
  k	
  
                                                    	
  l	
  =	
  log	
  N/log	
  k	
                                                      	
  
          Clustering	
  Coefficient	
  
•  The	
  clustering	
  coefficient	
  defines	
  the	
  
   proporGon	
  of	
  A’s	
  neighbours	
  (N(A))	
  which	
  are	
  
   connected	
  by	
  an	
  edge	
  (are	
  friends).	
  
•  The	
  number	
  of	
  triangles	
  in	
  which	
  A	
  is	
  involved	
  
   wrt	
  to	
  the	
  ones	
  it	
  could	
  be	
  involved	
  in.	
  
                                A	
       B	
  
                      E	
       C	
       D	
       F	
  
Formally:	
  Clustering	
  Coefficient	
  
                                               2 | {e jk } |
 Local	
  Clustering	
  Coefficient	
       Ci =               : v j, vk " N i , e j,k " E
                                               ki (ki !1)
                                                   1
 Network	
  Clustering	
  Coefficient	
          CG = ! Ci
                                                   N i
Clustering	
  Coefficient:	
  Example	
  
•    Ki=3	
  
•    Nominator=	
  2*2=4	
  
•    Denominator=6	
  
•    Ci=4/6=2/3	
  
                           A	
     B	
  
                   E	
     C	
     D	
     F	
  
            Clustering	
  Coefficient	
  	
  
             of	
  	
  a	
  Random	
  Graph	
  
•  The	
  clustering	
  coefficient	
  of	
  a	
  random	
  graph	
  is	
  
                                                          	
  
                                 k                        	
        !n $ !n $
                                                                 p *# v & / # v & = p
                    Crand   = p=                          	
        "2% "2%
                                 N                        	
  
•  The	
  probability	
  that	
  2	
  neighbours	
  of	
  a	
  node	
  
     are	
  connected	
  is	
  equal	
  to	
  the	
  probability	
  that	
  
     2	
  
      !random	
  nodes	
  are	
  connected	
  
•  Is	
  this	
  mirroring	
  the	
  clustering	
  coefficient	
  of	
  
     real	
  networks?	
  	
  
                    QuesGon 	
  	
  
•  Are	
  Random	
  Graphs	
  representaGves	
  of	
  Real	
  
   Networks?	
  
                     Summary	
  	
  
•  We	
  have	
  introduced	
  graphs	
  definiGons	
  and	
  
   measures.	
  
•  Random	
  graphs	
  are	
  a	
  first	
  examples	
  of	
  models	
  
   for	
  networks.	
  
                        References 	
  	
  
•  Material	
  from	
  Chapter	
  1,	
  2	
  of	
  	
  
    –  D.	
  Easley,	
  J.	
  Kleinberg.	
  Networks,	
  Crowds,	
  and	
  
       Markets:	
  Reasoning	
  About	
  a	
  Highly	
  Connected	
  
       World.	
  Cambridge	
  University	
  Press,	
  2010.	
   	
  	
  
•  R.	
  Albert,	
  A.	
  Barabasi.	
  StaGsGcal	
  Mechanics	
  of	
  
   Complex	
  Networks.	
  Reviews	
  of	
  Modern	
  Physics	
  
   (74).	
  Jan.	
  2002.	
  
•  S.	
  Boccale{,	
  V.	
  Latora,	
  Y.	
  Moreno,	
  M.	
  Chavez,	
  D.-‐
   U.	
  Hwang,	
  Complex	
  Networks:	
  Structure	
  and	
  
   Dynamics	
  Physics	
  Reports	
  424	
  (2006)	
  175	
  .