PageRank

编程入门 行业动态 更新时间:2024-10-13 14:29:09

<a href=https://www.elefans.com/category/jswz/34/1753495.html style=PageRank"/>

PageRank

目录

Routes database

Content

数据集来源

代码

1) 导包

 2)读入数据

3)数据探索 

4) 提取起飞和目的

 5)构建有向图

6) 输出机场排名,按PR值降序

7)定义画网络图函数


Routes database

As of January 2012, the OpenFlights/Airline Route Mapper Route Database contains 59036 routes between 3209 airports on 531 airlines spanning the globe.

Content

The data is ISO 8859-1 (Latin-1) encoded.

Each entry contains the following information:

  • Airline 2-letter (IATA) or 3-letter (ICAO) code of the airline.
  • Airline ID Unique OpenFlights identifier for airline (see Airline).
  • Source airport 3-letter (IATA) or 4-letter (ICAO) code of the source airport.
  • Source airport ID Unique OpenFlights identifier for source airport (see Airport)
  • Destination airport 3-letter (IATA) or 4-letter (ICAO) code of the destination airport.
  • Destination airport ID Unique OpenFlights identifier for destination airport (see Airport)
  • Codeshare "Y" if this flight is a codeshare (that is, not operated by Airline, but another carrier), empty otherwise.
  • Stops Number of stops on this flight ("0" for direct)
  • Equipment 3-letter codes for plane type(s) generally used on this flight, separated by spaces

The special value \N is used for "NULL" to indicate that no value is available.

Notes:

  • Routes are directional: if an airline operates services from A to B and from B to A, both A-B and B-A are listed separately.
  • Routes where one carrier operates both its own and codeshare flights are listed only once.

数据集来源

Flight Route Database | Kaggle

代码

1) 导包

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
import networkx as nx

 2)读入数据

data=pd.read_csv("d:/datasets/Flight Route Database.csv")

3)数据探索 

data.head()
data.info()

4) 提取起飞和目的

weight_=2
edges=[(i,j,weight_) for i,j  in data[[" source airport"," destination apirport"]].values ]  
#i为起飞,j为目的,weight为边的权重

 5)构建有向图

G = nx.DiGraph()  #实例化有向图
for edge in edges:G.add_edge(edge[0], edge[1])  #增加边
pagerank = nx.pagerank(G, alpha=0.85)   #计算PR值
G.add_weighted_edges_from(edges)   #边权重

6) 输出机场排名,按PR值降序

#pagerank为字典
sorted(pagerank.items(),key=lambda x:x[1],reverse=True)

7)定义画网络图函数

# 画网络图
def show_graph(graph, layout='spring_layout'):# 使用 Spring Layout 布局,类似中心放射状if layout == 'circular_layout':#positions=nx.positions=nx.circular_layout(graph)else:positions=nx.spring_layout(graph)# 设置网络图中的节点大小,大小与 pagerank 值相关,因为 pagerank 值很小所以需要 *200000nodesize = [x['pagerank']*200000 for v,x in graph.nodes(data=True)]# 设置网络图中的边长度edgesize = [e[2]['weight'] for e in graph.edges(data=True)]# 绘制节点nx.draw(graph, positions, node_size=nodesize, alpha=0.4)# 绘制边nx.draw_networkx_edges(graph, positions,  alpha=0.2)# 绘制节点的 labelnx.draw_networkx_labels(graph, positions, font_size=10)

  8) 输出PR阈值为0.003的机场链接图

nx.set_node_attributes(G, name = 'pagerank', values=pagerank)
nx.set_edge_attributes(G, name = 'weight', values=2)
pagerank_threshold = 0.003
small_graph = G.copy()
# 剪掉 PR 值小于 pagerank_threshold 的节点
for n, p_rank in G.nodes(data=True):if p_rank['pagerank'] < pagerank_threshold:small_graph.remove_node(n)
# 画网络图, 采用 circular_layout 布局让筛选出来的点组成一个圆
show_graph(small_graph, 'circular_layout')

更多推荐

PageRank

本文发布于:2024-03-10 17:05:59,感谢您对本站的认可!
本文链接:https://www.elefans.com/category/jswz/34/1728540.html
版权声明:本站内容均来自互联网,仅供演示用,请勿用于商业和其他非法用途。如果侵犯了您的权益请与我们联系,我们将在24小时内删除。
本文标签:PageRank

发布评论

评论列表 (有 0 条评论)
草根站长

>www.elefans.com

编程频道|电子爱好者 - 技术资讯及电子产品介绍!