说明:通过百度智慧交通平台城市拥堵指数平台 http://jiaotong.baidu.com/top/ 爬取道路的拥堵指数并存在Mysql数据库中。
百度-中国城市拥堵指数平台
百度-中国城市拥堵指数平台是百度地图旗下道路拥堵情况发布平台(http://jiaotong.baidu.com/top/),每隔5分钟刷新一次。先介绍几个url:
- https://jiaotong.baidu.com/trafficindex/city/list
返回的是所有城市的JSON格式,包括citycode。 - http://jiaotong.baidu.com/top/report/?citycode=194
修改citycode值可以进入相应的城市详情页 - https://jiaotong.baidu.com/trafficindex/city/roadrank?cityCode=194&roadtype=0
返回详情页上拥堵系数排名前十的全部道路(roadtype:0)或高速/快速路(roadtype:1)或普通道路(roadtype:11)。修改上面链接为状态码即可。 - https://jiaotong.baidu.com/trafficindex/city/roadcurve?cityCode=194&id=厦门大桥-1
修改citycode和id可查询某城市某条路一天中每隔5分钟的拥堵系数。道路id需要自己去试验,一般是 路名-序号,如'杏林大桥-1','海沧大桥-5','同安大桥-8','集源路-3'等等,后面序号无规律,需要自己测试。 - https://jiaotong.baidu.com/trafficindex/city/curve?cityCode=194&type=minute
返回城市一天中每隔5分钟的拥堵系数和速度 - https://jiaotong.baidu.com/trafficindex/city/curve?cityCode=194&type=day
返回城市一周中每天的拥堵系数
爬取特定道路一天的拥堵系数和速度
要先修改代码下的数据库信息,并手动创建tablename中的表名,表格属性name (name,speed,congestIndex,datatime,crawlertime),类型(text,float,float,text,datetime)
import re
import requests
from lxml import etree
import pymysql
import time
import json
def bridge():
road_list = ['厦门大桥-1','集美大桥-1','杏林大桥-1','海沧大桥-5','新阳大桥-1','同安大桥-8','翔安隧道-1','银江路辅路-2','嘉庚路-1','石鼓路-1','盛光路-1','浔江路-1','集源路-3','鳌园路-6','海堤路-2','滨水路-1','集美大道-3','同集南路-3','乐海路-1','岑西路-1']
result=[]
for i in range(len(road_list)):
sUrl = 'https://jiaotong.baidu.com/trafficindex/city/roadcurve?cityCode=194&id='+road_list[i]
res = requests.get(url=sUrl).content
res=res.decode("utf-8")
total_json = json.loads(res)
result.append(total_json.get('data').get('curve'))
return result
def write_sql(result):
client = pymysql.connect("119.23.111.11","root","password","traffic")
col = client.cursor()
tablename = ['bridge_xiamen','bridge_jimei','bridge_xinglin','bridge_haicang','bridge_xinyang','bridge_tongan','tunnel_xiangan','road_yinjiangfulu','road_jiageng','road_shigu','road_shengguang','road_xinjiang','road_jiyuan','road_aoyuan','road_haidi','road_bingshui','road_jimeidadao','road_tongjinan','road_lehai','road_cenxi']
for i in range(len(tablename)):
for j in range(len(result[i])):
t0 = tablename[i]
t1 = re.sub("[A-Za-z0-9\!\%\[\]\,\。\-]", "", result[i][j].get('roadsegid')) #去除名字中的-和序号
t2 = result[i][j].get('speed')
t3 = result[i][j].get('congestIndex')
t4 = result[i][j].get('datatime')
t5 = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime()) #获取爬取时间
print(t0,t1,t2,t3,t4,t5)
col.execute("insert into %s (name,speed,congestIndex,datatime,crawlertime) values('%s','%s','%s','%s','%s')" % (t0,t1,t2,t3,t4,t5))
client.commit()
client.close()
result = bridge()
write_sql(result)
爬取特定道路最近一条的拥堵系数和速度
要先修改代码下的数据库信息,并手动创建tablename中的表名,表格属性name (name,speed,congestIndex,datatime,crawlertime),类型(text,float,float,text,datetime)
import re
import requests
from lxml import etree
import pymysql
import time
import json
import threading
def bridge():
road_list = ['厦门大桥-1','集美大桥-1','杏林大桥-1','海沧大桥-5','新阳大桥-1','同安大桥-8','翔安隧道-1','银江路辅路-2','嘉庚路-1','石鼓路-1','盛光路-1','浔江路-1','集源路-3','鳌园路-6','海堤路-2','滨水路-1','集美大道-3','同集南路-3','乐海路-1','岑西路-1']
result=[]
for i in range(len(road_list)):
sUrl = 'https://jiaotong.baidu.com/trafficindex/city/roadcurve?cityCode=194&id='+road_list[i]
res = requests.get(url=sUrl).content
res=res.decode("utf-8")
total_json = json.loads(res)
result.append(total_json.get('data').get('curve'))
return result
def write_sql(result):
client = pymysql.connect("119.23.111.11","root","password","traffic")
col = client.cursor()
tablename = ['bridge_xiamen','bridge_jimei','bridge_xinglin','bridge_haicang','bridge_xinyang','bridge_tongan','tunnel_xiangan','road_yinjiangfulu','road_jiageng','road_shigu','road_shengguang','road_xinjiang','road_jiyuan','road_aoyuan','road_haidi','road_bingshui','road_jimeidadao','road_tongjinan','road_lehai','road_cenxi']
for i in range(len(tablename)):
t0 = tablename[i]
t1 = re.sub("[A-Za-z0-9\!\%\[\]\,\。\-]", "", result[i][-1].get('roadsegid'))
t2 = result[i][-1].get('speed')
t3 = result[i][-1].get('congestIndex')
t4 = result[i][-1].get('datatime')
t5 = time.strftime("%Y-%m-%d %H:%M:%S",time.localtime())
#print(t0,t1,t2,t3,t4,t5)
col.execute("insert into %s (name,speed,congestIndex,datatime,crawlertime) values('%s','%s','%s','%s','%s')" % (t0,t1,t2,t3,t4,t5))
client.commit()
client.close()
def pydata():
result = bridge()
write_sql(result)
timer = threading.Timer(300,pydata)
timer.start()
if __name__ == "__main__":
timer = threading.Timer(300,pydata)
timer.start()
版权声明:本文为原创文章,版权归 Helo 所有。
本文链接:https://www.ishelo.com/archives/224/
商业转载请联系作者获得授权,非商业转载请注明出处。
4 comments
我也在爬取这个,对指定路段还好,想要获取全市的路段数据有没有办法呢?
尝试使用高德或者百度地图的api爬交通态势
在北京,5分钟的数据,有时用处不大
主要是长期收集为以后的分析做准备哈