地理科学 ›› 2017, Vol. 37 ›› Issue (9): 1310-1317.doi: 10.13249/j.cnki.sgs.2017.09.003

所属专题: 地理大数据

• • 上一篇    下一篇

基于大数据的城市服务业空间关联分析

廖伟华1(), 聂鑫2   

  1. 1.广西大学数学与信息科学学院, 广西 南宁 530004
    2.广西大学公共管理学院,广西 南宁 530004
  • 收稿日期:2016-11-14 修回日期:2017-03-04 出版日期:2017-11-20 发布日期:2017-11-20
  • 作者简介:

    作者简介:廖伟华(1975-),男,湖南耒阳人,副教授,硕士,主要研究方向为GIS、经济地理学、城市计算。E-mail: gisliaowh@163.com

  • 基金资助:
    国家自然科学基金项目(71363005)、国家社会科学基金(13CGL109)资助

Spatial Association Analysis for Urban Service Based on Big Data

Weihua Liao1(), Xin Nie2   

  1. 1. College of Mathematics and Information Science, Guangxi University, Nanning 530004, Guangxi, China
    2.School of Public Administration, Guangxi University, Nanning 530004, Guangxi, China
  • Received:2016-11-14 Revised:2017-03-04 Online:2017-11-20 Published:2017-11-20
  • Supported by:
    National Natural Sciences Foundation of China (71363005), Social Nature Sciences Foundation of China (13CGL109).

摘要:

信息技术与电商平台的发展,产生了各种各样的大数据。在城市服务业中,商家在电商平台上注册自己带有坐标的信息,构成了空间服务业的空间大数据源。首先建立限定距离阈值的空间关联规则数据模型,介绍该模型产生频繁项集和关联规则的方法与步骤。最后利用Python爬取糯米网南宁站的商家信息,用Apriori算法做出了10~1 000 m 6种距离阈值的空间关联规则和服务业空间频繁项集。

关键词: 大数据, 关联规则, Apriori算法, 服务业, 南宁市

Abstract:

With the development of information technology, big data has become a research focus of all sectors. There is an increasing demand for big data in the urban planning management process. Big data acquisition and calculation is a key technology in the process of the smart city construction. This article covers the following major aspects: 1) Distance table linking to urban service physical store table is used to establish spatial association frequent rules model based on the concept of spatial neighbouring point and the property of spatial point entity; the article also introduces the method and procedure of how spatial frequent items and spatial association rules appear in urban service spatial association model; 2) “For xml path” technology is used in SQL Server to build spatial transaction database because transaction database is needed in association rules computing; 3) Python+sqlite3+lxml+BeautifulSoup technology is used to crawl the online data of the companies in Nanning which have all of their public information registered on “Baidu Nuomi” (https://nn.nuomi.com/); 4) Apriori algorithm is applied to analyze spatial frequent items and spatial association rules in urban service industry of 6 distance thresholds between 10 to 1 000 meters with the obtained data. In case study, the top six registered businesses in “Baidu Nuomi” are snacks and fast food, beauty, hotels, bakeries, sweets and drinks, budget hotels. The spatial association rule of {budget hotels, hotels} has a high degree of confidence and a high upgrading degree in the distance threshold of 10 m and 50 m, being a set of strong spatial association rules. This illustrates the Nanning hotel industry has the characteristics of a compact layout, with all kinds of hotels being together. The spatial association rule of {sweet drinks, snacks and fast food} is a set of strong spatial association rules in the distance threshold of 50 m, 500 m and 1 000 m. Snacks and fast food frequency is very high, especially in the succeeding rules with high support degree. In different distance thresholds, as a kind of mass consumer entity service, snacks and fast food restaurants are distributed around various industries. Because the lift degree of these rules is about 1, the snacks and fast food industry has the characteristics of no connection with other industries. This study is an attempt to use ubiquitous web data around us to analyze city management. Researchers can get a steady flow of big data so as to better carry out the studies on city big data in real time with this methods and thoughts.

Key words: big data, association rules, Apriori algorithm, service industry, Nanning City

中图分类号: 

  • F290