A software architecture for Twitter collection, search and geolocation services

作者:

Highlights:

摘要

The substantial increase of social networks and their combination with mobile devices make rigorous analysis of the outcomes of such system of paramount importance for intelligence gathering and decision making purposes. Since the introduction of Twitter system in 2006, tweeting emerged as an efficient open social network that attracted interest from various research/commercial and military communities. This paper investigates the current software architecture of Twitter system and put forward a new architecture dedicated for semantic and spatial analysis of Twitter data. Especially, Twitter Streaming API was used as a basis for tweet collection data stored in MySQL like database. While Lucene system together with WordNet lexical database linked to advanced natural language processing and PostGIS platform were used to ensure semantic and spatial analysis of the collected data. A functional diversity approach was implemented to enforce fault tolerance for the data collection part where its performances were evaluated through comparison with alternative approaches. The proposal enables the discovery of spatial patterns within geo-located Twitter and can provide the user or operator with useful unforeseen elements.

论文关键词:Data mining,Tweet,Social network,Software architecture,Semantic analysis

论文评审过程:Received 8 December 2011, Revised 15 June 2012, Accepted 23 July 2012, Available online 7 August 2012.

论文官网地址:https://doi.org/10.1016/j.knosys.2012.07.017