Data analytics on graphs

Loading...
Thumbnail Image

Journal Title

Journal ISSN

Volume Title

Publisher

Πανεπιστήμιο Πελοποννήσου

Abstract

Nowadays, there is an increasing need for brands (stakeholders) to effectively and efficiently connect with their customers in both spatial and social domains so as to grow their revenues. In the spatial field, there are a variety of location-based services (e.g., Google Maps, Uber, Foursquare) for utilization by brands, whereas in social area, there are several social networks (e.g., Facebook, Instagram, VK) in which brands can maintain their own social network pages for advertising. In this thesis, we analyze and study fundamental spatial and social data operations on graphs that can significantly contribute to the successful connection among brands and customers. In the spatial domain, our contribution is the development of a spatial RDF system named SRX that extends the popular RDF-3X store to provide spatial RDF data operations. RDF-3X itself does not support spatial RDF data. In particular, SRX supports three types of spatial queries: range selections (e.g., find entities within a given polygon), distance joins (e.g., find pairs of entities whose locations are close to each other), and k nearest neighbors (e.g., find the three closest entities from a given location). Further, SRX supports spatial updates (e.g., deletions, insertions, and modifications of spatial RDF triples). SRX relies its good performance on a gridscheme that approximates the geometries of the spatial entities inside their integer IDs. We extensively evaluate the performance of SRX for both queries and updates by comparing it with the systems RDF-3X, Virtuoso, GraphDB, and Strabon on LGD and YAGO datasets. Our results show that SRX outperforms other systems for queries and updates, while it incurs just a little overhead to RDF-3X for updates. In the social domain, we contribute by studying three novel content-aware recommendation problems relative to the Influence Maximization (IM) problem. IM seeks for the k users who can maximize the influence of a given post in a social network. The first problem we study, named Content-Aware Influence Maximization (CAIM), is the inverse variant of IM and seeks for the k features that can form the content of a non-given post so as to make it popular in a social network. The diffusion of the post starts from a given set of initial adopters (subscribers of brand’s social network page). We prove that CAIM does not have influence guarantees, and for that we deploy heuristic methods to solve it. Our experimental results on Gnutella and VK datasets show that our advanced heuristic algorithm is more influential than simple heuristics and it is also much faster than a conventional greedy approach. The second problem we study is an adaptive (online) version of CAIM, named Adaptive Content-Aware Influence Maximization (ACAIM), and aims to maximize the cumulative influence achieved in a social network over a number of rounds. In each round, the content of a post is sought (comprising k features) and the influence feedback of posts in the previous rounds is utilized for the content decision of posts in the next rounds. To solve ACAIM, we integrate Online Learning to Rank (OLR) techniques to our machine learning IM framework. To achieve that, we deploy a propagation model, a simulator that runs the model to generate realistic feedback, and three ACAIM learners. Our thorough experimental study on various VK datasets for several brands shows that ACAIM is solvable in big social networks. Finally, the third problem we study relates with how brands can maximize their subscription (instead of influence as happens in CAIM and ACAIM) in social networks. Specifically, we propose a content recommendation policy to brands for Gaining Subscribers by Messaging (GSM). The goal of the GSM problem is to maximize the cumulative subscription gain in a social network over a series of rounds. In each round, GSM recommends to brands what content (consisting of k features) to publish in their social network pages and which m users to notify of that content. We develop three GSM solvers, and by conducting a rich experimental evaluation on different VK datasets, we ascertain the importance and practical value of GSM.

Description

Δ.Δ. 24

Keywords

Citation

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license