Data analytics on graphs
Loading...
Date
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Πανεπιστήμιο Πελοποννήσου
Abstract
Nowadays, there is an increasing need for brands (stakeholders) to effectively and
efficiently connect with their customers in both spatial and social domains so as to
grow their revenues. In the spatial field, there are a variety of location-based services
(e.g., Google Maps, Uber, Foursquare) for utilization by brands, whereas in social
area, there are several social networks (e.g., Facebook, Instagram, VK) in which
brands can maintain their own social network pages for advertising. In this thesis, we
analyze and study fundamental spatial and social data operations on graphs that can
significantly contribute to the successful connection among brands and customers.
In the spatial domain, our contribution is the development of a spatial RDF system
named SRX that extends the popular RDF-3X store to provide spatial RDF data
operations. RDF-3X itself does not support spatial RDF data. In particular, SRX
supports three types of spatial queries: range selections (e.g., find entities within a
given polygon), distance joins (e.g., find pairs of entities whose locations are close
to each other), and k nearest neighbors (e.g., find the three closest entities from a
given location). Further, SRX supports spatial updates (e.g., deletions, insertions,
and modifications of spatial RDF triples). SRX relies its good performance on a gridscheme
that approximates the geometries of the spatial entities inside their integer
IDs. We extensively evaluate the performance of SRX for both queries and updates
by comparing it with the systems RDF-3X, Virtuoso, GraphDB, and Strabon on
LGD and YAGO datasets. Our results show that SRX outperforms other systems for
queries and updates, while it incurs just a little overhead to RDF-3X for updates.
In the social domain, we contribute by studying three novel content-aware recommendation
problems relative to the Influence Maximization (IM) problem. IM seeks
for the k users who can maximize the influence of a given post in a social network.
The first problem we study, named Content-Aware Influence Maximization (CAIM),
is the inverse variant of IM and seeks for the k features that can form the content of
a non-given post so as to make it popular in a social network. The diffusion of the
post starts from a given set of initial adopters (subscribers of brand’s social network
page). We prove that CAIM does not have influence guarantees, and for that we
deploy heuristic methods to solve it. Our experimental results on Gnutella and VK datasets show that our advanced heuristic algorithm is more influential than simple
heuristics and it is also much faster than a conventional greedy approach.
The second problem we study is an adaptive (online) version of CAIM, named
Adaptive Content-Aware Influence Maximization (ACAIM), and aims to maximize
the cumulative influence achieved in a social network over a number of rounds. In
each round, the content of a post is sought (comprising k features) and the influence
feedback of posts in the previous rounds is utilized for the content decision of posts
in the next rounds. To solve ACAIM, we integrate Online Learning to Rank (OLR)
techniques to our machine learning IM framework. To achieve that, we deploy a
propagation model, a simulator that runs the model to generate realistic feedback,
and three ACAIM learners. Our thorough experimental study on various VK datasets
for several brands shows that ACAIM is solvable in big social networks.
Finally, the third problem we study relates with how brands can maximize their
subscription (instead of influence as happens in CAIM and ACAIM) in social networks.
Specifically, we propose a content recommendation policy to brands for Gaining
Subscribers by Messaging (GSM). The goal of the GSM problem is to maximize
the cumulative subscription gain in a social network over a series of rounds. In each
round, GSM recommends to brands what content (consisting of k features) to publish
in their social network pages and which m users to notify of that content. We develop
three GSM solvers, and by conducting a rich experimental evaluation on different
VK datasets, we ascertain the importance and practical value of GSM.
Description
Δ.Δ. 24
Keywords
Citation
Endorsement
Review
Supplemented By
Referenced By
Creative Commons license
Except where otherwised noted, this item's license is described as Αναφορά Δημιουργού-Μη Εμπορική Χρήση-Όχι Παράγωγα Έργα 3.0 Ελλάδα

