Quantifying Active and Passive De-anonymization in Social Networks

Internet users reasonably expect their online identities and web browsing activities to remain private. Unfortunately, this is far from the case in practice; in reality, users are constantly tracked on the internet. As web tracking technologies become more sophisticated and pervasive, there is an urgent need to understand and quantify web users’ privacy risks. In this project we address this problem by considering online fingerprinting attacks as well as offline social network de-anonymization attacks (graph matching attacks).

Online and Offline Offline Fingerprinting Attacks

In this project, we introduce a new mathematical formulation for the problem of de-anonymizing social network users by actively querying their membership in social network groups (Online Fingerprinting) as well as de-anonymizing pairs of correlated databases (Offline Fingerprinting). In online fingerprinting, the attacker has access to a noisy observation of the group membership of each user in the social network. When an unidentified victim visits a malicious website, the attacker uses browser history sniffing to make queries regarding the victim’s social media activity. Particularly, it can make polar queries regarding the victim’s group memberships and the victim’s identity. The attacker receives noisy responses to her queries. The goal is to de-anonymize the victim with the minimum number of queries. In offline fingerprinting, the attacker has access to a pair of correlated databases. It is assumed that one of the databases is labeled (de-anonymized), whereas the labels for the other database entries are removed (anonymized). The objective is to recover the labels in the second database by leveraging the correlation among the database entries. Starting with a rigorous mathematical model for the offline and online fingerprinting problems, upper bounds on the attacker’s expected query cost are derived, and new attack algorithms are proposed which achieve this bound. These algorithms vary in computational cost, probability of success and reliability.

Featured Group Publications

F. Shirani Chaharsooghi, S. Garg, E. Erkip, ‘An information theoretic framework for active de-anonymization in social networks based on group memberships’, 55th Annual Allerton Conference on Communication, Control, and Computing, pp. 470-477. IEEE, 2017.
F. Shirani Chaharsooghi, S. Garg, E. Erkip, ‘Optimal active social network de-anonymization using information thresholds’, IEEE International Symposium on Information Theory (ISIT), pp. 1445-1449. IEEE 2018.
F. Shirani, S. Garg, E. Erkip, “A Concentration of Measure Approach to Database De-anonymization,” to appear in IEEE International Symposium on Information Theory (ISIT), July 2019.

Offline Social Network De-anonymization Attacks

In this project, we investigate the matchability of pairs of correlated graphs and propose graph matching algorithms based on the concept of typicality from information theory. Graph matching algorithms can be used to exploit the publicly available information over several social networks to match their corresponding graphs and reveal private information. Given a pair of stochastically correlated graphs, the objective is to deduce the one-to-one correspondence between their vertices which is consistent with the underlying statistical correlation. Previous approaches to the problem often leverage the graph structure for reliable matching. In contrast, the approach proposed in this work considers the adjacency matrices of the two graphs and uses concentration of measure theorems to determine whether the two graphs can be matched reliably. We apply the framework both to seeded and seedless matching scenarios under various random graph models. Our main result shows that reliable matching is contingent upon the value of the mutual information between the variables corresponding to the edge probabilities of the two graphs being greater than a critical value.

Featured Group Publications

F. Shirani Chaharsooghi, S. Garg, E. Erkip, ‘Typicality matching for pairs of correlated graphs’, IEEE International Symposium on Information Theory (ISIT), pp. 221-225. IEEE, 2018.
F. Shirani Chaharsooghi, S. Garg, E. Erkip, ‘Seeded graph matching: efficient algorithms and theoretical guarantees’, 51st Asilomar Conference on Signals, Systems, and Computers, pp. 253-257. IEEE, 2017.