Seminar Privacy und Technischer Datenschutz

  • Type: seminar
  • Chair: KIT-Fakultäten - KIT-Fakultät für Informatik - KASTEL – Institut für Informationssicherheit und Verlässlichkeit - KASTEL Strufe
    KIT-Fakultäten - KIT-Fakultät für Informatik
  • Semester: summer of 2024
  • Lecturer: Prof. Dr. Thorsten Strufe
    Patricia Guerra Balboa
    Saskia Bayreuther
  • SWS: 2
  • Lv-No.: 2400087
topics

The seminar covers current topics from the research area of technical data protection.

These include, for example:

  • anonymous communication
  • network security
  • anonymous online services
  • anonymous digital payment systems
  • evaluation of the anonymity of online services
  • anonymized publication of data (differential privacy, k-anonymity)
  • transparency/awareness enhancing systems
  • media literacy support
language of lecture english

Organisation

Preliminary schedule:

April 16, 2024, 10:00–10:45 Introduction (Organization & Topics) Room 252 (50.34)
April 23, 2024, 10:00–11:00 Kickoff Reading, Writing, Presenting, Topic preferences due Room 252 Room 148 (50.34)
April 24, 2024 Topic assignment  
June 30, 2024 Paper submission deadline  
July 7, 2024 Reviews due  
July 14, 2024 Revision deadline  
~ July 23, 2024 Presentations TBD

Materials

Topics

This is a preliminary list of available topics:

#1 Privacy Protections for Mixed Reality

Supervisor: Simon Hanisch

Virtual Reality (VR) and Augmented Reality (AR) continue to evolve and become more integrated into our daily lives. The devices required to realize VR/AR connect the virtual world with the physical world by continuously monitoring their users and their surroundings. This continuous monitoring raises new privacy issues, as the people captured by the monitoring are recorded in a new quality and quantity. The goal of this seminar is to explore what privacy technologies exist for users of VR/AR and how they can be categorized and compared.

References:

#2 A survey of biometric recognitions for smart cities (unavailable)

Supervisor: Julian Todt

Smart Cities capture individuals with a range of sensors which include video, depth and thermal cameras, LiDAR, Radar and WiFi. It has been shown in the past that these recordings can be used to infer personal attributes and the identity of the captured individuals. The goal of this seminar topic is to survey the current state-of-the-art of these recognition methods, analyze and compare them. Finally, the surveyed methods should be compared across sensors and it should be discussed how differences refer back to differences between sensors.

References:

  • Shen, Chuanfu, Chao Fan, Wei Wu, Rui Wang, George Q. Huang, and Shiqi Yu. “LidarGait: Benchmarking 3D Gait Recognition With Point Clouds,” 1054–63, 2023.
  • Zou, Shinan, Chao Fan, Jianbo Xiong, Chuanfu Shen, Shiqi Yu, and Jin Tang. “Cross-Covariate Gait Recognition: A Benchmark.” arXiv, March 4, 2024.
  • Pegoraro, Jacopo, and Michele Rossi. “Real-Time People Tracking and Identification From Sparse Mm-Wave Radar Point-Clouds.” IEEE Access 9 (2021): 78504–20.

#3 Network slicing: Isolation of network devices in software-defined networks

Supervisor: Fritz Windisch

Network slicing describes the practice of dividing a network into multiple logical networks that are fully isolated from each other. These logical networks can then guarantee certain properties (e.g. quality of service guarantees like latency, bandwidth, jitter and more). In recent time, network slicing has gained additional attention due to being utilized in 5G and subsequent standards. By providing the aforementioned isolation and guarantees, network slicing plays an important role in securing modern software-defined networks (SDN).

The goal of the work is to give an overview over network slicing and existing approaches (single-domain and multi-domain) alongside their claims on security and limitations. As there are many projects on network slicing, utilizing the most recent developments will be sufficient.

References:

#4 Choosing things privately with Differential Privacy

Supervisor: Patricia Guerra Balboa

Many statistical analyses involve the selection of an element from a set of possibilities. For example, if we want to predict a traffic jam, we need to select the most congested road with a higher probability of getting stuck from the whole set of possible roads. At the same time, this selection process involves private information of the users. For example, to know the most congested road, we may need access to the current location of individuals.

Differential Privacy (DP) has become the formal and de facto mathematical standard for privacy-preserving data analysis in such situations. Hence, we find various algorithms in the literature that allow us to “choose things privately”. Now the question arises: Which one do we want to choose? The goal of this seminar topic is to understand the state of the art of DP “choosing” mechanisms and to compare them in terms of privacy and utility to build a systematization that allows us to know which one it is worth to use depending on the circumstances.

References:

  • Desfontaines, D. (2023, 10). Choosing things privately with the exponential mechanism. https:// desfontain.es/privacy/choosing-things-privately.html. (Ted is writing things (personal blog))
  • Dwork, C., Roth, A., et al. (2014). The algorithmic foundations of differential privacy. Foundations and Trends® in Theoretical Computer Science, 9 (3–4), 211–407.
  • McKenna, R., & Sheldon, D. R. (2020). Permute-and-flip: A new mechanism for differentially private selection. Advances in Neural Information Processing Systems, 33, 193–203.

#5 Tor beyond TCP

Supervisor: Daniel Schadt

Tor is the most widely used tool for anonymous internet browsing. However, the current version of Tor uses TCP extensively: Relays use TCP to talk to each other, and Tor's circuits are best suited to route TCP connections. This brings practical challenges, such as head-of-line blocking or inefficiencies in UDP-based protocols.

In this seminar, you will give an overview over the usage of TCP in Tor, and the alternatives that are proposed. You then evaluate those proposals in terms of efficiency improvement, impact on the anonymity and other sensible categories.

References:

#6 Deniability in multi-party computation

Supervisor: Saskia Bayreuther

In multi-party computation (MPC), mutually distrusting parties wish to jointly compute a function over private inputs. Typically, in MPC protocols, properties such as correctness, i.e. the result the parties receive is the correct output of the function, and input privacy, i.e. no information about the inputs is revealed from the protocol transcript and the output, are guaranteed. Deniability is a property where participation in a protocol is oblivious to the adversary, with access to some information such as protocol transcripts.

For deniability, some form of anonymization is required. This seminar paper takes a look at deniability in MPC, staring at [1]. The student reviews papers about deniability in MPC and compares approaches to reaching the property.

References:

  1. https://eprint.iacr.org/2014/753

#7 Oblivious Message Retrieval

Supervisor: Christoph Coijanovic

Oblivious Message Retrieval (OMR) and Oblivious Message Detection (OMD) are new paradigms for anonymous communication that have received much interest since their first introduction in 2021 [0-6]. OMD assumes an setting where an untrusted server stores a database of messages. Clients want the server to detect and tell them which of these messages are for them, without the server learning the same information. OMR goes a step further and allows clients to retrieve “their” messages without compromising privacy.

Your assignment for this topic is to review existing approaches to OMD and OMR and answer the following questions:

  • What constructions exist for OMD/OMR, how do they work, and how do they differ from each other?
  • What are the advantages and disadvantages (see [6]!) of OMD/OMR compared to other techniques for anonymous communication like Private Information Retrieval (PIR) [7]?

References:

  1. Beck, Gabrielle et al. “Fuzzy Message Detection.” Proceedings of the 2021 ACM SIGSAC CCS (2021).
  2. Liu, Zeyu and Eran Tromer. “Oblivious Message Retrieval.” Annual International Cryptology Conference (2022).
  3. Liu, Zeyu et al. “Group Oblivious Message Retrieval.” IACR Cryptol. ePrint Arch. (2023).
  4. Jakkamsetti, Sashidhar et al. “Scalable Private Signaling.” IACR Cryptol. ePrint Arch. (2023).
  5. Liu, Zeyu et al. “PerfOMR: Oblivious Message Retrieval with Reduced Communication and Computation.” IACR Cryptol. ePrint Arch. (2024).
  6. Wang, Zhiwei et al. “Online/Offline and History Indexing Identity-Based Fuzzy Message Detection.” IEEE TIFS (2023).
  7. Seres, István András et al. “The Effect of False Positives: Why Fuzzy Message Detection Leads to Fuzzy Privacy Guarantees?” ArXiv (2021).
  8. Davidson, Alex et al. “FrodoPIR: Simple, Scalable, Single-Server Private Information Retrieval.” IACR Cryptol. ePrint Arch. (2023).

#8 Vulnerabilities on 5G network Layer 2

Supervisor: Kamyar Abedi

While 5G marks a substantial improvement in security compared to previous standards, recent studies have uncovered numerous vulnerabilities across the physical and network layers of the 5G stack. However, our insight of security and privacy at the second layer remains limited. In this seminar, we aim to thoroughly explore and analyze various vulnerabilities targeting the 5G layer 2 stacks.

References:

  • D. Rupprecht, K. Kohls, T. Holz and C. Pöpper, “Breaking LTE on Layer Two,” 2019 IEEE Symposium on Security and Privacy (SP), San Francisco, CA, USA, 2019, pp. 1121-1136
  • Kohls, K., Rupprecht, D., Holz, T. and Pöpper, C., 2019, May. Lost traffic encryption: fingerprinting LTE/4G traffic on layer two. In Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks (pp. 249-260)

#9 Attacks on Biometric Authentication Systems

Supervisor: Matin Fallahi

This seminar is aimed to explore the weaknesses found in biometric authentication systems. We will look into various studies to learn about different attacks that can break the security and reliability of these systems. Our main goal is to examine biometric authentication closely to find any weak spots that attackers could use. By studying various academic papers, we will understand the risks these systems face and gain a better insight into the security issues of biometric authentication. We aim to address an important question: What kinds of attacks can happen to biometric systems, and how can we make these systems safer?

References:

  • Roberts C. Biometric. Attack Vectors and Defences. Computers & Security. 2007 Feb 1;26(1):14-25.
  • Galbally J, McCool C, Fierrez J, Marcel S, Ortega-Garcia J. On the vulnerability of face verification systems to hill-climbing attacks. Pattern Recognition. 2010 Mar 1;43(3):1027-38.
  • Wang X, Yan Z, Zhang R, Zhang P. Attacks and defenses in user authentication systems: A survey. Journal of Network and Computer Applications. 2021 Aug 15;188:103080.

#10 Choices of Epsilon in DP

Supervisor: Àlex Miranda-Pascual

Differential privacy (DP) is one of the most widely used and well-known privacy metrics in the literature. In the classical DP formulation, the privacy level of DP is quantified by the parameter epsilon, where lower epsilons correspond to higher privacy. Although epsilon can theoretically be any non-negative value, large values are often considered to provide no privacy. For example, the best study on the choice of epsilon [1] suggests choosing epsilon values smaller than 2. However, we find numerous mechanisms that achieve DP for epsilon values around 10 or even 50. Therefore, we ask ourselves: Are these mechanisms essentially inapplicable, since they do not achieve good epsilon values, or is it still useful to consider such epsilon values?

In this seminar topic, your main task is to review the literature to find studies on the choice of specific epsilon parameters, understand their discussion, and describe the potential effect of the choice of epsilon. In addition, using the surveyed studies, you are tasked on looking at some mechanisms that achieve DP for larger epsilon values to see if it choice of epsilon is reasonable or not.

References:

  1. Lee, J., & Clifton, C. (2011). How much is enough? Choosing ε for differential privacy. In X. Lai, J. Zhou, & H. Li (Eds.), Information Security (pp. 325–340). Springer. https://doi.org/10.1007/978-3-642-24861-0_22
  2. Hsu, J., Gaboardi, M., Haeberlen, A., Khanna, S., Narayan, A., Pierce, B. C., & Roth, A. (2014). Differential privacy: An economic method for choosing epsilon. 2014 IEEE 27th Computer Security Foundations Symposium, 398–410. https://doi.org/10.1109/CSF.2014.35

#11 Conceptualizing Model and Hyperparameter Comprehension for PPML

Supervisor: Felix Morsbach

Hyperparameter optimization and model selection are key steps during the machine learning development process. Unfortunately, these steps are prohibitively expensive in terms of computing resources. This is especially severe for privacy-preserving machine learning, as model training is significantly more expensive.

Recent research demonstrates that practitioner’s model and hyperparameter comprehension is crucial during hyperparameter optimization. To enhance efficiency of hyperparameter optimization and model selection, this expertise of machine learning experts can be leveraged. However, leveraging practitioner expertise requires that often tacit and intractable knowledge of hyperparameters and models needs to be explicated in a machine-readable format. But to enable this, we first have to understand what machine learning experts mean when they refer to these concepts and how such knowledge could be explicated. To support less expensive development of machine learning models by leveraging practitioners’ knowledge, this work is envisioned to conceptualize model and hyperparameter comprehension of practitioners by reviewing recent literature.

#12 Privacy-preserving machine learning frameworks

Supervisor: Felix Morsbach

To address the privacy and confidentiality threat associated with machine learning on personal or sensitive data, multiple controls have been proposed (e.g., differentially private machine learning and federated machine learning). Not only have these controls been proposed in scientific literature, many of them are also available as – more or less – off-the-shelf products. For example, there are multiple deep learning libraries for differentially private machine learning, such as Opacus from Meta or Tensorflow Privacy from Google.

However, the volume of available privacy-preserving machine learning libraries is vast and new scientific results are published regularly, continuously advancing and altering the state-of-the-art. For practitioners it is often unclear which libraries are suitable for which use cases, how to compare them or even what their protection goals are. Thus, the goal of this seminar topic is a) to develop a suitable set of categories to compare privacy-preserving machine learning libraries and b) create an overview of currently available libraries along these axis.

#13 Out of the Lab: Privacy Threats of Machine Learning

Supervisor: Felix Morsbach

Machine learning models have been demonstrated to be vulnerable to inference attacks (e.g. membership inference). Such attacks can infer information about the data used for training a given machine learning model. This can violate the confidentiality of the training data and, in scenarios in which the training data comprises personal information, might even cause a privacy (and/or data protection) violation.

This threat is often used as an argument in discussions on the privacy implications of machine learning. However, whether information gained from such inference attacks actually pose a privacy risk is debatable. For example, the demonstrated attacks usually are conducted in sterile lab environments and the information gained from such an attack is probabilistic.

Determining the privacy risk due to the vulnerability of machine learning models to inference attacks is difficult. Thus, as one part of this puzzle, the goal of this project is to investigate possible privacy harms of inference attacks on machine learning models.

This encompasses the following parts: For a use case of your choice, a) describe how machine learning models could be deployed in this use case, b) describe how an inference attack in this scenario could be conducted, c) characterize the information gained from a successful inference attack, and d) discuss how this information could lead to a privacy harm.

References:

  • Shokri, R., M. Stronati, C. Song, and V. Shmatikov. “Membership Inference Attacks Against Machine Learning Models.” In 2017 IEEE Symposium on Security and Privacy (SP), 3–18, 2017. https://doi.org/10.1109/SP.2017.41
  • Liu, Bo, Ming Ding, Sina Shaham, Wenny Rahayu, Farhad Farokhi, and Zihuai Lin. “When Machine Learning Meets Privacy: A Survey and Outlook.” ACM Computing Surveys 54, no. 2 (March 5, 2021): 31:1-31:36.https://doi.org/10.1145/3436755
  • Citron, Danielle Keats, and Daniel J. Solove. “Privacy harms.” BUL Rev. 102 (2022): 793.

#14 Private Set Intersection

Supervisor: Shima Hassanpour

Private Set Intersection (PSI) is one of the most important privacy issues. The aim of this seminar topic is to survey existing quantum approaches that have been proposed to solve the PSI problem.

References:

  • Amoretti, Michele. “Private Set Intersection with Delegated Blind Quantum Computing.” In 2021 IEEE Global Communications Conference (GLOBECOM), pp. 1-6. IEEE, 2021.
  • Shi, Run-Hua, and Yi-Fei Li. “Quantum private set intersection cardinality protocol with application to privacy-preserving condition query.” IEEE Transactions on Circuits and Systems I: Regular Papers 69, no. 6 (2022): 2399-2411.

#15 Quantum Private Query

Supervisor: Shima Hassanpour

  • PIR-SPIR
  • The quantum scheme for SPIR is defined as QPQ

The goal of this seminar is to review different types of QPQ and how do they work.