Privacy und Technischer Datenschutz

Typ: Seminar
Lehrstuhl: KIT-Fakultäten - KIT-Fakultät für Informatik - KASTEL – Institut für Informationssicherheit und Verlässlichkeit - KASTEL Strufe
Semester: Winter 2021/2022
Dozent: Prof. Dr. Thorsten Strufe
Christiane Kuhn
Sven Maier
SWS: 2
LVNr.: 2400118

Links

ILIAS-Kurs

Inhalt

Das Seminar behandelt aktuelle Themen aus dem Forschungsgebiet des technischen Datenschutzes.

Dazu gehören z.B.:

- Anonyme Kommunikation

- Netzwerksicherheit

- Anonymisierte Online-Dienste

- Anonyme digitale Bezahlungssysteme

- Bewertung der Anonymität von online Diensten

- Anonymisierte Publikation von Daten (Differential Privacy, k-Anonymity)

- Transparenz-/Awareness-verbessernde Systeme

- Unterstützung beim Medienverständnis

Vortragssprache

Deutsch/Englisch

Inhalte

Important Dates

October 21, 2PM in 50.34 Room 252	Organization intro, Topics presentation
October 28, 2PM in 50.34 Room 252	Kickoff Reading, Writing, Presenting
October 29	Topic preferences due
November 2	Topic assignment
January 24	Paper submission deadline
January 31	Reviews due
February 6	Revision deadline
~February 15	Presentations

Topics

1) Recognizing persons by their smell
Supervisor: Simon Hanisch

From using dogs as trackers it is widely known that humans can be identified and even traced in crowds by their smell. What is easy for dogs is still a hard problem for digital smell detectors. These detectors use multiple channels to identify the combination of gasses in the air. Besides using this information to identify individuals it might also be possible to infer additional information such as if the person smokes, or, more advanced, what kind of diet the person eats. The goal of this seminar work is to survey the current state-of-the-art of using digital smell detectors to recognize humans.

[1] Li, S., 2014. Recent developments in human odor detection technologies. Journal of Forensic Science & Criminology, 1(1), pp.1-12.
[2] Inbavalli, P. and Nandhini, G., 2014. Body odor as a biometric authentication. International Journal of Computer Science and Information Technologies, 5(5), pp.6270-6274.
[3] Yang, B. and Lee, W., 2018, November. Human body odor based authentication using machine learning. In 2018 IEEE Symposium Series on Computational Intelligence (SSCI) (pp. 1707-1714). IEEE.

2) Fine-Grained Cryptography
Supervisor: Sven Maier (sven.maier2∂kit.edu)

The goal of cryptography is to construct schemes which provide an asymmetry between the run-time of an honest party executing the protocol and the run-time of an adversary trying to break the security. Traditionally, these are quite strong -- the honest parties should only have to run polynomial-time algorithms whereas the adversaries have to solve a problem which is conjectured to be unsolvable in polynomial-time. These problems are based on assumptions which are widely believed to be true, but to this day have not been proven. Especially, they would turn out to be false if P = NP.

An alternative approach was first formalized by Degwekar et. al. [1] in 2016, although it was implicitly first used by Merkle [2] in 1978. Instead of providing protection against adversaries with an arbitrary polynomial run-time Fine-Grained Cryptography provides concrete (albeit polynomial) bounds for the time it takes an adversary to break a scheme. For example, in Merkle's puzzles [2] it takes honest parties O(n) steps to exchange a key, while an adversary trying to extract the key in time o(n²) will provably fail. The task of this topic is to motivate and introduce Fine-Grained Cryptography and to present the Fine-Grained Public-Key Encryption scheme from LaVigne et. al. [3].

[1] Akshay Degwekar, Vinod Vaikuntanathan, Prashant Nalini Vasudevan: Fine-Grained Cryptography. CRYPTO (3) 2016: 533-562
[2] Ralph C. Merkle: Secure Communications Over Insecure Channels. Commun. ACM 21(4): 294-299 (1978)
[3] Rio LaVigne, Andrea Lincoln, Virginia Vassilevska Williams: Public-Key Cryptography in the Fine-Grained Setting. CRYPTO (3) 2019: 605-635

3) How to Whistleblow
Supervisor: Christoph Coijanovic

Whistleblowing (i.e., the act of exposing wrongdoing within some organization to the public) often comes with great personal risk.
While some countries guarantee legal protection to whistleblowers, they face persecution or even worse in others.
Thus, it is crucial to provide anonymous means of submitting compromising information.
In recent years, many news organizations have adopted SecureDrop [1] and GlobaLeaks [2] to receive information.
Academic research has also produced multiple purpose-built protocols [3-5], which might provide even stronger privacy protection.
The goal of this work is to survey existing approaches for whistleblowing and compare them based on provided functionality, usability, and privacy protection.

[1] Di Salvo, P. (2021). Securing Whistleblowing in the Digital Age: SecureDrop and the Changing Journalistic Practices for Source Protection. Digital Journalism, 9, 443 - 460.
[2] https://www.globaleaks.org/
[3] Habbabeh, A., Asprion, P., & Schneider, B. (2020). Mitigating the Risks of Whistleblowing - an Approach Using Distributed System Technologies. PoEM Workshops.
[4] Eskandarian, S., Corrigan-Gibbs, H., Zaharia, M., & Boneh, D. (2021). Express: Lowering the Cost of Metadata-hiding Communication with Cryptographic Privacy. USENIX Security Symposium.
[5] Newman, Z., Servan-Schreiber, S., & Devadas, S. (2021). Spectrum: High-Bandwidth Anonymous Broadcast with Malicious Security. IACR Cryptol. ePrint Arch., 2021, 325.

4) Enabling Security and Privacy with Technology during Political Events
Supervisor: Christiane Kuhn

Privacy-enhancing technologies are been build with the goal to protect activists, protesters and minorities. Currently, these people already make use of technologies and strategies to protect themselves, as well as to react to active attacks. Your task during this seminar is to shed light on the employed technologies in political events and the hurdles that they face while trying to preserve their security and privacy.

[1] Daffalla, Alaa, et al. "Defensive Technology Use by Political Activists During the Sudanese Revolution." 2021 IEEE Symposium on Security and Privacy. IEEE, 2021.
[2] Albrecht, Martin R., et al. "Collective Information Security in Large-Scale Urban Protests: the Case of Hong Kong." arXiv preprint arXiv:2105.14869 (2021).
[3] Simko, Lucy, et al. "Computer security and privacy for refugees in the United States." 2018 IEEE Symposium on Security and Privacy. IEEE, 2018.

5) How to get privacy guarantees in machine learning
Supervisor: Patricia Guerra-Balboa

The protection of private information is vital in research, business and government. However there is a conflict between privacy and utility that is open nowadays. In the search for a privacy guarantee that does not completely spoil our results, the synthetic data has become very popular under the promise of preserving statistical properties of raw data but providing protection against privacy attacks. Our goal is to understand the recent research in this field, the weakness of synthetic data against different attacks and finally to approach new methods of protecting machine learning mechanisms and synthetic data by mixing different privacy techniques.

Topics to work with: • State of the art in privacy: DP-Synthetic data. • Problems of synthetic data privacy guarantees • Recent ideas of mix methods to achieve privacy guarantees on Machine learning mechanisms

[1] Ganev, G., Oprisanu, B., & De Cristofaro, E. (2021). Robin Hood and Matthew Effects--Differential Privacy Has Disparate Impact on Synthetic Data. arXiv preprint arXiv:2109.11429.
[2] Stadler, T., Oprisanu, B., & Troncoso, C. (2020). Synthetic Data–Anonymization Groundhog Day. arXiv preprint arXiv:2011.07018.
[3] Boedihardjo, M., Strohmer, T., & Vershynin, R. (2021). Covariance's Loss is Privacy's Gain: Computationally Efficient, Private and Accurate Synthetic Data. arXiv preprint arXiv:2107.05824.

6) The real meaning of Differential Privacy
Supervisor: Patricia Guerra-Balboa

Recently, differential privacy (DP) technology has achieved a good trade-off between data utility and privacy guarantee by publishing noisy outputs. Nonetheless, DP still has a risk of privacy leakage when handling correlated data directly. Current schemes attempt to extend DP to publish correlated data, but are faced with the challenge of violating DP or low-level data utility. Another problem with these privacy techniques is the lack of consensus in choosing the appropriate eps parameter. With Google studies presenting eps greater than 50 it is worth thinking about the role of eps in the definition of DP.

Topics to work with: • What is actually the concept of Differential Privacy • How to understand eps parameter • Lacks of information and problems of Differential Privacy

[1] Wang, H., Xu, Z., Jia, S., Xia, Y., & Zhang, X. (2021). Why current differential privacy schemes are inapplicable for correlated data publishing?. World Wide Web, 24, 1-23.
[2] Lee, J., & Clifton, C. (2011, October). How much is enough? choosing ε for differential privacy. In International Conference on Information Security (pp. 325-340). Springer, Berlin, Heidelberg.

7) Privacy issues of authentication based on brain waves
Supervisor: Matin Fallahi

Privacy issues of authentication based on brain waves By advancing technology, brain waves devices become available at reasonable prices and quality[1]. Therefore researchers are on the way to bring them into daily life in different areas, for example, brain-computer interfaces(BCI), authentication, identification, and health issues. Along with the rapid growth of brain waves applications, privacy concerns emerge. Some studies have shown that subjects' individual characters could be inferred from recorded Electroencephalography(EEG) waves[2,3]. Your task is to survey the current state-of-the-art of privacy issues in EEG waves, such as age, gender, stress level, etc.

[1] Arias-Cabarcos P, Habrich T, Becker K, Becker C, Strufe T. Inexpensive Brainwave Authentication: New Techniques and Insights on User Acceptance. In30th {USENIX} Security Symposium ({USENIX} Security 21) 2021
[2] Bosl WJ, Tager-Flusberg H, Nelson CA. EEG analytics for early detection of autism spectrum disorder: a data-driven approach. Scientific reports. 2018 May 1;8(1):1-20.
[3] Kaushik P, Gupta A, Roy PP, Dogra DP. EEG-based age and gender prediction using deep BLSTM-LSTM network model. IEEE Sensors Journal. 2018 Dec 7;19(7):2634-41.

8) Differential Privacy in the Shuffle Model
Supervisor: Kilian Becher (kilian.becher∂sap.com)

In differential privacy, privacy guarantees rely on the introduction of uncertainty into data sets. This uncertainty is achieved by adding random noise to individuals’ inputs, intermediate results, or outputs. In the curator model, a trusted central analyzer with access to the individuals’ data points performs analyses of that data while introducing randomness. In contrast, in the local model, the individual data contributors locally add noise to their data points before providing the noisy data to an untrusted analyzer. Both models come with individual tradeoffs between utility and privacy.
Similar to the local model, data contributors in the shuffle model locally randomize to their inputs. However, before handing their noisy data over to the central analyzer, they have their data randomly permuted by a trusted shuffler. This ensures anonymity and guarantees the same level of privacy while requiring less noise. The goal of this work is to survey the shuffling model, give an overview of approaches to shuffling, and investigate its tradeoff between utility and privacy.

[1] Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. "Calibrating noise to sensitivity in private data analysis." In Theory of cryptography conference, pp. 265-284. Springer, Berlin, Heidelberg, 2006.
[2] Bittau, Andrea, Úlfar Erlingsson, Petros Maniatis, Ilya Mironov, Ananth Raghunathan, David Lie, Mitch Rudominer, Ushasree Kode, Julien Tinnes, and Bernhard Seefeld. "Prochlo: Strong privacy for analytics in the crowd." In Proceedings of the 26th Symposium on Operating Systems Principles, pp. 441-459. 2017.
[3] Balle, Borja, James Bell, Adria Gascón, and Kobbi Nissim. "The privacy blanket of the shuffle model." In Annual International Cryptology Conference, pp. 638-667. Springer, Cham, 2019.
[4] Balcer, Victor, and Albert Cheu. "Separating local & shuffled differential privacy via histograms." arXiv preprint arXiv:1911.06879 (2019).
[5] Cheu, Albert. "Differential privacy in the shuffle model: A survey of separations." arXiv preprint arXiv:2107.11839(2021).

9) Local Differential Privacy and the Effect of the Input Distribution
Supervisor: Kilian Becher (kilian.becher∂sap.com)

In differential privacy, privacy guarantees rely on the introduction of uncertainty into data sets. This uncertainty is achieved by adding random noise to individuals’ inputs, intermediate results, or outputs. In the curator model, a trusted central analyzer with access to the individuals’ data points performs analyses of that data while introducing randomness. In contrast, in the local model, the individual data contributors locally add noise to their data points before providing the noisy data to an untrusted analyzer. Both models come with individual tradeoffs between utility and privacy.
The data points provided by data contributors can follow different distributions, such as uniform, Gaussian, or power-law distribution. The goal of this topic is to investigate how different input-data distributions might affect the privacy guarantees and the accuracy of analysis outputs.

[1] Evfimievski, Alexandre, Johannes Gehrke, and Ramakrishnan Srikant. "Limiting privacy breaches in privacy preserving data mining." In Proceedings of the twenty-second ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems, pp. 211-222. 2003.
[2] Dwork, Cynthia, Frank McSherry, Kobbi Nissim, and Adam Smith. "Calibrating noise to sensitivity in private data analysis." In Theory of cryptography conference, pp. 265-284. Springer, Berlin, Heidelberg, 2006.
[3] Kasiviswanathan, Shiva Prasad, Homin K. Lee, Kobbi Nissim, Sofya Raskhodnikova, and Adam Smith. "What can we learn privately?." SIAM Journal on Computing 40, no. 3 (2011): 793-826.
[4] Snoke, Joshua, and Aleksandra Slavković. "pMSE mechanism: differentially private synthetic data with maximal distributional similarity." In International Conference on Privacy in Statistical Databases, pp. 138-159. Springer, Cham, 2018.
[5] Yang, Mengmeng, Lingjuan Lyu, Jun Zhao, Tianqing Zhu, and Kwok-Yan Lam. "Local differential privacy and its applications: A comprehensive survey." arXiv preprint arXiv:2008.03686(2020).
[6] Kifer, Daniel, and Ashwin Machanavajjhala. "Pufferfish: A framework for mathematical privacy definitions." ACM Transactions on Database Systems (TODS) 39, no. 1 (2014): 1-36.