This document discusses privacy-preserving techniques for information sharing and their applications. It introduces privacy-enhancing technologies that allow parties to share just the minimum required information while protecting privacy. Secure computation and private set intersection protocols are described that enable privacy-respecting data processing and information sharing when parties have limited trust. The document outlines several private set intersection techniques and their applications to privacy-preserving genomic testing and collaborative machine learning.
K anonymity for crowdsourcing database
In crowdsourcing database, human operators are embedded into the database engine and collaborate with other conventional database operators to process the queries. Each human operator publishes small HITs (Human Intelligent Task) to the crowdsourcing platform, which consists of a set of database records and corresponding questions for human workers.
Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems.
However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data.
In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior.
To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms.
K anonymity for crowdsourcing database
In crowdsourcing database, human operators are embedded into the database engine and collaborate with other conventional database operators to process the queries. Each human operator publishes small HITs (Human Intelligent Task) to the crowdsourcing platform, which consists of a set of database records and corresponding questions for human workers.
Clustering algorithms have become a popular tool in computer security to analyze the behavior of malware variants, identify novel malware families, and generate signatures for antivirus systems.
However, the suitability of clustering algorithms for security-sensitive settings has been recently questioned by showing that they can be significantly compromised if an attacker can exercise some control over the input data.
In this paper, we revisit this problem by focusing on behavioral malware clustering approaches, and investigate whether and to what extent an attacker may be able to subvert these approaches through a careful injection of samples with poisoning behavior.
To this end, we present a case study on Malheur, an open-source tool for behavioral malware clustering. Our experiments not only demonstrate that this tool is vulnerable to poisoning attacks, but also that it can be significantly compromised even if the attacker can only inject a very small percentage of attacks into the input data. As a remedy, we discuss possible countermeasures and highlight the need for more secure clustering algorithms.
Umang Mehta - Emotion AI Developer Day 2016Affectiva
Please tweet to us using: @Affectiva and #EmoDev16
Key Websites:
Affectiva: http://affectiva.com
Developer Portal: http://developer.affectiva.com
Affectiva Demo: http://go.affectiva.com/affectiva-demo
Emotion AI Developer Day brought together the largest remote conference of emotion recognition developers in the world, including Affectiva staff, affective computing thought leaders and companies offering complementary technologies. Emotion AI Developer Day provided opportunities for attendees to learn, as we as to help shape the future of Affectiva.
Find us on:
Facebook: https://www.facebook.com/Affectiva/
Twitter: https://twitter.com/Affectiva
LinkedIn: https://www.linkedin.com/company/affectiva_2
Teardowns are powerful tools to gain invaluable insight on how things are designed and built. Critical lessons can be learnt from this process to improve our jobs as designers and manufacturers of electronic products. The Apple Watch went from novelty to market leader in just a few weeks. Added to its impressive commercial success, this device is an incredible work of electronic packaging design. In this presentation we will show the details of the Apple Watch teardown, showing its many layers and complexities. We augment the teardown with x-ray images of several parts of the Apple Watch to gain extra insights on how this tightly packaged device is assembled. We will also present other similar teardowns, including cell phones and wearables. This presentation has images graciously provided by ifixit.com. X-Ray images taken with TruView X-Ray Inspection systems.
DTS-i Engine - DTS-i stands for “Digital Twin Spark Ignition”, a 'Bajaj Patented Technology'. DIGITAL TWIN SPARK ignition engine has two Spark plugs located at opposite ends of the combustion chamber and hence fast and efficient combustion is obtained.
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016Mohamed _el_shora
Introduction about Hydrology.
Planning any project by scientific method.
Understand how Ground Water Wells Drilled.
Types of Pumping Test.
How you can Write the daily Report.
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
See https://medium.com/@jamessirota for a series of blog entries that goes with this deck...
Defense in Depth for Big Data
Network Anomaly Detection Overview
Volume Anomaly Detection
Feature Anomaly Detection
Model Architecture
Deployment on OpenSOC Platform
Questions
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Pluribus One
Pattern classifiers have been widely used in adversarial settings like spam and malware detection, although they have not been originally designed to cope with intelligent attackers that manipulate data at test time to evade detection.
While a number of adversary-aware learning algorithms have been proposed, they are computationally demanding and aim to counter specific kinds of adversarial data manipulation.
In this work, we overcome these limitations by proposing a multiple classifier system capable of improving security against evasion attacks at test time by learning a decision function that more tightly encloses the legitimate samples in feature space, without significantly compromising accuracy in the absence of attack. Since we combine a set of one-class and two-class classifiers to this end, we name our approach one-and-a-half-class (1.5C) classification. Our proposal is general and it can be used to improve the security of any classifier against evasion attacks at test time, as shown by the reported experiments on spam and malware detection.
Security is a major concern in computer networking which faces increasing threats as the commercial
Internet and related economies continue to grow. Virtualization technologies enabling
scalable Cloud services pose further challenges to the security of computer infrastructures,
demanding novel mechanisms combining the best-of-breed to counter certain types of attacks
. Our work aims to explore advances in Cyber Threat Intelligence (CTI) in the context of
Software Defined Networking (SDN) architectures. While CTI represents a recent approach
to combat threats based on reliable sources, by sharing information and knowledge about
computer criminal activities, SDN is a recent trend in architecting computer networks based
on modularization and programmability principles. In this dissertation, we propose IntelFlow,
an intelligent detection system for SDN that follows a proactive approach using OpenFlow
to deploy countermeasures to the threats learned through a distributed intelligent plane. We
show through a proof of concept implementation that the proposed system is capable of delivering
a number of benefits in terms of effectiveness, altogether contributing to the security
of modern computer network designs.
Umang Mehta - Emotion AI Developer Day 2016Affectiva
Please tweet to us using: @Affectiva and #EmoDev16
Key Websites:
Affectiva: http://affectiva.com
Developer Portal: http://developer.affectiva.com
Affectiva Demo: http://go.affectiva.com/affectiva-demo
Emotion AI Developer Day brought together the largest remote conference of emotion recognition developers in the world, including Affectiva staff, affective computing thought leaders and companies offering complementary technologies. Emotion AI Developer Day provided opportunities for attendees to learn, as we as to help shape the future of Affectiva.
Find us on:
Facebook: https://www.facebook.com/Affectiva/
Twitter: https://twitter.com/Affectiva
LinkedIn: https://www.linkedin.com/company/affectiva_2
Teardowns are powerful tools to gain invaluable insight on how things are designed and built. Critical lessons can be learnt from this process to improve our jobs as designers and manufacturers of electronic products. The Apple Watch went from novelty to market leader in just a few weeks. Added to its impressive commercial success, this device is an incredible work of electronic packaging design. In this presentation we will show the details of the Apple Watch teardown, showing its many layers and complexities. We augment the teardown with x-ray images of several parts of the Apple Watch to gain extra insights on how this tightly packaged device is assembled. We will also present other similar teardowns, including cell phones and wearables. This presentation has images graciously provided by ifixit.com. X-Ray images taken with TruView X-Ray Inspection systems.
DTS-i Engine - DTS-i stands for “Digital Twin Spark Ignition”, a 'Bajaj Patented Technology'. DIGITAL TWIN SPARK ignition engine has two Spark plugs located at opposite ends of the combustion chamber and hence fast and efficient combustion is obtained.
Drilling Operations of Ground Water Wells AAPG-TUSC Tanta Univ 05-09-2016Mohamed _el_shora
Introduction about Hydrology.
Planning any project by scientific method.
Understand how Ground Water Wells Drilled.
Types of Pumping Test.
How you can Write the daily Report.
Detecting Hacks: Anomaly Detection on Networking DataJames Sirota
See https://medium.com/@jamessirota for a series of blog entries that goes with this deck...
Defense in Depth for Big Data
Network Anomaly Detection Overview
Volume Anomaly Detection
Feature Anomaly Detection
Model Architecture
Deployment on OpenSOC Platform
Questions
Battista Biggio @ MCS 2015, June 29 - July 1, Guenzburg, Germany: "1.5-class ...Pluribus One
Pattern classifiers have been widely used in adversarial settings like spam and malware detection, although they have not been originally designed to cope with intelligent attackers that manipulate data at test time to evade detection.
While a number of adversary-aware learning algorithms have been proposed, they are computationally demanding and aim to counter specific kinds of adversarial data manipulation.
In this work, we overcome these limitations by proposing a multiple classifier system capable of improving security against evasion attacks at test time by learning a decision function that more tightly encloses the legitimate samples in feature space, without significantly compromising accuracy in the absence of attack. Since we combine a set of one-class and two-class classifiers to this end, we name our approach one-and-a-half-class (1.5C) classification. Our proposal is general and it can be used to improve the security of any classifier against evasion attacks at test time, as shown by the reported experiments on spam and malware detection.
Security is a major concern in computer networking which faces increasing threats as the commercial
Internet and related economies continue to grow. Virtualization technologies enabling
scalable Cloud services pose further challenges to the security of computer infrastructures,
demanding novel mechanisms combining the best-of-breed to counter certain types of attacks
. Our work aims to explore advances in Cyber Threat Intelligence (CTI) in the context of
Software Defined Networking (SDN) architectures. While CTI represents a recent approach
to combat threats based on reliable sources, by sharing information and knowledge about
computer criminal activities, SDN is a recent trend in architecting computer networks based
on modularization and programmability principles. In this dissertation, we propose IntelFlow,
an intelligent detection system for SDN that follows a proactive approach using OpenFlow
to deploy countermeasures to the threats learned through a distributed intelligent plane. We
show through a proof of concept implementation that the proposed system is capable of delivering
a number of benefits in terms of effectiveness, altogether contributing to the security
of modern computer network designs.
Approximating Attack Surfaces with Stack Traces [ICSE 15]Chris Theisen
Security testing and reviewing efforts are a necessity for software projects, but are time-consuming and expensive to apply. Identifying vulnerable code supports decision-making during all phases of software development. An approach for identifying vulnerable code is to identify its attack surface, the sum of all paths for untrusted data into and out of a system. Identifying the code that lies on the attack surface requires expertise and significant manual effort. This paper proposes an automated technique to empirically approximate attack surfaces through the analysis of stack traces. We hypothesize that stack traces from user-initiated crashes have several desirable attributes for measuring attack surfaces. The goal of this research is to aid software engineers in prioritizing security efforts by approximating the attack surface of a system via stack trace analysis. In a trial on Windows 8, the attack surface approximation selected 48.4% of the binaries and contained 94.6% of known vulnerabilities. Compared with vulnerability prediction models (VPMs) run on the entire codebase, VPMs run on the attack surface approximation improved recall from .07 to .1 for binaries and from .02 to .05 for source files. Precision remained at .5 for binaries, while improving from .5 to .69 for source files.
Often information is spread among
several data sources, such as hospital databases, lab databases,
spreadsheets, etc. Moreover, the complexity of each of these data sources
might make it difficult for end-users to access them, and even
more, to query all of them at the same time.
A new solution that has been proposed to this problem is
ontology-based data access (OBDA).
OBDA is a popular paradigm, developed since the mid 2000s, to query
various types of data sources
using a common vocabulary familiar to the end-users. In a nutshell
OBDA separates the user
from the data sources (relational databases, CVS files, etc.) by means
of an ontology, which is a common terminology that provides the user with a
convenient query vocabulary, hides the structure of the data sources,
and can enrich incomplete data with background knowledge. About a
dozen OBDA systems have been implemented in both academia and
industry.
In this tutorial we will give an overview of OBDA, and our system -ontop-
which is currently being used in the context of the European project
Optique. We will discuss how to use -ontop- for data integration,
in particular concentrating on:
– How to create an ontology (common vocabulary) for a life science domain.
– How to map available data sources to this ontology.
– How to query the database using the terms in the ontology.
– How to check consistency of the data sources w.r.t. the ontology
MobiDE’2012, Phoenix, AZ, United States, 20 May, 2012Charith Perera
Charith Perera, Arkady Zaslavsky, Peter Christen, Ali Salehi, Dimitrios Georgakopoulos, Connecting Mobile Things to Global Sensor Network Middleware using System-generated Wrappers, Proceedings of the 11th ACM International Workshop on Data Engineering for Wireless and Mobile Access (ACM SIGMOD/PODS-Workshop-MobiDE), Scottsdale, Arizona, USA, May, 2012
In this era, there are need to secure data in distributed database system. For collaborative data
publishing some anonymization techniques are available such as generalization and bucketization. We consider
the attack can call as “insider attack” by colluding data providers who may use their own records to infer
others records. To protect our database from these types of attacks we used slicing technique for anonymization,
as above techniques are not suitable for high dimensional data. It cause loss of data and also they need clear
separation of quasi identifier and sensitive database. We consider this threat and make several contributions.
First, we introduce a notion of data privacy and used slicing technique which shows that anonymized data
satisfies privacy and security of data which classifies data vertically and horizontally. Second, we present
verification algorithms which prove the security against number of providers of data and insure high utility and
data privacy of anonymized data with efficiency. For experimental result we use the hospital patient datasets
and suggest that our slicing approach achieves better or comparable utility and efficiency than baseline
algorithms while satisfying data security. Our experiment successfully demonstrates the difference between
computation time of encryption algorithm which is used to secure data and our system.
zkStudyClub: Zero-Knowledge Proofs Security, in Practice [JP Aumasson, Taurus]Alex Pruden
Slides accompanying zkStudyClub talk: Zero-Knowledge Proofs Security, in Practice. JP Aumasson (co-creator of the BLAKE hash function family) will share his experience doing security auditing for various projects that use zero-knowledge proofs. He will describe his approach, the common pitfalls in the different components of a proof system, as well as a catalog of bugs that have been discovered in various projects
Talk at the Royal Society Privacy in Statistical Analysis Workshop at Imperial College -- May 3, 2017
http://wwwf.imperial.ac.uk/~nadams/events/ic-rss2017/ic-rss2017.html
Introduction to the Open Source HPCC Systems Platform by Arjuna ChalaHPCC Systems
An introduction to HPCC Systems and our data analytics processing platform with use case examples in analyzing insurance fraud cases using our open source platform.
Similar to Privacy-preserving Information Sharing: Tools and Applications (20)
A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on...Emiliano De Cristofaro
Presentation at the Computational Social Science (CSS) Initiative London, 27 March 2017.
Based on the paper by Gabriel Hine, Jeremiah Onaolapo, Emiliano De Cristofaro, Nicolas Kourtellis, Ilias Leontiadis, Riginos Samaras, Gianluca Stringhini, and Jeremy Blackburn, "Kek, Cucks, and God Emperor Trump: A Measurement Study of 4chan’s Politically Incorrect Forum and Its Effects on the Web." To appear at 11th International AAAI Conference on Web and Social Media (ICWSM 2017)
https://arxiv.org/abs/1610.03452
MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral ...Emiliano De Cristofaro
E. Mariconti, L. Onwuzurike, P. Andriotis, E. De Cristofaro, G. Ross, G. Stringhini. MaMaDroid: Detecting Android Malware by Building Markov Chains of Behavioral Model. To appear at NDSS 2017.
Cancer cell metabolism: special Reference to Lactate PathwayAADYARAJPANDEY1
Normal Cell Metabolism:
Cellular respiration describes the series of steps that cells use to break down sugar and other chemicals to get the energy we need to function.
Energy is stored in the bonds of glucose and when glucose is broken down, much of that energy is released.
Cell utilize energy in the form of ATP.
The first step of respiration is called glycolysis. In a series of steps, glycolysis breaks glucose into two smaller molecules - a chemical called pyruvate. A small amount of ATP is formed during this process.
Most healthy cells continue the breakdown in a second process, called the Kreb's cycle. The Kreb's cycle allows cells to “burn” the pyruvates made in glycolysis to get more ATP.
The last step in the breakdown of glucose is called oxidative phosphorylation (Ox-Phos).
It takes place in specialized cell structures called mitochondria. This process produces a large amount of ATP. Importantly, cells need oxygen to complete oxidative phosphorylation.
If a cell completes only glycolysis, only 2 molecules of ATP are made per glucose. However, if the cell completes the entire respiration process (glycolysis - Kreb's - oxidative phosphorylation), about 36 molecules of ATP are created, giving it much more energy to use.
IN CANCER CELL:
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
Unlike healthy cells that "burn" the entire molecule of sugar to capture a large amount of energy as ATP, cancer cells are wasteful.
Cancer cells only partially break down sugar molecules. They overuse the first step of respiration, glycolysis. They frequently do not complete the second step, oxidative phosphorylation.
This results in only 2 molecules of ATP per each glucose molecule instead of the 36 or so ATPs healthy cells gain. As a result, cancer cells need to use a lot more sugar molecules to get enough energy to survive.
introduction to WARBERG PHENOMENA:
WARBURG EFFECT Usually, cancer cells are highly glycolytic (glucose addiction) and take up more glucose than do normal cells from outside.
Otto Heinrich Warburg (; 8 October 1883 – 1 August 1970) In 1931 was awarded the Nobel Prize in Physiology for his "discovery of the nature and mode of action of the respiratory enzyme.
WARNBURG EFFECT : cancer cells under aerobic (well-oxygenated) conditions to metabolize glucose to lactate (aerobic glycolysis) is known as the Warburg effect. Warburg made the observation that tumor slices consume glucose and secrete lactate at a higher rate than normal tissues.
Seminar of U.V. Spectroscopy by SAMIR PANDASAMIR PANDA
Spectroscopy is a branch of science dealing the study of interaction of electromagnetic radiation with matter.
Ultraviolet-visible spectroscopy refers to absorption spectroscopy or reflect spectroscopy in the UV-VIS spectral region.
Ultraviolet-visible spectroscopy is an analytical method that can measure the amount of light received by the analyte.
This presentation explores a brief idea about the structural and functional attributes of nucleotides, the structure and function of genetic materials along with the impact of UV rays and pH upon them.
THE IMPORTANCE OF MARTIAN ATMOSPHERE SAMPLE RETURN.Sérgio Sacani
The return of a sample of near-surface atmosphere from Mars would facilitate answers to several first-order science questions surrounding the formation and evolution of the planet. One of the important aspects of terrestrial planet formation in general is the role that primary atmospheres played in influencing the chemistry and structure of the planets and their antecedents. Studies of the martian atmosphere can be used to investigate the role of a primary atmosphere in its history. Atmosphere samples would also inform our understanding of the near-surface chemistry of the planet, and ultimately the prospects for life. High-precision isotopic analyses of constituent gases are needed to address these questions, requiring that the analyses are made on returned samples rather than in situ.
A brief information about the SCOP protein database used in bioinformatics.
The Structural Classification of Proteins (SCOP) database is a comprehensive and authoritative resource for the structural and evolutionary relationships of proteins. It provides a detailed and curated classification of protein structures, grouping them into families, superfamilies, and folds based on their structural and sequence similarities.
Multi-source connectivity as the driver of solar wind variability in the heli...Sérgio Sacani
The ambient solar wind that flls the heliosphere originates from multiple
sources in the solar corona and is highly structured. It is often described
as high-speed, relatively homogeneous, plasma streams from coronal
holes and slow-speed, highly variable, streams whose source regions are
under debate. A key goal of ESA/NASA’s Solar Orbiter mission is to identify
solar wind sources and understand what drives the complexity seen in the
heliosphere. By combining magnetic feld modelling and spectroscopic
techniques with high-resolution observations and measurements, we show
that the solar wind variability detected in situ by Solar Orbiter in March
2022 is driven by spatio-temporal changes in the magnetic connectivity to
multiple sources in the solar atmosphere. The magnetic feld footpoints
connected to the spacecraft moved from the boundaries of a coronal hole
to one active region (12961) and then across to another region (12957). This
is refected in the in situ measurements, which show the transition from fast
to highly Alfvénic then to slow solar wind that is disrupted by the arrival of
a coronal mass ejection. Our results describe solar wind variability at 0.5 au
but are applicable to near-Earth observatories.
(May 29th, 2024) Advancements in Intravital Microscopy- Insights for Preclini...Scintica Instrumentation
Intravital microscopy (IVM) is a powerful tool utilized to study cellular behavior over time and space in vivo. Much of our understanding of cell biology has been accomplished using various in vitro and ex vivo methods; however, these studies do not necessarily reflect the natural dynamics of biological processes. Unlike traditional cell culture or fixed tissue imaging, IVM allows for the ultra-fast high-resolution imaging of cellular processes over time and space and were studied in its natural environment. Real-time visualization of biological processes in the context of an intact organism helps maintain physiological relevance and provide insights into the progression of disease, response to treatments or developmental processes.
In this webinar we give an overview of advanced applications of the IVM system in preclinical research. IVIM technology is a provider of all-in-one intravital microscopy systems and solutions optimized for in vivo imaging of live animal models at sub-micron resolution. The system’s unique features and user-friendly software enables researchers to probe fast dynamic biological processes such as immune cell tracking, cell-cell interaction as well as vascularization and tumor metastasis with exceptional detail. This webinar will also give an overview of IVM being utilized in drug development, offering a view into the intricate interaction between drugs/nanoparticles and tissues in vivo and allows for the evaluation of therapeutic intervention in a variety of tissues and organs. This interdisciplinary collaboration continues to drive the advancements of novel therapeutic strategies.
Nutraceutical market, scope and growth: Herbal drug technologyLokesh Patil
As consumer awareness of health and wellness rises, the nutraceutical market—which includes goods like functional meals, drinks, and dietary supplements that provide health advantages beyond basic nutrition—is growing significantly. As healthcare expenses rise, the population ages, and people want natural and preventative health solutions more and more, this industry is increasing quickly. Further driving market expansion are product formulation innovations and the use of cutting-edge technology for customized nutrition. With its worldwide reach, the nutraceutical industry is expected to keep growing and provide significant chances for research and investment in a number of categories, including vitamins, minerals, probiotics, and herbal supplements.
2. Prologue
Privacy-Enhancing Technologies (PETs):
Increase privacy of users, groups, and/or organizations
PETs often respond to privacy threats
Protect personally identifiable information
Support anonymous communications
Privacy-respecting data processing
Another angle: privacy as an enabler
Actively enabling scenarios otherwise impossible w/o
clear privacy guarantees
2
3. Sharing Information w/ Privacy
When parties with limited mutual trust willing or
required to share information
Only the required minimum amount of information should
be disclosed in the process
3
4. Secure Computation
Alice (a) Bob (b)
f(a,b)
f(a,b)f(a,b)
Map information sharing to f(·,·)?
Realize secure f(·,·) efficiently?
Quantify information disclosure from output of f(·,·)?
4
5. Private Set Intersection (PSI)
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
5
9. Authorized Private Set Intersection
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
What if the client
populates C with its best
guesses for S?
Client needs to prove that inputs satisfy
a policy or be authorized
Authorizations issued by appropriate authority
Authorizations need to be verified implicitly
9
10. Size-Hiding Private Set Intersection
Server Client
S ={s1, ,sw} C ={c1, ,cv}
Private
Set Intersection
SÇC
v w
10
11. Garbled-Circuit Based PSI
Very fast implementations
[HKE12, DCW13, PSZ14], possibly more
Based on pipelining
Exploits advances in OT Extension
Do not support all functionalities
Still high communication overhead
11
12. Special-purpose PSI
[DT10]: scales efficiently to very large sets
First protocol with linear complexities and fast crypto
[DKT10]: extends to arbitrarily malicious adversaries
Works also for Authorized Private Set Intersection
[DJLLT11]: PSI-based database querying
Won IARPA APP challenge, basis for IARPA SPAR
[DT12]: optimized toolkit for PSI
Privately intersect sets – 2,000 items/sec
[ADT11]: size-hiding PSI
12
14. OPRF-based PSI
Server Client
fk (x)
OPRF
k
Unless sj is in the intersection fk(sj)
indistinguishable from random
ci
fk (sj )
fk (ci )
14
15. OPRF from Blind-RSA Signatures
RSA Signatures:
PRF: fd (x)= H(sigd (x))
e×d º1mod(p-1)(q-1)(N = p×q, e), d
Sigd (x)= H(x)d
modN,
Ver(Sig(x), x)=1ÛSig(x)e
= H(x)modN
Server (d) Client (x)
(H one way function)
a = H(x)×re r Î ZN
(= H(x)d
red
)
sigd (x) = b /rb = ad
fd (x)= H(sigd (x))
15
16. Authorized Private Set Intersection (APSI)
Server Client
S ={s1, ,sw} C ={(c1,auth(c1)), ,(cv,auth(cv ))}
Authorized Private
Set Intersection
SÇC =
def
sj Î S $ci Î C :ci = sj Ùauth(ci )is valid{ }
Court
16
17. OPRF w/ Implicit Signature Verification
Server Client
fk (x)
OPRF with ISV
k sig(x)
fk (x) if Ver(sig(x), x)=1
$ otherwise
17
18. A simple OPRF-like with ISV
Court issues authorizations:
OPRF: fk (x)= F(H(x)2k
modN)
Sig(x)= H(x)d
modN
Server (k) Client (H(x)d)
a = H(x)d
gr r Î ZN
(b = H(x)2edk
g2rek
)
H(x)2k
= b/g2erk
b = a2×e×k
;gk
fk (x)= F(H(x)2k
)(Implicit Verification)
18
19. OPRF with ISV – Malicious Security
OPRF: fk (x)= F(H(x)2k
)
Server (k) Client (H(x)d)
a = H(x)d
gr r Î ZN
(b = H(x)2edk
g2rek
)
H(x)2k
= b/g2erkb = a2ek
fk (x)= F(H(x)2k
)
a = H(x)(g')r
p = ZKPK{r :a2e
/a2
=(ge
/g')2r
}
gk
p'=ZKPK{k:b=a2ek
}
19
20. Other Building Blocks
[DGT12]: Private Set Intersection Cardinality-only
[BDG12]: Private Sample Set Similarity
[DFT13]: Private Substring/Pattern Matching
20
27. The Bad News
Sensitivity of human genome:
Uniquely identifies an individual
Discloses ethnicity, disease predispositions (including mental)
Progress aggravates fears of discrimination
Once leaked, it cannot be “revoked”
De-identification and obfuscation are not effective
More info:
[ADHT13] Chills and Thrills of Whole Genome Sequencing. IEEE
Computer Magazine.
27
28. Secure Genomics?
Privacy:
Individuals remain in control of their genome
Allow doctors/clinicians/labs to run genomic tests, while
disclosing the required minimum amount of information, i.e.:
(1) Individuals don’t disclose their entire genome
(2) Testing facilities keep test specifics (“secret sauce”)
confidential
[BBDGT11]: Secure genomics via *-PSI
Most personalized medicine tests in < 1 second
Works on Android too
28
30. doctor
or lab
genome
individual
test specifics
Secure
Function
Evaluation
test result test result
• Private Set Intersection (PSI)
• Authorized PSI
• Private Pattern Matching
• Homomorphic Encryption
• Garbled Circtuis
• […]
Output reveals nothing beyond
test result
• Paternity/Ancestry Testing
• Testing of SNPs/Markers
• Compatibility Testing
• Disease Predisposition […]
30
31. Open Problems
Where do we store genomes?
Encryption can’t guarantee security past 30-50 yrs
Reliability and availability issues?
Cryptography
Efficiency overhead
Data representation assumptions
How much understanding required from users?
31
32. Collaborative Anomaly Detection
Anomaly detection is hard
Suspicious activities deliberately mimic normal behavior
But, malevolent actors often use same resources
Wouldn’t it be better if organizations
collaborated?
It’s a win-win, no?
“It is the policy of the United States
Government to increase the volume, timelines,
and quality of cyber threat information shared
with U.S. private sector entities so that these
entities may better protect and defend
themselves against cyber attacks.”
Barack Obama
2013 State of the Union Address
32
33. Problems with Collaborations
Trust
Will others leak my data?
Legal Liability
Will I be sued for sharing customer data?
Will others find me negligible?
Competitive concerns
Will my competitors outperform me?
Shared data quality
Will data be reliable?
33
35. Training Machine Learning Models
The Big Data “Hype”
Large-scale collection of contextual information often
essential to gather statistics, train machine learning models,
and extract knowledge from data
Doing so privately…
35
36. Efficient Private Statistics [MDD16]
Real-world problems:
1. Recommender systems for online streaming services
2. Statistics about mass transport movements
3. Traffic statistics for the Tor Network
Available tools for computing private statistics are
impractical for large streams collection
Intuition: Approximate statistics are acceptable in
some cases?
36
37. Preliminaries: Count-Min Sketch
An estimate of an item’s frequency in a stream
Mapping a stream of values (of length T) into a matrix of size
O(logT)
The sum of two sketches results in the sketch of the union of
the two data streams
37
38. ItemKNN Recommender Systems
Predict favorite TV programs based on their own
ratings and those of “similar” users
Consider N users, M programs and binary ratings
Build a co-views matrix C, where Cab is the number of
views for the pair of programs (a,b)
Compute the Similarity Matrix
Identify K-Neighbors based on the Similarity Matrix
38
39. Private Recommender System
We build a global matrix of co-views for training
ItemKNN in a privacy-friendly way by relying on:
Private data aggregation based on [Kursawe et al. 2011]
Count-Min Sketch to reduce overhead
System Model
Users (in groups)
Tally Server (e.g, the BBC)
39
40. Security & Implementation
Security
In the honest-but-curious model under the CDH assumption
Prototype implementation:
Tally as a Node.js web server
Users run in the browser or as a mobile cross-
platform application (Apache Cordova)
Transparency, ease of use, ease of deployment
40
There are a few possible ways to instantiate OPRFs and apply them to PSI. I’d like to show one that is relatively simple and, incidentally, turns out to yield the most efficient PSI protocol. It’s based on Blind RSA.
First, notice that, in the random oracle model, the hash of a signature is a PRF. Therefore, we can construct an OPRF using Blind RSA Signatures, whereby the client obtains, from the server, a signature on a message x without disclosing x and without the server disclosing the RSA private signing key to the client. We can do so with this simple two round protocol.
ZKPK of equality…
using pipeling and multi-threading
So progress in sequencing can help researchers gain better understanding of diseases and their relationship with genetic features, but it is also already bearing fruits in the clinical setting as we hear more and more news of how doctors have been able to diagnose and/or cure patients thanks to whole genome sequencing
Another important point I’d like to make is that the genomics revolution is not only happening within the walls of our research labs and our hospitals. Take as an example the case of Angelina Jolie, who was diagnosed … her choice of undergoing … really put genetic testing in the spotlight, if not for the debate it generated
Another important phenomenon is the emergence of direct-to-consumer genomics, i.e., genetic tests that are marketed directly to customers, without necessarily involving a doctor or a health professional. One of the most popular companies in this market is 23andme, a US company now operating in the UK as well, that provides individuals with reports on a number of genetic traits, inherited risk factors, etc.
Not clear if cryptographic techniques can scale to (full) genomes
The ultimate ID, invariant, the most precise way of authentication, it survives time, even death
Different sizes
Users keep sequenced genomes (encrypted)
But still allow doctors/clinicians to run tests
Only disclose minimum amount of information
E.g. privacy-preserving version of paternity, ancestry, genealogy, personalized medicine tests