Personal data: Any information relating to an identified or identifiable natural person.
The 4 Data Type
Anonymous data: All identifiable elements removed to the extent that an individual's identity
can not be determined.
Pseudonymous Data: Information that no longer allows the identification of an individual
without additional information.
Personal Data: Information that relates to an identifiable person.
Sensitive data: The General Data Protection Regulation (GDPR) identifies special categories
of data which merit higher protection.
These include: race, ethnicity, religious, political, philosophical belief, trade union
membership, genetic and biometric data, health data, sex life, sexual orientation, criminal
offences.
The 6 data protection principles
1. Lawfulness, fairness and transparency
2. Purpose limitation
3. Data minimization
4. Accuracy
5. Storage Limitation
6. Integrity and confidentiality
1. Lawfulness, fairness and transparency
Data should be processed lawfully, fairly and in a transparent manner in relation to
individuals.
The data controller should define a legal basis for the data they process and communicate
openly with data subjects about data processing activities. eg. the collection of data, how it's
stored and what it's used for, by issuing a privacy notice to data subjects.
Transparency principle
The transparency principle requires data controllers to provide information about the personal
data they process in:
+ An intelligible and easily accessible form
+ Clear and plain language
+ Concise communication
Privacy Notice
A statement to the data subject which describes how the organisation collects, uses, retains
and discloses personal data.
+ A physical notice, or
+ A link to a webpage
2. Purpose limitation
Data can only be collected and processed for a defined, explicit and legitimate purpose and
should not be used for processing which falls outside the stated purpose.
Data controllers must first identify the purpose for which the personal data will
be processed. This will form the boundary of what can be done with the data.
If the organisation wishes to complete processing which falls outside this boundary, they
must determine if the processing is compatible with the original purpose.
3. Data minimization
Data should be adequate, relevant and limited to what is necessary in relation to the purposes
for which they are processed.
Only personal data that is needed for the defined purpose should be collected and processed.
The GDPR states that data not required for the processing is not collected. This is known as
data minimisation.
4. Accuracy
Data should be accurate and, where neccessary, kept up to date; every reasonable step must
be taken to ensure that inaccurate data is erased or rectified without delay.
It is the responsibility of the data controller to ensure data is accurate, and to action requests
by data subjects to amend or remove inaccurate data.
In addition, the data controller must notify anyone else who has access to the data, or
processes the data, that it has changed, unless it requires "disproportionate effort".
The third-party recipient should then update their record accordingly.
If requested, the data controller must inform the data subject about any data recipients.
5. Storage Limitation
Data should be kept in a form which permits identification of data subjects for no longer than
is necessary for the purposes for which personal data are processed.
Data should be kept:
+ for the duration of the processing and, if necessary, for a period thereafter, as defined by
law,
Organisations should have a data retention policy which states how long each category of
data is retained and how it will be destroyed after this period.
6. Integrity and confidentiality
Data should be processed in a manner that ensures appropriate security of the personal data,
including protection against unauthorised or unlawful processing and against accidental loss,
destruction or damage, using appropriate technical or organisational measures.
The GDPR requires organisations to completerisk analysis for their systems and implement
appropriate technical and organisational measures to mitigate those risks and ensure the
security of the data.
Subject rights
The GDPR provides rights for individuals:
1. The right to be informed
2. The right of access
3. The right to rectification
4. The right to erase
5. The right to restrict processing
6. The right to data portability
7. The right to object
8. Automated decision making & profiling
1. The right to be informed
The data subject has the right to know that another person or organisation has their data and
intends to use it, plus:
- Why they need it,
- How long it will be kept,
- and their rights regarding the organisation's use of the data
Data controllers need to be transparent in their handling of data and cannot pass personal data
to third parties without the data subject being aware and having the right to object.
2. The right of access
The data subject has the right to know what type of data is being stored. e.g. name, address,
national insurance number, and also has the right to request copy of that data, as long as the
request is reasonable.
Organisations cannot charge for providing data unless the request is for multiple copies of the
data, is excessive, or the data has been requested an unreasonable
number of times.
3. The right to rectification
The right to have incorrect data amended on all systems where that data is held.
For example, if a data subject realises that their email address is incorrect when reviewing
their profile on an organisation's website, they can ask the controller to update their data.
4. The right to erase
Individuals have the right to be forgotten; althought this right is tempered by the
organisation's right to keep information if they arerequired to do so by law, or if they may
need to defend themselves against leval claims. If an organisation declines to erase data, it
must inform the data subject of the reasons.
5. The right to restrict processing
The individual has the right to ask the controller to restrict processing if they have contested
the accuracy of data, have objected to the processing, or if the processing is unlawful.
6. The right to data portability
The right for an individual to obtain and reuse their personal data for their own purposes, or
to move data from one IT environment to another.
This right only applied to data that is processed electronically and that has been provided to a
controller by the data subject based on consent or a contract.
7. The right to object
Individual have the right to object to processing of data based onlegitimate interest, for direct
marketing, or if it is used for scientific or historical research.
In all cases data subjects have the right to question if the processing is legitimate and
necessary. Data controllers cannot "assume" they have the right to use personal data.
8. Automated decision making & profiling
If an organisation makes decisions based on purely automated processing, or uses automated
profiling (automated evaluation of an individual), the person has the right to object and ask
for human intervention.
Cryptography = Crypto (Secret) + Graphy (Write,Study)
Cryptography is the science of writing or creating secret.
P, M = Plaintext, Message
CT = Cipher Text
K, PK, SK, MSK Key, = Public Key, Secret Key, Master Secret Key
ENC = Encryption
DEC = Decryption
Encryption: The process of encoding a message into a cipher text.
Decryption: The process of decoding a cipher text into a message.
Symmetric Cryptography Aka Secret Key Cryptography or Private Key Cryptography.
Key Size (bits): 128, 256, 512, 1024, 2048,4096,…
Strength: randomness
Algorithm: Caesar Cipher, Vigenère Cipher, Enigma, DES, AES, RC4, …
Mode of operation: ECB, CBC, CFB, CTR, GCM, …
Help: $ openssl help
Advantage and Disadvantage of symmetric cryptography?
Secure and Fast computation
Secret key need to be transferred all the time that create high risk of key compromising
=> Diffie Hellman Key Exchange [1]
Secret key is leaving the owner of the message to other receiver or at the attacker hand, which
means the same key is on the hand of many people.
Asymmetric Cryptography Aka Public Key Cryptography
What is asymmetric cryptography?
The algorithm that use the two different keys, one for encryption and another for decryption.
Public Key (PK) is used for encryption and can be shareto public, everyone can see it.
Secret Key (SK) aka. Private Key is used for decryption and this key need to keep secret only
on the owner hand.
Key Size (bits): 128, 256, 512, 1024, 2048,4096,…
Strength: randomness
Algorithm: RSA, El Gamal, ECC, Pairing…
IBE: Identity-Based Encryption
What is Identity-Based Encryption [3,4]?
The algorithm that encrypt the message with receiver identity such as name or email address.
Only the receiver who possess the key with encrypted identity could decrypt the cipher text
The encryption process could be done, even before the receiver exist
The receiver need to get the secret key from Key Generation Center (KGC) to decrypt the
cipher text
ABE: Attribute-Based Encryption
FUZZY IBE[5]
In Fuzzy IBE, the author extend a single identity into multiple identities
The total set of identity is called Identity Universe
The message is encrypted with some identities
The receiver whose key contains identities match with
The encrypted identities could be able to decrypt the cipher text
Suppose:
• U: Identity Universe
• ω∈U: Identity encrypt in the cipher text
• ώ: (where | ω∩ώ≥d|) Identity in the key that could decrypt the cipher text
What is Attributed-Based Encryption
In ABE, the author change the ideas of multipleidentities into attributes
The total set of attribute is called Attributes Universe
The message is encrypted with some attributes in the form of policy access structure.
The receiver whose key contains attributes match with the encrypted attributed policy could
be able to decrypt the cipher text.
Additional Question
Principles of GDPR for Data Scientists:
1. Lawfulness, Fairness, and Transparency
○ Data must be collected, processed, and stored legally, fairly, and transparently.
○ Data scientists must ensure the purpose of data collection is clear and
communicated to individuals.
2. Purpose Limitation
○ Data must only be collected for specific, explicit, and legitimate purposes.
○ Avoid using data for purposes beyond those stated at the time of collection
without obtaining consent.
3. Data Minimization
○ Collect only the data necessary for the intended purpose.
○ Avoid collecting irrelevant or excessive data.
4. Accuracy
○ Data must be kept accurate and up to date.
○ Implement procedures for correcting or deleting inaccurate information.
5. Storage Limitation
○ Retain data only for as long as necessary for the intended purpose.
○ Afterward, securely delete or anonymize the data.
6. Integrity and Confidentiality (Security)
○ Protect data from unauthorized access, loss, or breaches using appropriate
security measures (encryption, access control, etc.).
○ Data scientists must prioritize secure storage and transmission.
7. Accountability
○ Data scientists and organizations must demonstrate compliance with GDPR
principles through documentation and processes.
Practical Steps for Data Scientists:
● Consent: Obtain clear, informed consent before collecting personal data.
● Anonymization: Use anonymization or pseudonymization to minimize risks.
● Secure Data Storage: Use encryption and limit access to sensitive data.
● Transparency: Clearly explain to users how their data will be used.
● Data Breach Response: Have a protocol in place to notify authorities and affected
individuals in case of a breach.
● Compliance Monitoring: Regularly audit data processing activities to ensure GDPR
compliance.