Pseudonymization and Anonymization in the ZZPL and GDPR

Let’s examine pseudonymization and anonymization under the Serbian Law on Personal Data Protection (Zakon o zaštiti podataka o ličnosti, ZZPL) and the General Data Protection Regulation (GDPR), and clarify their importance and how they mitigate risks arising from the processing of personal data. 

Brief Introduction – Personal Data and Identification

The ZZPL defines personal data as any information relating to an identified or identifiable natural person. In other words, if a natural person can be identified from certain information, even indirectly, that information is considered personal data.

Such a broad definition of personal data is not accidental. In the digital environment, an individual can very easily be singled out and identified on the basis of data that traditionally would not have been associated with the identity of natural persons, such as IP addresses, location data, details about the device used for access, and similar identifiers commonly used in electronic communication networks.

It is important to emphasize that most of this data, on its own, is insufficient to identify a natural person. However, if such data were combined with other data, determining the identity of the natural person would be possible, which is why each data point is considered personal data in isolation.

Pseudonymization and Anonymization – Key Differences

In the simplest terms, the difference between pseudonymization and anonymization comes down to the following question: after the safeguard has been applied, is it still possible to determine the identity of the data subject?

Here is how these two measures differ in terms of the possibility of identification:

  • Anonymization: personal data is processed in such a way that identifying the data subject is no longer possible through the use of reasonable measures.
  • Pseudonymization: personal data is processed in such a way that elements that enable direct identification are replaced with indirect identifiers (such as lines of code, tokens, etc.), but identifying the data subject remains possible using additional information.

Accordingly, if an unauthorized person were to gain access to anonymized data, they would not be able to re-identify the data subject by using reasonable measures.

On the other hand, if someone were to gain access to pseudonymized data, identification would still be possible if they also obtained the additional information, i.e., the “key” that would enable them to “decrypt” the data and identify the data subjects. Although, without the key, pseudonymized data is useless for attribution, the fact that it can be “unlocked” when combined with additional information and used to reveal the identity of the data subjects means that pseudonymized data is still considered personal data.

Pseudonymization in the ZZPL – The Golden Mean of Protection Measures

The Serbian Law on Personal Data Protection, in Article 4, which defines the meaning of expressions used in the law, defines pseudonymization as follows:

“Pseudonymization” means the processing in such a manner that the personal data can no longer be attributed to a specific person without the use of additional information, provided that such additional information is kept separately and is subject to technical, organizational, and staff-related safeguards to ensure that the personal data cannot be attributed to an identified or identifiable person.

This definition points to the key elements of pseudonymization:

  • It prevents identification without additional information,
  • Such additional information is stored separately,
  • It requires implementing technical, organizational, and staff-related measures.

Pseudonymization as a Safeguard

Pseudonymization is also expressly mentioned in Article 42 of the ZZPL, which sets out the safeguards that the controller must implement when determining the means of processing and during processing itself. There, pseudonymization is classified as one of the technical, organizational, and staff-related safeguards aimed at ensuring the effective implementation of personal data protection principles, such as data minimization.

Pseudonymization is also mentioned in several other provisions of the ZZPL, including:

  • Processing for other purposes, Article 6: where processing is carried out for a purpose different from the one for which the data was originally collected, the controller is required to take into account the application of appropriate safeguards, such as encryption and pseudonymization.
  • Security of processing, Article 50: pseudonymization and encryption are expressly listed as technical, organizational, and staff-related safeguards that the controller and processor should consider in order to achieve an appropriate level of security in relation to the risk of processing.
  • Codes of conduct, Article 59: in the provision governing the drafting of codes of conduct by associations and other bodies representing categories of controllers or processors, pseudonymization is listed as one of the safeguards that should particularly be taken into account when drafting such codes.
  • Archiving in the public interest, scientific or historical research purposes, and statistical purposes, Article 92: pseudonymization is again mentioned as one of the safeguards for complying with the principle of data minimization.

Considering that the domestic legislator avoided explicitly referring to anonymization, privacy by design, and privacy by default, the fact that pseudonymization is both defined and expressly mentioned multiple times throughout the law indicates that the legislator considers pseudonymization a safeguard that controllers and processors must take into account.

How Pseudonymization Is Achieved

The European Data Protection Board (EDPB) Guidelines 01/2025 on pseudonymization state that three actions must be carried out in order to achieve the effect of pseudonymization:

  • Personal data must be modified or transformed into a different form – this is most commonly done by replacing part of the personal data with one or more pseudonyms (hence the term pseudonymization), i.e. new identifiers that can be attributed to data subjects only through the use of additional information.
  • The additional information required to attribute pseudonymized data to a specific person must be kept separately – this information enables pseudonymized data to be attributed to identifiable or identified natural persons, and keeping it separate is essential for the effectiveness of pseudonymization as a safeguard.
  • Appropriate technical and organizational safeguards must be implemented in order to ensure that pseudonymized personal data cannot be attributed without authorization to an identified or identifiable natural person –  this is usually achieved by restricting access to cryptographic keys, pseudonym tables, and similar technical means, i.e., additional information that enables attribution and, consequently, the identification of data subjects.

The Golden Mean of Safeguards

The fact that pseudonymization is both defined and expressly mentioned in the LPDP is not accidental – the legislator likely considered this safeguard to be applicable in domestic practice as well. Although privacy by design and privacy by default, anonymization, and encryption all have their advantages, implementing these methods is not always feasible and often conflicts with companies’ commercial interests. Pseudonymization significantly enhances data security while preserving the data’s commercial value, making it a suitable option for broader implementation.

If someone were to gain access to pseudonymized data, but not to the additional information needed to identify the data subjects, the identities of those persons would remain undisclosed. This does not mean that pseudonymized data ceases to be personal data, but processing such data is certainly a safer option than processing traditional “raw” personal data.

The Possibility of Identification –  A Key Characteristic of Pseudonymization

Pseudonymization is often implemented in a manner that enables data to be quickly restored to its original form, which may at some point allow for the further processing of raw personal data.

It is important to emphasize that even where pseudonymized data is held by one party, and the additional information necessary to carry out identification is held by another party, the pseudonymized data is still considered personal data because identification remains possible.

The Case of SRB v. EDPS (Case T-557/20) — The Element of Subjectivity in Defining Personal Data

The decision of the General Court of the European Union in the case of the Single Resolution Board (SRB) v. the European Data Protection Supervisor (EDPS) (Case T-557/20) caused significant turbulence in the field of personal data protection. The Court held that a mere theoretical possibility of identification is insufficient; rather, what matters is whether the controller or processor has any legal or practical means to attribute the data to the data subject. Where the controller or processor has no means of independently carrying out such attribution, or of obtaining the additional information necessary to do so, the Court held that the data in question does not constitute personal data.

This legal precedent had a significant impact on the proposed amendments to the regulations governing modern technologies, known as the Digital Omnibus. One of the key proposed changes to the GDPR was specifically aimed at modifying the definition of personal data.

If adopted, the proposed change would have completely relativized the concept of personal data: data would be considered personal data only when held by an entity capable of attributing it to the data subject. This would mean that the same piece of data within the same chain of recipients could simultaneously be considered and not considered personal data, depending on whether the recipient possesses the identification “key”.

Fortunately, this proposal to amend the definition of personal data encountered resistance from regulatory bodies and the broader public, and was therefore removed from further consideration. Nevertheless, the decision in Case T-557/20 – SRB v. EDPS – remains, and will undoubtedly influence future case law in the field of personal data protection; it remains to be seen to what extent.

Pseudonymized Data Retains Commercial Value

Although anonymized data is no longer considered personal data, which excludes the application of the LPDP and the GDPR (as discussed later in the text), this form of “total” data privacy is rarely useful for business entities that process personal data for commercial purposes.

This is why pseudonymization represents a kind of golden mean: it reduces the risks associated with processing raw personal data, while preserving the commercial benefits of processing and remaining within the scope of the LPDP and the GDPR, thereby ensuring that data subjects continue to enjoy protection and can exercise their rights.

EDPB Guidelines and the “Pseudonymization Domain” Framework

The aforementioned EDPB Guidelines 01/2025 introduce the concept of a “pseudonymization domain” as a framework intended to reduce the risk that pseudonymized data can be attributed to data subjects by preventing the use of additional information that could enable such attribution.

The pseudonymization domain encompasses the processing of pseudonymized data, as well as the persons (e.g., employees of controllers and processors), systems, and environments in which it is processed, all to prevent unauthorized attribution.

The fundamental requirement of pseudonymization imposed by the ZZPL and the GDPR is that the additional information enabling attribution must be kept separately and protected by appropriate technical and organizational safeguards. In that regard, the effectiveness of pseudonymization depends on whether actors within the pseudonymization domain, using means reasonably available to them, can access the additional information and carry out attribution.

The pseudonymization domain defines the circle of actors within which access to pseudonymized data and additional information is regulated and restricted; i.e., it determines who may and may not access such data, and under what conditions. At the same time, the domain also includes potential unauthorized actors, such as cyber attackers or employees acting contrary to instructions, whose capabilities and means are taken into account when assessing risks and designing technical and organizational safeguards, all to prevent unauthorized attribution of data.

The Guidelines state that the domain does not need to encompass the entire organization of the controller, but may include only the personnel processing pseudonymized data, as well as the information and systems available to them. At the same time, the additional information required for identification must remain outside the domain to prevent unauthorized attribution of data.

The EDPB Guidelines further state that, when sharing data with processors or other recipients, controllers must ensure that pseudonymized data remains within the defined domain and that unauthorized attribution is prevented, which may be achieved through appropriate contractual clauses.

Types of Pseudonymization (with Examples)

Pseudonymization does not eliminate the possibility of attribution, so pseudonymized data remains personal data. Below are the most common pseudonymization techniques:

  • Encryption – data is encrypted and can only be restored to its original form through the use of the appropriate cryptographic key. If the data were intercepted, it would be practically impossible to decrypt it without the key.
    • Hash functions –  original data values are transformed into a hash function representing a fixed string of characters, in which the same input always produces the same output. For example, the email address pera@perapera.com may be hashed to X7A91K…, so that the same email address always produces the same hash. Still, the original value, i.e., the email address, cannot be directly obtained from the hash.
    • Salted hash –  before hashing, a random value (salt) is added, making any attempt to guess the hash significantly more difficult.
    • Keyed-hash function – a hash with a secret encryption key is a more secure variant because only the organization that applies the hash has access to the key. Without that key, it is practically impossible for an attacker to reconstruct the hashed data.
  • Tokenization – original data values are replaced with random identifiers. For example, instead of a credit card number that follows a recognizable structure, a random token such as A1B3C45 is assigned.

It is important to emphasize that these pseudonymization methods reduce the direct connection with identity, but data subjects may still be:

  • singled out from a group because they often have a unique pseudonym,
  • linked across different datasets,
  • identified through a combination with other information.

Since data subjects can still be identified, the pseudonymized data retains its commercial value, but it remains personal data.

Anonymization — Exiting the Regulatory Framework of the ZZPL and the GDPR

Unlike pseudonymized data, anonymized data is not personal data.

Anonymization is a process that permanently prevents the identification of natural persons. Once the anonymization process has been successfully carried out, the natural person is no longer identifiable, even when anonymized data is combined with other data. Unlike pseudonymization, anonymization does not involve any “key” that would enable the controller or processor to identify the data subject.

Because anonymized data is not considered personal data, the provisions of the ZZPL and the GDPR do not apply to such data.

Although anonymization is not expressly mentioned in the ZZPL, Recital 26 of the GDPR clearly states that the GDPR does not apply to anonymized data:

The principles of data protection should therefore not apply to anonymous information, namely information which does not relate to an identified or identifiable natural person or to personal data rendered anonymous in such a manner that the data subject is not or no longer identifiable

This Regulation does not therefore concern the processing of such anonymous information, including for statistical or research purposes.

Although a large portion of the ZZPL text was adopted from the GDPR, the recitals were unfortunately omitted. Nevertheless, the GDPR recitals may assist in interpreting the ZZPL, as is the case here.

In short, anonymization is the irreversible removal of the connection between data and the natural person to whom it relates. Once anonymization has been carried out, the information can no longer constitute personal data, thereby excluding any further compliance obligations under the ZZPL and the GDPR.

Never Say Never

Although the essence of anonymization is the permanent severance of the connection between a natural person and the data, the impossibility of identification is nevertheless not absolute.

The previously cited Recital 26 of the GDPR further states:

To determine whether a natural person is identifiable, account should be taken of all the means reasonably likely to be used, such as singling out, either by the controller or by another person to identify the natural person directly or indirectly.

The emphasis is on the phrase “reasonably likely to be used,” which suggests excluding any unreasonable methods or those unlikely to be utilized for identification from the scope of the GDPR.

Opinion 05/2014 of the Article 29 Working Party — Anonymization and the Possibility of Attribution

Opinion 05/2014 of the Article 29 Working Party addresses anonymization techniques, but also touches upon the very concept of anonymization and the possibility of identification. The Article 29 Working Party clarifies what may reasonably be expected to be used for re-identification, with particular attention being paid to the state of the art at the time of anonymization, as well as to the costs of potential re-identification and the available know-how.

If re-identification were possible with a “reasonable” amount of effort, using means that it would be “reasonably likely” third parties would use, then data processed in such a manner would still constitute personal data.

However, if re-identification would require disproportionate effort and the use of means that, at the time of data anonymization, could not reasonably have been expected to be used by third parties, then even if re-identification later became possible, it would still be considered that the data controller and processor had carried out anonymization in a diligent manner.

Of course, there is no such example in practice at present. Still, it is conceivable that technological developments (AI, quantum computing) may, in the future, enable re-identification in ways currently unforeseeable. It would be unfair to expect controllers and processors to anticipate technologies or capabilities that do not exist when safeguards are implemented, assuming they will ever come into existence.

However, the Article 29 Working Party states that anonymization is not a one-time (“release and forget”) process, but rather that anonymized data requires periodic reassessments of residual risks. For example, we are all aware of the advances in AI technology. If a controller anonymized data before the emergence of LLMs, it would be prudent to verify whether such anonymization remains effective, given that these tools are now widely available to the general public and, consequently, to malicious actors as well.

Types of Anonymization (with Examples)

Opinion 05/2014 of the Article 29 Working Party distinguishes between two main groups of anonymization techniques: randomization and generalization.

Types of Randomization

Randomization changes the actual values of the data to weaken the link between the data and the data subject.

The most common forms of randomization are:

  • Noise addition – values are slightly altered, preserving statistical usefulness while reducing precision. For example, instead of recording each person’s body weight in exact kilograms, the value is randomly adjusted (e.g., + or − a few kilograms). This allows the data to remain useful for research and statistical analysis while reducing the likelihood of individual isolation, especially in large datasets.
  • Permutation – values are shuffled between data subjects so that they no longer correspond to the individuals to whom they originally belonged. For example, in a dataset containing employee salaries, the salary values are mixed, so they no longer match the correct person. The dataset still contains the same number of salaries and the same values, which remain statistically meaningful. Still, it is no longer possible to link a specific person to a specific salary.
  • Differential privacy – instead of publishing the full dataset, responses to queries are masked by adding controlled noise that significantly hinders reconstruction of the identity of the data subject who provided a specific answer. This can be used, for example, to track app usage and see which emojis are most popular. At the same time, the added noise makes it much more difficult to attribute a specific emoji to the individual who used it, thereby preventing singling out.

Although randomization reduces the risk of attribution, if it is not properly implemented or the sample is not large enough, it may still be possible to identify an individual within a group.

Types of Generalization

Generalization changes the level of detail in the data, effectively “hiding” the individual within a larger group.

Some forms include:

  • Aggregation and k-anonymity – data is grouped so that each person shares the same characteristics with at least k other individuals (for example, using year of birth instead of an exact date of birth).
  • l-diversity – if we take the example of a group of male patients from Belgrade, l-diversity would be achieved if the group had multiple diagnoses. In this way, even if we know that a person belongs to the group, we cannot determine which diagnosis applies to them. Conversely, if we knew that everyone in the group had, for example, sinusitis, we would also know that any individual in that group has the same condition, thereby revealing personal data.
  • t-closeness – the distribution of data within a group must be sufficiently similar to the distribution in the overall population, with an allowed deviation “t”. For example, if 1% of the population in a country has HIV, then the dataset of health records should also contain approximately 1% of patients with HIV, with a maximum deviation defined by t. In this way, knowing that 1% of patients in the dataset have HIV does not reveal meaningful information, because the same proportion exists in the general population.

Pseudonymization and Anonymization are Personal Data Processing Operations

It is important to emphasize that pseudonymization and anonymization are types of personal data processing and fall under processing carried out for a purpose other than the one for which the data were originally collected.

For example, when signing a contract with a mobile operator, data subjects do not provide their data so that it can later be anonymized, but rather to obtain access to a phone number, device, etc.

If the controller has not obtained the data subject’s consent for processing for new purposes, it must rely on another legal basis for further processing to be lawful. This is most commonly a legitimate interest, but it may also include direct legal obligations.

For instance, the ZZPL and the GDPR provide for the principle of storage limitation, which requires that personal data be kept in a form that permits identification of data subjects only for as long as necessary for the purposes for which the data are processed. In other words, controllers and processors are required to prevent identification after the retention period has expired, and anonymization is one technique that can achieve this, for example, where deletion of data is not feasible.

Since anonymization and pseudonymization fall under personal data processing operations, they must comply with legal requirements, particularly those relating to the principles of processing under Article 5 of the ZZPL, and meet one of the conditions for lawful processing under Article 12 of the ZZPL.

After successful pseudonymization, the pseudonymized data remains personal data. Controllers and processors who have carried out the process must continue to ensure appropriate technical and organizational measures and enable data subjects to exercise their rights defined under the ZZPL, except in cases of processing that does not require identification.

The situation with anonymization is different – after a properly executed process, anonymized data is no longer considered personal data and falls outside the scope of the ZZPL and the GDPR.

Legal Practice and the “Motivated Intruder” Test

A recent decision, No. 498628 of the French Conseil d’État (the highest administrative Court), from February 2026, illustrates why pseudonymized data is still considered personal data.

In the case at hand, the Court examined healthcare databases containing information on patients’ health conditions and treatments, partially pseudonymized by replacing names with specific codes, but still containing numerous indirect identifiers that could be used to attribute data to data subjects. The defendant companies argued that the data could not be considered personal data because the database did not contain direct identifiers that could reveal identity.

The Court’s key position was that such data still constitutes personal data because there is a real possibility of attribution and re-identification using reasonably available means. The Court based its reasoning on the GDPR standard, which requires consideration of all means “reasonably likely to be used” for identification, including the costs and time required to perform identification, as well as available technology.

Although names were replaced with codes, the database still contained data such as patients’ age and gender, timestamps of medication purchases, and detailed health information (diagnoses, therapies, etc.), which constitute special categories of personal data. The Court emphasized that by combining certain pieces of information (e.g., rare disease + location and time of examination + attending physician), it may be possible to identify a patient even without direct identifiers. Due to this, and by using standard tools such as spreadsheet software and publicly available registers of healthcare professionals, it is relatively easy to combine data from the database and reconstruct individuals’ identities, or at least single them out from the group.

Although the controllers argued otherwise, the Court concluded that the data in question was pseudonymized and that the GDPR therefore applied. Furthermore, pseudonymization of names was insufficient to eliminate re-identification risks, as attribution to data subjects remained possible even without direct identifiers.

Finally, the Court emphasized that it is not relevant whether the controller actually performs attribution; what matters is whether identification is objectively possible using reasonable means, including publicly available sources of information.

This case shows that regulators clearly distinguish between anonymization and pseudonymization, and that these measures must be implemented in a way that truly achieves their intended purpose. Otherwise, “attempted” pseudonymization or anonymization is merely an additional cost that achieves nothing, as demonstrated in this case: the Court upheld the first-instance decisions and the imposed financial penalties.

Strategic Benefits That Pseudonymization and Anonymization Bring to Organizations

The aforementioned case of improperly implemented pseudonymization does not mean that it is merely an unnecessary cost – quite the opposite. Successful implementation of pseudonymization and anonymization can bring numerous strategic benefits to companies.

Key Benefits of Pseudonymization

Below is an overview of the main benefits of pseudonymization as outlined in EDPB Guidelines 01/2025:

  • Reduction of security risks for data subjects – pseudonymization mitigates the risks and consequences of a potential personal data breach, such as data leaks and unauthorized access. Pseudonymized data itself is useless if the attacker does not also possess the additional information, i.e., the “key” needed for attribution. For this reason, additional information must be stored separately.
  • Improved processing accuracyaccuracy is one of the principles of the ZZPL, and pseudonymization reduces the number of processing errors. Using specific pseudonyms for individuals with similar personal data (for example, names Ivan Ivanović and Ivana Ivanović can often be permuted) reduces the risk of incorrect attribution.
  • Barrier to unauthorized processing – if the pseudonymization domain is properly established, individuals with access to pseudonymized data will not be able to use it for unauthorized purposes. Even if someone within the domain intends to misuse the data, they will be technically prevented from doing so because they do not possess the additional information, i.e., the key required for identification.
  • Legal certainty in processing – pseudonymization allows attribution only in strictly defined cases and only to the necessary extent, while the rest of the database remains pseudonymized and therefore protected. In this way, the controller can demonstrate that it has implemented technical, organizational, and staff-related safeguards in accordance with the accountability principle under the ZZPL.

Key Benefits of Anonymization

Anonymization goes beyond pseudonymization. According to the ICO guidance on anonymization (Information Commissioner’s Office, the UK supervisory authority for data protection), the main benefit is that effectively anonymized data ceases to be personal data. Therefore, the ZZPL and GDPR no longer apply to its further use. More specifically:

  • Legal freedom and exit from the scope of the ZZPL and GDPR – since anonymized data no longer falls under the ZZPL/GDPR, organizations may store it indefinitely, making anonymization an ideal alternative to deletion, which is not always desirable and is often not even feasible.
  • Reduced security risks – since anonymization does not involve any “key” for decryption, even the most severe cyberattacks, data breaches, or internal misuse cannot expose risks to the individuals to whom the data originally related. Of course, controllers and processors still have an obligation to periodically assess whether new technologies or publicly available data could re-link anonymized data to data subjects, but this is considered unlikely.
  • Support for innovation and AI development – training artificial intelligence models and machine learning systems requires data, but not necessarily personal data, as these systems are primarily interested in patterns and trends. Anonymization makes this possible, enabling the development of modern solutions without constantly seeking new consent from data subjects and without the legal complexity and risks associated with potential further processing for other purposes.

Conclusion – Pseudonymization and Anonymization as Safeguards

I hope this article has helped clarify, at least to some extent, the difference between pseudonymization and anonymization and their treatment under the ZZPL and the GDPR.

In short, the essential difference is that pseudonymized data remains personal data because it can still be attributed. In contrast, anonymized data falls outside the scope of the Serbian Law on Personal Data Protection and the General Data Protection Regulation.

Of course, it is unlikely that controllers will always be able to choose between pseudonymization and anonymization, as these safeguards are rarely applicable in the same situations.

Pseudonymization and anonymization are not just “nice to have high-tech options”; they are essential legal expectations that regulators consider. These processes significantly enhance security and reduce the risks associated with personal data processing, especially in the wake of recent, far-too-common data breaches.