Privacy is often talked about as a personal preference — something that matters to cautious or private people, while bolder, more modern individuals simply accept that living online means sharing data. This framing does a disservice to what privacy actually is. Privacy is not about hiding things you are ashamed of. It is about maintaining the conditions under which autonomy, authentic relationships, and political freedom are possible. When information about you is collected, analyzed, and acted upon without your knowledge or meaningful consent, the power in the relationship between you and the organizations that hold that data shifts in ways that have concrete consequences.

The scale of modern data collection is genuinely difficult to comprehend. Every app you install, every website you visit, every search you make, every loyalty card you swipe, every location ping from your phone, every television program you watch through a streaming service — these generate data that is collected, aggregated, sold, and used to build profiles that are often more detailed and accurate than anything you have consciously provided about yourself. This is not abstract: these profiles are used to make decisions about what you pay for insurance, whether you receive a job interview, what political advertising you see, and what news reaches you.

Understanding data privacy requires engaging with both its regulatory dimensions (what GDPR and CCPA require), its technical dimensions (what companies actually collect and how), its economic dimensions (the data broker industry), and its philosophical dimensions (why privacy is a collective rather than purely individual concern). This article addresses all of these, with attention to where the language of 'privacy as preference' systematically obscures what is actually at stake.

"Surveillance is the business model of the internet. Data is not the new oil — it is the new means of behavioral control." — Shoshana Zuboff, adapted from 'The Age of Surveillance Capitalism'


Key Definitions

Personal data: Any information that relates to an identified or identifiable individual. Under GDPR, this is defined broadly — including name, email, IP address, location data, cookie identifiers, and inferred attributes.

GDPR: The General Data Protection Regulation, a European Union law effective since 2018 that governs the collection, processing, and storage of personal data of EU residents. Applies globally to any organization that processes EU residents' data.

CCPA: The California Consumer Privacy Act, in effect since 2020, providing California residents with rights to know what personal data is collected, to opt out of its sale, and to request deletion.

Data broker: A company that collects personal data from multiple sources, aggregates it into profiles, and sells or licenses those profiles to third parties. Typically operates without a direct relationship with the individuals whose data it holds.

Differential privacy: A mathematical framework providing formal privacy guarantees by adding calibrated noise to data analyses, ensuring that individual participation in a dataset cannot meaningfully increase the risk of that individual's data being revealed.


Regulation Jurisdiction Key Rights Granted Applies To
GDPR European Union Access, correction, deletion, portability, objection Any org processing EU residents' data
CCPA/CPRA California, USA Know, delete, opt out of sale, non-discrimination Businesses above revenue/data thresholds
LGPD Brazil Similar to GDPR with local variations Organizations processing Brazilian residents' data
PIPEDA Canada Access, correction, accountability principles Private sector organizations in Canada
PDPA Singapore Access, correction, data portability (partial) Organizations processing Singapore residents' data

What Companies Actually Collect

The Visible Collection

Some data collection is visible and expected. When you create an account, you provide your name and email. When you make a purchase, you provide payment and shipping information. When you post on social media, you provide the content you choose to share. These forms of collection are straightforward, and most users are aware they are happening.

The privacy concern with visible collection is about what happens to that data after collection: how long it is retained, who it is shared with, how it is combined with other data, and what decisions it informs. A health app that asks for your date of birth, weight, and fitness goals has legitimate use for that data to provide its service — but the same data sold to insurance companies or employers creates entirely different implications.

The Invisible Collection

The more consequential data collection is largely invisible. Cross-site tracking through cookies and fingerprinting follows your behavior across websites you visit, building a behavioral profile that includes every category of content you read, every product you browse without buying, and your patterns of interest over time. Mobile advertising identifiers (the IDFA on iOS and GAID on Android) link your app usage to your device identity, enabling cross-app tracking.

Location data is particularly revealing. From GPS data captured by apps and sold to data brokers, researchers and journalists have repeatedly reconstructed highly sensitive behavioral patterns: attendance at medical clinics, places of worship, political events, and immigration lawyer offices. In 2020, The New York Times's Privacy Project published location data analyses demonstrating that even 'anonymized' location datasets allowed individual identification with straightforward techniques.

Third-party pixels and SDKs (software development kits) embedded by app developers cause data to be sent to advertising networks, analytics companies, and data brokers whenever the app is used — often without the app developer's full knowledge of what data is being transmitted. A 2018 study by researchers at Oxford found that the average app had 12 third-party trackers embedded.

The Inferred Data

Beyond what you explicitly share and what is invisibly collected, data brokers and advertising platforms generate inferred attributes — predictions about characteristics you have never disclosed. These include: estimated income and net worth, political affiliation, religious views, health conditions (inferred from search and purchase patterns), sexual orientation (inferred from app usage and content engagement), relationship status, and psychological traits.

Inferred attributes may be wrong — and when used in consequential decisions, errors are harmful. But there is no systematic mechanism for individuals to know what has been inferred about them, challenge incorrect inferences, or prevent those inferences from being used.


GDPR vs. CCPA: Rights and Limits

GDPR's Framework

The General Data Protection Regulation, effective May 2018, represents the most comprehensive data protection framework currently in force globally. Its core principles require: lawful basis for processing (typically explicit, informed, freely given consent or demonstrated legitimate interest); data minimization (collecting only what is necessary); purpose limitation (using data only for the purpose it was collected for); storage limitation (retaining data only as long as necessary); and data security.

GDPR grants individuals enforceable rights: the right to access data held about them, the right to correction, the right to erasure (right to be forgotten), the right to data portability, the right to object to processing, and rights related to automated decision-making. Violations can result in fines of up to 20 million euros or 4% of global annual revenue, whichever is higher.

GDPR enforcement has been uneven — Ireland's Data Protection Commission, which is responsible for many of the largest technology companies due to their European headquarters being in Dublin, has been criticized for slow and weak enforcement. But the regulation has had real effects: cookie consent requirements changed the web's user experience globally, data breach notification became standard, and the GDPR framework influenced privacy legislation worldwide.

The largest GDPR fine to date was 1.2 billion euros against Meta (Facebook) by the Irish DPC in 2023, related to transfers of EU user data to the United States without adequate protection mechanisms.

CCPA's Approach

California's Consumer Privacy Act, effective January 2020 and strengthened by the California Privacy Rights Act (CPRA) in 2023, takes a different approach. Where GDPR is comprehensive and rights-based, CCPA is more targeted and opt-out oriented. It gives California residents the right to know what personal information businesses collect about them and why, the right to delete personal information held by businesses, the right to opt out of the sale of their personal information, and the right not to be discriminated against for exercising these rights.

CCPA applies only to for-profit businesses that meet threshold criteria (annual revenue above $25 million, or that buy/sell/receive/share personal information on more than 100,000 consumers or households annually). Many small businesses and nonprofit organizations are exempt.

The limitation of the opt-out model is behavioral: most people never exercise opt-out rights, either because they are unaware of them, because the process is made deliberately inconvenient, or because the interface design (dark patterns) makes opting out more difficult than staying opted in. Research by Lorrie Faith Cranor and colleagues at Carnegie Mellon has documented extensively how interface design affects privacy choice exercise rates.


The Data Broker Industry

Scale and Structure

The data broker industry processes and sells information on hundreds of millions of individuals. The largest players — Acxiom, Experian, Equifax (consumer profile division separate from its credit reporting function), Epsilon, Nielsen, LexisNexis, and Verisk — are relatively well-known within the industry. Below them are thousands of smaller brokers specializing in specific data types or markets.

A typical Acxiom profile may include over 3,000 data attributes per individual: name, address history, phone numbers, email addresses, vehicle ownership, estimated income, household composition, political affiliation, religious affiliation, purchase history categories, health interest categories, media consumption patterns, and more. These profiles are sold to direct marketers, financial institutions, healthcare marketers, political campaigns, and — increasingly — law enforcement agencies.

Law Enforcement Use

The Fourth Amendment to the U.S. Constitution protects against unreasonable search and seizure and generally requires a warrant for law enforcement to obtain communications content. It does not apply to information voluntarily shared with third parties — the 'third party doctrine' established in Supreme Court cases Smith v. Maryland (1979) and United States v. Miller (1976).

Law enforcement agencies have exploited this by purchasing data from data brokers rather than seeking warrants. The purchase of location data, purchase histories, and behavioral profiles from commercial data brokers requires no judicial oversight. Several agencies including Immigration and Customs Enforcement (ICE) and the Defense Intelligence Agency have contracted directly with data brokers for access to commercial data pools, a practice that the Electronic Frontier Foundation and ACLU have challenged as an unconstitutional workaround to warrant requirements.


Differential Privacy: A Technical Approach

The Privacy-Utility Tradeoff

Traditional statistical disclosure limitation techniques — publishing aggregated data, suppressing small cell counts, generalizing geographic information — provide imperfect privacy protection because sufficiently motivated analysts can often reconstruct individual records through combination attacks.

Differential privacy, developed formally by Cynthia Dwork and colleagues at Microsoft Research in 2006, provides a mathematically rigorous framework. A mechanism satisfies differential privacy if the output of the mechanism is essentially the same whether or not any single individual's data is included. This is achieved by adding calibrated random noise to query results — the noise is chosen to be large enough to mask individual contributions but small enough to preserve aggregate accuracy.

Real-World Implementations

Apple has used differential privacy since iOS 10 to collect aggregate usage statistics — which keyboard words are most common, which emoji are used most, which websites use the most battery — without being able to attribute any individual usage pattern to a specific user.

The U.S. Census Bureau deployed differential privacy for the 2020 Census, using it to protect individual household records in published tabulations while preserving accuracy for aggregate statistics. The transition was controversial among researchers who relied on Census data, as the added noise reduced accuracy for small geographic areas and subpopulations. The episode illustrated that differential privacy involves genuine tradeoffs between privacy protection and data utility.

Google's RAPPOR system uses differential privacy to collect statistics from Chrome users' browser settings, and Google has used related techniques in its Federated Learning of Cohorts (FLoC) advertising targeting proposal — though FLoC was ultimately abandoned in favor of the Privacy Sandbox approach following significant criticism.


Why Privacy Is a Power Issue

Surveillance Capitalism

Shoshana Zuboff, professor at Harvard Business School, developed the concept of surveillance capitalism in her 2019 book of the same name. Her argument is that the major internet platforms discovered a new economic logic: behavioral data generated by users' online activities has value not just for improving services but as raw material for predicting — and influencing — behavior. The product sold to advertisers is not just ad placement; it is access to a behavioral modification apparatus.

The power asymmetry this creates is significant: users generate the behavioral data, companies analyze it, and the resulting predictions are used to influence user behavior in ways users do not see and cannot easily resist. Zuboff argues this represents a fundamental challenge to human autonomy, not merely a privacy concern.

Privacy and Political Freedom

Surveillance enables repression at the political level. Knowing who communicates with whom, what people believe, and where they gather allows authoritarian governments to identify and preemptively suppress opposition. The same tools and data infrastructure that enable commercial behavioral targeting can be repurposed for political targeting.

The journalist Glenn Greenwald, who reported on Edward Snowden's NSA revelations in 2013, argued that surveillance has a chilling effect on behavior even when people are doing nothing wrong — the knowledge of observation changes what people say, search, and associate with. This chilling effect is the mechanism by which surveillance constrains freedom even absent direct punishment.

Privacy protection, in this frame, is not about individual preference but about maintaining the structural conditions for free society: the ability to think, communicate, assemble, and dissent without systematic behavioral monitoring.


Practical Takeaways

For individuals: understanding what rights apply in your jurisdiction (GDPR if you are in the EU, CCPA if you are in California, with similar regulations emerging in other states and countries) and knowing that you can exercise them — submitting data access requests, deletion requests, and opt-outs from data sales — creates some measure of practical control. Tools like browser extensions (uBlock Origin, Privacy Badger), privacy-focused search engines (DuckDuckGo), and privacy-respecting email providers (Proton Mail) reduce the volume of tracking data generated.

The deeper point is structural: meaningful privacy in the current environment requires both individual practice and collective action — regulatory pressure, enforcement, and norms that treat data collection with genuine scrutiny rather than passive acceptance.


References

  1. Zuboff, S. (2019). The Age of Surveillance Capitalism: The Fight for a Human Future at the New Frontier of Power. PublicAffairs.
  2. European Union. (2016). Regulation (EU) 2016/679 (GDPR). Official Journal of the European Union.
  3. California Attorney General. (2020). California Consumer Privacy Act: Text of Statute. California DOJ.
  4. Dwork, C., et al. (2006). 'Calibrating noise to sensitivity in private data analysis.' Theory of Cryptography Conference, 265-284.
  5. New York Times Privacy Project. (2019). One Nation, Tracked. The New York Times, December 19.
  6. Cranor, L. F. (2012). 'Necessary but not sufficient: Standardized mechanisms for privacy notice and choice.' Journal on Telecommunications and High Technology Law, 10(2), 273-308.
  7. Reardon, J., et al. (2019). '50 ways your data is collected.' USENIX Security Symposium.
  8. Electronic Frontier Foundation. (2023). Government Purchases of Personal Data. EFF Deeplinks.
  9. Greenwald, G. (2014). No Place to Hide: Edward Snowden, the NSA, and the U.S. Surveillance State. Metropolitan Books.
  10. Solove, D. J. (2011). Nothing to Hide: The False Tradeoff Between Privacy and Security. Yale University Press.
  11. Irish Data Protection Commission. (2023). Decision on Meta (Facebook) Data Transfer. DPC.ie.
  12. Warren, S. D., & Brandeis, L. D. (1890). 'The right to privacy.' Harvard Law Review, 4(5), 193-220.

Frequently Asked Questions

What is the difference between GDPR and CCPA?

GDPR (General Data Protection Regulation) is a European Union regulation that applies to any organization processing the personal data of EU residents, regardless of where the organization is located. It requires explicit, informed consent for data collection, gives individuals rights to access, correct, and delete their data, mandates data breach notification within 72 hours, and imposes significant fines for violations (up to 4% of global annual revenue). CCPA (California Consumer Privacy Act) is a California state law with similar principles but narrower scope: it applies to businesses that meet certain revenue or data volume thresholds and California residents. GDPR is generally considered more comprehensive and stringent; CCPA is seen as more business-friendly with broader exemptions.

What are data brokers and why are they a privacy concern?

Data brokers are companies that collect, aggregate, and sell personal data about individuals, often without those individuals' knowledge. They compile information from public records, social media, loyalty card programs, mobile apps, website tracking, and purchased datasets into detailed profiles covering names, addresses, phone numbers, income estimates, political views, health interests, relationship status, and behavioral patterns. Companies like Acxiom, Experian, and LexisNexis maintain profiles on hundreds of millions of people. This data is sold to advertisers, employers, insurers, landlords, and law enforcement. The privacy concern is that people have no meaningful ability to see, correct, or remove information that is used to make consequential decisions about them.

What is differential privacy?

Differential privacy is a mathematical framework for analyzing datasets in ways that protect individual privacy. It works by adding carefully calibrated statistical noise to results, ensuring that the output of an analysis does not change meaningfully whether or not any specific individual's data is included. This gives individuals a formal privacy guarantee: their participation in a dataset does not meaningfully increase the risk of their specific information being revealed. Apple uses differential privacy to collect aggregate usage statistics from devices, and the U.S. Census Bureau deployed differential privacy for the 2020 Census to protect individual records while preserving the accuracy of aggregate statistics. It represents the most rigorous technical approach to the privacy-utility tradeoff.

What is the right to be forgotten?

The right to be forgotten (formally called the 'right to erasure' in GDPR Article 17) gives individuals the right to request that an organization delete their personal data when it is no longer necessary for the original purpose, when consent is withdrawn, or when the data has been processed unlawfully. The concept gained major legal standing through the 2014 European Court of Justice ruling in Google Spain v. AEPD and Mario Costeja Gonzalez, which established that individuals could request search engines to de-index certain results about them. The right is not absolute — it can be overridden by public interest, freedom of expression, or legal obligations — but it represents a significant check on permanent digital records and has been exercised millions of times against major search engines.

Why is privacy described as a power issue rather than just a preference?

Privacy as a preference frames it as a matter of personal comfort — some people care about it, others do not. Privacy as a power issue recognizes that information asymmetry creates control. When companies or governments know far more about you than you know about them, they can predict and influence your behavior in ways you are unaware of. Shoshana Zuboff's concept of 'surveillance capitalism' describes how behavioral data is used not just to predict but to modify behavior at scale. At a political level, surveillance enables repression: knowing who communicates with whom, what people believe, and where they go allows authoritarian governments to identify and target dissidents. Privacy is a precondition for autonomy, political freedom, and the ability to develop ideas and identities without constant observation.