A Theory of Data with Practical Legal Implications

October 18 2021 | Committees


Data is considered one of the most valuable asset classes for the twenty-first century enterprise.[1] This awareness will intensify as data continues to fuel the digital economy and stimulate multi-billion-dollar transactions. Yet, data’s value proposition is just one concern in debates about data’s role in modern life. Other concerns include personal and social issues related to privacy, access to information,[2] and freedom of expression.[3] These concerns reflect conflicting perspectives, which meld together to obscure data’s proper legal treatment.[4] As a result, recent legal discourse has focused on how to strike the right balance between competing interests when formulating a data governance regime and, more specifically, accompanying rights in data.[5]

Rights in data are widely misunderstood by attorneys and non-attorneys, alike.[6] This misunderstanding is excusable given politicians’ and scholars’ failure to develop a comprehensive legal framework for dealing with rights in data (properly understood).[7] Without such a framework, the legal community cannot articulate coherent, practical and theoretically sound data rights, because the intellectual scaffolding for doing so is missing. This article aims to fill that void.

This article examines data’s nature, data’s relation to its associations (i.e., data’s elements, means of expression, processing, and accuracy), and data’s legal treatment in practice. In doing so, this article’s primary contribution is to present a systematic way to think about data and its associations that is useful for delineating related rights and conducting further legal analysis. More broadly, this article provides a logical framework upon which coherent data policies can be built.

This article proceeds by presenting a definition for data and distinguishing data from its associations to facilitate legal analysis.


Policymakers must articulate data’s nature and relation to other concepts (labeled “associations”)[8] to successfully evaluate the merits of its legal treatment, both in practice and in theory. That effort is nothing new, as philosophers have wrestled with models of information since at least the Hellenistic period.[9] Since then, enduring questions have tended to center on how information (or, more precisely, data),[10] as a distinct concept, stands in relation to its content, expression, use, and accuracy.[11] The discussion below tackles these questions for a clear functional purpose: to distill data’s meaning, distinguish its many associations, and thereby outline key concepts to assist further legal inquiries.

A. Data Defined

In a legal sense,[12] data is an abstract proposition regarding an object of consideration’s state. The state may designate, for example, that object’s act, composition, kind, quality, relation to other objects, or position within a system for coordinating perceptions.[13] When the object of consideration is placed in that context, the resulting idea is subject to factual examination in theory but must be expressed in a comprehensible way to be utilized in practice. For this reason, data is often expressed in a physical medium that facilitates processing activities, such as storage and transmission. While such expressions are what intelligent beings and processes technically interact with when dealing with data, data is an idea independent of any physical manifestation.[14]

B. Data’s Associations

The above notion is useful for delineating rights and obligations related to data because it distinguishes data from data’s legally significant associations, including: (1) data’s elements, especially the data subject; (2) the means of expressing data, both in a general sense (i.e., mode of expression) and in a particular instance (i.e., medium of expression); (3) the analytic operations performed on data directly (i.e., reasoning) and indirectly via its physical manifestations (i.e., processing); and (4) the connection between data and the empirical universe or some defined logical system (i.e., accuracy). Each such association and its basic legal significance are discussed below, in turn.

1. The Elements of Data

For an idea to be considered data under the above definition,[15] it must both: (i) reference an object of consideration,[16] and (ii) posit something about that object’s state.[17] Both of these elements are essential. Without context, the object of consideration is a mere theoretical placeholder.[18] Without reference to an object, a potential state is a mere instrument of comprehension.[19] Thus, the object of consideration must be put in context (or, said differently, the context must be anchored by the object of consideration) to conceive a proposition subject to factual examination that is properly called “data.”

(i) The Data Subject

For particular data points,[20] the object of consideration is the data subject.[21] The data subject may reference someone or something that exists in reality or that acts as a reference point in some hypothetical system.[22] For this reason, a data point may not completely articulate every feature of the data subject’s discoverable state and, thus, may not serve to fully identify the data subject in a broader sense.[23] However, when a data point is combined with a set of related data points and considered in the right context, enough information may be presented to examine the data subject in more detail and with superior empirical accuracy. This feature expands a data point’s potential uses by increasing its explanatory power.[24]

The relationship between a data point and its data subject is legally significant due to its potential real-world connections. The data subject may be an actual person or a thing in which an actual person has an interest.[25] So, data points may have direct links to actual persons, who hold legal rights and have politically significant expectations. This link is a concern for two diverging reasons: (a) information revealed by data points may be used in a way that harms related persons; and (b) another’s access to that information may serve some acceptable, or even socially desirable, function.[26] Whether these reasons for concern apply to a particular data point ultimately depends on the information it reveals about the data subject (i.e., the posited state) and how that information can be used.

(ii) The Posited State

Like the data subject, the posited state articulated by a data point may or may not align with reality or some logical system. That alignment ultimately depends on the data subject, as the data subject anchors the posited state to the empirical or logical environment in which the hypothesized situation is examined. Consequently, this examination into the posited state is often secondary to an inquiry into the nature of the data subject.

The relationship between a data point and the posited state is legally significant due to the real-world implications of its accuracy. Actual persons or processes may rely on the posited state’s factual accuracy when utilizing the related data. As a result, processors (or their controllers) may be harmed when they process data that does not align with expectations for factual accuracy. Similarly, another’s processing of inaccurate data may unfairly harm data subjects or those with interests in data subjects. Thus, concerns regarding data subjects and posited states are intertwined.

In short, data’s elements each bring about legal concerns that help to shape the broader issues revolving around data, namely use and accuracy. However, before fully considering how data is used or the extent to which it can be considered accurate, it is important to consider how data is expressed and, thereby, comprehended.

2. The Means of Expressing Data

Data is an abstract proposition that must be expressed to be comprehended. There are two levels to this requirement. First, in a general sense, data must be expressed in a mode of expression that is theoretically comprehensible, like linguistic or pictorial representation. Second, in a particular instance, data must be expressed through an experiential/tangible medium of expression that facilitates actual comprehension, such as a digital file or painting. Each of these levels present unique concerns from a policy standpoint.

(i) The Mode of Expression

As for the mode of expression, its connection to data is tenuous. Firstly, the same data may be expressed through various modes of expression and, thus, no particular mode of expression is essential. Secondly, a particular mode of expression may not convey the same data to all processors in all contexts and, thus, modes of expression are imperfect. Finally, a mode of expression may provide tangential utility independent of the data it conveys (or is designed to convey), such as aesthetic or functional utility.

For the above reasons, the social and economic value of modes of expression are not necessarily tied to the data such modes convey (or are designed to convey). Instead, modes of expression, as a distinct form of association, tend to be valued for their aesthetic or functional qualities.[27] Thus, legal regimes governing the modes of expression are often unconcerned with the protection of related data.[28]

The modes of expressing data are legally significant, because their creation necessarily involves some effort and resource allocation.[29] Despite such an investment, an author that expresses data in a particular mode may not realize the full benefit of that expression, because a mode of expression (distinct from any physical manifestation) is technically non-excludable, can be enjoyed by others at relatively little marginal cost, and has utility associated with positive externalities. Thus, the law may need to step in to incentivize the creation of modes of expression to achieve an optimal level of development.

(ii) The Medium of Expression

As for the medium of expression, its connection to data is secondary to practical considerations regarding its tangible form. Firstly, a medium of expression is subject to possession, allowing for potential access to or exclusion of the expressed data.[30] Secondly, a medium of expression facilitates isolated processing activities, with no bearing on data processing that utilizes other mediums of expression. Finally, the medium of expression’s connection to the data it conveys (or is meant to convey) is dependent on the embodied mode of expression, especially to the extent it embodies a mode of expression of particular aesthetic or functional qualities. Thus, the medium of expression presents concerns that precede (and may supersede) concerns regarding expressed data.

For the above reasons, a medium of expression’s social and economic value may correlate the expressed data’s use value, but such correlation depends on extraneous factors. This feature complicates the task of coordinating policies for protecting data with policies for protecting the mediums of expressing data. Thus, legal regimes directly governing mediums of expression as property are distinguishable from legal regimes indirectly governing expressed data based on tort law.[31]

Any medium of expressing data is legally significant because it is composed of tangible resources that are susceptible to allocation.[32] Like all other tangible resources, a medium of expression: (a) is finite; (b) may be possessed, transferred, and excluded from access; and (c) has inherent value, at least in terms of opportunity costs. Thus, the law must provide a scheme for allocating such resources and governing their use.

3. Data Processing

Data’s connection to its medium of expression facilitates processing activities, in the broad sense. Processing activities include: (1) capturing data in a medium of expression; (2) storing the captured data in one or more mediums of expression; (3) transferring expressed data to another’s possession through a medium of expression; (4) destroying an instance of expressed data by destroying its medium of expression; and (5) using expressed data to make a decision, capture new data, or for some other purpose. Each of these processing activities presents unique policy concerns and is discussed below, in turn.

(i) Data Capture

Data is captured when it is expressed through a comprehensible medium. Data capture may occur through different processes, such as automated processes (e.g., IoT device recordings), intentional processes (e.g., someone taking a photograph), or biological processes (e.g., human experience). However, regardless of the process, data capture’s result is always (by definition) of the same kind: the representation of data in a comprehensible manner.

Data capture, as a distinct processing activity, is legally significant because it may involve an invasion of privacy or proprietary rights.[33] In this sense, the concern is that the capture method may involve an invasion of another’s reasonable expectation of privacy or a physical trespass on another’s tangible property.[34] This concern may not fully capture an aggrieved person’s ultimate worry about the content of captured data and how it will be used, but the concrete actions associated with data capture necessarily precede and dictate potential data uses. Thus, the legal concern with data capture (at the point of capture) remains with how data is captured (rather than the captured data’s content), albeit with an eye towards why data is captured and how the captured data will be further processed.[35]

(ii) Data Storage

Captured data may be stored in one or more mediums of expression. In this sense, data storage is the fixation of a mode of expression in a tangible medium that facilitates an intelligent being’s or intelligent process’s comprehension of expressed data. Although data is technically stored at the moment of capture, this article focuses on the types of data storage that enable expressed data to be comprehended by more than one individual or process through experiential or technical means.[36] This focus places processing activities like creating copies of stored data (whether perfect or imperfect) under the umbrella of data storage.

These types of data storage, as distinct processing activities, are legally significant because they cause the expressed data to be susceptible to unauthorized processing.[37] In this sense, these types of data storage present security concerns related to expressed data. For this reason, those who control or possess stored data may be required to implement administrative, physical or technical safeguards to protect it.[38] Thus, the legal concern here is with stored data’s susceptibility to processing by different processors.[39]

(iii) Data Transfer

Data may be transferred among processors through mediums of expression. Data transfers may involve the physical transfer of data storage media (e.g., handing over a USB flash drive containing a digital file) or the use of one or more transient mediums of expression (e.g., sending a smoke signal).[40] This conception of data transfers includes any manner in which data may be intentionally communicated to another processor, including verbal disclosures, but does not include mere alterations in data storage (alone). Thus, the primary focus of data transfers is on the processors (i.e., transferors and transferees), not the utilized transfer method.[41]

Data transfer, as a distinct processing activity, is legally significant because it allows intended transferees to further process the expressed data in unauthorized or unanticipated ways.[42] The concern is not that the transfer may not be secure (which is technically a data storage concern) but, instead, that the intentional transfer itself may be unauthorized by the rightful controller or otherwise induced by improper means.[43] For example, the transferor may be contractually prohibited from disclosing the expressed data to its intended transferee, or an intended transferee may have received disclosure authorization through fraudulent means. Thus, the legal concern here is with control over who is given the ability to further process expressed data.

(iv) Data Destruction

An instance of expressed data is destroyed when the medium of its expression ceases to exist or is no longer capable of facilitating its comprehension (in an irreversible way). This concept is not to say that the abstract proposition is destroyed or that the expressed data cannot survive in other mediums of expression. Instead, the concept is that an intelligent being or process can no longer comprehend the data that was once expressed through a now non-existent or irreversibly altered medium of expression.

Data destruction, as a distinct processing activity, is legally significant because it may cause the loss of all reliable forms of expressed data.[44] As a practical matter, it may be costly or impossible to capture the same data by alternate, acceptable means. Such a scenario can be problematic when the lost data helps processors to understand a broader context or to complete a desired task. Thus, the legal concern here is with loss of the once-expressed data’s processing potential.

(v) Data Use

Data capture, storage, transfer and destruction are all secondary to data use, which involves inputting expressed data into an analytical process that yields some ultimate result (e.g., a decision). Consequently, the important feature that distinguishes use from other processing activities is that use (i.e., intentional reasoning with particular data points) operates on data directly, rather than indirectly through data’s medium of expression. Although tangible expression is a prerequisite to data use, data (the abstract proposition) is the ultimate raw material for such usage via analytical processes. In this sense, data’s end is usage.

The above understanding is legally significant because it pin-points the primary concern regarding data processing in general as well as the practical concern regarding data, itself, in debates over the proper legal regime for governing it. In each case, the concern is control over how data (regardless of its factual accuracy) is used or can be used.[45] While there may be an impulse to attempt to ground rights in data separated from its medium of expression, doing so ignores the fact that concerns rest with data usage, not data in a vacuum, and usage relies on mediums of expression. Thus, the legal concern here is with the practical control of outcomes of processing data, independent of its content or its factual accuracy.

4. Factual Examinations of Data

Data is subject to factual examination in theory. As an abstract proposition, a data point presents discrete positive (or negative) claims regarding a data subject’s state. Such claims may be tested in the empirical or logical environment in which the data subject is positioned.[46] The focus of such an examination depends on the sense in which data may be considered accurate.

(i) Empirical Accuracy

Data may be accurate in the empirical sense, meaning the data subject’s posited state comports with experiential reality. Experiential reality, however, does not lend itself to infallible examination.[47] Instead, the empirical accuracy of any given data point is often clouded by diverse or incomplete perceptions regarding the expressed data or relevant circumstances.[48]

This type of accuracy is also somewhat misleading in practice because it assumes that instances of expressed data will be interpreted in a certain way. However, data must be expressed to be comprehended, and expressions often convey different messages to different processors.[49] Consequently, when discussing the empirical accuracy of data, it is often important to view the expressed data from an identified audience’s perspective.[50] This shift in focus has the odd result of bringing a hypothetical inquiry to bear on an empirical question. Thus, for practical purposes, empirical accuracy may not be as definitive as the term suggests and often cannot be simplified to a binary choice between “true” and “false.”[51]

The above understanding is legally significant for three main reasons. First, expectations regarding the empirical accuracy of data often determine how such data will be processed, raising the concern that false data will be used to a processor’s or other related person’s detriment. Second, different processors may attribute different data to the same mode of expression, creating an opening for ambiguous or misleading modes of expression to propagate. Third, a practical evaluation of empirical accuracy is inherently procedural and does not lend itself to absolute truths.[52] For these reasons, the law may be concerned about the empirical accuracy of both the data meant to be conveyed and the data actually conveyed with a given mode of expression, but must address such concerns indirectly through procedural means.

(ii) Logical Accuracy

Data may be accurate in the logical sense, meaning the data subject’s posited state comports with the logical implications of given assumptions.[53] Such assumptions may specify parameters for some conclusion that triggers a pre-determined consequence outside of the logical system.[54] Thus, logical implications of data may yield practical outcomes.

This type of accuracy places the assumptions and processes of the underlying logical system at the forefront of consideration. Assuming empirical accuracy (if necessary), logical accuracy or inaccuracy will necessarily follow from given assumptions and processes. Because that accuracy may have practical consequences, the design of the logical system will be of upmost concern.

The above understanding is legally significant, because logical systems are often used to make economically, politically and socially important decisions, albeit sometimes covertly.[55] The assumptions and processes used by such systems may treat a data subject’s characteristic in a way that society views as illegitimate or unfair.[56] Thus, the law may have an interest in restricting the types of assumptions or processes used to make important decisions or, at least, exposing such assumptions and processes.[57]

C. Policy Questions

The above discussion regarding data’s and its associations’ theoretical nature presents important questions regarding how data should be treated in practice. Central to these questions has been the issue of whether data can be considered a form of property and, thus, the object of rights distinct from any rights regarding its associations.[58] This issue, however, yields an even deeper, often ignored, issue: whether data is even susceptible to a rational legal regime’s direct governance. I plan to explore this issue and related legal concerns in a future publication.

Parker Smith
CoreServe Legal
Technology Law Committee Chair

Written on Behalf of the Technology Law Committee



© 2021 Parker N. Smith

* - Parker N. Smith is an attorney and the founder CoreServe Legal, LLC, a law firm based in New Orleans, LA. Parker’s practice primarily focuses on helping clients with intellectual property and information technology transactions and advising clients on ancillary matters related to technological innovation, data privacy, and analytics. He currently serves as the Chair of the New Orleans Bar Association’s Technology Law Committee.

[1]       The World’s Most Valuable Resource is No Longer Oil, but Data, The Economist (Mar. 6, 2017), https://www.economist.com/leaders/2017/05/06/the-worlds-most-valuable-resource-is-no-longer-oil-but-data.

[2]       See, e.g., Rebecca Wexler, Privacy as Privilege: The Stored Communications Act and Internet Evidence, 134 Harv. L. Rev. 2721 (2021) (discussing case law regarding access to social media communications under the Stored Communications Act (SCA), 18 U.S.C. §§ 2701–2712). Another example is the difficulty of certifying the reproducibility of scientific studies that use restricted data. Christopher Perignon et al., Certify Reproducibility with Confidential Data, Sci., July 12, 2019, at 127.

[3]       See, e.g., Rebecca Tushnet, Trademark Law as Commercial Speech Regulation, 58 S. C. L. Rev. 737 (2007).

[4]       See generally Lothar Determann, No One Owns Data, 70 Hastings L.J. 1, 6 (2018).

[5]       See, e.g., Alexander Tsesis, Data Subject’s Privacy Rights: Regulation of Personal Data Retention and Erasure, 90 U. Colo. L. Rev. 595 (2019) (discussing the tension between European-style privacy rights and the First Amendment).

[6]       See Thomas M. Boyd & Tara Sugiyama Potashnik, Data Ownership: The Suitability of a Consumer Property Right in a 21st Century Economy, at 3 (Sept. 29, 2020), available at https://www.venable.com/insights/publications/2020/09/data-ownership (“The frequent comingling of ‘control’ and ‘ownership’ in data discussions reveals a fundamental misunderstanding by some policymakers and the public at large about the actual legal meaning of ‘ownership.’”).

[7]       Although some legal regimes deal with data privacy issues comprehensively, data privacy issues are merely a subset of potential data rights issues, more generally.

[8]       This article uses the term “associations” to discuss concepts that are dependent on the abstract concept of data (as defined in Part I.A) but are not, strictly speaking, within that concept’s scope. In this sense, data and its associations are both distinct and interdependent.

[9]       See, e.g., Aristotle, Introductory Readings (Terence Irwin & Gail Fine trans., Hackett Publishing Co. ed. 1996).

[10]     Here, data essentially represents discrete forms of information that have a propositional feature, ignoring broader conceptions of information that include the universe of potentially discoverable facts and conceivable ideas.

[11]     See generally Epistemology, History of, Encyclopedia of Philosophy, available at https://www.encyclopedia.com/humanities/encyclopedias-almanacs-transcripts-and-maps/epistemology-history (last updated Sept. 23.2021) (providing a historical account of epistemology, including concepts related to information’s content, features, and truth).

[12]     This article aims to articulate an understanding of data that is both theoretically rigorous and practically useful for delineating legal rights. This article does not step into larger philosophical debates regarding the differences among concepts like data, information, knowledge, and wisdom. See, e.g., Saša Baškarada & Andy Koronios, Data, Information, Knowledge, Wisdom (DIKW): A Semiotic Theoretical and Empirical Exploration of the Hierarchy and Its Quality Dimension, 18 Australasian J. Info. Sys. SYS. 5 (2013) (providing a comparative analysis of the use of those terms). The author views such debates as over-complications of a relatively straightforward, atomized view of cognition, in which inputs (i.e., data) are plugged into an analytic process (i.e., reasoning) and yield an output (e.g., a decision), and has not been convinced that successive applications of analytical processes to resulting data provide anything that is different in kind from the initial data from which so-called “knowledge” or “wisdom” derives.

[13]     Even space and time should be considered aspects of an object of consideration’s state when those concepts are merely considered to be tools of human cognition. Immanuel Kant, Critique of Pure Reason 65-91 (Norman Kemp Smith trans. 1929, reissued ed. 2007). This is to say that potential states do not exist without reference to some object of consideration, whether explicit or implied.

[14]     Determann, supra note 4, at 6 (“Information can be or relate to diverse things, such as memories, thoughts, discoveries, insights, opinions, perceptions, fictions, or answers to questions. Information can be stored in physical forms, such as human brains and data servers, or physically expressed in books or on road markings. It can also be communicated via smoke signals, blinking lights, measurable radio waves, digital cable connections, or writings on a wall. But the informational content, that is, data itself, exists separately from its context of a larger data base or work of authorship or its physical embodiment.”). In this sense, data can be thought of as a “non-thing” because its abstract nature does not lend itself to exclusive possession or control.

[15]     See supra Part I.A, para. 1. This article uses that definition for references to “data,” unless otherwise stated.

[16]     The term “object of consideration” is used here to emphasize a thinker’s critical role in data’s existence. While the “object of consideration” is the subject of the data (an abstract proposition about that subject’s state), data’s very existence depends on a thinking thing’s act of comprehension. Thus, while the “object of consideration” is the data’s subject, it’s also the object of the thinking thing’s consideration (or simply the “object of consideration,” for brevity).

[17]     The term “state” is used here as a generic term for one of multiple possibilities across multiple dimensions. In other words, it is a discrete set of circumstances within a larger universe of potential circumstances. It can be as simple as a single option from a single category (e.g., the color red) or as complex as simultaneous options from an incalculable number of different categories (e.g., all the features that define a natural ecosystem at a given point in time). It can even be compounded across time, because time is just another possible category or dimension for defining the overall state.

[18]     Consider a naked variable in mathematics. That object of consideration is only useful when it is used in an equation, a data set, or some other broader context. Without such context, the variable is a mere placeholder. A standalone variable is not data.

[19]     Consider the color red. That potential state is only useful when applied to a thing (i.e., an object of consideration). Without such application, red is just a potential quality for distinguishing something based on color. A standalone concept of red is not data.

[20]     The term “data points” is used here to distinguish between data as a concept and a particular example of that concept. Thus, the idea of “the red truck” meets the definition of data and can be thought of as a single data point.

[21]     See supra notes 16 & 17.

[22]     For example, consider a reference to Athena. Athena may act as the subject of different propositions, like “Athena is a real-life human alive today (perhaps the daughter of Greek mythology enthusiasts),” “Athena is a character in a fictional play about the struggle for women suffrage,” or “Athena is the goddess of wisdom herself.” In each case, the term “Athena” is used as the subject of a claim about her state.

[23]     This feature helps to explain efforts to transform personal data into forms of data that can be used to gain valuable insights on a macro level without compromising human data subjects’ privacy. See Steven M. Bellovin, Preetam K. Dutta & Nathan Reitinger, Privacy and Synthetic Datasets, 22 Stan. Tech. L. Rev. 1, 2-7 (2019).

[24]     The expansion of a data points potential uses is not necessarily a net positive for society. Id. at 15-17 (discussing the risks associated with the reidentification potential of de-identified personal data).

[25]     The term “data subject” is used in the broad sense and is not restricted to the narrower concept of data subjects as individuals associated with personal data under some privacy laws.

[26]     For example, public records laws serve to ensure transparency so citizen can hold their government accountable. David C. Vladeck, Information Access—Surveying the Current Legal Landscape of Federal Right-to-Know Laws, 39 Environ. L. & Pol. Annual Rev. 10773 (2009).

[27]     This tendency is best seen in the legal treatment of works of authorship under copyright law.

[28]     See, e.g., 17 U.S.C. § 102(b) (“In no case does copyright protection for an original work of authorship extend to any idea, procedure, process, system, method of operation, concept, principle, or discovery, regardless of the form in which it is described, explained, illustrated, or embodied in such work.”).

[29]     This feature is best seen in copyright law justifications. See generally Nikos Koutras, From Property Right to Copyright: A Conceptual Approach and Justifications for the Emergence of Open Access, 12 Erasmus L. Rev.144-152 (2019).

[30]     See Determann, supra note 4, at 13-14.

[31]     Determann, supra note 4, at 7-15, 18-19.

[32]     The idea of wholly-uncaptured data is incoherent. Of course, the scientific community can recognize that there may be some far-off planet with an atmospheric composition similar to Earth’s atmosphere. In this sense, one may propose that the planet’s circumstances reflect uncaptured data. However, it would be more precise to say that the planet’s potential circumstances, as imagined by scientists, are captured data, albeit not of the empirical variety.

[33]     Determann, supra note 4, 7-13, 22-25.

[34]     Id.

[35]     A simple example can illustrate this point. Suppose you saw a news headline that read: “Man takes pictures of women in the nude.” This headline may stir up initial feeling of disgust and outrage. But, consider how those feelings might change if upon further reading you discover that the pictured women were voluntarily protesting unhealthy sexual norms surrounding body image in a public space. What this hypothetical shows is that there is not some inherent concern for the content of data at the point of capture (the “what”), but there may be significant concerns with the methods used to capture such data (the “how”) or, more remotely, how such data may be used in the future (the “why”). Of course, some capture methods may be considered so harmful it is easy to conflate concerns about capture with concerns about content, such as when capture necessarily involves a data subject that is incapable of consenting to what society considers an activity that requires consent. However, drilling down into concerns about content typically yields deeper issues regarding accuracy, use or other forms of processing activities, in each case independent of the content in a vacuum.

[36]     Data residing in a human’s internal memory, given modern technological restraints, would not be considered stored data in this limited sense of the term.

[37]     William McGeveran, The Duty of Data Security, 103 Minn. L. Rev. 1135 1141 (2019).

[38]     Id. at 1175-1193.

[39]     Mariel Borowitz, Government Data, Commercial Cloud: Will Public Access Suffer?, Sci., Feb. 9, 2019, at 588.

[40]     Determann, supra note 4. Notably, data transfer assumes the receiving party’s capture of the transferred data.

[41]     This priority fits well within the justifications for fiduciary duties of confidentiality. See Jack M. Balkin, The Fiduciary Model of Privacy, 134 Harv. L. Rev. Forum 11 (2020). The utilized transfer method is technically a storage issue.

[42]     Camille Calman, Bigger Phish to Fry: California’s Anti- Phishing Statute And Its Potential Imposition Of Secondary Liability On Internet Service Providers, 13 Rich. J.L. & Tech. 2, 3-4 (2006).

[43]     See id.

[44]     This concern is seen in data retention laws and laws imposing liability for evidence spoilation. See generally Daniel Borel, The Land of Oz: Spoliation of Evidence in Louisiana, 74 La. L. Rev. 507 (2014).

[45]     See generally Determann, supra note 4.

[46]     Such environments are built on certain assumptions, such as the existence of objective reality for real-world data or fundamental rules for artificial information environments.

[47]     Caryn Devins, et al., The Law and Big Data, 27 Cornell J.L. & Pub. Pol. 357, 372 (2017) (“Data is inherently both subjective and incomplete, rather than objective and determinant. Without being filtered and theoretically-driven, mere data only produces a meaningless sea of correlations and must be simplified in order to be understood. This act of simplification (and aggregation), like legal interpretation, requires theory. Even the very act of deciding what data to gather in the first place—what to measure and observe, when and how—necessitates a theory.” (footnotes omitted)).

[48]     Id. This point calls into question the very existence of truly objective criteria in the same tradition as Albert Einstein’s explanation of space-time variability. Albert Einstein, Relativity: The Special and General Theory (1920)

[49]     This concern is fundamental to the interaction between trademark law and First Amendment protections for commercial speech. See Rebecca Tushnet, Trademark Law as Commercial Speech Regulation, 58 S.C. L. Rev. 737, 739-744 (2007).

[50]     See Zauderer V. Office of Disciplinary Counsel, 471 U.S. 626, 651 (1985).

[51]     Court systems acknowledge this reality by imposing burdens of proof less than absolute certainty. See generally Ronald J. Allen, Burdens of Proof, 13 L. Probability & Risk 195 (2014).

[52]     See generally id.

[53]     Mathematical proofs represent the purist form of logical accuracy.

[54]     For example, companies are increasing their use of artificial intelligence applications for determining or informing important business decisions, such as hiring decisions. Filippo Raso, Artificial Intelligence & Human Rights: Opportunities & Risks 44-45 (2018).

[55]     Kristin Johnson, Frank Pasquale, & Jennifer Chapman, Artificial Intelligence, Machine Learning, and Bias in Finance: Toward Responsible Innovation, 88 Fordham L. Rev. 499, 510-513 (2019).

[56]     See id.

[57]     Id. Concerns about logical accuracy should not be confused with concerns about data use. Data use concerns involve the issue of whether expressed data should be used for some end by inputting that data into an analytical process. In contract, logical accuracy concerns involve the issue of whether the analytical process is designed in an acceptable manner (assuming the appropriateness of data use). Thus, whereas data use concerns revolve around purpose, logical accuracy concerns revolve around process.

[58] See Alina Ng, Rights, Privileges, and Access to Information, 42 Loy. U. Chi. L.J. 89, 92-93 (2010).




« back to News