Artificial Intelligence and Health Privacy
Article
By Shannon Britton Hartsfield*
Artificial intelligence (“AI”) may change our health care delivery system in powerful ways. It may be able to promote efficiency and quality, but innovation also carries risk. Health privacy is a primary concern as AI plays an increasingly important role in health care. Ongoing improvement of AI systems requires the acquisition and processing of large amounts of health data to train models, yet in a speech on February 27, 2024, Federal Trade Commission Chair Lina Khan said that “some data, particularly peoples’ sensitive health data . . . is simply off limits for model training.”[1] Use of health care data in AI development brings privacy challenges and requires careful attention to federal privacy and security regulations implementing portions of HIPAA.[2]
The U.S. government has devoted resources in recent years to the use of AI to assist with making advances in health care. According to the U.S. Department of Health and Human Services (“HHS”), machine learning and AI are critical components of “achiev[ing its] mission to enhance health and well-being of all Americans.”[3] AI allows “computer systems to perform tasks normally requiring human intelligence,” and machine learning is a type of AI that allows computers to learn “without being programmed by humans.”[4] HHS states that it will partner with those “in academia, industry and government” to “leverage AI and machine learning to solve previously unsolvable problems.”[5] In March of 2021, HHS established the Office of the Chief Artificial Intelligence Officer (OCAIO), an office focused on AI.[6] This office is tasked with implementing HHS’s AI strategy, developing an AI governance structure, coordinating how HHS will respond to federal AI mandates, and collaborating across other HHS offices and agencies.[7]
HHS developed the Trustworthy AI (TAI) Playbook, published in September of 2021.[8] According to the Playbook, security and privacy breaches are listed among the top risks of AI identified by HHS.[9] An example given by HHS involves an AI model using protected health information (“PHI”)[10] to help develop public health measures.[11] If not properly secured, a cyber attack could compromise PHI and cause harm to the individuals who are the subject of the PHI.[12] HHS’s TAI Playbook lists the following four key privacy considerations:
Data sensitivity – whether the data contains personally identifiable information, PHI, or other sensitive data, and whether it has been de-identified or encrypted;
Individual privacy rights – whether individuals are aware of how their data is used, and whether they can opt in or out of having their data shared;
Legal requirements – whether the AI solution complies with applicable privacy laws and rules; and
Data sharing – whether the tradeoffs between protecting privacy and releasing data for the public good and scientific purposes have been adequately considered.[13]
Before beginning an AI project, the TAI Playbook recommends that a number of principles be considered, including privacy.[14] HHS’s recommendations include conducting a privacy impact assessment to examine whether sensitive data must be involved and how it will be collected, shared, and used.[15]
Use of Health Information for AI Training and Development
When HIPAA-covered entities[16] and business associates[17] hold data required to train AI models, those entities must ensure that any uses or disclosures of PHI for AI purposes conform to the parameters of the HIPAA Privacy Rule[18] and Security Rule.[19] HIPAA, for the most part, requires written patient authorization for most uses and disclosures of PHI, except for those required for treatment, payment, and health care operations.[20]
Research-related uses and disclosures of PHI will, in many cases, fall outside of the definitions of treatment, payment, or health care operations. HIPAA’s privacy regulations define research as a “systematic investigation” that is “designed to develop or contribute to generalizable knowledge.”[21] Knowledge can “be generalizable when it can be applied to either a population inside or outside the population served” by a HIPAA covered entity.[22] Development of AI tools may fall within HIPAA’s broad definition of research if it is systematic in nature and develops or contributes to generalizable knowledge, which could include commercial research.[23] If AI development is research, then patient authorizations or partial waiver of those authorizations by an institutional review board (IRB) or privacy board may be required to use PHI for such purposes.[24]
De-identified Data
The “health care operations” exception to HIPAA’s written authorization requirement includes creation of de-identified data and “limited data sets.”[25] Information is de-identified if it “does not identify an individual” and “there is no reasonable basis to believe that the information could be used to identify an individual.”[26] A “limited data set” is still PHI, but certain direct identifiers have been removed.[27] Once PHI is completely de-identified in accordance with HIPAA, the Privacy Rule and Security Rule no longer apply.[28] PHI may be used to create de-identified information regardless of whether the covered entity intends to use the de-identified data.[29] Therefore, HIPAA would not restrict the ability to use and disclose properly de-identified data for AI development.
The HIPAA regulations provide for two methods of de-identification of PHI—a safe harbor method and an expert determination method.[30] The safe harbor requires removal of a number of specific identifiers listed in 45 C.F.R. § 164.514(b)(2), including all elements of dates directly related to an individual, including dates associated with laboratory reports.[31]
Data sets falling outside of the safe harbor can still be determined to be de-identified, in certain cases. Under the expert determination method, a statistician or another appropriately qualified person could determine that the information is de-identified.[32] The expert would have to document the methods and results of an analysis concluding that the risk is very small that the information could be used by an anticipated recipient, alone or in combination with other data, to identify individuals.[33]
Limited Data Sets
Unlike de-identified data, using or disclosing a “limited data set,” as defined in the HIPAA regulations,[34] for AI development would be subject to restrictions. Limited data sets are still considered to be PHI. Therefore, HIPAA limits how these data sets can be used and disclosed. Specifically, limited data sets may be used and disclosed for research, public health, or health care operations purposes in certain circumstances.[35] Creation of a limited data set requires removal of many of the same data elements that are required for the de-identification safe harbor, but certain data elements can remain, including town or city, state, zip code, and dates related to the individual.[36] Using or disclosing a limited data set for research or other limited purposes requires a written data use agreement between the covered entity and the recipient.[37]
Impermissible Sales of PHI
Health care entities should use extreme caution when receiving anything of value in exchange for providing third parties with access to PHI, including limited data sets, for AI development or other purposes. HIPAA prohibits the sale of PHI, unless an exception applies.[38] This concept was discussed in the context of machine learning in a lower court opinion in the case of Dinerstein v. Google.[39] The plaintiff was a former patient of a hospital who asserted that the hospital violated HIPAA by selling limited data sets to Google.[40] The hospital provided data to Google to assist the company with developing a machine-learning algorithm to help identify certain health problems and predict future medical events.[41] In exchange for providing data, Google gave the hospital a perpetual license to use the algorithms it developed “for internal non-commercial research purposes.”[42] Although HIPAA allows a covered entity to provide PHI for research purposes for a reasonable cost-based fee when HIPAA otherwise allows the research, the plaintiff asserted that the license the hospital received was remuneration that exceeded the cost-based fee permitted by HIPAA.[43] In dicta, the lower court agreed with the plaintiff and found that a sale occurred that exceeded the fee permitted by the Privacy Rule.[44] Ultimately, however, both the lower court and the appellate court found in favor of Google and the hospital because the plaintiff failed to successfully demonstrate that he had suffered a sufficient injury.[45]
Because of the need for health data for AI development, companies are going to continue to seek out health care providers to obtain that data. Health care providers may also be seeking new revenue streams, and compensation in exchange for providing data could potentially result in substantial revenue. However, any data transfer in exchange for remuneration must be permissible under HIPAA and, in most cases, will require complete de-identification of the PHI prior to any sale.
Business Associate Agreement Restrictions
A “business associate,” as defined in the HIPAA regulations, is a person or entity that “creates, receives, maintains, or transmits” PHI for a HIPAA-regulated function or activity.[46] Business associates that seek to use and disclose PHI for AI development are subject to the same HIPAA restrictions as the covered entities that provide them with PHI. Business associates may be subject to even more stringent restrictions depending on the wording of the written HIPAA business associate agreements that must be in place with the covered entities.[47] Business associates cannot use or disclose PHI for purposes that may otherwise be permissible under HIPAA, such as de-identification or disclosure of PHI for the business associate’s own proper management and administration, unless the business associate agreement specifically permits such activities.[48] A business associate may not use or disclose PHI for AI-related purposes unless permitted by the business associate agreement and applicable law.
State Privacy Law Restrictions
The HIPAA privacy and security regulations are not the only laws and rules that could affect the use and disclosure of PHI for AI development. HIPAA preempts state privacy laws only in certain situations. First, the state law must be contrary to HIPAA,[49] meaning that it would be impossible to comply with both HIPAA and state law.[50] Additionally, state law preempts HIPAA if state law is “more stringent,”[51] meaning that the law provides the individual patient with more rights, is more protective of the information, or meets certain other standards.[52] For example, for AI programs that capture ambient voices to transcribe a patient visit, state law may require the permission of all parties to the conversation—the patients and practitioners—in order to record the conversation.[53]
Patient Rights and Transparency
In addition to compliance with the provisions above, it is important to refrain from using and disclosing PHI for purposes the patient would not reasonably expect. For example, if a covered entity’s HIPAA Notice of Privacy Practices[54] indicates that PHI will never be used for research without patient authorization, then AI-related research where an institutional review board would waive such authorization[55] would likely be impermissible.
Conclusion
The Federal Trade Commission (“FTC”) has indicated that those building AI tools “should always keep privacy and security in mind, such as in their treatment of the training data.”[56] The FTC has indicated that “it is critical that the research community keep privacy in mind” and that implementing sufficient privacy protection for data used to train AI “may be difficult in practice and may require creative solutions.”[57] If health data is, in fact, “off limits” for AI training, developers will need to come up with new mechanisms to improve the safety and efficacy of AI health care tools. The American Medical Association has noted that “protecting information gathered in association with the care of the patient is a core value in health care” and that patient privacy is “fundamental, as an expression of respect for patient autonomy and a prerequisite for trust.”[58] In the race to develop health-related AI tools, privacy must be a primary focus of compliance efforts.
Footnotes
* Shannon B. Hartsfield is the executive partner of Holland & Knight LLP's Tallahassee office. She is Board Certified in Health Law by The Florida Bar Board of Legal Specialization and Education, and she is the co-author of a book entitled HIPAA: A Practical Guide to the Privacy and Security of Health Data, Second Edition, published by the American Bar Association.
[1] RemedyFest, Live: Remedy Fest, YouTube (February 27, 2024), (https://www.youtube.com/watch?v=DuykIQ15Lag&t=20s).
[2] The Health Insurance Portability and Accountability Act of 1996, Public Law 104-191, 42 U.S.C. § 1320d–1320d-9 (2024).
[3] Artificial Intelligence (AI) at HHS, U.S. Dept. of Health and Hum. Servs., https://www.hhs.gov/about/agencies/asa/ocio/ai/index.html, (last visited Apr. 10, 2024).
[4] Id.
[5] Id.
[6] About the HHS Office of the Chief Artificial Intelligence Officer (OCAIO), U.S. Dept. of Health and Hum. Servs., https://www.hhs.gov/about/agencies/asa/ocio/ai/ocaio/index.html (visited Jan. 7, 2024).
[7] Id.
[8] Trustworthy AI (TAI) Playbook, U.S. Dept. of Health and Hum. Servs. (2021), https://www.hhs.gov/sites/default/files/hhs-trustworthy-ai-playbook.pdf.
[9] Id. at 5.
[10] See 45 C.F.R. § 160.103 (defining protected health information).
[11] See TIA Playbook, supra note 8, at 5.
[12] See id.
[13] Id. at 22.
[14] Id. at 30.
[15] Id.
[16] § 160.103 (defining a “covered entity” as a health plan, health care clearinghouse, or a health care provider who transmits health information in electronic form in connection with specific transactions listed in the HIPAA regulations).
[17] Id. (defining “business associate” broadly to include a number of entities that need to use or disclose protected health information on behalf of covered entities or other business associates for certain activities, including data analysis).
[18] 45 C.F.R. Part 160, Part 164 Subpart A and E (2013).
[19] 45 C.F.R. Part 160, Part 164 Subpart A and C (2013).
[20] 45 C.F.R. § 164.502(a)(1)(ii) (2013). State law may be more stringent, however.
[21] 45 C.F.R. § 164.501.
[22] Standards for Privacy of Individually Identifiable Health Information, 65 Fed. Reg. 82462, 82625 (Dec. 28, 2000).
[23] Greene, Adam and McWilliams, Rasheed, Is AI Development “Research” Under HIPAA?, JDSupra (June 27, 2023), https://www.jdsupra.com/legalnews/is-ai-development-research-under-hipaa-3640046/.
[24] See id.
[25] § 164.501.
[26] 45 C.F.R. § 164.514(a).
[27] § 164.514(e)(2).
[28] See § 164.514(a).
[29] 45 C.F.R. § 164.502(d).
[30] See Guidance Regarding Methods for De-identification of Protected Health Information in Accordance with the Health Insurance Portability and Accountability Act (HIPAA) Privacy Rule, U.S. Dept. Health Hum. Servs. (Nov. 26, 2012), https://www.hhs.gov/sites/default/files/ocr/privacy/hipaa/understanding/coveredentities/De-identification/hhs_deid_guidance.pdf [hereinafter De-Identification of PHI Methods].
[31] Id. at 7.
[32] 45 C.F.R. § 164.514(b)(2).
[33] See De-Identification of PHI Methods, supra note 30, at 12.
[34] 45 C.F.R. § 164.514(e)(2).
[35] § 164.514(e)(3).
[36] § 164.514(e)(2).
[37] § 164.514(e)(4).
[38] § 164.502(a)(5)(ii)(A).
[39] See Dinerstein v. Google, 484 F. Supp. 3d 561, 584-86 (N.D. Ill. 2020).
[40] Id. at 566, 569.
[41] Id. at 568.
[42] Id.
[43] Id. at 586.
[44] Id. at 586-87.
[45] See id. at 579; Dinerstein v. Google, 73 F.4th 502, 522-23 (7th Cir. 2023).
[46] 45 C.F.R. § 160.103.
[47] §§ 164.308(b)(1), 164.314, 164.504(e).
[48] § 164.504(e)(2).
[49] § 160.203.
[50] § 160.202.
[51] § 160.203(b).
[52] § 160.202.
[53] See, e.g., Fla. Stat. § 934.03(d) (2023) (indicating that it is lawful “for a person to intercept a wire, oral, or electronic communication when all of the parties to the communication have given prior consent to such interception”).
[54] 45 C.F.R. § 164.520.
[55] See § 164.512(i)(1)(i).
[56] See Fed. Trade Comm’n, Combatting Online Harms through Innovation: Report to Congress, p. 7-8 (2022) (“It may be that these responsibilities need to be imposed on executives overseeing development and deployment of these tools, not merely pushed as ethical precepts.”)
[57] Id. at 61.
[58] See Am. Med. Ass’n Code of Med. Ethics Opinion 3.1.1, Privacy in Healthcare, https://code-medical-ethics.ama-assn.org/sites/amacoedb/files/2022-08/3.1.1.pdf (last visited Feb. 24, 2024).