Skip to content

Instantly share code, notes, and snippets.

@hugodutka
Last active May 15, 2024 22:18
Show Gist options
  • Save hugodutka/6ef19e197feec9e4ce42c3b6994a919d to your computer and use it in GitHub Desktop.
Save hugodutka/6ef19e197feec9e4ce42c3b6994a919d to your computer and use it in GitHub Desktop.

Below you have a regulation called the EU AI Act (AI Act).

Legal Document

<!--
The content of the document truncated for readability.
Here's the full, unedited prompt: https://gist.github.com/hugodutka/6ef19e197feec9e4ce42c3b6994a919d/764e5b8f67b1e3dfefc81b7b96f506f29bc55ca3
-->

Note on the AI Act

Please note you're working with a draft version of AI Act. Draft versions of regulations have this unusual way of signifying additions of Articles or Recitals. For example, an article called "Article 43a" means, the first ("a") article added right after Article 43 that was found in the previous version of a draft. Similarly, "Article 43b" mean the second article added in the new version of the draft, right after "Article 43a". This is not to be confused with "Article 43(a)" which is a legal reference to some element in the list within Article 43. Pay special attention when referring to Articles and Recitals when preparing a plan.

Task Description

You're working for a law firm as a senior lawyer. You have the task of drafting a plan for a junior lawyer who will work on answering a legal question. The junior lawyer's scope is only working with the EU AI Act.

Legal Question

Does the development and deployment of a Language Model acting as a proxy to interpret and optimize SQL queries for better performance on database engines, which may potentially be employed by clients within the European Union and handle various types of data, including personal and sensitive information, and whose optimization could influence database query outcomes, indirectly impacting decision-making processes, fall within the regulatory scope of the EU's AI Act, and if so, what are the necessary compliance measures, specifically in the areas of transparency, accuracy, and human oversight?

Instructions

User's Legal Question

First, quote the original user's question for context of the plan.

Understanding the Legal Question

Next, look at the question and explain how you understand it. What is the context of the question? What goal do you infer from the question?

Ambiguities in the Legal Question

Then, list any ambiguities in the question. What would you need to know to provide a more precise answer?

Assumptions for the Legal Analysis and the Plan for the Junior Lawyer

Then, compose a list of assumptions that will shape the legal analysis in response to the user's question. The assumptions will also guide the plan for the junior lawyer. The goal is to clearly scope the answer and make the factual situation explicit. In other words, the aim is to make the legal analysis more focused and avoid excessive branching into likely irrelevant directions.

When it comes to assumptions:

  • Be Concise: Aim for clarity and brevity in your analysis. Each section should be direct and to the point, avoiding unnecessary details that do not contribute directly to understanding the legal question or the assumptions for analysis.
  • Use Clear Language: Avoid legal jargon where possible. Aim for clear, understandable language that can be easily followed by non-specialists.
  • Avoid Legal References: Do not refer to specific legal provisions, articles, or recitals in the assumptions. The goal is to provide a clear context for the legal analysis without delving into the specifics of the legal text.
  • Keep the Original Format: Maintain the structure provided in the template, including lists, bolded text, and lack of introduction. This will help maintain consistency and clarity in the response.
  • Reference Each Ambiguity: Ensure that each ambiguity in the legal question is addressed in the assumptions. This will help in providing a comprehensive and focused legal analysis.

Plan for the Junior Lawyer:

Finally, write a detailed plan (in Markdown) for a junior lawyer with step-by-step instructions on how to analyze the AI Act and answer the legal question.

Rules for writing the plan:

  • when writing the steps, use legal references at the granularity of a specific Article, Recital, or Annex; do not use higher-level structures (like Chapters or Titles)
  • the junior lawyer won't have access to the whole document, but will be querying the system for specific Article or Recital numbers; the plan has to be very detailed so the junior lawyer will know which parts of the document to fetch from the system
  • if in the plan you're referring to some important definitions or terms from the AI Act, create a section with the list of these definitions so the reader can have the full context to understand your answer.
  • if there's an ambiguity in the legal question, make a note of the ambiguity, and construct the plan around various possibilities; you have no option for follow-up questions
  • when analyzing the question and constructing the plan, try to be as helpful as possible; consider non-direct ways of addressing the goal behind the question.
  • write the whole output in Markdown for easier reading

Examples

Example 1

An example of an expected response (for a regulation called the General Data Protection Regulation):

### User's Legal Question
> there was a data leak involving the theft of a database containing the following categories of personal data: name, surname, e-mail address, delivery address. Do I have to inform the supervisory authority about this event? Do I have to inform data subjects about the event?

### Understanding the Legal Question
The legal question pertains to the obligations of a data controller or processor in the event of a data breach involving personal data. The user needs to know whether they must notify a supervisory authority about the breach and if they are also obligated to communicate the breach to the affected data subjects. The inferred goal here is to ensure compliance with the notification requirements of the General Data Protection Regulation (GDPR).

### Ambiguities in the Legal Question
1. **Jurisdiction and applicable law**: The question assumes the application of the General Data Protection Regulation (GDPR), but it doesn't specify the jurisdiction in which the data leak occurred. The GDPR is applicable within the European Union (EU) and the European Economic Area (EEA), and also affects organizations outside these areas that process data of EU/EEA residents. The jurisdiction could impact the specific obligations and procedures to follow.

2. **Nature of the entity involved**: It's unclear whether the entity suffering the data leak is a data controller, data processor, or a third party. Under GDPR, data controllers have specific obligations, including notifying supervisory authorities and data subjects of data breaches, whereas processors have different responsibilities.

3. **Severity and risks associated with the data leak**: GDPR mandates notification based on the risk to individuals' rights and freedoms. The severity and potential harm of the data leak, such as identity theft or fraud, are not specified. This ambiguity affects the assessment of whether the breach must be reported to authorities and to the individuals concerned.

4. **Previous data protection and security measures**: Information about the data protection and security measures in place before the breach is not provided. The adequacy of such measures can be relevant in the supervisory authority's assessment of the breach and any potential sanctions.

5. **Details about the stolen database**: The question lacks details on whether the stolen database contained data that was encrypted, anonymized, or otherwise protected, which can influence the need to notify the supervisory authority and the data subjects.

6. **Data subjects' location**: The locations or residency of the data subjects affected by the data leak are not specified. GDPR protections apply to data subjects in the EU/EEA, but if the data subjects reside outside these regions, different notification requirements might apply.

### Assumptions for the Legal Analysis and the Plan for the Junior Lawyer
1. **Jurisdiction and applicable law**: It is assumed that the entity involved in the data leak is subject to the General Data Protection Regulation (GDPR) and operates within the jurisdiction of the European Union or processes data of individuals residing in the EU/EEA.

2. **Nature of the entity involved**: For the purpose of this analysis, it is assumed that the entity is acting as either a data controller or a data processor under the GDPR, and therefore has specific obligations regarding data breaches.

3. **Severity and risks associated with the data leak**: As the question does not provide specific details about the severity or potential risks associated with the data leak, it is assumed that the breach poses a significant risk to the rights and freedoms of the affected individuals, triggering the notification requirements under GDPR.

4. **Previous data protection and security measures**: In the absence of details about pre-existing data protection and security measures, it is assumed that the entity had implemented reasonable measures to protect personal data, but the breach occurred due to circumstances beyond their control or unforeseen vulnerabilities.

5. **Details about the stolen database**: The data leak involved personal data categories such as names, surnames, email addresses, and delivery addresses. It is assumed that this data is not subject to additional protections like encryption or anonymization.

6. **Data subjects' Location**: In the absence of information about the data subjects' locations, it is assumed that the affected individuals are residents of the EU/EEA, and therefore, the GDPR notification requirements apply.

### Plan for the Junior Lawyer:
1. **Introduction to GDPR and Data Breach Notification**:
    - Begin by reading Article 4 to understand the definitions important to this case, particularly the definitions of "personal data" and "data breach".

2. **Identify Obligations for Notification of Supervisory Authority**:
    - Review the obligations related to the notification of a data breach to a supervisory authority articulated in Article 33.

3. **Examine Criteria for Notifying Data Subjects**:
    - Assess the circumstances under which data subjects need to be informed about a data breach as per Article 34.

4. **Study of Recitals for Contextual Interpretation**:
    - Consult Recital 85 to comprehend the rationale behind notifying supervisory authorities and the potential impact of a data breach on data subjects.
    - Read Recital 87 for additional context on communicating a breach to data subjects, particularly in relation to high risk to their rights and freedoms.

5. **Evaluating the Data Breach**:
    - Based on the information provided in the legal question and the guidance found in Articles 33 and 34 and the relevant Recitals, decide whether the data leak meets the GDPR threshold for notification to the supervisory authority and communication to the data subjects.

6. **Judging the Necessity to Notify Data Subjects**:
    - Evaluate if the breach likely results in a high risk to the rights and freedoms of natural persons. This assessment will help determine if direct communication to data subjects is mandatory according to Article 34.

7. **Action Plan for Notification**:
    - If the breach meets the criteria for notifying both the supervisory authority and the data subjects, outline the steps and information required for making these notifications effectively and within the stipulated timeframe as set out in Articles 33 and 34.

8. **Documentation and Record Management**:
    - According to Article 33(5), ensure there is an understanding of the obligation to document the breach and keep records, which may be necessary to demonstrate compliance with the notification requirements.

9. **Completing the Task**:
    - Compile a comprehensive report based on the findings from the above steps, including recommendations on whether the supervisory authority and data subjects should be notified, and the procedures to follow for each type of notification.

### Definitions and Terms from the General Data Protection Regulation:

- **Personal Data**: Information relating to an identifiable natural person.
- **Data Breach**: A breach of security leading to accidental or unlawful destruction, loss, alteration, unauthorized disclosure of, or access to, personal data.
- **Data Subject**: An identifiable natural person to whom personal data relate.
- **Controller**: The entity that determines the purposes and means of the processing of personal data.

Example 2

Another example of an expected response (for a regulation called the AI Act):

### User's Legal Question
> I have two AI products in mind that I'm considering building:
1. An AI-based Q&A where users ask legal questions and AI analyzes regulation's text and surrounding documents (e.g. court rulings) and answers the question. This will be marketed as legal research, and not as legal advice.
2. An AI-based tool that aids doctors at creating medical documentation. It would help summarize voice conversations, and fill out internal documents for doctors based on rough notes provided by a doctor.
>
> Would any of the two fall into the high-risk category according to the AI Act?

### Understanding the Legal Question
The legal question relates to the classification of two AI-based product ideas under the potential risk categories of the EU AI Act. The user wants to ascertain whether an AI Q&A tool for legal research and an AI tool for aiding in medical documentation creation are subject to the high-risk category regulations. The user's goal is to understand the regulatory obligations that may affect the development and deployment of these AI products.

### Ambiguities in the Legal Question
1. **Functionalities and capabilities**: The specific functionalities and capabilities of the AI-based Q&A and medical documentation tools are not fully detailed. Knowing what the AI systems can and cannot do is necessary to assess potential legal implications.

2. **Data types and sensitivity**: The types and sensitivity of data processed by both AI systems are not described. Different legal requirements apply depending on whether the data is personal, sensitive, or otherwise protected.

3. **Scope of surrounding documents**: The scope of the "surrounding documents" that the legal Q&A AI would analyze is not clear. The contents and nature of these documents could impact the AI's outputs and any resulting liabilities.

4. **Autonomy in decision-making**: The extent of the autonomy of both AI systems in decision-making is not indicated. More autonomous systems may have different legal considerations compared to those with more human oversight.

5. **Context and applications**: The specific context and applications of both AI products within their respective industries are not fully detailed. The legal implications could vary depending on how and where the AI is being applied.

### Assumptions for the Legal Analysis and the Plan for the Junior Lawyer
1. **Functionalities and capabilities**: The AI-based Q&A tool will analyze legal documents to provide informative responses, but will not offer definitive legal advice. The AI-based medical documentation tool will summarize voice conversations and auto-fill documents based on doctors' notes, without making autonomous medical decisions.

2. **Data types and sensitivity**: The legal Q&A tool will process publicly available legal documents and regulations, without storing personal or sensitive data. The medical documentation tool will handle necessary patient data for creating medical records, but will not process additional sensitive data.

3. **Scope of surrounding documents**: The "surrounding documents" analyzed by the legal Q&A tool will be limited to official, publicly available legal documents directly relevant to the queried regulations.

4. **Autonomy in decision-making**: Both AI systems will operate as assistive tools under the control of human professionals in their respective fields, without making autonomous decisions that have legal consequences or independent medical diagnoses.

5. **Context and applications**: The AI-based legal Q&A tool will be applied in the context of legal research and analysis, not as a substitute for professional legal advice. The AI-based medical documentation tool will support healthcare professionals in administrative tasks, not as a replacement for human medical judgment.

### Plan for the Junior Lawyer:
1. **Research High-Risk AI Categories**:
    - Examine Article 6 of the AI Act, focusing on the classification rules for high-risk AI systems, to see if the two AI product ideas fit within those rules.

2. **Determine the Relevancy of AI Systems to Specific Use Cases**:
    - Review Annex III which lists specific high-risk AI systems and use cases to identify if either product idea is explicitly mentioned or closely related to the use cases addressed in the Annex.

3. **Assess Involvement in Biometric Identification or Categorisation**:
    - Check if the legal Q&A tool implicates Annex III, point 1 concerning AI systems used for biometric identification or categorization.
    - Investigate if the medical documentation tool falls under Annex III, point 2, section (aa) as it may involve processing data concerning health.

4. **Evaluate the AI Systems in the Context of Healthcare and Educational Services**:
    - For the medical documentation tool, assess Articles from Annex III, points 3, 4 and 5 that discuss the implications of AI in education, vocational training, employment, management, and health care sectors.

5. **Scrutinize the AI Systems within the Law Enforcement Context**:
    - Determine if the legal Q&A tool could be seen as providing a service that might relate to the profiling activities covered by Annex III, points 6 and 7.

6. **Consider Ethics and Rights**:
    - Read Recital 85, for additional context on the rights and freedoms of natural persons, which could be a concern for the legal Q&A tool.
    - Check Recital 87 for Privacy and Data Governance insights concerning the medical documentation tool.

7. **Clarify Transparency Obligations**:
    - Analyze Article 52 to understand the transparency obligations that might apply to either AI system, particularly if there's a remote chance they generate or manipulate content for users.

8. **Submit Findings and Recommendations**:
    - Prepare a report summarizing findings related to the applicability of high-risk categories to both AI products. Include recommended actions or features to avoid classification as high-risk under the AI Act.

### Definitions and Terms from the EU AI Act:

- **High-Risk AI System**: A system that presents significant risks to health, safety, or fundamental rights. The AI Act contains specific criteria and use cases for such systems.
- **Biometric Identification System**: An AI system that processes biological or behavioral data to uniquely identify individuals.
- **Medical Documentation Tool**: An AI application intended to aid healthcare professionals by processing data for documentation purposes. Not explicitly named in the AI Act but could be inferred from context under health-related provisions.
- **Legal Q&A Tool**: An AI application that provides answers based on the legal documents. Not specifically mentioned in the AI Act, but could relate to provisions on information society services.
- **Data Governance**: The process of managing the availability, usability, integrity, and security of data used by an AI system.
- **Fundamental Rights**: Rights recognized under the EU Charter of Fundamental Rights, such as privacy, non-discrimination, and access to justice, which may be affected by AI systems.
@bzmrgonz
Copy link

bzmrgonz commented May 1, 2024

Any chance I can see how the 226 page regulation document was structured with markdown? Does this mean we need to start adding markdown to our legal drafts? Or are we finally gonna teach our lawyers markdown and forget about the traps of docx??

@hugodutka
Copy link
Author

Any chance I can see how the 226 page regulation document was structured with markdown?

Here: https://gist.github.com/hugodutka/6ef19e197feec9e4ce42c3b6994a919d/764e5b8f67b1e3dfefc81b7b96f506f29bc55ca3

@bzmrgonz
Copy link

bzmrgonz commented May 1, 2024 via email

@M-JULIANI
Copy link

@hugodutka great experiment! question: the above 226 page regulation structured in markdown, how many tokens did that end up using in the end?

i'm doing a similar experiment where i have a 207 page pdf (which i am converting to markdown), yet I am getting the following error (I am on the free usage tier)

Limit 30000, Requested 309107. The input or output tokens must be reduced in order to run successfully.

What usage tier are you on?

Thanks in advance,
M

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment