Opt-Out Step Count Codebook¶

Methodological grounding¶

This codebook operationalizes consent friction measurement for insurance privacy documents. Step counting follows the approach established by Nouwens et al. (2020) in their analysis of GDPR consent interfaces, extended to cover non-digital opt-out pathways predominant in health insurance documents. Friction categories align with the European Data Protection Board's deceptive design patterns taxonomy (EDPB Guidelines 03/2022, version 2.0) and the FTC's 2022 Staff Report Bringing Dark Patterns to Light.

Definition¶

A step is any discrete action a member must take to exercise a single privacy right. Steps are counted from the moment a member decides to act to the moment the request is submitted. Waiting periods and system limitations are recorded separately as flags.

Step types¶

Each step is classified as one of three types.

Locate: finding information needed to act, such as a phone number, mailing address, web portal, or form location.

Act: taking an action, such as calling, navigating to a web page, mailing, submitting, or completing a form.

Verify: providing identity confirmation, such as a member ID, date of birth, answering security questions, or providing a written signature.

What counts as a step¶

  1. Locating contact information [Locate]
  2. Initiating contact such as calling, navigating to a web page, opening mail [Act]
  3. Verifying identity [Verify]
  4. Stating the request verbally or in writing [Act]
  5. Completing a form [Act]
  6. Mailing or submitting a document [Act]
  7. Navigating to a separate system where each separate platform or opt-out link counts once [Act]

Asymmetry measure¶

For each document, record separately the steps required to enroll or opt in versus the steps required to opt out. Asymmetry exists when opt-out requires more steps than opt-in. Asymmetry is itself a documented dark pattern per Nouwens et al. (2020) and EDPB Guidelines 03/2022.

Where enrollment steps are not described in the document, record as not specified rather than zero.

Flags (recorded separately, not as steps)¶

  • Waiting period (note duration if specified)
  • Insurer may decline the request
  • Prior disclosures cannot be undone
  • Opt-out not available for this data type
  • Do Not Track signals ignored
  • Default opt-in (member is enrolled automatically without action)
  • No digital opt-out exists (mail or phone only)

Coding rules¶

  • Code the most burdensome pathway described in the document
  • If multiple opt-out mechanisms exist, code each separately
  • If a step is ambiguous, note it and apply conservative judgment
  • Record step type (Locate / Act / Verify) for each step
  • Each document coded independently
  • Record asymmetry where both enrollment and opt-out pathways are described
  • Do not infer steps not described in the document
  • Document the exact language used to justify each step count in the coding log

Scoring¶

Record the following separately for each document. Do not combine into a single score.

  • Total action steps
  • Step type breakdown (number of Locate / Act / Verify steps)
  • Total flags
  • Asymmetry measure (opt-in steps vs opt-out steps)
  • Notes on ambiguous coding decisions

Output format¶

Each coded document produces one row in the coding log CSV with the following columns:

insurer, state, doc_type, opt_out_pathway, total_steps, locate_steps, act_steps, verify_steps, total_flags, flags_list, opt_in_steps, asymmetry_exists, ambiguous_notes, coder, date_coded

Scope and limitations¶

This codebook measures described friction, meaning the steps a member is instructed to take according to the document text, rather than experienced friction, meaning what users actually encounter when attempting to exercise privacy rights. Described friction is a conservative measure. Actual burden may exceed what documents describe due to navigation difficulty, unclear instructions, and interface design not captured in text alone (Habib et al., 2020).

Document analysis is appropriate as a primary method here because the documents themselves are the consent instrument. Members are presented with these texts as the basis for their consent. The friction embedded in the document description is the friction that governs the legal relationship regardless of whether users successfully navigate it.

In [1]:
#pip install anthropic
In [2]:
import anthropic
import pdfplumber
import csv
import json
import os
from docx import Document
from datetime import datetime
In [5]:
import anthropic
import pdfplumber
import csv
import json
import re
import os
from docx import Document
from datetime import datetime


CODEBOOK = """
A step is any discrete action a member must take to exercise a single privacy right.

Step types:
- Locate: finding information needed to act
- Act: taking an action
- Verify: providing identity confirmation

What counts as a step:
1. Locating contact information [Locate]
2. Initiating contact [Act]
3. Verifying identity [Verify]
4. Stating the request verbally or in writing [Act]
5. Completing a form [Act]
6. Mailing or submitting a document [Act]
7. Navigating to a separate system [Act]

Flags (recorded separately):
- Waiting period
- Insurer may decline the request
- Prior disclosures cannot be undone
- Opt-out not available for this data type
- Do Not Track signals ignored
- Default opt-in
- No digital opt-out exists

Coding rules:
- Code the most burdensome pathway described
- Do not infer steps not described in the document
- If multiple opt-out mechanisms exist, code each separately
"""

documents = [
    {"insurer": "Aetna", "state": "Georgia", "doc_type": "Web Privacy Policy",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\Aetna privacy.docx"},
    {"insurer": "Anthem BCBS", "state": "Georgia", "doc_type": "HIPAA Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\anthem BCBS privacy practices.pdf"},
    {"insurer": "Anthem BCBS", "state": "Georgia", "doc_type": "HIPAA Notice (Spanish)",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\anthem BCBS privacy spanish.pdf"},
    {"insurer": "Cigna", "state": "Georgia", "doc_type": "Data Sharing Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\Cigna privacy data sharing.docx"},
    {"insurer": "Cigna", "state": "Georgia", "doc_type": "Global Health Benefits Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\cigna-global-health-benefits-privacy-notice-eng_copy.pdf"},
    {"insurer": "Cigna", "state": "Georgia", "doc_type": "HIPAA Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\cigna-health-care-and-cigna-supplemental-benefits-privacy-notice-eng_copy.pdf"},
    {"insurer": "Cigna", "state": "Georgia", "doc_type": "GLB Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\gramm-leach-bliley-act-privacy-notice_copy.pdf"},
    {"insurer": "Humana", "state": "Georgia", "doc_type": "HIPAA Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\humana privacy practices.pdf"},
    {"insurer": "UnitedHealthcare", "state": "Georgia", "doc_type": "Web Privacy Policy",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\UHC privacy.docx"},
    {"insurer": "UnitedHealthcare", "state": "Georgia", "doc_type": "HIPAA Notice",
     "path": r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\united hipaa privacy.pdf"},
]

def extract_text(path):
    if path.endswith(".pdf"):
        with pdfplumber.open(path) as pdf:
            return " ".join(
                page.extract_text() for page in pdf.pages
                if page.extract_text()
            )
    elif path.endswith(".docx"):
        doc = Document(path)
        return " ".join(para.text for para in doc.paragraphs if para.text)
    return ""

def parse_response(raw):
    clean = raw.replace("```json", "").replace("```", "").strip()
    clean = clean.encode("ascii", "ignore").decode("ascii")
    match = re.search(r'\{.*\}', clean, re.DOTALL)
    if not match:
        raise ValueError("No JSON object found in response")
    json_text = match.group(0)
    return json.loads(json_text)

def repair_json(broken_text, client):
    print("  Attempting JSON repair...")
    repair_prompt = f"""The following text is supposed to be a JSON object but has syntax errors.
Fix it and return valid JSON only. No explanation, no markdown, just the JSON object.

{broken_text[:4000]}"""
    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=1000,
        messages=[{"role": "user", "content": repair_prompt}]
    )
    return parse_response(response.content[0].text.strip())

def code_document(insurer, state, doc_type, text, client, text_limit=8000, max_tokens=1000):
    prompt = f"""You are a systematic research coder applying a privacy policy codebook.

CODEBOOK:
{CODEBOOK}

DOCUMENT:
Insurer: {insurer}
State: {state}
Document type: {doc_type}
Text: {text[:text_limit]}

Identify ALL opt-out pathways described. Return ONLY a valid JSON object.

STRICT JSON RULES:
- ASCII characters only
- No newlines inside string values
- No double quotes inside string values, use single quotes instead
- Keep all string values under 60 characters
- flags_list items must be under 40 characters each
- ambiguous_notes under 80 characters

{{
  "insurer": "{insurer}",
  "state": "{state}",
  "doc_type": "{doc_type}",
  "pathways": [
    {{
      "opt_out_pathway": "pathway name",
      "total_steps": 0,
      "locate_steps": 0,
      "act_steps": 0,
      "verify_steps": 0,
      "steps_detail": [
        {{
          "step_number": 1,
          "type": "Locate or Act or Verify",
          "description": "what member must do"
        }}
      ],
      "total_flags": 0,
      "flags_list": ["flag name"],
      "opt_in_steps": "not specified",
      "asymmetry_exists": false,
      "ambiguous_notes": "brief note"
    }}
  ]
}}"""

    response = client.messages.create(
        model="claude-sonnet-4-6",
        max_tokens=max_tokens,
        messages=[{"role": "user", "content": prompt}]
    )

    raw = response.content[0].text.strip()

    try:
        return parse_response(raw)
    except (json.JSONDecodeError, ValueError) as e:
        print(f"\n=== JSON ERROR ===")
        print(e)
        print(f"\n=== RAW OUTPUT (first 500 chars) ===")
        print(raw[:500])
        print("=================\n")
        try:
            return repair_json(raw, client)
        except Exception as e2:
            print(f"  Repair also failed: {e2}")
            raise

run_timestamp = datetime.now().strftime("%Y-%m-%d %H:%M:%S")
all_rows = []
raw_outputs = []

for doc in documents:
    print(f"Coding: {doc['insurer']} - {doc['doc_type']}")
    try:
        text = extract_text(doc["path"])

        # Use reduced text limit and higher token budget for complex documents
        if doc["doc_type"] == "Data Sharing Notice":
            text_limit = 4000
            max_tokens = 1500
        else:
            text_limit = 8000
            max_tokens = 1000

        result = code_document(
            doc["insurer"], doc["state"], doc["doc_type"],
            text, client, text_limit, max_tokens
        )
        raw_outputs.append(result)

        for pathway in result["pathways"]:
            row = {
                "run_timestamp": run_timestamp,
                "insurer": result["insurer"],
                "state": result["state"],
                "doc_type": result["doc_type"],
                "opt_out_pathway": pathway["opt_out_pathway"],
                "total_steps": pathway["total_steps"],
                "locate_steps": pathway["locate_steps"],
                "act_steps": pathway["act_steps"],
                "verify_steps": pathway["verify_steps"],
                "total_flags": pathway["total_flags"],
                "flags_list": "; ".join(pathway["flags_list"]),
                "opt_in_steps": pathway["opt_in_steps"],
                "asymmetry_exists": pathway["asymmetry_exists"],
                "ambiguous_notes": pathway.get("ambiguous_notes", ""),
                "coder": "claude-sonnet-4-6",
                "date_coded": run_timestamp,
            }
            all_rows.append(row)
        print(f"  Found {len(result['pathways'])} pathway(s)")

    except Exception as e:
        print(f"  Failed: {e}")

csv_path = r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\opt_out_step_counts.csv"
json_path = r"C:\Users\victo\OneDrive\Desktop\Privacy Policies\opt_out_raw_outputs.json"

if all_rows:
    with open(csv_path, "w", newline="", encoding="utf-8") as f:
        fieldnames = [
            "run_timestamp", "insurer", "state", "doc_type", "opt_out_pathway",
            "total_steps", "locate_steps", "act_steps", "verify_steps",
            "total_flags", "flags_list", "opt_in_steps", "asymmetry_exists",
            "ambiguous_notes", "coder", "date_coded"
        ]
        writer = csv.DictWriter(f, fieldnames=fieldnames)
        writer.writeheader()
        writer.writerows(all_rows)

    with open(json_path, "w", encoding="utf-8") as f:
        json.dump(raw_outputs, f, indent=2)

    print(f"\nDone. {len(all_rows)} pathways coded across {len(documents)} documents.")
    print(f"CSV: {csv_path}")
    print(f"JSON: {json_path}")
else:
    print("No rows to save.")
Coding: Aetna - Web Privacy Policy
  Found 1 pathway(s)
Coding: Anthem BCBS - HIPAA Notice
  Found 3 pathway(s)
Coding: Anthem BCBS - HIPAA Notice (Spanish)
  Found 3 pathway(s)
Coding: Cigna - Data Sharing Notice
  Found 2 pathway(s)
Coding: Cigna - Global Health Benefits Notice
  Found 1 pathway(s)
Coding: Cigna - HIPAA Notice
  Found 1 pathway(s)
Coding: Cigna - GLB Notice
  Found 1 pathway(s)
Coding: Humana - HIPAA Notice
  Found 1 pathway(s)
Coding: UnitedHealthcare - Web Privacy Policy
  Found 1 pathway(s)
Coding: UnitedHealthcare - HIPAA Notice
  Found 3 pathway(s)

Done. 17 pathways coded across 10 documents.
CSV: C:\Users\victo\OneDrive\Desktop\Privacy Policies\opt_out_step_counts.csv
JSON: C:\Users\victo\OneDrive\Desktop\Privacy Policies\opt_out_raw_outputs.json

Preliminary Findings: Consent Friction in Health Insurance Privacy Notices¶

This analysis identified 17 opt-out or consent-revocation pathways across privacy notices from Aetna, Anthem Blue Cross Blue Shield, Cigna, Humana, and UnitedHealthcare. Across the 17 coded pathways, documented procedural burden was generally low (0–3 steps), but many pathways lacked sufficient information for members to determine how rights could be exercised. The most common forms of friction were absent mechanisms, insurer discretion to deny requests, inability to reverse prior disclosures, and potential asymmetries between consent and withdrawal processes. Additionally, many privacy notices describe rights incompletely. Several pathways rely on inferred actions, external websites, phone calls, or unspecified procedures, making it difficult to determine how consumers would actually exercise the right.

The most common sources of friction were not procedural complexity but limitations on consumer control. Frequently observed barriers included: (1) prior disclosures could not be reversed after an opt-out or revocation request; (2) insurers retained discretion to deny or decline certain requests; and (3) no opt-out option was available for some categories of data use or sharing. These barriers appeared more frequently than extensive step requirements.

Several pathways describe rights without providing sufficient information for members to exercise them. Six pathways were flagged as potential asymmetry cases because data use or sharing appeared to occur by default, or because withdrawal required affirmative action while the corresponding opt-in process was not fully described. Two UnitedHealthcare pathways reserve discretion to deny or decline requests despite requiring members to complete multiple procedural steps. Four pathways contained no documented opt-out mechanism, and one additional pathway (Aetna) described a privacy contact process without clearly specifying an opt-out procedure.

Six pathways were flagged by the coding framework as potential consent asymmetry cases. Because opt-in procedures were often not fully described in the source documents, these flags should be interpreted as indicators for further review rather than definitive evidence of asymmetry. In these cases, information sharing or use appeared to occur by default, or withdrawal of consent required affirmative action while the corresponding opt-in process was not fully described. This pattern suggests that the burden of privacy management is often placed on consumers rather than embedded in default protections.

Notably, no pathways contained explicitly documented verification requirements. Because privacy notices often omit operational details, this finding should not be interpreted as evidence that verification procedures are absent. Rather, verification requirements were generally not described in the materials analyzed and therefore could not be coded.

Overall, these preliminary results suggest that privacy-related friction in health insurance notices may arise less from lengthy procedures and more from limited transparency, restricted consumer choice, incomplete procedural disclosure, and potential asymmetries between consent and withdrawal processes. Given the small sample size and document-specific nature of the coding, these findings should be interpreted as exploratory and hypothesis-generating.

No Opt-Out Mechanism

  • Cigna Third-Party App Data Authorization
  • Cigna Marketing Use of PHI
  • Cigna GLB Notice
  • Humana Opt-out of health-related contacts

Mechanism Unclear (or not explicitly described)

  • Aetna

Asymmetry Flags

  • Anthem HIPAA: Cancel written authorization
  • Anthem Spanish: Revoke written authorization
  • Cigna Data Sharing: Provider Access
  • Cigna Global Health: Marketing use of PHI
  • Cigna HIPAA: Disclosure to individuals involved in care
  • UnitedHealthcare HIPAA: Revoke written permission

Generative AI Statement¶

Portions of this codebook were developed with assistance from Claude (Anthropic, claude-sonnet-4-6). All methodological decisions, coding judgments, and interpretive conclusions are the author's own.

References¶

Amos, R., Acar, G., Lucherini, E., Kshirsagar, M., Narayanan, A., & Mayer, J. (2021). Privacy policies over time: Curation and analysis of a million-document dataset. Proceedings of the Web Conference 2021, 2021. https://doi.org/10.1145/3442381.3450048

European Data Protection Board. (2023). Guidelines 03/2022 on deceptive design patterns in social media platform interfaces: How to recognise and avoid them. Version 2.0. Adopted February 14, 2023. https://www.edpb.europa.eu/system/files/2023-02/edpb_03-2022_guidelines_on_deceptive_design_patterns_in_social_media_platform_interfaces_v2_en_0.pdf

Federal Trade Commission. (2022). Bringing dark patterns to light. FTC Staff Report. September 2022. https://www.ftc.gov/reports/bringing-dark-patterns-light

Habib, H., Pearman, S., Wang, J., Zou, Y., Acquisti, A., Cranor, L. F., Sadeh, N., & Schaub, F. (2020). "It's a scavenger hunt": Usability of websites' opt-out and data deletion choices. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–12. https://doi.org/10.1145/3313831.3376317

Nouwens, M., Liccardi, I., Veale, M., Karger, D., & Kagal, L. (2020). Dark patterns after the GDPR: Scraping consent pop-ups and demonstrating their influence. Proceedings of the 2020 CHI Conference on Human Factors in Computing Systems, 1–13. https://doi.org/10.1145/3313831.3376321