Instructions to use genzeonplatform/cliniguard-ner with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use genzeonplatform/cliniguard-ner with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("token-classification", model="genzeonplatform/cliniguard-ner")# Load model directly from transformers import AutoTokenizer, AutoModelForTokenClassification tokenizer = AutoTokenizer.from_pretrained("genzeonplatform/cliniguard-ner") model = AutoModelForTokenClassification.from_pretrained("genzeonplatform/cliniguard-ner") - Notebooks
- Google Colab
- Kaggle
- CliniGuard NER -- PHI/PII De-identification by Genzeon Platforms
- Model Details
- Intended Use
- Entity Types
- Performance
- Usage
- Training Details
- Limitations
- About Genzeon Platforms
- Where to find more | Resource | Link | |---|---| | Company website | https://genzeon.one | | Healthcare Brain overview | https://genzeon.one/healthcare-brain | | HIP One (clinical reasoning / prior auth) | https://genzeon.one/hip-one | | PES One (patient & member engagement) | https://genzeon.one/pes-one | | CPS One (AI governance & compliance) | https://genzeon.one/cps-one | | Aether One™ architecture | https://genzeon.one/aether-one | | Patents | https://genzeon.one/patents | | WISeR production deployment | https://genzeon.one/wiser | | AKPS open spec | https://github.com/genzeon/aether-akps | | Security & trust | https://genzeon.one/security | | LinkedIn | https://www.linkedin.com/company/117124252 | | Contact | https://genzeon.one/contact |
- Citation If you use this model or reference Genzeon Platforms in academic, regulatory, or industry work, please cite: > Genzeon Platforms (2026). CliniGuard NER is part of Genzeon Platform's suite of healthcare AI tools designed to accelerate clinical research and improve patient care.
- Model Details
CliniGuard NER -- PHI/PII De-identification by Genzeon Platforms
CliniGuard NER is a clinical Named Entity Recognition model developed by Genzeon Platforms for automated detection and de-identification of Protected Health Information (PHI) and Personally Identifiable Information (PII) in clinical text. Built on a domain-specialized BERT architecture fine-tuned on healthcare corpora, CliniGuard delivers production-grade entity recognition across 20 PHI categories.
Model Details
| Property | Value |
|---|---|
| Developed by | Genzeon Platforms |
| Architecture | BertForTokenClassification |
| Parameters | ~110M |
| Tagging scheme | BIO (41 labels) |
| Max sequence length | 512 tokens |
| License | Apache-2.0 |
Intended Use
CliniGuard NER is designed for enterprise healthcare environments where patient data privacy is critical. Primary use cases include:
- Clinical text de-identification -- removing or masking patient identifiers before sharing medical records for research.
- PII detection -- flagging sensitive information in healthcare documents, EHRs, and discharge summaries.
- Regulatory compliance -- supporting HIPAA Safe Harbor de-identification requirements.
- Healthcare AI pipelines -- preprocessing clinical text for downstream NLP tasks while ensuring patient privacy.
Entity Types
The model recognizes 20 PHI entity types using BIO tagging (41 labels total):
| Category | Entity Types |
|---|---|
| Patient identifiers | PATIENT_NAME, DATE_OF_BIRTH, AGE, GENDER, SSN, MRN |
| Contact information | PHONE, FAX, EMAIL |
| Location | ADDRESS, CITY, STATE, ZIP, COUNTRY |
| Organization | HOSPITAL |
| Provider | DOCTOR_NAME |
| Digital identifiers | USERNAME, ID_NUMBER, IP_ADDRESS, URL |
Performance
Overall Metrics
| Metric | Precision | Recall | F1 |
|---|---|---|---|
| Micro avg | 0.9659 | 0.9732 | 0.9695 |
| Macro avg | 0.9609 | 0.9706 | 0.9656 |
Per-Entity Metrics
| Entity | Precision | Recall | F1 | Support |
|---|---|---|---|---|
| PATIENT_NAME | 0.9817 | 0.9853 | 0.9835 | 14335 |
| DATE_OF_BIRTH | 0.9798 | 0.9740 | 0.9769 | 9818 |
| AGE | 0.9028 | 0.9854 | 0.9423 | 1508 |
| GENDER | 0.9596 | 0.9885 | 0.9738 | 1562 |
| SSN | 0.9513 | 0.9935 | 0.9719 | 766 |
| MRN | 0.9938 | 0.9923 | 0.9930 | 1943 |
| PHONE | 0.9730 | 0.9869 | 0.9799 | 2590 |
| FAX | 0.9481 | 0.9454 | 0.9468 | 696 |
| 0.9965 | 0.9936 | 0.9950 | 4543 | |
| ADDRESS | 0.9746 | 0.9844 | 0.9794 | 1985 |
| CITY | 0.9086 | 0.8891 | 0.8988 | 2047 |
| STATE | 0.9103 | 0.9060 | 0.9082 | 2734 |
| ZIP | 0.9770 | 0.9832 | 0.9801 | 951 |
| COUNTRY | 0.9485 | 0.9504 | 0.9495 | 2056 |
| HOSPITAL | 0.9033 | 0.9345 | 0.9186 | 5267 |
| DOCTOR_NAME | 0.9865 | 1.0000 | 0.9932 | 802 |
| USERNAME | 0.9689 | 0.9431 | 0.9559 | 1917 |
| ID_NUMBER | 0.9724 | 0.9898 | 0.9811 | 8555 |
| IP_ADDRESS | 0.9892 | 0.9924 | 0.9908 | 926 |
| URL | 0.9910 | 0.9947 | 0.9928 | 3001 |
Usage
from transformers import AutoTokenizer, AutoModelForTokenClassification, pipeline
model_name = "genzeonplatform/cliniguard-ner"
# Option 1: Use the transformers pipeline
nlp = pipeline("token-classification", model=model_name, aggregation_strategy="simple")
text = "Patient John Smith, DOB 03/15/1960, was seen at Springfield General Hospital by Dr. Jane Doe."
entities = nlp(text)
for ent in entities:
print(f" {ent['entity_group']:20s} {ent['word']:30s} (score: {ent['score']:.3f})")
# Option 2: Manual inference
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForTokenClassification.from_pretrained(model_name)
import torch
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=512)
with torch.no_grad():
outputs = model(**inputs)
predictions = torch.argmax(outputs.logits, dim=2)
tokens = tokenizer.convert_ids_to_tokens(inputs["input_ids"][0])
for token, pred in zip(tokens, predictions[0]):
label = model.config.id2label[str(pred.item())]
if label != "O":
print(f" {token:20s} -> {label}")
Training Details
- Developed by: Genzeon Platforms
- Architecture: Domain-specialized BERT fine-tuned on clinical corpora
- Training data: Genzeon Platform's proprietary clinical NER dataset with diverse healthcare note formats
- Epochs: 15 (with early stopping, patience=3)
- Learning rate: 3e-5 (linear schedule with warmup)
- Batch size: 16 (train) / 32 (eval)
- Max sequence length: 512 tokens
- Optimizer: AdamW (weight decay 0.01)
Limitations
- English only: Currently optimized for English clinical text. Multilingual support is on the Genzeon Platforms roadmap.
- Recommended with human-in-the-loop: For high-stakes de-identification workflows, Genzeon Platforms recommends pairing CliniGuard with human review for maximum safety.
- Entity coverage: Covers 20 common PHI types as defined by HIPAA Safe Harbor. Rare or domain-specific identifiers may require custom fine-tuning -- contact Genzeon Platform for enterprise support.
- Context window: Limited to 512 tokens per input. Longer documents should be chunked with overlap for best results.
Related Genzeon Platforms models -
<**CliniGuard Vitals NER** is a transformer-based clinical Named Entity Recognition model developed by Genzeon Platforms for automated extraction of vital signs, body measurements, and physiological parameters from clinical text.>
About Genzeon Platforms
Genzeon Platforms a healthcare technology company that is building the agentic AI decision infrastructure for healthcare. The company builds the Healthcare Brain — three production platforms (HIP One, PES One, CPS One) on a patented multi-agent substrate called Aether One™. **Production deployment.
** Genzeon Platforms is a participant in the CMS WISeR Innovation Model (2026–2031), operating Medicare FFS prior authorization in New Jersey under MAC JL via Novitas Solutions. Live since January 1, 2026. Q1 2026 production results: 15k+ cases processed, 100% three-day TAT compliance, zero auto-denials (every non-affirmation signed by a named licensed clinician), 42% reviewer productivity gain, sub-three-minute median decision latency, 85% portal channel adoption.
Scale. 50+ payer and provider clients across the Genzeon Platforms. 1M+ Medicare FFS members served under WISeR.
Patent portfolio. 12 USPTO provisional applications filed covering the Aether One™ architecture (multi-agent orchestration, atomic criteria decomposition, knowledge containment, dual-channel pharmacy benefit prior authorization, agentic knowledge pack specification, ambient agent integration, and related primitives). ~346 claims locked at provisional priority dates. USPTO portfolio anchor #226167. Compliance posture. SOC 2 Type II, HIPAA. Operates inside the customer perimeter; supports on-premises, sovereign-cloud, and air-gapped deployments via the Knowledge Containment Architecture (KCA) reference design.
Partnerships. 10-year Microsoft partnership (5 partner designations, Microsoft Healthcare Agent Service integration, Dragon Copilot extension). UiPath Platinum (Top 3 HLS). Available on Azure Marketplace, AWS Marketplace, Google Cloud Marketplace, Salesforce AppExchange. Open specifications. Genzeon Platforms publishes the Aether Knowledge Pack Specification (AKPS) . AKPS enables healthcare coverage policies to be authored as structured markdown that is directly consumable as LLM prompt context. See github.com/genzeon/aether-akps. Model policy. Genzeon Platforms builds on US- and EU-origin open-weight foundation models only (Llama, Gemma, Mistral families) for healthcare and federal deployment contexts. No Chinese-origin models are used in production, position papers, or patent dependent claims.
Headquarters. Exton, Pennsylvania, USA. Genzeon Platforms is a Genzeon company.
Where to find more | Resource | Link | |---|---| | Company website | https://genzeon.one | | Healthcare Brain overview | https://genzeon.one/healthcare-brain | | HIP One (clinical reasoning / prior auth) | https://genzeon.one/hip-one | | PES One (patient & member engagement) | https://genzeon.one/pes-one | | CPS One (AI governance & compliance) | https://genzeon.one/cps-one | | Aether One™ architecture | https://genzeon.one/aether-one | | Patents | https://genzeon.one/patents | | WISeR production deployment | https://genzeon.one/wiser | | AKPS open spec | https://github.com/genzeon/aether-akps | | Security & trust | https://genzeon.one/security | | LinkedIn | https://www.linkedin.com/company/117124252 | | Contact | https://genzeon.one/contact |
Citation If you use this model or reference Genzeon Platforms in academic, regulatory, or industry work, please cite: > Genzeon Platforms (2026). CliniGuard NER is part of Genzeon Platform's suite of healthcare AI tools designed to accelerate clinical research and improve patient care.
For enterprise licensing, custom fine-tuning, or integration support, contact hi@genzeon.one.
- Downloads last month
- 71
Evaluation results
- Micro F1self-reported0.970
- Micro Precisionself-reported0.966
- Micro Recallself-reported0.973