The Sponsor entity in TOP is per Organization, per Study. One Sponsor entity captures one Organization's role on one Study. A company that sponsors fifty Studies has fifty Sponsor entities, all anchored through belongsToOrganization to the same corporate Organization. The "all studies Pfizer sponsors" enterprise view is served from Organization.playsSponsorRole, not from a single Sponsor entity. This shape resolves the cases the OOUX map's enterprise-level Sponsor cannot: CRO-as-proxy on a single Study, multi-jurisdictional sponsor of record (Pfizer Inc under FDA, Pfizer Ireland under EMA on the same trial), M&A transfer of one Study without moving the rest of the portfolio, and the academic IIT where the legal sponsor is a single PI.
What you see if you query the model: three boolean responsibility flags per Sponsor (regulatory, financial, operational) plus an isInitiator flag plus an isSponsorOfRecord flag, a 1..1 link to the corporate Organization, a 1..1 link to the Study, an optional self-reference (actsOnBehalfOf) for delegation chains, an optional self-reference (parentSponsor) for structural lineage, an optional 0..N link to RegulatoryAuthority for jurisdiction scoping, and validFrom / validUntil temporal bounds for handoffs and transfers. That set is the entire architectural payload. Everything else (Contracts, Documents, Audits, Submissions, IP, Sites, Systems, Persons) hangs off the Sponsor through standard relationships and accumulates over the Study lifecycle.
The shape lets a regulator query "show me the legal Sponsor of record on Study ONCO-423 in my jurisdiction" and get exactly one entity back. It lets a site query "who do I call when the eTMF goes down on this Study" and get a Sponsor entity with operational responsibility. It lets a sponsor portfolio analyst query "show me every Sponsor entity backed by Pfizer Inc that is the SoR under FDA and is currently in active enrollment" without crossing layer boundaries. The model preserves clean projection into USDM and FHIR because every predicate carries one target type and every entity sits at one architectural scope.
| Attribute | Type | Optional | Doc |
|---|---|---|---|
| sponsorId | ngsi-ld:URI | no | Globally unique NGSI-LD identifier |
| sponsorName | xsd:string | no | Display name |
| legalName | xsd:string | no | Registered legal name |
| sponsorType | enum | no | PHARMACEUTICAL, BIOTECH, ACADEMIC, GOVERNMENT, INVESTIGATOR_SPONSOR, CRO_AS_SPONSOR, OTHER |
| duns | xsd:string | yes | Dun & Bradstreet DUNS number |
| orcid | xsd:string | yes | ORCID, populated for investigator-sponsors |
| address | xsd:object | no | Per-Study contact address (line1, city, region, postalCode) |
| country | xsd:string | no | ISO 3166-1 alpha-3 |
| phone | xsd:string | no | Primary phone |
| xsd:string | no | Primary email | |
| website | xsd:anyURI | no | Sponsor website |
| status | enum | no | ACTIVE, INACTIVE, MERGED, ACQUIRED |
| isSponsorOfRecord | xsd:boolean | no | True if this Sponsor entity holds regulatory accountability per 21 CFR 312 (or jurisdictional equivalent). Exactly one Sponsor per (Study × jurisdiction) carries this true. |
| hasRegulatoryResponsibility | xsd:boolean | no | Often coincident with isSponsorOfRecord; separated for the rare case where regulatory submission is delegated without transfer of legal SoR status. |
| hasFinancialResponsibility | xsd:boolean | no | True if this Sponsor provides funding. Distinct from SoR (an academic medical center may be the financial sponsor while a PI is the legal sponsor of an IIT). |
| hasOperationalResponsibility | xsd:boolean | no | True if this Sponsor manages day-to-day conduct. The CRO-as-proxy case. |
| isInitiator | xsd:boolean | no | True if this Sponsor initiated the Study. |
| validFrom | xsd:dateTime | yes | NGSI-LD temporal: from when this Sponsor relationship is operationally valid. |
| validUntil | xsd:dateTime | yes | NGSI-LD temporal: through when this Sponsor relationship was operationally valid. |
| primaryTherapeuticArea | xsd:string | yes | Per-Study TA focus from this Sponsor's role view (oncology, cardiology, rare disease, etc.). Corporate TA spread lives on Organization.primaryTherapeuticAreas. |
| Predicate | Target | Cardinality | Operational meaning |
|---|---|---|---|
| belongsToOrganization | Organization | 1..1 | The corporate Organization this Sponsor is backed by; carries the parentOrganization hierarchy for J&J / Janssen / Innovative Medicine cases. |
| runs | Study | 1..1 | The single Study this Sponsor entity is scoped to (per-Org-per-Study pattern). |
| actsOnBehalfOf | Sponsor | 0..1 | Self-ref for operational delegation; CRO operational sponsor → sponsor of record. |
| parentSponsor | Sponsor | 0..1 | Self-ref for structural lineage: M&A successor, lead/co-sponsor, EU CTR Article 74 legal-rep parent. Distinct from actsOnBehalfOf (operational) and belongsToOrganization (corporate). |
| regulatoryAuthorityScope | RegulatoryAuthority | 0..N | Jurisdictional scoping for multi-jurisdictional sponsor of record. |
| engages | Site | 0..N | Sites engaged for this Study (accumulates over startup). |
| supplies | InvestigationalProduct | 0..N | IP supplied for this Study (observational studies have none). |
| interfacesWith | OversightBody | 0..N | IRB, EC, DSMB, IDMC for this Study. |
| files | RegulatorySubmission | 0..N | Submissions filed under this sponsorship (accumulates over lifecycle). |
| operatesSystem | System | 0..N | Per-Study operational ownership of Systems (the "who do you call when the EDC breaks on this Study" view). Distinct from corporate tenancy at Organization.operatesSystem. |
| publishesDocument | Document | 0..N | Sponsor-controlled documents on this Study (TMF and beyond). |
| publishesPublication | Publication | 0..N | Resulting publications. Split from "publishes" for clean projection. |
Three decisions shape the Sponsor model. Each was made in dialogue and is reversible if downstream evidence argues against it.
The OOUX map's original Sponsor was implicitly enterprise-level (one Sponsor entity = one company; runs: Study (1..N)). We flipped this in v0.2-TOP to per-Organization-per-Study (runs: Study (1..1)). Reason: the operator's mental model of "the Sponsor of this study" is per-study contextual. The "all studies Pfizer sponsors" enterprise view is served from Organization.playsSponsorRole, not from a single Sponsor entity. This matches USDM's ProductOrganizationRole pattern and resolves Gemini's "show me trials where Org X is the payer but not the regulator" query cleanly. Trade-off: the per-Study scope means a company sponsoring fifty trials has fifty Sponsor entities. That is fine; they are cheap, they index well, and they make the SHACL invariants crisp.
Each Sponsor entity carries five booleans: isSponsorOfRecord, hasRegulatoryResponsibility, hasFinancialResponsibility, hasOperationalResponsibility, isInitiator. Gemini originally proposed a junction class to model the five-way Sponsor-role matrix; we collapsed it into flags after stress-testing the IIT case (PI = legal sponsor, AMC = financial sponsor, no separate initiator) and the CRO-as-proxy case (Pfizer = SoR + regulatory + financial; IQVIA = operational only). The flags express the matrix without introducing a junction-class indirection that would slow every query and confuse operators reading the entity. Trade-off: the flags require explicit SHACL invariants to encode the rule "exactly one isSponsorOfRecord = true per Study × jurisdiction." That invariant lives in §7.
Pfizer Inc is the SoR under FDA on Study ONCO-423. Pfizer Ireland is the SoR under EMA on the same Study. Two different legal entities, two different Sponsor entities, both carrying isSponsorOfRecord = true, distinguished by regulatoryAuthorityScope (the FDA entry vs the EMA entry). The SHACL invariant relaxes from "exactly one SoR per Study" to "exactly one SoR per Study × jurisdiction." This matches how multi-region clinical trials actually file and how regulators read the entity at audit. Trade-off: the RegulatoryAuthority horizontal is currently flagged-missing in OOUX v0.2; the relationship resolves through the target-missing minCount-zero pattern until the horizontal is specified.
Pfizer Inc sponsors Study ONCO-423 under FDA. Pfizer runs ops in-house. No CRO, no co-sponsor, no academic involvement at the sponsorship layer.
Pfizer is the legal SoR under FDA. IQVIA runs day-to-day operations as proxy. Two Sponsor entities, both linked to the same Study, related by actsOnBehalfOf.
Pfizer Inc is SoR under FDA. Pfizer Ireland is SoR under EMA on the same Study. IQVIA operates the trial globally as proxy for both. Three Sponsor entities, two SoRs, one operational sponsor.
PI Dr. Smith at MD Anderson runs IIT-001 with departmental funds. Smith is the legal sponsor (sponsorType = INVESTIGATOR_SPONSOR, orcid populated, duns empty). MD Anderson is the financial sponsor and the operational owner.
Parexel is contracted as legal sponsor on a Phase I study for a small biotech that lacks the regulatory capacity to file IND themselves. Parexel is SoR; the biotech retains financial responsibility and IP ownership.
Pfizer acquires Arena Pharmaceuticals on 2026-03-15. Arena's Study ARENA-LEGACY-001 transfers to Pfizer sponsorship on the IND-transfer effective date 2026-04-01. The Arena Sponsor entity gets validUntil = 2026-04-01T00:00:00Z; a new Pfizer Sponsor entity is created with validFrom = 2026-04-01T00:00:00Z. Both records remain queryable for audit. The successor links via parentSponsor.
Elevate Research is an SMO running sites across many Studies for many sponsors. Question: "show me every Sponsor entity whose engaged sites include any Elevate location, grouped by Sponsor parent Organization." This is a graph traversal, not a Sponsor-attribute query.
GET /entities?type=Sponsor &q=runs=="urn:ngsi-ld:Study:ONCO-423" &q=hasOperationalResponsibility==true &options=keyValues
GET /entities?type=Sponsor &q=runs=="urn:ngsi-ld:Study:ONCO-423" &q=isSponsorOfRecord==true &q=regulatoryAuthorityScope=="urn:ngsi-ld:RegulatoryAuthority:fda"
GET /entities?type=Sponsor &q=runs=="urn:ngsi-ld:Study:ONCO-423" &q=isSponsorOfRecord==true &q=regulatoryAuthorityScope=="urn:ngsi-ld:RegulatoryAuthority:fda" &options=sysAttrs
GET /entities?type=Sponsor &q=engages.partOfSiteNetwork=="urn:ngsi-ld:Organization:elevate-research" &options=keyValues
GET /entities?type=Sponsor &q=belongsToOrganization=="urn:ngsi-ld:Organization:pfizer" &q=isSponsorOfRecord==true &q=regulatoryAuthorityScope=="urn:ngsi-ld:RegulatoryAuthority:fda" &q=runs.studyStatus=="active"
GET /temporal/entities?type=Sponsor &q=runs=="urn:ngsi-ld:Study:ARENA-LEGACY-001" &timerel=between &timeAt=2024-01-01T00:00:00Z &endTimeAt=2026-12-31T23:59:59Z &options=temporalValues
Most Sponsor-related questions an operator types into a search bar collapse to a small number of structural patterns. The translation table below shows how natural-language phrasings rewrite into NGSI-LD query shapes, and how the boolean responsibility flags carry the discriminating logic.
| Operator phrasing | Translated query shape | Discriminating predicate |
|---|---|---|
| "Who's the sponsor of Study X?" | type=Sponsor & runs==X & isSponsorOfRecord==true | isSponsorOfRecord |
| "Who's running the trial day-to-day?" | type=Sponsor & runs==X & hasOperationalResponsibility==true | hasOperationalResponsibility |
| "Who's funding it?" | type=Sponsor & runs==X & hasFinancialResponsibility==true | hasFinancialResponsibility |
| "Who started this trial?" | type=Sponsor & runs==X & isInitiator==true | isInitiator |
| "Who holds the IND in the US?" | type=Sponsor & runs==X & isSponsorOfRecord==true & regulatoryAuthorityScope==fda | regulatoryAuthorityScope |
| "Show me Pfizer's active oncology trials" | type=Sponsor & belongsToOrganization==pfizer & primaryTherapeuticArea==oncology & runs.studyStatus==active | primaryTherapeuticArea + studyStatus |
| "Who replaced Arena as sponsor?" | type=Sponsor & parentSponsor==Sponsor:arena-* & validFrom!=null | parentSponsor + validFrom |
| "Who's the CRO running this trial?" | type=Sponsor & runs==X & sponsorType==CRO_AS_SPONSOR (legal CRO sponsor) OR & actsOnBehalfOf!=null (operational proxy) | sponsorType vs actsOnBehalfOf |
| "Who's responsible for this Study's safety reporting?" | type=Sponsor & runs==X & hasRegulatoryResponsibility==true (then traverse to RegulatorySubmission) | hasRegulatoryResponsibility |
| "Has this Study had a sponsor change?" | temporal/entities, type=Sponsor & runs==X & count>1 OR any record where validUntil!=null | validUntil + temporal endpoint |
The pattern that earns its keep: every operator question reduces to a Sponsor-entity query plus one or two boolean-flag or scoped-relationship constraints. No junction-class hops, no derived-view materializations. The verb is always runs for "what Study" and the discriminator is always a flag, an enum, or a scoped relationship. This is what makes NLP-to-NGSI-LD translation tractable for a thin LLM layer.
The SHACL shapes in shapes/clinical-trials-shapes.ttl are emitted from the source intermediate by tools/build_shacl.py. The Sponsor NodeShape has 47 attribute property shapes and 24 relationship property shapes covering Sponsor and the Organization horizontal it backs; the full graph (Sponsor + Study + Site + Organization + Document) emits 84 property shapes total.
| Invariant | SHACL mechanism | Status |
|---|---|---|
| Every Sponsor must belong to exactly one Organization. | sh:minCount 1 ; sh:maxCount 1 on belongsToOrganization, plus sh:class topc:Organization | Encoded |
| Every Sponsor must run exactly one Study. | sh:minCount 1 ; sh:maxCount 1 on runs, plus sh:class top:Study | Encoded |
| Required attributes (sponsorId, sponsorName, legalName, sponsorType, address, country, phone, email, website, status, the five booleans) must all be present. | sh:minCount 1 ; sh:maxCount 1 on each, plus sh:datatype for typed literals. | Encoded |
| sponsorType, status, identifierScheme are bounded by enum. | sh:in (...) | Encoded |
| Aspirational relationships (Audits, Submissions, IP, Documents, etc.) accumulate over time and may legitimately be empty. | 0..N cardinalities on every aspirational relationship, set via the v0.1.1 cardinality realism pass. | Encoded |
| Relationships pointing at flagged-missing target classes (RegulatoryAuthority, CRO, Country, Publication, SOP, TrainingProgram, DataTransferAgreement) automatically relax minCount to zero until the target is specified. | build_shacl.py forces minCount=0 when _targetMissing is set; emits a Turtle comment recording the relaxation. | Encoded (v0.1.1) |
| Predicate names are unambiguous within a focus class (no polysemous verbs). | Pre-emission guard check_no_polysemous_verbs in both build_context.py and build_shacl.py raises ValueError if a focus class has duplicate relationship or attribute names. | Encoded (v0.1.2) |
Domain invariants (encoded as SHACL-SPARQL constraints by tools/build_shacl.py, validated by pyshacl in --advanced mode):
| Invariant | Severity | Status |
|---|---|---|
| Soft warning: isSponsorOfRecord = true should imply hasRegulatoryResponsibility = true. The rare 21 CFR 312.52 transfer-of-obligations case is permitted; surfaces to a human reviewer rather than blocking validation. | sh:Warning | Encoded (v0.1.4) |
| Hard violation: every Study must have at least one Sponsor with isSponsorOfRecord = true. | sh:Violation | Encoded (v0.1.4) |
| Hard violation: at most one Sponsor with isSponsorOfRecord = true per (Study × RegulatoryAuthority). Sponsors lacking explicit regulatoryAuthorityScope group together as the implicit single-jurisdiction bucket, so two unscoped SoRs on the same Study correctly fire. | sh:Violation | Encoded (v0.1.4) |
| Hard violation: every Study must have at least one Sponsor with hasOperationalResponsibility = true. If nobody is running it day-to-day, the trial cannot be conducted. | sh:Violation | Encoded (v0.1.4) |
| Invariant | SHACL mechanism | Status |
|---|---|---|
| Temporal coverage continuity: no gaps in Sponsor coverage across handoffs (the Sponsor at validUntil=T1 should be succeeded by a Sponsor with validFrom=T1). | Custom SPARQL constraint or post-validation check. | Optional, low priority |
| Standard | Equivalent class / property | Notes |
|---|---|---|
| FHIR R5 | fhir:Organization | FHIR collapses sponsor into a generic Organization with no sponsor-specific role discriminator. TOP's per-Org-per-Study Sponsor projects onto FHIR by populating Organization.type with codes for the sponsor role and binding the Study via ResearchStudy.sponsor. |
| USDM v3 | usdm:Sponsor | USDM models Sponsor as a Code type on Organization (one of several roles an Organization can play). TOP keeps Sponsor as an operator-grade top-level and uses belongsToOrganization to anchor it back to the corporate truth USDM expresses. Projection is two-hop: Sponsor → belongsToOrganization → Organization (USDM-shaped). |
| CDISC ODM-XML | odm:Study/GlobalVariables/StudyName | ODM-XML's Sponsor representation is shallow (a study-name attribute and a contact block). TOP's full Sponsor entity projects forward into ODM as a flattened Study-level metadata block. |
| CDASH v2.1 | (not directly mapped) | CDASH is a clinical data acquisition standard at the case-report-form level; Sponsor lives above its scope. |
| W3C PROV | prov:Agent with role prov:Organization | PROV models Sponsor as an Agent involved in the activity (Study). The boolean responsibility flags map onto PROV qualified-association roles. |
| NCIt | NCIT:C70793 (Trial Sponsor) | NCIt's Trial Sponsor concept is the closest single-term match. |
Items called out for resolution before v0.1 of the clinical-research reference graph publishes. Each is tracked; none block the architectural decisions in §3.
| Issue | Status | Owner |
|---|---|---|
| Cardinality realism pass (1..N → 0..N on aspirational Sponsor relationships). | Closed in v0.1.1-strawman. Eighteen aspirational 1..N relationships relaxed to 0..N. The Sponsor invariant is: belongs to exactly one Organization, runs exactly one Study. Everything else accumulates over the Study lifecycle. Worked example validates clean against pyshacl. | Bo (Q9), shipped |
| Target-missing minCount handling in build_shacl.py. | Closed in v0.1.1-strawman. When a relationship's target class is flagged with _targetMissing the emitter relaxes minCount to 0 and emits a Turtle comment recording the relaxation. Constraints restore automatically once the target class lands. | Shipped |
| Polysemous verb split: "publishes" against Document and Publication; "operates" against TrainingProgram and System. | Closed in v0.1.2-strawman. Renamed to publishesDocument, publishesPublication, operatesTrainingProgram, operatesSystem. Path B (split, not collapse-with-sh:or) chosen to preserve clean projection into USDM and FHIR. Pre-emission guard check_no_polysemous_verbs added to both emitters; regression cannot slip back in. | Bo (Q7), shipped |
| Q3 parentSponsor (keep with note). | Closed in v0.1.3-strawman. Added parentSponsor (0..1) self-ref for structural lineage cases (M&A successor, lead/co-sponsor in co-development, EU CTR Article 74 legal-rep parent). Distinct from actsOnBehalfOf (operational delegation) and belongsToOrganization (corporate identity). | Bo (Q3), shipped |
| Q4 attribute completeness. | Closed in v0.1.3-strawman. primaryTherapeuticArea added on Sponsor (per-Study TA lens). employeeSizeBand, fiscalYearEnd, primaryTherapeuticAreas added on the Organization horizontal (corporate-scope facts). Regulatory authority of record left unchanged: already covered by isSponsorOfRecord + regulatoryAuthorityScope. | Bo (Q4), shipped |
| Q8 CTAs (operator-grade gaps). | Closed in v0.1.3-strawman. Added: Reassign Primary Contact, Transfer Studies Between Programs, Request Public Listing on ClinicalTrials.gov. | Bo (Q8), shipped |
| System ownership vs system use, three-layer modeling. Bo's eTMF insight (2026-05-07): a Sponsor may own multiple eTMFs but the operationally meaningful binding is per-Study; ownership matters because that is who you call when the system breaks. | Resolved architecturally; parked for Study spec implementation. Three layers: (1) corporate tenancy at Organization.operatesSystem (0..N), the master-contract view; (2) per-Org-per-Study operational ownership at Sponsor.operatesSystem (0..N), the call-when-it-breaks view (already in v0.1.2); (3) per-Study use binding at Study.usesEtmfSystem / usesEdcSystem / usesCtmsSystem / usesIrtSystem / usesSafetyDatabaseSystem / usesRandomizationSystem / usesEproSystem (0..1 each), the which-instance-is-this-Study-on view. System becomes a horizontal carrying vendor (→ Organization with type=VENDOR), productName, instanceId, baseUrl, systemType. The Sponsor v0.1.3 source does not change. | Parked for Study spec; tracked. Logged 2026-05-07. |
| RegulatoryAuthority horizontal needs full spec (currently flagged-missing in OOUX v0.2 #60). | Parked | Working group |
| CRO horizontal needs full spec (currently flagged-missing in OOUX v0.2). | Parked | Working group |
| USDM parentOrganization equivalent needs verification with David Iverson Hurst. | Outreach pending | Bo |
| SHACL "exactly one isSponsorOfRecord per (Study × jurisdiction)". | Closed in v0.1.4-strawman. Encoded as SHACL-SPARQL constraint via sh:SPARQLConstraint. The implementation uses OPTIONAL on regulatoryAuthorityScope so unscoped SoRs group together as the implicit single-jurisdiction bucket, correctly catching dual-SoR-without-scoping. Pfizer-IQVIA worked example was enriched with regulatoryAuthorityScope triples (FDA, EMA) so the multi-jurisdictional pattern actually validates rather than just being claimed in a comment. Required pyshacl --advanced mode for SHACL-SPARQL. | Shipped |
| Boolean responsibility flag combinations: "isSponsorOfRecord implies hasRegulatoryResponsibility" and the cross-entity invariants (Study must have ≥1 SoR, ≥1 operational sponsor). | Closed in v0.1.4-strawman (Bo, soft-warning chosen). Implication encoded as sh:Warning; cross-entity rules encoded as sh:Violation. Plus a related fix: build_shacl.py now also suppresses sh:class when the target class is _targetMissing, on the same logic as the minCount relaxation (cannot validly assert a class against a class that does not yet exist). | Shipped |
| Temporal coverage continuity SHACL invariant (no gaps across handoffs). | Optional, low priority | Translator scaffold v3 |