key: cord-0286237-t105qww0
authors: Kramer, M. A.
title: FHIR Profiliferation: A Data Science Approach
date: 2022-03-12
journal: nan
DOI: 10.1101/2022.03.09.22272163
sha: f25cc1755404801816e109cd17852e60a1ddf04e
doc_id: 286237
cord_uid: t105qww0

Profiling is a mechanism for shaping Fast Healthcare Interoperability Resources (FHIR) for particular use cases. Profiliferation (profile + proliferation) is a coinage referring to the explosive growth in the number of FHIR profiles over the past few years. By reviewing a broad sample of almost 3000 FHIR profiles from 125 implementation guides, it was determined that just two items, Observation and Extension, accounted for half the profiles. FHIRs 80/20 rule was determined to be closer to 65/35, revealing that FHIR is more dependent on profiling than initially intended. Use of the Observation resource was inconsistent, hinting that the resource is either poorly designed, or that users lack proper guidance. Better management of reusable items and certain changes in FHIR and profiling practices could improve the consistency of FHIR artifacts and reduce unnecessary and potentially incompatible profiles.

Health Level 7 Fast Healthcare Interoperability Resources 1 (HL7® FHIR®) is a framework for interoperability. FHIR defines a data exchange syntax, an application programming interface (API), a set of health-related resources (information objects), some terminology, and an extensibility mechanism. By design, FHIR omits many details needed to implement health-related use cases.

Those details are provided by Implementation Guides (IGs) 2 . Among the most important items in IGs are FHIR profiles 3 . Profiles provide additional information for the use case's data structures, including constraints on data element cardinality, value sets, additional data elements ("extensions"), and more. Profiles can also be used at run time to check the validity of exchanged data.

Over the past few years, the FHIR community has built thousands of profiles 4 , leading to challenges in discovery and re-use of these resources. Given the difficulties of finding existing profiles, scarce knowledge of their provenance, and lack of control over third party development, IG authors frequently elect to create their own profiles rather than reusing existing ones, thus compounding the problem.

Before formulating approaches to deal with profile proliferation ("profiliferation"), it is worthwhile to analyze what specifically is being done with profiles. By studying extant profiles, it is possible to transform the problem from simply "too many profiles" to one that spotlights specific areas and practices that can be addressed.

The current study is a meta-analysis of the contents of IGs to identify patterns of profile specification. The sample of IGs was selected from the population of IGs listed in the FHIR Package Registry 5 . When accessed in February 2022, the Registry contained over 450 distinct IG packages based on FHIR Release 4 (R4). Eliminating packages uploaded for testing purposes, those created prior to 2020, and those containing serious errors, 125 IGs were selected (Appendix A). The majority of these IGs are cross-listed in the FHIR Implementation Guide Registry 6 , a more selective source that indicates a degree of maturity in the content of those IGs.

The selected IGs were broad-based. A total of 11 countries were represented, the most numerous being 53 IGs from the US, 24 from Germany, 9 from Switzerland, and 5 from Canada, with 24 considered universal (not countryspecific). The domains covered by the IGs included financial management, public health, administration, patient care, quality measurement, clinical decision support, clinical documents, medical records access, medications, and more. The StructureDefinitions 7 in each IG were analyzed. StructureDefinition is the FHIR resource used to define profiles of resources and datatypes. A small number of StructureDefinitions defining logical models (custom resources) were excluded. Other FHIR conformance resources 8 such as ValueSet, OperationDefinition, and SearchParameter were left out of scope.

StructureDefinitions contain a "differential" that captures all the changes made by the profile, relative to the profile's immediate parent. These changes include adding new data elements, restricting cardinality (for example changing an optional element to a required element), assigning value sets to data elements, and constraining datatypes.

For each StructureDefinition, a "deep differential" was constructed by recursively combining the differential of the target profile with the differentials of its parents ( Figure 1 ). The deep differential captures every modification made by the target profile on its base resource or datatype. Finally, for each data element, the types of profiling modifications were noted. Figure 1 . Formulation of the deep differential across multiple levels of profiles (generalizes to any depth).

To help organize and interpret the results, the data elements were subdivided into three categories:

1. Native Elements. These are elements defined in the core FHIR specification. Native elements include an element's sub-elements to any depth (whether explicit or implicit) and any inherited data elements. There are two types of extensions: standalone and inline 10 Regarding the subdivision of elements into these three categories, there might be an argument to classify slices as native since they are constraints on native elements. However, the identical argument would also apply to extensions, which are constraints on native extension elements. It is more informative to view slices as their own category, falling between the more rigid, predefined meaning of some native elements and the flexible semantics of extensions. For example, Observation.component is composed of arbitrary name-value pairs, like extensions, but the semantics are more constrained, since each component is supposed to represent an observation concurrent with the primary observation.

It is worth noting that native elements are much more numerous than the FHIR documentation indicates. For example, the documentation page for the Patient resource lists 28 data elements 11 , while the StructureDefinition contains 73 12 , yet this study observed 126 distinct native data elements. The difference is that neither the documentation nor the StructureDefinition fully accounts for nested and inherited elements.

The 125 selected IGs contained a total of 2,931 profiles. The IG with the most profiles (hl7.fhir.us.covid19library) had 150, while the median was 16 profiles, and the average 23.5 profiles. About one-third of the IGs (43 of 125) had 10 or fewer profiles.

The 15 most frequently profiled resources and datatypes are shown in Table 1 . Overall, 105 distinct FHIR resources and 13 complex datatypes were profiled. The remaining 40 resources and 50 datatypes defined in FHIR R4 were never profiled. Of the 40 unprofiled resources, 27 were "draft" status (FHIR Maturity Model 13 level 0), indicating very little use in the FHIR community. Only two unprofiled resources were maturity level 3 or higher (AppointmentResponse and OperationDefinition).

In total, there were 4864 slices defined in the profiles, about 1.7 per profile. The ten most frequently sliced elements were: Composition In total, the IGs used extensions 2645 times ( Figure 2 ). Of these, 2,291 were standalone extensions and 354 were inline extensions. Of the standalone extensions, 526 (23%) were standard extensions (included in base FHIR).

The standard extensions represented 124 unique extensions. The ten most frequently used standard extensions were: data-absent-reason (used 49 times), condition-assertedDate (42), bodySite (24), workflow-episodeOfCare (18), designNote (18), observation-secondaryFinding (14) , narrativeLink (13), patient-birthPlace (12), rendering-style (11) , and rendering-xhtml (11). . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted March 12, 2022 

Multiple profiling actions can be applied to one data element. Ten of the most common profiling actions are: changing the minimum or maximum cardinality, assigning a "must support" (MS) flag, constraining the datatype, assigning a fixed or pattern value, binding a value set or changing the binding strength, and adding slice(s) or extension(s). One of more of these actions were taken on each of the 52,397 profiled data elements across all IGs. Other profiling actions, such as adding descriptions, providing metadata about the profile, specifying upper and lower bounds on quantitative elements, etc., were not tracked. Table 2 shows the percentage of profiled data elements profiled each way. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted March 12, 2022. ; https://doi.org/10.1101/2022.03.09.22272163 doi: medRxiv preprint

Labeling an element "must support" is significant because it indicates the data element is of particular importance in the IG 14 . Assigning these flags is the most common profiling action. Table 3 shows which native elements were most likely to be MS, for the top 15 resources and data types. Top level elements not listed in the Table 3 are rarely designated MS in IGs and perhaps could have been omitted from the resource in question. It may be noted, however, that some IGs do not specify MS flags. Table 4 shows the same calculation based on all elements profiled across all IGs. The results are similar, but several more resources fall below the 80% line. According to this analysis, FHIR's 80/20 rule is more like 65/35. . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted March 12, 2022. ; https://doi.org/10.1101/2022.03.09.22272163 doi: medRxiv preprint Table 4 . Evaluation of FHIR's 80/20 rule according to the percentage of native (non-slice, non-extension) MS elements in all MS elements, by resource. For comparison, the percentage based on all profiled elements also shown. 

As the most profiled resource, Observation deserves special attention. Table 5 shows which top-level elements were profiled most frequently, with each count including profiled sub-elements. Observation.component was by far the most frequently profiled element because numerous slices created on this element. 0  7  dataAbsentReason  341  320  21  0  status  329  320  0  9  bodySite  295  257  12  26  method  248  240  0  8  extension  234  80  0  154  referenceRange  192  166  25  1  hasMember  173  85  88  0  interpretation  133  104  25  4  meta  132  104  28  0  device  104  104  0  0  specimen  104  104  0  0 . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted March 12, 2022. ; https://doi.org/10.1101/2022.03.09.22272163 doi: medRxiv preprint Focusing on Observation.component, what stands out is that many components defined in profiles do not appear consistent with that element's definition. Components are supposed to be observations that accompany or group with the primary observation. However, in practice, components were used to capture a wide variety of information, including circumstances surrounding the observation, methods, follow-up actions, interpretations, and more. Without judging the appropriateness of any particular one, here are some examples of observation components (names formatted for readability):

The variety of top-level extensions was also conspicuous, presenting things that might align with certain elements in the Observation resource, or expressed in another resource.

Here are examples of top-level extensions in Observation:

Some other observations:

• Some Observations with many components appeared to be more like questionnaires, with a series of questions and answers. • When profiling elements such as category, identifier, code, and performer, users were split between using fixed values, slicing, and value pattern approaches.

Base FHIR provides approximately 2/3 of the necessary elements in the IGs studied here. One reason for profiliferation is that profiling is used to define 33-38% of new data elements, rather than the 20% suggested by FHIR's 80/20 rule.

With almost 3000 profiles gathered from 125 IGs, only two profiled items, Extension and Observation, accounted for nearly half (49%) of all profiles. These are the prime targets for reducing profiliferation.

Although reusable "standard extensions" are included with FHIR, they accounted for only one-fifth of all extensions used in the IGs. There are clearly opportunities to increase this proportion. Currently, standard extensions are only updated when a new FHIR version is released. Release can be several years apart, which is far too slow for IG authors.

An alternative source of existing extensions is Simplifier.net 4 , which currently contains about 1500 active extensions for R4. A major drawback of Simplifier is that it is not curated, so its contents come with no guarantee of quality, stability, or uniqueness. There is no standardization when users can choose from 31 birth sex extensions.

Since medicine has many thousands of potential observables, extensive profiling of Observation is to be expected. However, it is apparent from this survey that the use of this resource is inconsistent. This has been the subject of previous discussions 16 . It could be that the resource is poorly designed, or that users lack proper guidance. Regardless, if the current situation persists, implementers will be faced with adapting their input and output processing to deal with many idiosyncratic Observation profiles.

Although not a direct source of profiliferation, slices are a major contributor to variation within profiles. In addition, slices are even less reusable than extensions because they are typically defined inline. To allow any reuse, the user must first create a standalone profile of the required datatype, and then constrain the slice datatype to an instance of . CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

The copyright holder for this preprint this version posted March 12, 2022. ; https://doi.org/10.1101/2022.03.09.22272163 doi: medRxiv preprint that profile. However, this approach does not apply to BackboneElement 17 , a hierarchical grouping element where slicing often occurs. FHIR does not allow profiling of nested structures based directly on BackboneElement, such as Observation.component or Patient.contact, and yet one can profile very similar elements, such as Observation.identifier or Patient.address. The difference is that the latter are based on normally-behaving data types (Identifier and Address, respectively). Inability to profile BackboneElement structures limits the reusability of slices. 1

Based on the current study, the following recommendations may help reduce profiliferation:

1. Evaluate what data elements should be added to each resource, so the coverage of native elements reaches 80% or more. Studying the extensions added to each resource may provide guidance on missing data elements. (The data here also show that some of the existing native elements are unnecessary, or at least, utilized at very low levels, but removing existing data elements is more problematic than adding new ones.)

2. Focus initial anti-profiliferation efforts on the most frequently profiled resources, Extension and Observation. Also, examine methods for standardizing and reusing slices.

3. Create a curated registry for collecting, sorting, standardizing, and sharing reusable items. The registry should not only contain extensions, but also specialized identifiers, addresses, and other profiled datatypes that may be useful outside of the context of a full resource profile.

4. Within registries, provide better mechanisms for searching within StructureDefinitions to help users and tools find definitions with certain properties, such as the use of a certain code. Improve authoring tools to detect when authors have created a profile similar to an existing profile.

5. Since slicing contributes substantially to variation among profiles, it is worth exploring how slices could be shared and reused. For example, it might be possible to create additional complex datatypes derived from BackboneElement, like Timing or Dosage. This will allow standalone profiling of these elements.

6. Create a release cycle for the registry that leads to rapid publication of reusable items. The current practice of linking the official extension registry to the FHIR release cycle is far too slow for IG developers.

7. Follow up this study with a closer look at how Observations have been profiled and create better guidance on using this critical resource. Address the proper use Observation.component and point out alternative means for expressing ancillary information associated with observations. It may be helpful to create abstract profiles to serve as parent profiles for different types of observations. There could be more clarity around when to use a questionnaire, and when to use Observations with many components. 8 . Strengthen education or governance that encourages IG creators to reuse existing extensions or create and contribute standalone extensions that have the best chance for reuse.

9. Educate IG developers on when to use fixed values, value patterns, and slicing approaches. Update old IGs and profiles to use value patterns when possible.

10. Conduct a similar study focused on the terminology landscape across a large sample of IGs.

This paper was intended to shed light on how profiles are being used across the FHIR community by analyzing a diverse set of 125 implementation guides. Strong trends were revealed, including the concentration of profiling on a small number of resources and datatypes. Areas where current practice is creating non-reusable items and contributing to the proliferation of profiles were uncovered, and specific recommendations for creating more usable IGs were proposed.

The current study only examined StructureDefinitions, and a full study of profiling should include other conformance resources, especially those related to terminology. While the current study analyzed a large set of IGs, it might be possible to broaden the set, or focus on particular types of IGs, for additional insights. 1 To be fully transparent, FHIR 4.6.0 provides the extension elementdefinition-profile-element for the purpose of reusing a profiled BackboneElement. This is an obscure feature, not used in any IG analyzed here.

. CC-BY-ND 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not certified by peer review)

The copyright holder for this preprint this version posted March 12, 2022. ; https://doi.org/10.1101/2022.03.09.22272163 doi: medRxiv preprint

Health Level Seven. HL7 FHIR Release 4

Health Level Seven. HL7 FHIR Release 4

Health Level Seven. FHIR Package Registry

FHIR Implementation Guide Registry

Health Level Seven. FHIR Resource StructureDefinition

Health Level Seven. FHIR Conformance Module

Health Level Seven. HL7 FHIR Release 4

Health Level Seven. HL7 FHIR Release 4

Health Level Seven. HL7 FHIR Release 5 Draft Ballot

Health Level Seven. HL7 FHIR Release 4

Health Intersections

FHIR Observations -design considerations, the proliferation of profiles, the role of validation

Health Level Seven. HL7 FHIR Release 4

IGs are listed by their formal identifiers. More information on each package can be found at https://registry.fhir.org/