HL7 V3 Datatypes Implementation Notes

From IHE Wiki
Jump to navigation Jump to search

Introduction

This page provides some discussion about the HL7 V3 data types, and their XML representation. It is not constrained to a particular IHE profile, but it is hopefully useful to implementers of any CDA Release 2 based content profile, or any integration profile using HL7 V3 messages.

Background

The HL7 V3 data types are defined as an abstract specification, part of the HL7 V3 standard. The current version is Release 1, with Release 2 being in the final stages of development. This discussion is strictly about Release 1 of the HL7 V3 data types.

The abstract data types specification defines the properties of the data types. The XML Implementable Technology Specification (XML ITS) defines how the data types are represented in XML. HL7 publishes XML schemas for the V3 data types, which express most of the properties defined in the abstract specification, however certain rules are impossible to define using the XML schema language. This is one of the reasons that the schemas published by HL7 are informative artifacts, and are not sufficient to implement the full semantics of the V3 data types.

The schemas provided by HL7 are part of the V3 normative editions. The ones part of the 2008 Normative Edition are at ftp://ftp.ihe.net/TF_Implementation_Material/ITI/schema/HL7V3/NE2008/coreschemas/

Implementation notes

The HL7 V3 Data Types specification is one of the more stable and robust parts of the HL7 V3 standard. For anyone tasked with implementing and supporting HL7 V3 messaging, or CDA Release 2 document creation or processing, it will be quite useful to have a single common reusable implementation of the HL7 V3 data types.

The abstract implementation describes the relationships among the various data types, and the XML ITS tries to follow these relationships as close as possible. The HL7 V3 data types have rich semantics, and this results in a quite complicated schema, and also in a complex object-oriented implementation. Note that HL7 has an additional [UML ITS for the HL7 V3 data types], which may be useful for certain development environments, and the HL7 RIM Based Application Architecture (RIMBAA) workgroup has a [Java-based implementation for V3 data types] as part of its Java API for the V3 RIM.

While all major development platforms have tools to automatically generate code based on XML schema, most are not sophisticated enough to properly handle the recursive definitions and generic type extensions used by the HL7 V3 data types. To illustrate this point the following sections will go through the definition of a particular data type to see how its properties are defined.

The Starting Point

The following question was asked on the 2009 NA Connectathon mailing list:

according to the basic schema datatypes-base.xsd, an element of type EN may have sub-elements 
like given, family, prefix and so on.
 
now,
 - "given" is of type "en.given"
 - "en.given" restricts "ENXP"
 - "ENXP" extends "ST"
 - "ST" restricts "ED"
 - "ED" extends "BIN"
 - "BIN" extends "ANY"
 - "ANY" doesn't restrict or extend anything

question: which XMLSchema simple type has to be used for "given"?

In order to answer the question, let's follow the steps of the definition of an element of type en.given.

Defining an XML Element

The simplest element definition in an XML schema may look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="urn:example:org" targetNamespace="urn:example:org">
<xs:element name="el1"/>
</xs:schema>

This simply defines an element with name el1, which belongs to the urn:example:org namespace. No other restrictions or properties are defined for this element. A minimal XML document, consisting only of this element may look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<!--  We can have just the element here -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample1.xsd"/>

Since there were no restrictions placed on the element in the schema, it can have text content, and any number of attributes or child elements:

<?xml version="1.0" encoding="UTF-8"?>
<!--  We can have the element with content-->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample1.xsd">Some content</el1>
<?xml version="1.0" encoding="UTF-8"?>
<!--  We can have the element with arbitrary attributes-->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample1.xsd" attr1="1"/>
<?xml version="1.0" encoding="UTF-8"?>
<!--  We can have the element even with arbitrary child elements-->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample1.xsd"><el2/></el1>
<?xml version="1.0" encoding="UTF-8"?>
<!--  We can have the element with arbitrary attributes, child elements and mixed content-->
<el1 xmlns="urn:example:org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample1.xsd" attr1="1">
    <el2/>
    Some text
    <el3>some other text</el3>
    more text
    <el4/>
</el1>

All of the above examples are valid with respect to the schema.

Defining a Complex Data Type in the Schema

If we take the definition of the ANY data type from the HL7 V3 schema, and remove the abstract attribute, we can see what will happen if we define our element to be of the ANY data type. The schema will be something like this:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="urn:example:org"
    targetNamespace="urn:example:org">
    <xs:element name="el1" type="ANY"/>
     <!-- This definition of the ANY data type is identical to the one in the HL7 datatypes-base.xsd schema,
     except that it is not defined as an abstract data type. This allows the associated examples to illustrate
     the effect of defining a complex data type for an element -->
    <xs:complexType name="ANY">
        <xs:annotation>
            <xs:documentation> Defines the basic properties of every data value. This is an abstract
                type, meaning that no value can be just a data value without belonging to any
                concrete type. Every concrete type is a specialization of this general abstract
                DataValue type. </xs:documentation>
        </xs:annotation>
        <xs:attribute name="nullFlavor" type="NullFlavor" use="optional">
            <xs:annotation>
                <xs:documentation> An exceptional value expressing missing information and possibly
                    the reason why the information is missing. </xs:documentation>
            </xs:annotation>
        </xs:attribute>
    </xs:complexType>
    <xs:simpleType name="cs">
        <xs:annotation>
            <xs:documentation> Coded data in its simplest form, consists of a code. The code system
                and code system version is fixed by the context in which the value occurs. is used
                for coded attributes that have a single HL7-defined value set. </xs:documentation>
        </xs:annotation>
        <xs:restriction base="xs:token">
            <xs:pattern value="[^\s]+"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="NullFlavor">
        <xs:annotation>
            <xs:documentation>vocSet: T10609 (C-0-T10609-cpt)</xs:documentation>
        </xs:annotation>
        <xs:union memberTypes="NoInformation"/>
    </xs:simpleType>
    <xs:simpleType name="NoInformation">
        <xs:annotation>
            <xs:documentation>specDomain: S10610 (C-0-T10609-S10610-cpt)</xs:documentation>
        </xs:annotation>
        <xs:union memberTypes="Other Unknown">
            <xs:simpleType>
                <xs:restriction base="cs">
                    <xs:enumeration value="NI"/>
                    <xs:enumeration value="MSK"/>
                    <xs:enumeration value="NA"/>
                    <xs:enumeration value="UNC"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:union>
    </xs:simpleType>
    <xs:simpleType name="Other">
        <xs:annotation>
            <xs:documentation>specDomain: S10616 (C-0-T10609-S10610-S10616-cpt)</xs:documentation>
        </xs:annotation>
        <xs:restriction base="cs">
            <xs:enumeration value="OTH"/>
            <xs:enumeration value="NINF"/>
            <xs:enumeration value="PINF"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="Unknown">
        <xs:annotation>
            <xs:documentation>specDomain: S10612 (C-0-T10609-S10610-S10612-cpt)</xs:documentation>
        </xs:annotation>
        <xs:union memberTypes="AskedButUnknown">
            <xs:simpleType>
                <xs:restriction base="cs">
                    <xs:enumeration value="UNK"/>
                    <xs:enumeration value="QS"/>
                    <xs:enumeration value="NASK"/>
                    <xs:enumeration value="TRC"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:union>
    </xs:simpleType>
    <xs:simpleType name="AskedButUnknown">
        <xs:annotation>
            <xs:documentation>specDomain: S10614
            (C-0-T10609-S10610-S10612-S10614-cpt)</xs:documentation>
        </xs:annotation>
        <xs:restriction base="cs">
            <xs:enumeration value="ASKU"/>
            <xs:enumeration value="NAV"/>
        </xs:restriction>
    </xs:simpleType>
</xs:schema>

The bulk of the schema is taken by the vocabulary definitions of the allowable values for the single attribute, nullFlavor, defined by the ANY data type. Besides this new optional attribute, nothing else is defined for this data type, but this actually implies certain things about it.

The first example shows the use of the optional nullFalvor attribute:

<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type ANY which means that the only valid addition to 
    the element is the nullFlavor attribute -->
<el1 xmlns="urn:example:org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample2.xsd" nullFlavor="OTH"/>

There is nothing in the definition of the ANY data type describing the content of an element of this data type. This lack of description, however, implies that there is no content allowed (i.e. it defines an empty content model). The following examples are invalid for various reasons:

<?xml version="1.0" encoding="UTF-8"?>
<!-- el1 is now of type ANY, which has an empty content type, so no text 
    content is allowed. Therefore this example is invalid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample2.xsd">Some content</el1>
<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type ANY, which has only one defined attribute, "nullFlavor", so no 
    other attributes are allowed. Therefore this example is invalid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample2.xsd" attr1="1"/>
<?xml version="1.0" encoding="UTF-8"?>
<!-- el1 is now of type ANY, which has an empty content type, so no chldren are allowed. 
    Therefore this example is invalid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample2.xsd"><el2/></el1>
<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type ANY, which has an empty content type, so no chldren, text content, 
    or attributes (other than nullFlavor) are allowed. 
    Therefore none of the additional XML attributes or children are valid -->
<el1 xmlns="urn:example:org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample2.xsd" attr1="1">
    <el2/>
    Some text
    <el3>some other text</el3>
    more text
    <el4/>
</el1>

Extending a Complex Data Type

In order to take the next step along the type hierarchy, we can now restore the ANY data type to be an abstract data type, and use the BIN data type in the definition of our element.

In the HL7 V3 schema, the BIN data type is also defined as an abstract data type, so the definition will be modified to make it a regular data type. Other than that, the definition is identical to the one in the HL7 V3 schema. The BIN data type defines two additions to the ANY datatype - the representation attribute, and allowing mixed content for elements of this data type. In general, mixed content means that an element can have child elements and/or text content. In this particulr case BIN has no child elements defined, so the mixed content definition simply allows text content to be present for elements of this data type.

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="urn:example:org"
    targetNamespace="urn:example:org">
    <xs:element name="el1" type="BIN"/>
     <!-- This definition of the BIN data type is identical to the one in the HL7 datatypes-base.xsd schema,
     except that it is not defined as an abstract data type. This allows the associated examples to illustrate
     the effect of defining an extension of a complex data type for an element -->
    <xs:complexType name="BIN" mixed="true">
        <xs:annotation>
            <xs:documentation>
                Binary data is a raw block of bits. Binary data is a
                protected type that MUST not be used outside the data
                type specification.
            </xs:documentation>
        </xs:annotation>
        <xs:complexContent>
            <xs:extension base="ANY">
                <xs:attribute name="representation" use="optional" type="BinaryDataEncoding" default="TXT">
                    <xs:annotation>
                        <xs:documentation>
                            Specifies the representation of the binary data that
                            is the content of the binary data value.
                        </xs:documentation>
                    </xs:annotation>
                </xs:attribute>
            </xs:extension>
        </xs:complexContent>
    </xs:complexType>
    <xs:simpleType name="BinaryDataEncoding">
        <xs:restriction base="xs:NMTOKEN">
            <xs:enumeration value="B64"/>
            <xs:enumeration value="TXT"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:complexType name="ANY" abstract="true">
        <xs:annotation>
            <xs:documentation> Defines the basic properties of every data value. This is an abstract
                type, meaning that no value can be just a data value without belonging to any
                concrete type. Every concrete type is a specialization of this general abstract
                DataValue type. </xs:documentation>
        </xs:annotation>
        <xs:attribute name="nullFlavor" type="NullFlavor" use="optional">
            <xs:annotation>
                <xs:documentation> An exceptional value expressing missing information and possibly
                    the reason why the information is missing. </xs:documentation>
            </xs:annotation>
        </xs:attribute>
    </xs:complexType>
    <xs:simpleType name="cs">
        <xs:annotation>
            <xs:documentation> Coded data in its simplest form, consists of a code. The code system
                and code system version is fixed by the context in which the value occurs. is used
                for coded attributes that have a single HL7-defined value set. </xs:documentation>
        </xs:annotation>
        <xs:restriction base="xs:token">
            <xs:pattern value="[^\s]+"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="NullFlavor">
        <xs:annotation>
            <xs:documentation>vocSet: T10609 (C-0-T10609-cpt)</xs:documentation>
        </xs:annotation>
        <xs:union memberTypes="NoInformation"/>
    </xs:simpleType>
    <xs:simpleType name="NoInformation">
        <xs:annotation>
            <xs:documentation>specDomain: S10610 (C-0-T10609-S10610-cpt)</xs:documentation>
        </xs:annotation>
        <xs:union memberTypes="Other Unknown">
            <xs:simpleType>
                <xs:restriction base="cs">
                    <xs:enumeration value="NI"/>
                    <xs:enumeration value="MSK"/>
                    <xs:enumeration value="NA"/>
                    <xs:enumeration value="UNC"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:union>
    </xs:simpleType>
    <xs:simpleType name="Other">
        <xs:annotation>
            <xs:documentation>specDomain: S10616 (C-0-T10609-S10610-S10616-cpt)</xs:documentation>
        </xs:annotation>
        <xs:restriction base="cs">
            <xs:enumeration value="OTH"/>
            <xs:enumeration value="NINF"/>
            <xs:enumeration value="PINF"/>
        </xs:restriction>
    </xs:simpleType>
    <xs:simpleType name="Unknown">
        <xs:annotation>
            <xs:documentation>specDomain: S10612 (C-0-T10609-S10610-S10612-cpt)</xs:documentation>
        </xs:annotation>
        <xs:union memberTypes="AskedButUnknown">
            <xs:simpleType>
                <xs:restriction base="cs">
                    <xs:enumeration value="UNK"/>
                    <xs:enumeration value="QS"/>
                    <xs:enumeration value="NASK"/>
                    <xs:enumeration value="TRC"/>
                </xs:restriction>
            </xs:simpleType>
        </xs:union>
    </xs:simpleType>
    <xs:simpleType name="AskedButUnknown">
        <xs:annotation>
            <xs:documentation>specDomain: S10614
            (C-0-T10609-S10610-S10612-S10614-cpt)</xs:documentation>
        </xs:annotation>
        <xs:restriction base="cs">
            <xs:enumeration value="ASKU"/>
            <xs:enumeration value="NAV"/>
        </xs:restriction>
    </xs:simpleType>
</xs:schema>

The first example shows the new attribute added to the el1 element. Note that while the example is valid according to the schema above, the presence of the representation attribute doesn't really make sense, since it refers to the text content of the element, and el1 has no content. Since this is valid XML, however, it is important for implementers to realize that there are semantic properties which have to handled beyond XML schema validation. In this case, the representation attribute must be ignored, and the nullFlavor attribute must be processed as denoting that the element is indeed empty, and possibly the reason for it.

<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type BIN which means that in addition to the nullFlavor attribute
there is another attribute, called representation-->
<el1 xmlns="urn:example:org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample3.xsd" nullFlavor="OTH" representation="TXT"/>

The second example shows el1 with text content. Note again, that, purely in XML Schema terms, one could add the nullFlavor attribute to the element, but this would make for an invalid HL7 data type, because the data types specification clearly defines the meaning of the nullFlavor attribute, and it must not be used if an element of a particular data type is not null.

Another thing to notice is that the representation attribute is not present. This is valid from both a purely XML Schema point of view, and from an HL7 point of view, because the attribute is optional, and it has a default value of TXT, which is, in fact, the representation shown in the example:

<?xml version="1.0" encoding="UTF-8"?>
<!-- el1 is now of type BIN, which was defined of mixed content, so text content is allowed. Therefore this example is valid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample3.xsd">Some content</el1>

An example showing Base 64 encoded data is presented as part of the ED data type discussion.

The rest of the examples show how adding extra attributes and/or child elements make the XML invalid with respect to the BIN data type as defined in the schema.

<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type BIN, which has only two defined attributes, "nullFlavor" and "representation", 
    so no other attributes are allowed. Therefore this example is invalid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample3.xsd" attr1="1"/>
<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type BIN, which has a mixed content type, but no child elements are defined.
    This practically restricts the BIN content type to only text content. Therefore this example is invalid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample3.xsd"><el2/></el1>
<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type BIN, which has a mixed content type, but no child elements are defined.
    This practically restricts the BIN content type to only text content. so no child elements or 
    additional attributes (other than nullFlavor or representation) are allowed. 
    Therefore this example is invalid -->
<el1 xmlns="urn:example:org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample3.xsd" attr1="1" representation="TXT">
    <el2/>
    Some text
    <el3>some other text</el3>
    more text
    <el4/>
</el1>

Further Extensions

The ED data type further extends BIN by adding pore properties to the abstract data type. Since ED is not abstract itself, the sample schema now can use the HL7 V3 data types directly, which is done by including the relevant schema. It is important to notice that the sample schema and all the examples use the urn:example:org namespace, and this can be done even when we include the HL7 V3 data types schema. This is possible because the data types XML schemas are defined without a target namespace. This is sometimes referred to as chameleon schema based on the property of such schemas that all of their definitions become part of the namespace of the including schema.

Defining el1 to be of type ED would look something like this:

<?xml version="1.0" encoding="UTF-8"?>
<xs:schema xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns="urn:example:org"
    targetNamespace="urn:example:org">
    <xs:include schemaLocation="datatypes-base.xsd"/>
    <xs:element name="el1" type="ED"/>
</xs:schema>

The following example shows the el1 element with content appropriate for the ED data type. There are two child elements defined, reference and thumbnail, and several new attributes are added. The HL7 V3 Data Types specification and the published HL7 V3 data types schema provide complete details of the meaning and definitions of these extensions. One notable detail is the data type of the thumbnail element - it is a restriction on ED itself, which is done to explicitly prohibit a thumbnail from having a thumbnail, while keeping the thumbnail's data representation the same as the data represented by ED. Such recursion in the definition of the HL7 V3 data types is common, and this feature of XML Schema is supported by most toolkits.

The example shows the data both as content in the ED element, and as a reference. In most cases either one or the other method would be used, although using both is valid with regards to the XML Schema, and to the HL7 V3 Data Types specification.

<?xml version="1.0" encoding="UTF-8"?>
<!--  el1 is now of type ED which means that in addition to the attributes from the BIN data type, 
    there are other attributes, and two child elements defined. The ED data type has mixed content, so the child
    elements can co-exist with the text content -->
<el1 xmlns="urn:example:org" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample4.xsd" representation="B64"  mediaType="application/pdf"
    integrityCheck="ODMwZDZjOTE5NjFiNzNjZDM1OTkwMWY2M2QwZDlkMDM1MzNiZGU5MjU4NTYwNDQ0MDcwMmNhODZmZGFlNjM3Nw==" 
    integrityCheckAlgorithm="SHA-256">
    <reference value="http://wiki.ihe.net/images/d/de/Sample.pdf"/>
    JVBERi0xLjQKJcOkw7zDtsOfCjIgMCBvYmoKPDwvTGVuZ3RoIDMgMCBSL0ZpbHRlci9GbGF0ZURl
    Y29kZT4+CnN0cmVhbQp4nHVRu2rEMBDs/RVbB6Ts6n0gBPGdXaQ7MKQI6fLoArkmvx+NZDlFCIZl
    [...snip...]
    Q0IzNzI3NDkzOUJFQzlGPiBdCi9Eb2NDaGVja3N1bSAvQ0JFMkI5RkFFM0I2RDE1RDI4MUQxMzZD
    NjU0MTM2MEQKPj4Kc3RhcnR4cmVmCjQ0MDM1CiUlRU9GCg== 
    <thumbnail representation="B64" mediaType="image/png">
        iVBORw0KGgoAAAANSUhEUgAAAEwAAABhCAIAAADzx9CUAAAAAXNSR0IArs4c6QAAAARnQU1BAACx
        jwv8YQUAAAAgY0hSTQAAeiYAAICEAAD6AAAAgOgAAHUwAADqYAAAOpgAABdwnLpRPAAAAuJJREFU
        [...snip...]
        UuWxSSry6F42iSZKlccmqcije9kkmihVHpukIo/u/V9NarvNd35/lqS5P/dUW6ft/AFlHiFy1UdU
        qQAAAABJRU5ErkJggg== 
    </thumbnail>
</el1>

The next example shows a much simple use of the ED data type. Since in this case the data is simple text, both the representation and mediaType attributes can be omitted, since their default values (TXT and text/plain respectively) represent plain text.

<?xml version="1.0" encoding="UTF-8"?>
<!-- el1 is now of type ED, which is defined of mixed content, so text content is allowed. Therefore this example is valid -->
<el1 xmlns="urn:example:org"
    xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
    xsi:schemaLocation="urn:example:org sample4.xsd">Some content</el1>

In fact, this use of the ED data type is so common, that a separate type to represent plain text was created - the ST data type.

Restricting a Complex Data Type