Sharing Value Sets

Proposed Profile: Sharing Value Sets in a Vocabulary Domain

Proposal Editor: Christel Daniel (AP-HP, INSERM, Paris), Karima Bourquard (GMSIH), François Gareil (Thales), Jean Delahousse (Mondeca), Norbert Lipszyc (DBmotion), Pierre Zweigenbaum (LIMSI, CNRS), Ana Esterlich (GIP-DMP), Charles Rica (GIP-DMP)
Profile Editor: Ana Estelrich
Date: N/A (Wiki keeps history)
Version: N/A (Wiki keeps history)
Domain: ITI

Summary

Federal healthcare facilities, RHIOs, and national EHRs need to find a way to effectively share their health information, adopting the same clinical vocabulary. The vocabulary used to encode patient data is not uniform, resulting into an erroneous data capture and a lack of semantic interoperability (1). The issue becomes of importance when an application tries to implement a new nomenclature (or a part of one), such as it would be in the case of a newly installed system. This also applies when a system needs to update an already existing set of nomenclatures.

The HL7 v3 Reference Information Model (RIM) version 2.14n, and well as the HL7 Common Terminology Services (HL7 CTS) version 1.2 are good standard to use. These standards are mature and ready to use. The CTS 2 is addressing some interesting service issues that can be looked at in parallel with this profile.

An end-user clinical application such as a Content Creator/Consumer Actor or simply a medical device will need a Value Set Consumer Actor in order to encode, create or consume structured, coded content. This would be the case with CDA r2-based documents, or with DICOM objects needing to be standardized. In other words, this is the access function of the Value Set. More so, a Value Set will contain values derived from one or more code systems, and sometimes will need to update itself to assure an identical version among the application using the same Value Sets. This is the synchronisation function of the Value Sets). This profile intends to address the two cases.

IHE is the perfect venue to solve this problem because all the aforementioned efforts in achieving interoperability are aiming or already using the IHE-ITI-XDS infrastructure. More so, the IHE PCC content profiles use Clinical Document Architecture (CDA r2) as an established standard for the exchange of clinical documents, which need consistent values at the section or entry level(7). An example would be CDA CCD. Also, the profile XDS-I metadata needs a common Value Set (for example for “body parts”).

Since IHE-XDS is content-neutral, the profiles concerned are the content profiles. The need to have a common national terminology is of paramount importance when functional and semantic interoperability are at stake.

The Problem

Today’s terminologies are becoming more and more complex. Encoding is necessary to enable automated processing and not just human interpretation of ideas and concepts in the context of structured documents, namely the content profiles using the HL7 Clinical Document Architecture or applications using DICOM objects. Some of the benefits of encoded information are:

• The organization of information mean for human interpretation (classification of document types and section headings, enable data filtering and exploitation, easier navigation to related information)

• Effective indexing and retrieval of information (specific types of records or data)

• Automated translation to a different human language for human presentation (6).

A doctor or a technologist in a healthcare facility will try to use some type of encoding for filling out the details of a report, or the final results of an interpretation. The technologist would need a standardized picklist to indicate the body part involved in the procedure. The physician would need a special officially standardized nomenclature so that when s/he sends the Discharge Summary across the country with the patient, the application of the attending physician at the other end would be able to interpret it and extract the useful information.

Because of a lack of an officially standardized Value Set to be used in encoding or because of its lack of update, most healthcare facilities revert to using textual information or internal coding, which results in a lack of semantic interoperability.

Distributing and an official Value Set from a Terminology Server would solve this problem, but is a challenging task. This would have to be done when a new system is installed, or when a system decides to upgrade its nomenclature. Charging a terminology off a disk can be a time-consuming action, not to mention it will have to be repeated each time an updated version becomes available.

Certain concepts in a Value Set used clinically will change, become obsolete, or there will be new ones added. Most of the time the charge technologist is looking on the internet or calling up the vendor of the system, or their colleagues to find out if a new version has become available and where they can get it from. If the ValueSet is not obtained quickly enough and the changes are not enormous, they are usually entered by hand, leading to potential data entry errors. A method of synchronization with the official terminologies (updating) would facilitate the workload involved in such tasks.

Keeping an up-to-date terminology is important for the sake of interoperability. If an institution is using a different version of values then the one whom the document is sent to, potential medical errors might result. To close the loop, as soon as a new Value Set is uploaded or updated, an internal mapping should be between the data elements that the clinical application is using, and the data definition used in the HL7 specifications since it will ensure user compliance and ease of use within the coding process.

Key Use Case

'Use-Case 1: Needing Consistent Encoding Terms for Pathology'

A patient is seen in a regional healthcare network A by a group of healthcare professionals - such as oncologists, general practitioners, laboratory practitioners, pharmacists, and nurses. All these HCP will want to capture relevant medical information required for the continuity of patient care. All these health care professionals need to use a common encoding terminology, for example they would all need to have access to a pick list containing the Value Set Concept "Bacillus Anthracis" from the Value Set "Microorganism", derived from the most recent version of the terminology SNOMED-CT (8). If the system used by the laboratory practitioners has access to this encoding term, but the application that the general practitioner uses does not, the applications will not be able to extract this information. The healthcare practitioner will be able to obtain this information by reading the narrative part of the document, but the information will be lost for further processing by the application.

'Use-Case 2: Needing Consistent Encoding Terms for a Body Part on a CT scanner'

In hospital A, an imaging technologist is about to start a procedure. S/he chooses its protocol and "guesses" what body part s/he should be entering in the “body part” field present on the machine since nothing was prefigured. The modality sometimes will over-ride the RIS information that the RIS administrator has configured, or at times it takes the existing RIS information, depending on the vendor and on the implementation. Also the information concerning body parts in one particular RIS might not be consistent with the encoding chosen by the RIS in hospital B, in another state. When the study is transmitted via XDS-I or even on a CD, and imported into the local system of the hospital B, the local PACS and RIS administrator need to manually reconcile this information with the one present in their system for the sake of comparison of reports, and also to ensure the same display (body parts do influence the hanging protocols for the radiologists).

'Use-Case 3: Updating a List of Encoding Terms'

The regional healthcare network A has correctly used the “Bacillus Anthracis” as pathology. This result is send to a primary care physician’s office who does not have the most updated Value Set from the Terminology Server in its area. He will be able to read the information, but his application will not be able to store the information locally, having it later available for further use. A mechanism is needed so that all the applications are synchronised with the Terminology server.

Standards & Systems

The HL7 v3 Reference Information Model (RIM) version 2.14n, and the terminology models are interdependent. The HL7 v3 Data Types describe the structure and properties of the data types pertaining to the Value Set. The HL7 v3 RIM, Data Type definitions and the HL7 Vocabulary are all good parts of the standard to use.

The HL7 CTS version 1.2 - November 2004 (2) specifies the common functional characteristics that an external terminology must be able to provide and defines an Application Programming Interface (API) that can be used by HL7 version 3, software when accessing terminological content. The standard states that are two layers between the HL7 message processing applications and the target vocabularies. The standard can be downloaded on the site: http://informatics.mayo.edu/LexGrid/downloads/CTS/specification/ctsspec/cts.htm

The upper layer, the Message API, communicates with the messaging software, and it does so in terms of vocabulary domains, contexts, value sets, coded attributes, and other artifacts of the HL7 message model.

The lower layer, the Vocabulary API, communicates with the terminology service software, and does so in terms of code systems, concept codes, designations, relationships and other terminology specific entities.

The message API is specific to HL7. It allows to a wide variety of message processing applications to create, validate and translate CD-derived data types in a consistent and reproducible fashion.

The Vocabulary API intends to be generic. It allows applications to query different terminologies in a consistent, well-defined fashion. The Message API uses the Vocabulary API.

A list of valid concept codes is referred to as a value set. The key terms regarding this proposal are: • Common CTS Message Elements • Service Identification Section – common to both message runtime and browsing API • CTS Message Browsing API (such as looking up a vocabulary domain and looking up a value set).

The CTS 2 is addressing some interesting service issues that can be looked at in parallel with this profile.

Defintions (which will go into the Glossary) Vocabulary Domain is an abstract conceptual space such as "countries of the world", "the gender of a person used for administrative purposes". Each Vocabulary Domain has a unique name along with a description of the conceptual space that it represents. Before the values of an attribute can be used from this conceptual space, an actual list of concept codes needs actually to be defined.

Hence, a value set is ‘a collection of concepts drawn from one or more vocabulary code systems and grouped together for a specific purpose.’ (e.g: "Microorganism" value set derived from SNOMED-CT code system.) (8).

A value set also has Value Set Concepts, which is the name for an object or abstract idea that provides a pointer to the code system concept code and/or name. (e.g "Bacillus Anthracis" is a concept in the "Microorganism" value set derived from SNOMED-CT code system.)

A value set will also have an OID (Value Set OID - Unique Object Identifier for a Value Set). The metadata and the associations of a value set are presented in the table 1:

A representation of a value set can be: • Value Set Name: Infectious Agent (Microorganism) • Value Set Code: PHVS_InfectiousAgent_CDC • Value Set OID: 2.16.840.1.114222.4.11.908 • Code System Name: SNOMED-CT • Code System Code: PH_SNOMED-CT • Code System OID: 2.16.840.1.113883.6.96

(source: Public Health Information Network)(4).

Since XDS.b is using Web Services, there might be a suggestion to be revised by the technical committee of using Web Services APIs, such as Java API for XML-based RPC (JAX-RPC) 1.1 which is an API for building and deploying SOAP+WSDL web services clients and endpoints.

Also Java APIs for XML Registries (JAXR) 1.0.4 can be used in accessing different kinds of XML registries. It provides you with a single set of APIs to access a variety of XML registries, including UDDI and the ebXML Registry without having to know the registry's information model (9).

Technical Approach

A similar approach as the ITI-XDS is adopted for the distribution of the terminologies, with focus on the ValueSets used in a common clinical setting.

The HL7 CTS message elements, the metadata of the value set and the association that it does make with the other components of the Vocabulary Domain must be investigated. Also the Service Identification Section and the Message Browsing API must be looked at so that we can see how it will exactly affect the transactions between the proposed authors. Also since XDS.b is using Web Services, it might be of interest to examine the use of Web Services API.

Ultimately, the aim would be to treat a whole code system, as complex as SNOMED, for example, and covering vocabulary domains, contexts, and relations between terminologies such as interface terminologies, including the “processing” terminologies for data mining - Natural Language Processing technologies. These will be treated separately in the White Paper "Sharing Terminologies".

Existing actors

Content creator/consumers

New actors

'ValueSet Repository'

Actor whose role is to store the brand new ValueSets that the ValueSet Source has sent and also of its different updates. It also has the responsibility to register the metadata of each new or updated ValueSet it receives from the ValueSet Source.

'ValueSet Registry'

An actor who keeps track of the metadata belonging to the ValueSets existing in the Value Set Repository. The metadata registered can be queried, namely on the: name, the OID and the Assigning Auth Version. Each new entry or an update in the ValueSet Repository will create an entry in the ValueSet Registry.

'ValueSet Consumer'

An actor who queries the ValueSet Registry for a specific new Value Set or for an updated one. It will then retrieve it from the ValueSet Repository. The ValueSet Consumer queries the ValueSet Registry, namely on the: name, the OID and the Assigning Auth Version so that it can update the latest version if needed.

The ValueSet Consumer will somehow interact with the Content Creator/Consumer (application, CT scanner) so that the later one can use the Value Sets required for encoding (either a brand new one or an updated one). This point of interaction between the Value Set Consumer and the application needing the values still has to be figured out.

Existing transactions

The actors would need to authenticate one to another and also be respectful of the consistent time profile (coupled with ATNA and CT transactions).

New transactions (standards used)

ValueSet = VS:

Three transactions are proposed:

[Register VS]

The Value Set Repository register the metadata associated with the Value Set it received from the Value Set Source - new or updated - with the Value Set Registry so that it will available for query by the Value Set Consumer.

[Query VS Registry]

(ValueSet Consumer asks Value Set Registry for a new list of the Value Sets available for uploading and access for the various versions of the applications using it). This interaction will also have to contain an adaptation so that it can support a query for the most up-to-date Value Set version for retrieval by the ValueSet Consumer.

[Retrieve VS]

(Value Set Consumer retrieves a new Value Set from the Value Set Repository). This interaction will also have to contain an adaptation so that it can support a query for the most up-to-date version of the Value Set existing in ValueSet Repository.

Two other transactions can give a more complete picture, but they are left to the discretion of the ITI Technical Committee, being also mentioned in the White Paper treating the subject on a larger scale.

[Provide & Register VS]

The Value Set Source provides a Value Set Value – new or an update - to the Value Set Repository).

[Notification VS] (optional)

Registry Value Set notifies VS Consumer that an update of the VS is available.

The transaction between the authors can be seen in the Figure 1 and 2, within the context of the white paper. However, only the interaction in pink will be dealt with:

Fig. 1: Creating/updating a Value Set and consuming it by an application

The focus proposed is on the activity diagram:

Fig. 2: Importinng a value set (whole value set or update from a value set)

Impact on existing integration profiles

Needs to be paired with profiles CT and ATNA. Will improve the efficiency of coding in the content profiles.

New integration profiles needed

ITI-ST (ITI Sharing Terminologies)

Breakdown of tasks that need to be accomplished

• Detailed description of the current situation in a clinical setting with regards to terminology use.

• Detailed description of what this profile will bring

• Brief description of the whole terminology model

• Characteristics of encoded data

• Implementer’s responsibilities

• Figure out the feasibility of the transactions

• Detailed description of the ValueSet and its characteristics – metadata, content, structure

• Determine which metadata is necessary for registering the Value Set and which metadata is needed for the update.

• Handling Value Set Changes

• Description of the use of obsolete values in the Value Set and how should the application handle them

• Detail the connection between the ValueSet Consumer and the Content Creator/Consumer

• What terminology services functions does this profile offer – name exchange, identifier translation, local information, track version changes (10)

• How will the mapping be done?

• Conformance testing on the Value Set once retrieved

• Create table of dependencies on other profiles

• Follow Context

• Trigger Events

• Message Semantics

• Expected Actions

• Security considerations

• What issues will be addressed not here but in the white paper?

Support & Resources

France is in the midst of installing at a national scale a PHR (Personal Health Record) and it willing to participate with national efforts in the Profile development. Researchers from the INSERM (The National Institute of Health and Medical Research), CNRS (National Center for Scientific Research), the GMSIH (A Public Interest Group in charge of the Modernization of the Healthcare Information Systems), and the Hospitals of Paris (AP-HP) containing more then 40 hospitals are willing to put efforts into this profile. Not to forget, the industry is also interested in participating, namely companies such as Thales, Mondeca, and DBmotion.

Risks

To be discussed.

Open Issues

• What is the connection between a Content Creator/Consumer and a Value Set Consumer? How does the clinical application (Content Creator/Consumer) obtain the new codes or updated values from the Value Set Consumer?

• Should there be a VS Registry Stored Query or just a Query VS Registry

• Should there be a notification mechanism when a new version is available or should the Value Set Consumer do a query?

Tech Cmte Evaluation

<The technical committee will use this area to record details of the effort estimation, etc.>

Effort Evaluation (as a % of Tech Cmte Bandwidth):

35% for ...

Responses to Issues:

See italics in Risk and Open Issue sections

Candidate Editor:

TBA

References

1. Semantic Issues in Integrating Data from Different Models to Achieve Data Interoperability. Rahil Qamara, Alan Rectora. Medical Informatics Group, University of Manchester, Manchester, U.K. Consulted on line October 21, 2007, http://www.cs.man.ac.uk/~qamarr/papers/Medinfo_Paper_RQamar.pdf

2. HL7 Common Terminology Services. HL7® Version 3 Standard, © 2004 Health Level Seven®. Consulted on line October 21, 2007, http://informatics.mayo.edu/LexGrid/downloads/CTS/specification/ctsspec/cts.htm

3. United Stated Department of Health and Human Services. Office of the National Coordinator for Health Information Technology (ONC). Presidential Initiatives. Consulted on line October 22, 2007, http://www.hhs.gov/healthit/chiinitiative.html.

4. PHIN Vocabulary Access and Distribution System (VADS). Centre for Disease Control and Prevention. Frequently Asked Questions. Consulted on October 22, 2007. http://www.cdc.gov/PhinVSBrowser/html/static/Help_html/what_is.htm#q14

5. The Lexical Grid, Shared Terminology Resources. Consulted on line October 22, 2007, http://informatics.mayo.edu/LexGrid/index.php?page=aboutlg

6. pan-Canadian iEHR Standards. iEHR Terminology Overview. Version 1.3. October 23, 2006. Canada Health Infoway.

7. IHE Patient Care Coordination Technical Framework, Volume II (PCC TF-2), Integration Profiles, Revision 3.0, 2007-2008. Consulted on line October 22, 2007. http://www.ihe.net/Technical_Framework/upload/IHE_PCC_TF_Vol_2_TI_2007_08_15.pdf.

8. PHIN Vocabulary Access and Distribution System (VADS). Centre for Disease Control and Prevention. Frequently Asked Questions. Consulted on October 22, 2007. http://www.cdc.gov/PhinVSBrowser/html/static/Help_html/what_is.htm#q14

9. Service-Oriented Architecture (SOA) and Web Services: The Road to Enterprise Application Integration (EAI). Qusay H. Mahmoud, April 2005. Consulted on October 23, 2007. http://java.sun.com/developer/technicalArticles/WebServices/soa/

10. pan-Canadian iEHR Standards. Terminology Services. September 5, 2006. Canada Health Infoway.