XDS-SD Harmonization

From IHE Wiki
Revision as of 14:02, 22 May 2008 by Seknoop (talk | contribs) (→‎XDS Metadata)
(diff) ← Older revision | Latest revision (diff) | Newer revision → (diff)
Jump to navigation Jump to search

Back to: XDS-SD_-_Discussion


Several IHE profiles rely on HL7 specifications, in particular HL7 CDA R2. Implementers of these profiles are facing many stumbling blocks due to inconsistencies among the HL7 CDA based IHE profiles and the artifacts published by HL7 that implementers are relying on for their respective implementations. These implementations issues are significant enough to threaten the interoperability goals of IHE and should be addressed across the relevant committees.

This issue in the beginning discussion phases amongst the co-chairs of IHE. This effort with XDs-SD is the beginning of an alignment campaign in which all content profiles in IHE that are based on CDA use the PCC TF, in particular use the content module approach where appropriate. This will provide greater implementation and documentation consistency.

Scanned Documents Content Integration Profile

A variety of legacy paper, film, electronic and scanner outputted formats are used to store and exchange clinical documents. These formats are not designed for healthcare documentation, and furthermore, do not have a uniform mechanism to store healthcare metadata associated with the documents, including patient identifiers, demographics, encounter, order or service information. The association of structured, healthcare metadata with this kind of document is important to maintain the integrity of the patient health record as managed by the source system. It is necessary to provide a mechanism that allows such source metadata to be stored with the document. This profile defines how to couple such information, represented within a structured HL7 CDA R2 header, with a PDF or plaintext formatted document containing clinical information. Furthermore, this profile defines elements of the CDA R2 header necessary to minimally annotate these documents. Such header elements include information regarding patient identity, patient demographics, scanner operator identity, scanning technology, scan time as well as best available authoring information. Portions of CDA R2 header, along with supplemental document registration information, are then used to populate XDS Document Entry metadata.

The content of this profile is intended for use in XDS, XDR and XDM. Content is created by a Content Creator and is to be consumed by a Content Consumer. The Content Creator can be embodied by a Document Source Actor or a Portable Media Creator, and the Content Consumer by a Document Consumer, a Document Recipient or a Portable Media Importer. Obligations imposed on the Content Creator and the Content Consumer by this profile are understood to be fulfilled by the software that creates the final document for submission and/or consumes profile conformant documents rather than any particular scanning technology.

Use Cases

Content Use Cases

Text Chart Notes Examples of this content include handwritten, typed or word processed clinical documents and/or chart notes. These documents are typically multi-page, narrative text. They include preprinted forms with handwritten responses, printed documents, and typed and/or word processed documents, and documents saved in various word processing formats. Appropriate formats are PDF, derived from the word processing format, or plaintext, if the text structure is all that needs to be conveyed. PDF is desirable because it most faithfully renders word processed document content.
Graphs, Charts and/or Line Drawings Examples of this content include Growth Charts, Fetal Monitoring Graphs. Line drawings such as those described above are best rendered using PDF versus an image based compression, such as JPEG. However, when computer generated PDFs include lines, PDF vector encoding should be used.
Object Character Recognition (OCR) Scanned Documents Clinical documents can contain text and annotations that cannot be fully processed by optical character recognition (OCR). We call attention to the fact that the OCR text content may only partially represent the document content. These are best supported by converting to PDF format, which can mix the use of OCR’d text, compressed scanned text, and scanned image areas.
Electronic Documents Existing clinical documents that are electronically transmitted or software created (e.g. PDF, or plaintext) can be considered as actually scanned, previously scanned or virtually scanned before they are shared. In this context, “actually scanned” refers to electronic documents, newly created via some scanning technology from legacy paper or film for the purposes of sharing. “Previously scanned” refers to electronic documents that were previously produced via some scanning technology from legacy paper or film, but have existed in their own right for a period of time. “Virtually Scanned” electronic documents are existing electronic documents not derived from legacy paper or film that either are PDF/A or plaintext format or have been converted to one of these formats for the purposes of sharing. This content is covered by this profile.

Content Creator Use Cases

Content is created by a Content Creator. Impact on application function and workflow is implementation specific and out of scope of this content profile, though we note that they will be compliant with this content profile if they can produce CDA wrapped PDF, CDA wrapped plaintext or both. The following example use case is included to aid in the scoping of this content profile.

Legacy Clinic is a small two-physician clinic. They presently store their patient's medical records on paper. The Clinic is trying to figure out what to do with its paper and word processing documents as it converts over to an electronic system. They plan would like to be able to view the files over their local intranet. Presently, most records are handwritten on preprinted paper forms that are inserted into specific sections of the patient's chart. More detailed encounter reports are dictated and sent to a transcription company that returns them in a word processing format. The medical records clerk at Legacy Clinic receives these files via e-mail, decrypts them, prints them out, and adds them to the patient's chart in the correct section. Over the years, Legacy Clinic has used a number of different transcription companies, and the documents are stored in a variety of word processing formats. Several years ago, they began to require that returned documents be in RTF format in an attempt to reduce frustrations induced by dealing with discrepant word processing formats. Only in some cases was patient and encounter metadata stored within the word processing document in a regular format, depending upon the transcription company used at the time. A third party presently handles labs for the clinic. These are usually returned to the Clinic as printed documents. The clerk inserts these into the labs section in the patient's chart. In the case of Legacy Clinic, the link between the word processing documents and the patient has been maintained for many of its documents, since the existing manual process maintains that association, and some of the files also contain the encounter metadata. However, the link to the specific encounter will need to be reestablished by interpreting the document content, which will require a great deal of manual effort for some of their documents which do not have it, and will still require custom handling depending upon the format used to store this metadata. Legacy Clinic uses a transcription provider that can generate PDF documents, wrapped in a CDA Release 2.0 header. These are sent to Legacy Clinic via e-mail. While the same manual process is used, these documents are now in a format that is ready to be used by their new EHR system.

Content Consumer Use Cases

Content is consumed by a Content Consumer. Impact on application function and workflow is implementation specific and out of scope of this content profile. However, we note that adoption of this profile will necessitate the Content Consumer, upon document receipt, support the processing of both CDA wrapped PDF and CDA wrapped plaintext.


There are two actors in the XDS-SD profile, the Content Creator and the Content Consumer. Content is created by a Content Creator and is to be consumed by a Content Consumer. The sharing or transmission of content from one actor to the other is addressed by the appropriate use of IHE profiles described below, and is out of scope of this profile. A Document Source or a Portable Media Creator may embody the Content Creator Actor. A Document Consumer, a Document Recipient or a Portable Media Importer may embody the Content Consumer Actor. The sharing or transmission of content or updates from one actor to the other is addressed by the use of appropriate IHE profiles described in the section on Content Bindings with XDS, XDM and XDR.

XDS-SD Actor Diagram


Actor Option
XDS-SD Options
Content Consumer View Option (1)
Document Import Option (1)
Section Import Option (1)
Discrete Data Import Option (1)
Note 1: The Actor shall support at least one of these options.

Content Consumer Options

View Option

This option defines the processing requirements placed on Content Consumers for providing access, rendering and management of the medical document. See the View Option in PCC TF-2 for more details on this option.

A Content Creator Actor should provide access to a style sheet that ensures consistent rendering of the medical document content as was displayed by the Content Consumer Actor.

The Content Consumer Actor shall be able to present a view of the document using this style sheet if present.

Document Import Option

This option defines the processing requirements placed on Content Consumers for providing access, and importing the entire medical document and managing it as part of the patient record. See the Document Import Option in PCC TF-2 for more details on this option.

Section Import Option

This option defines the processing requirements placed on Content Consumers for providing access to, and importing the selected section of the medical document and managing them as part of the patient record. See the Section Import Option in PCC TF-2 for more details on this option.

Discrete Data Import Option

This option defines the processing requirements placed on Content Consumers for providing access, and importing discrete data from selected sections of the medical document and managing them as part of the patient record. See the Discrete Data Import Option in PCC TF-2 for more details on this option.

Cross Enterprise Document Sharing, Media Interchange and Reliable Messaging

Actors from the ITI XDS, XDM and XDR profiles embody the Content Creator and Content Consumer sharing function of this profile. A Content Creator or Content Consumer may be grouped with appropriate actors from the XDS, XDM or XDR profiles to exchange the content described therein. The metadata sent in the document sharing or interchange messages has specific relationships or dependencies (which we call bindings) to the content of the clinical document described in the content profile.

The Patient Care Coordination Technical Framework defines the bindings to use when grouping the Content Creator of this Profile with actors from the IHE ITI XDS, XDM or XDR Integration Profiles.

Scanned Documents Bindings
Content Binding Actor Optionality
Scanned Documents Medical Document Binding to XD* Content Creator R
Content Consumer R

Content Creation Process

This profile assumes the following sequence of events in creation of an XDS-SD document.

  1. A legacy paper document is scanned and a PDF/A is rendered. Alternatively, an electronic document is converted, if necessary, to PDF/A or plaintext format (see here).
  2. Software, conformant to this profile and most likely with the aid of user input (ex. to provide document title, confidentiality code, original author), renders the CDA R2 header pertaining to the PDF or plaintext produced. The document is wrapped and the XDS-SD document is completed (see here).
  3. XDS metadata is produced from data contained in the CDA header and supplemental information (see here).
  4. The completed XDS-SD document and corresponding metadata is sent via the Provide an Register Document Set Transaction [ITI-15 or ITI-41] of XDS/XDR, or the Distribute Document Set on Media Transaction [ITI-32] of XDM.

Content Standards

  • PDF RFC 3778, The application/pdf Media Type (informative)
  • PDF/A ISO 19005-1b. Document management - Electronic document file format for long-term preservation - Part 1: Use of PDF (PDF/A)
  • HL7 CDA Release 2.0 (denoted HL7 CDA R2, or just CDA, in subsequent text)
  • IETF (Internet Engineering Task Force) RFC 3066

Discussion of Content Standards

PDF and plaintext documents intended for wrapping can consist of multiple pages. Encoding of multiple page PDF documents are subject to the PDF/A standard. This ISO standard, PDF/, is a subset of Adobe PDF version 1.4 intended to be suitable for long-term preservation of page-oriented documents. PDF/A attempts to maximize:

  • Device independence
  • Self-containment
  • Self-documentation

The constraints imposed by PDF/A include:

  • Audio and video content are forbidden
  • JavaScript and executable file launches are prohibited
  • All fonts must be embedded and also must be legally embeddable for unlimited, universal rendering
  • Colorspaces specified in a device-independent manner
  • Encryption is disallowed (although the enclosing document and transport may provide encryption external to the PDF content)
  • Compression methods are restricted to a standard list

The PDF/A approach has several advantages over TIFF or JPEG. First, there are more image compressions and format flexibility in PDF, so that the image files sizes can be kept smaller. There are many simple programs available for converting TIFF and JPEG into PDF with various other features for improving compression or adding other information. The PDF/A enables devices that produce vectorized output. Unlike TIFF, JPEG, or BMP, a PDF/A image has the ability to provide several "layers" of information. This allows the creation of PDF searchable images.

A PDF searchable image is a PDF document with an exact bitmapped replica of the scanned paper pages and with text information stored behind the bitmap image of the page. This approach retains the look of the original pages while enabling text searchability and computer analysis. This approach is especially suitable for documents that have to be searchable while retaining the original scan details. The text layer is created by an Optical Character Recognition (OCR) application that scans the text on each page. It then creates a PDF file with the recognized text stored in a layer beneath the image of the text. Unrecognized graphics areas and annotations are preserved with full fidelity in the image. The text form may be incomplete or the OCR confused by some words, but the original image is preserved and available.

Plaintext as well as PDF/A documents shall be base-64 encoded before wrapped in a HL7 CDA R2 header. The PDF/A docments shall conform to PDF/A-1b. Creators are encouraged to conform to PDF/A-1a to the maximum extent possible, but a simple document scanner may be unable to fully conform to PDF/A-1a. Other profiles may require PDF/A-1a conformance.

HL7 CDA R2 document is constrained so that pertinent metadata values and scanning facility, technology and operator information shall be present (see ‎here).

Medical imagery and photographs are outside the scope of this profile. Diagnostic or intervention medical imagery will be supported through DICOM (which includes the use of JPEG and MPEG). Additionally audio and video recorded content is not covered by this profile.

XDS Metadata

XDS-SD is a CDA R2 document and thus conforms to the XDS Metadata requirements in the PCC-TF, volume 2, Section 5 unless otherwise specified below.


XDS-SD leverages the XDS DocumentEntry Metadata requirements in the PCC-TF, volume 2, Section and in PCC_TF-2/Bindings unless otherwise specified below


The XDSDocumentEntry.formatCode shall be urn:ihe:iti:xds-sd:pdf:2008 when the document is scanned pdf and urn:ihe:iti:xds-sd:text:2008 when the document is scanned text.


This value shall be the ClinicalDocument/id in the HL7 CDA R2 header. The root attribute is required, and the extension attribute is optional. In accordance with the XDS.a profile, total length is limited to 128 characters; for XDS.b the limit is 256 characters. Additionally see PCC-TF, volume 2, Section or PCC_TF-2/Bindings for further content specification.

Relating instances of XDS-SD documents

In general, most instances of XDS-SD will not have parent documents. It is possible, however, in some specific use cases that instances of XDS-SD documents are related. For example, for a particular document it may be the case that both the PDF scanned content and somewhat equivalent plaintext need to be wrapped and submitted. Each document would correspond to separate XDSDocumentEntries linked via an XFRM Association that indicates one document is a transform of the other. These can be submitted in a single submission set, or in separate ones. Other specific examples may exist and this profile does not preclude the notion of a parent document for these cases.


No additional constraints. Particular to this profile, a legitimate use of submission sets would be to maintain a logical grouping of multiple XDS-SD documents. We encourage such usage. For more information, see PCC-TF-2 Section or PCC_TF-2/Bindings.


No additional requirements. For more information, see PCC-TF-2 Section or PCC_TF-2/Bindings.

CDA Document Content Modules

XDS Scanned Document Specification (XDS-SD OID)

CDA Header Content Modules

CDA Section Content Modules


CDA and HL7 Version 3 Entry Content Modules



  • Metadata CP to PCC (CDA Bindings to XDS Metadata - see here), along with any resolution of what is stated in XDS-SD for metadata mapping (especially - subtypes)
  • how and where to document? - movement towards a domain independent CDA R2-based content framework?