AIR Datasets and Root Results - Proposal
1. Proposed Workitem: AIR Datasets and Root Results
- Proposal Editor: Kevin O’Donnell
- Editor: Kevin O’Donnell
- Domain: Radiology
The adoption of IHE Profiles in general, and AIR in particular, is harder without a robust set of example data to support development and testing.
Anticipating a growing quantity and sophistication of AI-generated image analysis results, effective interoperable organization methods are needed to help the clinicians navigate and absorb the information.
This workitem could both develop a robust set of example data and specify how to organize related result sets so they can be effectively used and navigated.
Doing this would facilitate (in two ways) adoption of standardized AI Result encoding.
2. The Problem
The AIR Profile specifies predictable encoding of image analysis results for reliable receipt, parsing and display by consumers (Image Displays).
First, while the AIR Profile includes some text examples at the end of Annex A, it can be challenging for implementers (per comments from Lynn Felhofer and Herman Oosterwijk) to understand how the specification applies in their case and to correctly create conformant objects without a fuller set of examples as digital datasets.
Second, broad adoption of image analysis AI poses the next problem for Image Displays, which is how to navigate large result datasets:
- Expanded use of AI algorithms may produce very large collections of results for a given study
- Image Displays need to present that information to radiologists/clinicians
- Large result sets have logical structure/hierarchy that would help clinicians navigate and review the data
- Where to start; “Root”/“Summary” findings
- “Summary” findings are supported/derived from sub-sets of “sub-findings” (“drill-down”, “next layer”)
- The open-ended nature of results being provided for display mean image displays will be hard pressed to organize it themselves
- Without such organization, navigating large result sets will be labor intensive for radiologists, leading them to either waste time or ignore information.
- The Image Display needs a simple way to access & leverage the hierarchy/structure
- The Creator likely knows that structure but needs a way to communicate it
- Advanced “data organizing” software could also create summaries and structure for results from multiple algorithms
3. Key Use Case
CT Lung Screening Example:
- An AI analysis package detects 8 nodules
- For each detected nodule,
- a segmentation algorithm generates a segmentation and a centroid location
- a third algorithm estimates the size, the solidity, the margin, and the LungRADS assessment
- it also generates an overall LungRADS™ score.
- another algorithm (outside the LungRADS package) generates a result that pneumonia is present
- and stores an associated saliency map
These roughly 40+ results are stored to the study (cardiac calcification and other screenings were not run on this study).
- The Lung Screening package stores a Root Result object with a root "layer 1 finding" of (LungRADS = Category 3)
- the root result references the "layer 2" findings (the 8 nodule locations and their LungRADS values)
- the root result references the "layer 3" findings for each nodule (segmentation, and assessments of the size, solidity, and margin)
- The Pneumonia application stores a Root Result object with a summary finding of (“pneumonia present”)
- that root result references the saliency map instance.
- The Image Display identifies two Root Result instances in the study and presents the layer 1 findings in the initial overlay
- LungRADS = Category 3
- Pneumonia present
- The radiologist may want to gain confidence in a layer 1 "summary finding", or to comprehend more details/nuances of the finding.
- The radiologist selects a layer 1 finding (LungRADS = Category 3) and its layer 2 findings are presented
- 8 nodule locations annotated with individual LungRADS scores
- The radiologist selects a layer 2 finding (one of the nodules) and it's layer 3 findings are presented
- a nodule segmentation
- the nodule size, solidity, and margin assessment
Might be interesting for the results to include a confidence and "potential significance" to assist in filtering/layering/organizing/prioritizing/progressive disclosure. Other navigation paradigms can be discussed. Some Display behaviors would likely be customizable to suit radiologist preferences. Ideally, some displays will develop much more sophisticated analysis and logic, or more advanced configurations, and more advanced navigation and display, while the Root Results provide a first simple step up from the flat list of findings.
The goal is to facilitate some basically useful navigation without Displays having to be customized for each AI algorithm. Similar to the use of primitives.
4. Standards and Systems
Consider DICOM SR analogous to Key Object Selection.
One piece of proposed work will be to create example datasets. (pilot the proposed improvement from the last retrospective)
A second piece of proposed work will be to assess how to make dataset organization tractable for relatively simple Image Displays.
5. Technical Approach
See Breakdown of Tasks details in the Prioritization Tab of the Evaluation Worksheet
This problem was identified during development of the AIR Profile and a “Root result” object was proposed.
- Some public comments supported the need and value
- Others comments challenged that the proposed mechanism had not been fully thought through or the potential complexities mapped out
Some form of DICOM SR object seems like a valid approach. Initial exploration ruled out simply using SR document titles, but there was not time to devise an appropriate structure.
An organizing object will be evaluated. If it seems workable, creating and using said object could be added as a Named Option in the AIR Profile for the Evidence Creator and Image Display actors.
There are expected to be multiple Root Result objects. Mandating a single result object catalog/index for all the results in the study would require continually revising the catalog object each time one of many algorithms stored new results to the study. It would be challenging, not to mention handling competing updates by when multiple algorithms happen to complete at the same time.
6. Support & Resources
<List groups that have expressed support for the proposal and resources that would be available to accomplish the tasks listed above.>
<Identify anyone who has indicated an interest in implementing/prototyping the Profile if it is published this cycle.>
<List real-world practical or political risks that could impede successfully fielding the profile.>
<Technical risks should be noted above under Uncertainties.>
8. Tech Cmte Evaluation
Effort Evaluation (as a % of Tech Cmte Bandwidth):
- xx% for MUE
- yy% for MUE + optional
- Kevin O'Donnell
- TBA <typically with a technical editor, the Subject Matter Expert will bring clinical expertise; in the (unusual) case of a clinical editor, the SME will bring technical expertise>