Difference between revisions of "AIR Datasets and Root Results - Proposal"

From IHE Wiki
Jump to navigation Jump to search
 
(16 intermediate revisions by the same user not shown)
Line 5: Line 5:
 
* Editor: Kevin O’Donnell  
 
* Editor: Kevin O’Donnell  
 
* Domain: Radiology
 
* Domain: Radiology
 +
 +
===Summary===
 +
The adoption of IHE Profiles in general, and AIR in particular, is harder without a robust set of example data to support development and testing.
 +
 +
Anticipating a growing quantity and sophistication of AI-generated image analysis results, effective interoperable organization methods are needed to help the clinicians navigate and absorb the information.
 +
 +
This workitem could both develop a robust set of example data and specify how to organize related result sets so they can be effectively used and navigated.
 +
 +
Doing this would facilitate (in two ways) adoption of standardized AI Result encoding.
  
 
==2. The Problem==
 
==2. The Problem==
Line 29: Line 38:
 
* An AI analysis package '''detects 8 nodules'''  
 
* An AI analysis package '''detects 8 nodules'''  
 
* For each detected nodule,  
 
* For each detected nodule,  
:* a segmentation algorithm generates a '''segmentation'''
+
:* a segmentation algorithm generates a '''segmentation''' and a centroid '''location'''
 
:* a third algorithm estimates the '''size''', the '''solidity''', the '''margin''', and the '''LungRADS assessment'''
 
:* a third algorithm estimates the '''size''', the '''solidity''', the '''margin''', and the '''LungRADS assessment'''
 
* it also generates an '''overall LungRADS™ score'''.  
 
* it also generates an '''overall LungRADS™ score'''.  
Line 35: Line 44:
 
:* and stores an associated '''saliency map'''   
 
:* and stores an associated '''saliency map'''   
  
These roughly 40+ results are stored to the study ( consisting of location, size, solidity, margin, and LungRADS for each of 8 nodules, an overall LungRADS score, the pneumonia finding, and the pneumonia saliency map reference.  The cardiac calcification and other screening were not run on this study.
+
These roughly 40+ results are stored to the study (cardiac calcification and other screenings were not run on this study).
  
The Lung Screening package stores a Root Result object with a root finding of (LungRADS = Category 3), with links to the next layer findings (the 8 nodules and their LungRADS values) which in turn are linked to locations, segmentations, and assessments of the size, solidity, and margin of each detected nodule.
 
  
The Pneumonia application stores a Root Result object with a summary finding of (“pneumonia present”), and a link/reference to the saliency map instance.
+
Strawman concept:
 +
* The Lung Screening package stores a Root Result object with a root "layer 1 finding" of (LungRADS = Category 3)
 +
:* the root result references the "layer 2" findings (the 8 nodule locations and their LungRADS values)
 +
:* the root result references the "layer 3" findings for each nodule (segmentation, and assessments of the size, solidity, and margin)
 +
* The Pneumonia application stores a Root Result object with a summary finding of (“pneumonia present”)
 +
:* that root result references the saliency map instance.
  
The Image Display first filters for Root Results instances and presents the two root findings to the radiologist.  
+
* The Image Display identifies two Root Result instances in the study and presents the layer 1 findings in the initial overlay
(that have a document title of (Newcode0,99IHE,”Analysis Root Result”) allowing consumers like Image Displays to identify them in a study. Root result instances then contain subordinate results (e.g., measurements and locations) and/or reference other instances that contain subordinate results.
+
:* '''LungRADS = Category 3'''
 +
:* '''Pneumonia present'''
 +
* The radiologist may want to gain confidence in a layer 1 "summary finding", or to comprehend more details/nuances of the finding.
 +
* The radiologist selects a layer 1 finding (LungRADS = Category 3) and its layer 2 findings are presented
 +
:* '''8 nodule locations annotated with individual LungRADS scores'''
 +
* The radiologist selects a layer 2 finding (one of the nodules) and it's layer 3 findings are presented
 +
:* '''a nodule segmentation'''
 +
:* '''the nodule size, solidity, and margin assessment'''
  
In the case of the two examples above, an Image Display might initially present the two root results and (Pneumonia present) followed by “drill down
 
''<Feel free to add a second use case scenario demonstrating how it “should” work.  Try to indicate the people/systems, the tasks they are doing, the information they need, and hopefully where the information should come from.>''
 
How to filter, how to start from “top level” details, how to step through, how to drill down,
 
  
It seems likely that displays might leverage component hierarchy by first presenting a summary “root” result or key value to an imaging clinician and offer the ability to explore additional layers of detail as needed, for example to next review the segmentation that underlies a volume measurement, or initially display a LungRADS score, and allow the imaging clinician to choose to expose the nodule evaluations it was based on. This exploration might be done to gain greater confidence in the summary result, or to comprehend more details and nuances of the finding(s).
+
Might be interesting for the results to include a confidence and "potential significance" to assist in filtering/layering/organizing/prioritizing/progressive disclosure. Other navigation paradigms can be discussed. Some Display behaviors would likely be customizable to suit radiologist preferences. Ideally, some displays will develop much more sophisticated analysis and logic, or more advanced configurations, and more advanced navigation and display, while the Root Results provide a first simple step up from the flat list of findings.
  
Work will continue to explore the use of some kind of Key Object Selection type of instance that summarizes/organizes/serves as an entry point for a set of results in the study. These objects should be readily filterable/findable by a simple display that can use the top level to communicate the summary finding to a user and then let the user choose to display additional lower levels within the “tree” that would likely contain supporting details.
+
The goal is to facilitate some basically useful navigation without Displays having to be customized for each AI algorithm. Similar to the use of primitives.
  
The expectation is that there could be multiple “root results” in a study. It would not make sense to mandate that a single result catalog object index all the results in the study, since that would require continually revising the catalog object each time one of many algorithms stored new results to the study. It would be challenging, not to mention handling competing updates by when multiple algorithms happen to complete at the same time.
+
==4. Standards and Systems==
 +
Consider DICOM SR analogous to Key Object Selection.
  
with information to help them organize and prioritize the result “entries” listed/communicated to the operator at one time, this profile introduces Root Results that communicate a basic hierarchy for a set of related results.
+
'''One piece of proposed work will be to create example datasets.''' (pilot the proposed improvement from the last retrospective)
  
The Root Results described in Section X.4.1.2a allow a display to use a mechanical filter (on Document Title) to get a first-order set of summary findings. The references in each of those Root Results provides a logical next layer of detail. Ideally, some displays will develop much more sophisticated analysis and logic, or more advanced configurations, and more advanced navigation and display, while the Root Results provide a first simple step up from the flat list of findings.
+
'''A second piece of proposed work''' will be to assess how to '''make dataset organization tractable for relatively simple Image Displays'''.
  
Might be interesting for the result to include a confidence and potential significance as well to assist in filtering/layering/progressive disclosure
+
==5. Technical Approach==
 
+
See '''Breakdown of Tasks''' details in the '''[https://docs.google.com/spreadsheets/d/18wETLZhLYcXq5pOMfAhH8-Q4M5cKztj6QNUEOAcOnng/edit?usp=sharing Prioritization Tab of the Evaluation Worksheet]'''
==4. Standards and Systems==
 
DICOM SR analogous to Key Object Selection seems like a valid approach.
 
  
==5. Discussion==
 
 
This problem was identified during development of the AIR Profile and a “Root result” object was proposed.
 
This problem was identified during development of the AIR Profile and a “Root result” object was proposed.
 
* Some public comments supported the need and value
 
* Some public comments supported the need and value
 
* Others comments challenged that the proposed mechanism had not been fully thought through or the potential complexities mapped out
 
* Others comments challenged that the proposed mechanism had not been fully thought through or the potential complexities mapped out
  
Some form of DICOM SR object seemed like a valid approach, but TODO
+
Some form of DICOM SR object seems like a valid approach. Initial exploration ruled out simply using SR document titles, but there was not time to devise an appropriate structure.
 +
 
 +
An organizing object will be evaluated. If it seems workable, creating and using said object could be added as a Named Option in the AIR Profile for the Evidence Creator and Image Display actors.
 +
 
 +
There are expected to be multiple Root Result objects.  Mandating a single result object catalog/index for all the results in the study would require continually revising the catalog object each time one of many algorithms stored new results to the study. It would be challenging, not to mention handling competing updates by when multiple algorithms happen to complete at the same time.
 +
 
 +
==6. Support & Resources==
 +
''<List groups that have expressed support for the proposal and resources that would be available to accomplish the tasks listed above.>''
 +
 
 +
''<Identify anyone who has indicated an interest in implementing/prototyping the Profile if it is published this cycle.>''
 +
 
 +
==7. Risks==
 +
''<List real-world practical or political risks that could impede successfully fielding the profile.>''
 +
 
 +
''<Technical risks should be noted above under Uncertainties.>''
 +
 
 +
==8. Tech Cmte Evaluation==
  
One piece of proposed work will be to create example datasets.
+
Effort Evaluation (as a % of Tech Cmte Bandwidth):
 +
:* xx% for MUE
 +
:* yy% for MUE + optional
  
A second piece of proposed work will be to assess how to make dataset organization tractable for relatively simple Image Displays.  An organizing object will be evaluated. If it seems workable, creating and using said object could be added as a Named Option in the AIR Profile for the Evidence Creator and Image Display actors.
+
Editor:
 +
: Kevin O'Donnell
  
''<This is the brief proposal.  Try to keep it to 1 or at most 2 pages>''
+
SME/Champion:
 +
: TBA ''<typically with a technical editor, the Subject Matter Expert will bring clinical expertise; in the (unusual) case of a clinical editor, the SME will bring technical expertise>''

Latest revision as of 21:06, 19 November 2021

1. Proposed Workitem: AIR Datasets and Root Results

  • Proposal Editor: Kevin O’Donnell
  • Editor: Kevin O’Donnell
  • Domain: Radiology

Summary

The adoption of IHE Profiles in general, and AIR in particular, is harder without a robust set of example data to support development and testing.

Anticipating a growing quantity and sophistication of AI-generated image analysis results, effective interoperable organization methods are needed to help the clinicians navigate and absorb the information.

This workitem could both develop a robust set of example data and specify how to organize related result sets so they can be effectively used and navigated.

Doing this would facilitate (in two ways) adoption of standardized AI Result encoding.

2. The Problem

The AIR Profile specifies predictable encoding of image analysis results for reliable receipt, parsing and display by consumers (Image Displays).

First, while the AIR Profile includes some text examples at the end of Annex A, it can be challenging for implementers (per comments from Lynn Felhofer and Herman Oosterwijk) to understand how the specification applies in their case and to correctly create conformant objects without a fuller set of examples as digital datasets.

Second, broad adoption of image analysis AI poses the next problem for Image Displays, which is how to navigate large result datasets:

  • Expanded use of AI algorithms may produce very large collections of results for a given study
  • Image Displays need to present that information to radiologists/clinicians
  • Large result sets have logical structure/hierarchy that would help clinicians navigate and review the data
  • Where to start; “Root”/“Summary” findings
  • “Summary” findings are supported/derived from sub-sets of “sub-findings” (“drill-down”, “next layer”)
  • The open-ended nature of results being provided for display mean image displays will be hard pressed to organize it themselves
  • Without such organization, navigating large result sets will be labor intensive for radiologists, leading them to either waste time or ignore information.

So:

  • The Image Display needs a simple way to access & leverage the hierarchy/structure
  • The Creator likely knows that structure but needs a way to communicate it
  • Advanced “data organizing” software could also create summaries and structure for results from multiple algorithms

3. Key Use Case

CT Lung Screening Example:

  • An AI analysis package detects 8 nodules
  • For each detected nodule,
  • a segmentation algorithm generates a segmentation and a centroid location
  • a third algorithm estimates the size, the solidity, the margin, and the LungRADS assessment
  • it also generates an overall LungRADS™ score.
  • another algorithm (outside the LungRADS package) generates a result that pneumonia is present
  • and stores an associated saliency map

These roughly 40+ results are stored to the study (cardiac calcification and other screenings were not run on this study).


Strawman concept:

  • The Lung Screening package stores a Root Result object with a root "layer 1 finding" of (LungRADS = Category 3)
  • the root result references the "layer 2" findings (the 8 nodule locations and their LungRADS values)
  • the root result references the "layer 3" findings for each nodule (segmentation, and assessments of the size, solidity, and margin)
  • The Pneumonia application stores a Root Result object with a summary finding of (“pneumonia present”)
  • that root result references the saliency map instance.
  • The Image Display identifies two Root Result instances in the study and presents the layer 1 findings in the initial overlay
  • LungRADS = Category 3
  • Pneumonia present
  • The radiologist may want to gain confidence in a layer 1 "summary finding", or to comprehend more details/nuances of the finding.
  • The radiologist selects a layer 1 finding (LungRADS = Category 3) and its layer 2 findings are presented
  • 8 nodule locations annotated with individual LungRADS scores
  • The radiologist selects a layer 2 finding (one of the nodules) and it's layer 3 findings are presented
  • a nodule segmentation
  • the nodule size, solidity, and margin assessment


Might be interesting for the results to include a confidence and "potential significance" to assist in filtering/layering/organizing/prioritizing/progressive disclosure. Other navigation paradigms can be discussed. Some Display behaviors would likely be customizable to suit radiologist preferences. Ideally, some displays will develop much more sophisticated analysis and logic, or more advanced configurations, and more advanced navigation and display, while the Root Results provide a first simple step up from the flat list of findings.

The goal is to facilitate some basically useful navigation without Displays having to be customized for each AI algorithm. Similar to the use of primitives.

4. Standards and Systems

Consider DICOM SR analogous to Key Object Selection.

One piece of proposed work will be to create example datasets. (pilot the proposed improvement from the last retrospective)

A second piece of proposed work will be to assess how to make dataset organization tractable for relatively simple Image Displays.

5. Technical Approach

See Breakdown of Tasks details in the Prioritization Tab of the Evaluation Worksheet

This problem was identified during development of the AIR Profile and a “Root result” object was proposed.

  • Some public comments supported the need and value
  • Others comments challenged that the proposed mechanism had not been fully thought through or the potential complexities mapped out

Some form of DICOM SR object seems like a valid approach. Initial exploration ruled out simply using SR document titles, but there was not time to devise an appropriate structure.

An organizing object will be evaluated. If it seems workable, creating and using said object could be added as a Named Option in the AIR Profile for the Evidence Creator and Image Display actors.

There are expected to be multiple Root Result objects. Mandating a single result object catalog/index for all the results in the study would require continually revising the catalog object each time one of many algorithms stored new results to the study. It would be challenging, not to mention handling competing updates by when multiple algorithms happen to complete at the same time.

6. Support & Resources

<List groups that have expressed support for the proposal and resources that would be available to accomplish the tasks listed above.>

<Identify anyone who has indicated an interest in implementing/prototyping the Profile if it is published this cycle.>

7. Risks

<List real-world practical or political risks that could impede successfully fielding the profile.>

<Technical risks should be noted above under Uncertainties.>

8. Tech Cmte Evaluation

Effort Evaluation (as a % of Tech Cmte Bandwidth):

  • xx% for MUE
  • yy% for MUE + optional

Editor:

Kevin O'Donnell

SME/Champion:

TBA <typically with a technical editor, the Subject Matter Expert will bring clinical expertise; in the (unusual) case of a clinical editor, the SME will bring technical expertise>