Collaboration Between University of Tennessee and UT Medical Center Could Revolutionize Breast Cancer Diagnosis
Researchers in the University of Tennessee, Knoxville's Tickle College of Engineering are using artificial intelligence and machine learning to provide treatment plans to breast cancer patients more quickly by analyzing pathology reports and other clinical records to determine how much breast cancer is in the body. Breast cancer is the second most common cancer in Tennessee and the second most common cancer among women in the United States.
After someone is diagnosed with breast cancer, doctors must determine the extent of the disease, if it has spread or not and where it is located. This process is called staging. The stage describes how much cancer is present and determines how serious the cancer is and how best to treat it.
"It often takes humans up to two hours to stage one case of breast cancer, and now we can do it in one click," said Xueping Li, a professor and Dan Doulet Faculty Fellow in the Department of Industrial and Systems Engineering, co-director of the Health Innovation Technology and Simulation Lab and director of the Ideation Laboratory.
Current process could delay diagnosis
All tests run on a breast cancer patient end up in a multiple-page report. To stage the cancer, an oncology data specialist must pull up the report on one screen, open a cancer registry software program on another and manually sift through huge amounts of text.
The process can take up to two hours for a single pathology report—and when the pipeline slows due to workforce imbalances, it can take weeks for patients to learn what they're facing.
"Imagine the anxiety experienced by the patient and the patient's family after the cancer diagnosis," Li said. "They want you to tell them what's going on so they can think about a treatment plan, but they often have to wait days to learn more."
And the process isn't just inefficient, he added. It's also error-prone, with one in 10 diagnoses containing a pathological staging error and one in five a clinical staging error.
"Humans make mistakes," Li said. "It happens if they are stressed or if they're tired. And they're looking at five pages of reports, so it's easy to make an error."
Solving the problem
The collaboration between the university and UT Medical Center began after surgical oncologist John Bell, former director of the UT Medical Center Cancer Institute and a professor of surgery in the UT Health Science Center's College of Medicine, Knoxville, shared the issue with Li and some of his colleagues in ISE: Bing Yao, the Dan Doulet Early Career Assistant Professor and director of the Reliability and Maintainability Engineering Program, and Tom Berg, an assistant professor in the College of Nursing who has a joint appointment in the Tickle College of Engineering.
The UT researchers began with almost 300,000 pathology reports from across Tennessee, with personally identifying information like names and addresses removed. They needed to develop a process to accurately convert the scanned documents into a format that machines could interpret.
The first step, Li explained, was pre-processing the reports to remove any visual "noise" that made the text difficult for machines to read. A five-step procedure is required to convert a blurry scanned report into a cleaned binary image with enhanced quality suitable for an optical character recognition engine, which transforms image-based text into a format that machines can readily interpret and process.
The researchers used pretrained deep learning models (artificial intelligence frameworks CRAFT and CRNN) to make the character recognition more robust and reliable for clinical applications. By identifying character and word locations and recognizing the characters or words within the designated areas, these frameworks helped researchers address the challenges of documents that remained blurry or poorly formatted even after conversion to a cleaned binary image, Li said. Next, they wrote a program to extract key information from the machine-readable text pertaining to cancer characteristics and combine it with cancer staging rules defined by the American Joint Committee on Cancer.
Finally, they developed a user interface that integrates all the information for end-to-end cancer staging directly from the scanned pathology reports. The code in the program uses the same parameters set forth by the AJCC: the size and extent of the main or primary tumor, the number of lymph nodes involved with the cancer and whether the cancer has metastasized. But it's doing the work much faster.
"If you were to diagnose cancer stages manually from these 300,000 pages—if you did it by hand, and you didn't eat and didn't sleep—it would take about 14 years," Li said. "With these algorithms, we were able to do it in under one hour. "
The tool is not intended to replace humans, the researchers stressed, but to help ensure that the staging results manually reached by cancer registrars are correct.
"The technology we developed will do the staging automatically instead of manually. However, it also provides medical professionals the flexibility to review and validate the computer-generated results," Yao said. "We are providing a tool to support their work instead of replacing them."
More hope for cancer patients
The researchers have published one paper on their work and have a few more underway.
In the meantime, they are considering licensing and potentially commercializing the new technology. UT Medical Center may pilot a project with the researchers' prototype as well.
"The outcomes we are witnessing can only happen when you bring together the compassionate and visionary expertise of the College of Nursing with the problem-solving skills of the Tickle College of Engineering—two forces united by a shared mission and strengthened through partnership with UTMC," said Brad Day, associate vice chancellor for research innovation initiatives at UT. "These breakthroughs are not the product of siloed efforts but of bold interdisciplinary leadership committed to turning knowledge into impact and discovery into care."
Berg noted that while the program looks only at breast cancer reports, many kinds of cancers go through the same diagnosis methodology, "so the impact of this work for cancer patients could be considerable."
In a state with above-average rates of cancer incidence and cancer mortality, solutions like these could save lives.
"Can you imagine the day when a computer is able to assist the clinicians and the oncology data specialist in real time, accurately staging any cancer, helping to quickly suggest treatment options, and also giving the patient an idea about success of treatment and risk of recurrence?" asked Bell. "Now that would be a game changer!"
And the researchers don't want to stop there. Yao and her students are exploring how artificial intelligence and machine learning could help physicians predict whether a patient's cancer will return.
"We have a wide range of tools that can be applied to address many complex problems," said Yao. "Dr. Bell introduced this problem to us, and we know the solutions are clinically meaningful. That makes this work great for us."
Provided by University of Tennessee at Knoxville