
Editor's Note: At the highly anticipated International Lymphoma Congress, top experts from around the globe gathered to discuss the latest advancements and future trends in lymphoma treatment. Among them, an innovative study on deep learning to predict the outcomes of chimeric antigen receptor T-cell (CAR-T) therapy garnered widespread attention. This research, led and presented by Professor Stephen Schuster from the University of Pennsylvania, was titled "FDG-PET/CT Imaging Analysis Based on Deep Learning Can Predict 12-Month Efficacy Outcomes of CAR-T-Cell Therapy."
Challenges of CAR-T-Cell Therapy and the Urgency of Prognostic Prediction
Chimeric antigen receptor T-cell (CAR-T) therapy has become one of the standard treatment options for patients with relapsed/refractory large B-cell lymphoma (R/R LBCL). However, despite its remarkable efficacy, not all patients achieve long-term remission. For instance, the JULIET pivotal Phase II trial of CD19-targeted CAR-T therapy Tisagenlecleucel showed that approximately one-third of patients treated in the third-line or later setting achieved long-term disease-free survival. Research data indicate that achieving complete remission (CR) after treatment is crucial for durable remission and long-term survival, but the timing of these CRs can be late and difficult to predict. For patients who achieve CR at any time point as their best overall response, the long-term survival rate is approximately 84%, with the curve plateauing around 12 months. Therefore, Professor Stephen Schuster and his team consider complete remission at 12 months to be a reasonable indicator for predicting long-term disease-free survival. However, the current clinical ability to predict which patients will achieve durable remission before treatment remains limited. Professor Stephen Schuster quoted the famous line from Hamlet – “There are more things in heaven and earth, Horatio, than are dreamt of in your philosophy” – to express the many unresolved mysteries in medicine, and believes that radiomics is such a promising field. He emphasized Dr. Giles’s view: “Images are not just pictures, they are data,” which provides a theoretical basis for utilizing the rich hidden information in medical images.
Radiomics: From “Pictures” to “Data”
Based on this concept, eight years ago, Professor Schuster’s team hypothesized that detailed quantitative data contained in radiology images extend far beyond the information we use to reconstruct recognizable anatomical and functional images of the human body. This additional data may contain prognostic information that aids clinical decision-making, meaning there are clinically significant features imperceptible to the naked eye. This study aimed to extract these image representation-independent data from fluorodeoxyglucose positron emission tomography/computed tomography (FDG PET/CT) or diagnostic CT scans and combine them with machine learning methods to correlate them with patient treatment outcomes. The research team chose CAR-T therapy as their starting point, not due to its uniqueness, but because they had accumulated a large volume of rigorously screened CAR-T patient imaging data. Their goal was to develop a computer-aided decision support system by retraining a publicly available neural network, AlexNet, and utilizing transfer learning and incremental machine learning techniques. Ultimately, they hoped to blindly analyze pre-treatment scans, without any clinical information, solely based on the model’s prediction of patient outcomes, and finally perform prospective validation.
Construction and Preliminary Validation of the Deep Learning Model
Professor Schuster detailed the model development process, which was based on retraining and incremental learning of the AlexNet neural network. The research team used historical data from CAR-T patients treated at the University of Pennsylvania, including diagnostic CT (low-dose CT) and pure FDG PET images, to build a model to predict the response of individual lesions. This process was extremely time-consuming, involving over 3,000 experiments. Since there were not 3,000 different patients, researchers instead analyzed 770 lymph node lesions from 39 CAR-T-cell study patients to develop their model.
Various input methods were attempted in the experiments, including complete slices of lesions on the axial plane, central slices of lesions, and central three layers of slices, and the computer was continuously retrained. The model’s objective was to output a binary classification result: whether complete remission was achieved at 12 months post-CAR-T treatment. Data showed that in predicting the response of individual lesions to CAR-T treatment, the model’s accuracy performed excellently, reaching approximately 85% to 90%, regardless of whether diagnostic CT, low-dose CT, or PET scans were used. The area under the receiver operating characteristic (ROC) curve (AUC) was as high as 90%, fully demonstrating the model’s strong predictive capability at the lesion level.
From Lesion-Level to Patient Outcome Prediction: Application of Rule-Based Reasoning
Given that clinical practice requires evaluating the entire patient rather than individual lesions, the research team introduced “rule-based reasoning,” specifically the “majority rule” principle, to extend lesion-level predictions to patient-level outcome predictions. They experimented with different majority rules, such as “all lymph nodes respond,” “more than 60% of lesions respond,” or “at least 70% of lesions respond.” Through this method, although patient-level accuracy decreased compared to individual lesion prediction, it still reached a respectable level. For example, with diagnostic CT scans, using the 70% rule, accuracy reached 81%. This preliminary result greatly encouraged Professor Schuster, validating the method’s potential in clinical application.
Large-Scale External Validation: Preliminary Demonstration of Model Generalization
To further validate the model’s generalization ability and clinical utility, the research team obtained 102 pre-treatment datasets from the Tisagenlecleucel Phase II JULIET study, provided by their sponsor. All patient information from the University of Pennsylvania was removed from these data, and the researchers were completely blinded to patient outcomes. After screening, some cases were excluded due to image quality or lack of discrete lymph node lesions, leaving 62 patients for algorithm evaluation. Notably, these imaging datasets came from 15 different scanner models, 3 manufacturers, 27 hospitals, and 10 different countries, strongly demonstrating the model’s good exportability and generalization potential.
In this external validation, Professor Schuster’s team primarily focused on the combined analysis of low-dose CT and PET images (three slices per lesion for each). The results showed that the model predicted CR at 12 months post-CAR-T treatment with a sensitivity of 77% and a specificity of 49%. As the number of non-responders was greater than responders, the balanced accuracy was 63%. The positive predictive value (PPV) was 37%, and the negative predictive value (NPV) was 85%. Although Professor Schuster acknowledged that this is still a “work in progress,” an NPV as high as 85% means the model can predict which patients are unlikely to respond with high confidence (85%), which has significant guiding implications for clinical decision-making.
Comparison with Traditional Biomarkers and Future Optimization Directions
The research team also compared the deep learning model’s prediction results with serum lactate dehydrogenase (LDH) levels, one of the most reliable prognostic factors in CAR-T therapy. Multivariate analysis from the JULIET trial showed that LDH was one of only two statistically independent factors. Kaplan-Meier curves demonstrated significantly worse progression-free survival (PFS) or overall survival (OS) for patients with elevated LDH, even below 10%. While LDH showed 100% specificity in predicting non-responders, as it had no false positives, its sensitivity was poor, and its positive predictive value was low, meaning many patients with normal LDH also failed to achieve remission. In contrast, the deep learning model showed higher sensitivity (77%), implying fewer false negatives, meaning it could identify more potential responders. Although specificity (49%) was relatively lower, leading to slightly more false positives, its potential for early screening of non-responders remains significant. Professor Schuster noted that this is an ongoing work, and they will continue to analyze diagnostic CT images and complete the evaluation of all cases. He believes that by combining clinical information such as LDH, histological type (e.g., transformed or non-transformed), and extranodal lesions with radiomic features extracted by deep learning, the model’s predictive accuracy will significantly improve.
Expert Opinion: Value and Outlook of AI in CAR-T Prognostic Prediction
Professor Stephen Schuster concluded that AI-based pre-treatment imaging prediction of CAR-T-cell therapy outcomes is feasible and cost-effective, requiring only the patient’s imaging disc. The current model’s average prediction accuracy for individual lymph node lesions is very high, reaching 85-90%, and can be extended to patient outcome prediction through rule-based reasoning. He emphasized the disruptive potential of this technology; although computers “see” things that humans often cannot understand because neural networks may analyze in 16 dimensions, far beyond human three- or even four-dimensional thinking capacity. During the Q&A session, an expert asked whether the model could predict which lymph nodes would respond and which would not, thereby guiding radiation oncologists to focus local treatment on non-responding lesions. Professor Schuster considered this a “brilliant idea” and pointed out that future algorithms might be lymphoma-specific and treatment-specific. He firmly believes that the application of artificial intelligence in radiomics and machine learning will revolutionize our ability to predict treatment outcomes. He further envisioned that if biomarkers like LDH were incorporated into the model, the balanced accuracy could reach 80-90%, which would be a “remarkable” achievement.
Conclusion and Future Outlook
Professor Stephen Schuster’s team’s research opens new avenues for precise prognostic prediction in CAR-T-cell therapy. By analyzing pre-treatment FDG PET/CT images using deep learning, it is expected to not only screen for patients most likely to benefit from CAR-T therapy before treatment but also identify patients who may require other interventions or closer monitoring. This innovative “non-invasive diagnostic” strategy will undoubtedly propel CAR-T-cell therapy into a more precise and individualized new era. With the integration of more clinical data and continuous model optimization, we have reason to expect more breakthrough results from AI in oncology imaging analysis and clinical decision support within the next two years.
Contribution/Interview Source: Oncology Outlook – Oncology News