Introduction
Artificial intelligence (AI) is transforming healthcare and also advancing early disease detection, personalized treatment, and operational efficiency [1-6]. To fully leverage the potential of AI, it is crucial to differentiate between two primary categories of healthcare applications, prediction and causal inference, each requiring different methodologies.
Currently, AI is primarily used for prediction tasks [7]. Predictive models forecast future outcomes based on historical data, and identify patterns and correlations. These models are valuable for predicting patient readmission and chronic disease progression [8,9]. However, they have limitations, particularly in healthcare, where understanding the root cause of a condition is vital for effective treatment. As the role of AI in healthcare grows, the importance of causal inference is increasingly being recognized [10]. Unlike prediction, which focuses on what may happen, causal inference seeks to determine why something happens by identifying cause-and-effect relationships. This understanding is key for developing targeted interventions [11-13]. Although the predictions and causal inferences are complementary, they are not interchangeable. Both offer unique advantages in specific contexts (Fig. 1). The confusion of predictions with causal inferences can lead to significant errors and thus compromise patient care [14].
By appropriately applying both prediction and causal inference, we can enhance patient outcomes, improve decision-making, and advance healthcare systems. This review explores how AI, particularly predictive modeling and causal inference, can transform nephrology by advancing personalized healthcare. It highlights the strengths and limitations of these approaches, aiming to improve clinical decision-making and outcomes for pediatric kidney patients.
Prediction
Prediction methodology is not designed for inferring causal relationships [15]. Instead, the methodology focuses solely on accurately forecasting outcomes without necessarily understanding the underlying causes that drive these outcomes. In predictive modeling, the primary objective is to develop models that can deliver accurate predictions by identifying patterns and correlations in data, often without regard for whether these correlations represent true causal relationships.
In the context of healthcare, predictive AI algorithms are widely employed to anticipate various outcomes, such as the likelihood of a patient developing a particular disease or the potential for a specific adverse event occurring during treatment. Methods including random forests [16], boosting algorithms [17], support vector machines [18], and deep learning models [19] are commonly employed for these purposes. The algorithms are particularly valuable in scenarios involving large datasets, as they allow for the identification of complex patterns that might not be immediately evident when using traditional statistical methods. For instance, a random forest algorithm can analyze vast amounts of patient data to predict which individuals are at an elevated risk for certain diseases based on patterns identified in their medical histories, demographic data, and other relevant information [20]. Similarly, deep learning models, which are capable of learning intricate representations of data, are increasingly utilized for image recognition tasks, such as identifying abnormalities in medical imaging that might indicate the presence of tumors or other conditions [21].
However, while these predictive models may be highly accurate in forecasting outcomes, they do not inherently provide insights into the reason underlying a particular outcome [22]. These models excel in identifying "what" might happen rather than "why" it happens. For instance, a model might predict that patients with certain characteristics are at high risk for developing diabetes, but it may not clarify which specific factors are causally contributing to the development of the disease. This is because predictive models are fundamentally correlation-based; they identify associations between variables, and the associations do not necessarily imply causation. Furthermore, while predictive models are useful in identifying high-risk groups for certain conditions, they do not facilitate an understanding of the causal pathways that lead to these risks. This limitation is particularly significant in the medical field, where understanding causality is typically crucial for effective intervention. Without knowing the underlying causes, interventions based solely on predictions may not effectively address the root problems [23]. For example, predicting that a patient is at high risk of heart disease based on lifestyle factors is valuable, but without understanding the causal impact of each lifestyle factor, developing targeted strategies for prevention or treatment would be challenging.
Notably, many predictive methods, such as those mentioned above, can be adapted for causal inference purposes under certain conditions. For instance, machine learning models can be employed to estimate causal effects if they are combined with appropriate statistical techniques and study designs, such as instrumental variable analysis or propensity score matching [24,25]. However, such applications require a different set of methodologies and considerations that are beyond the scope of this review. Taken together, while prediction models are indispensable tools for identifying potential risks and outcomes in the medical field, they should not be conflated with causal inference models. The former models focus on correlation and pattern recognition to forecast outcomes, whereas the latter models seek to understand the cause-and-effect relationships that drive those outcomes. Distinguishing between these two approaches is essential to avoid misinterpretation and ensure that the most appropriate methodologies are applied to address specific clinical questions.
Causal inference
Causal inference is a methodology aimed at identifying cause-and-effect relationships, which is crucial for understanding the impact of specific interventions in various domains, including healthcare. In medicine, randomized controlled trials (RCTs) are considered the gold standard for causal inference [26]. The fundamental principle underlying RCTs is the random allocation of participants to different intervention arms, thereby creating comparable groups that can be directly contrasted to estimate the effect of the intervention. This randomization ensures that, on average, the two groups are statistically equivalent concerning both observed and unobserved confounders, thereby allowing for an accurate estimation of causal effects.
Although the RCT design is robust and is considered the most reliable method for causal inference, it is not without limitations. One significant constraint is that RCTs can only estimate the average treatment effect across the study population [27]. This approach does not account for the variability in treatment responses among individuals within the same group, i.e., the methodology cannot estimate the treatment effect for each individual (i.e., the individual treatment effect). In clinical practice, where personalized medicine is becoming increasingly important, understanding how individual patients might respond differently to the same treatment is critical. To address this, interest in estimating heterogeneous treatment effects, which aim to capture the variations in treatment response among subgroups or even at the individual level, is growing [28].
To overcome the limitations of RCTs in estimating heterogeneous treatment effects, various methodologies have been proposed, many of which leverage advanced AI techniques. For example, machine learning models, such as decision trees, random forests, and neural network algorithms, have been adapted to estimate treatment effects across different subgroups defined by covariates.
Local explanation methods
Recent developments in interpretable AI methods have provided tools for understanding and explaining the predictions of these complex models. Several local explanation methods, including Shapley values and Local Interpretable Model-Agnostic Explanations (LIME), have been developed to provide granular insights into how AI models arrive at their predictions [29,30]. Table 1 summarizes the Shapley values and LIME.
Shapley values
Shapley values, which originated from cooperative game theory, provide a solution for fairly distributing the "payout" (in this case, the model's prediction) among different features, based on their contribution to the prediction [29]. When applied to causal inference, Shapley values can help in identifying which variables (or features) are most influential in determining the predicted treatment effect for a particular individual. The strength of this method lies in its ability to offer a theoretically sound approach to feature attribution, ensuring that the contributions of all possible feature combinations are considered. The use of Shapley values is limited by their computational complexity, which can become prohibitively expensive for models with a large number of features or when using extensive datasets. However, the results should be interpreted with caution as they are limited to a specific dataset and they do not imply universal interpretability. Therefore, careful consideration is required when generalizing these findings. As an example utilizing Shapley values, Oh et al. [20] in their study used this method to identify key demographic factors in predicting coronary calcium scores and selected the most influential factors based on their contribution using Shapley values.
Local Interpretable Model-Agnostic Explanations
Another popular technique to interpret complex models is LIME. The technique involves approximating complex models with simple, interpretable models locally around the prediction of interest [30]. For instance, LIME can fit a linear model around a specific prediction to approximate the behavior of a more complex model in the local vicinity. This approach is particularly useful in understanding the local decision boundaries of black-box models. For example, Li et al. [31] developed and validated a machine learning model in order to predict mortality in critically ill patients with sepsis-associated acute kidney injury; the XGBoost algorithm performed best and they emphasized the use of LIME to interpret individualized predictions and enhance the model's transparency. However, LIME is limited by its reliance on the assumption that a linear model can adequately approximate the complex model locally, which may not always hold true, particularly in cases involving highly non-linear interactions between features.
Local explanation methods have other constraints. They often provide insights that are specific to a particular instance or individual prediction, which may not be generalizable across a broad population [32]. Furthermore, while they can suggest factors that are most influential in a model's prediction, they do not necessarily provide causal explanations or indicate which interventions might lead to desired outcomes [33].
Emulated RCTs in observational studies
Given the limitations and practical constraints of traditional RCTs, particularly in settings where randomization is not feasible or ethical, interest in emulated RCTs, also known as "quasi-experimental designs" or "observational causal inference" has increased [34,35]. These methods aim to replicate the conditions of an RCT using observational data by creating comparable groups that mimic the treatment and control arms in a randomized study.
A key strategy in emulated RCTs is to employ advanced statistical techniques, such as propensity score matching, inverse probability weighting, or regression discontinuity design, to balance the treatment and control groups in terms of observed covariates. This approach attempts to account for confounding variables and simulate the random assignment used in an RCT, thereby enabling a reliable estimation of causal effects from non-randomized data. The potential of using emulated RCTs in the medical field is considerable. They offer a practical solution for evaluating the effectiveness of interventions in real-world settings, where conducting an RCT may be logistically challenging or ethically problematic. For example, using electronic health records, researchers can construct a retrospective cohort study that closely resembles an RCT, making the assessment of treatment effects across different patient populations and clinical settings possible [36,37]. Moreover, with the integration of AI and machine learning, emulated RCTs can be further refined to improve their accuracy and validity. Machine learning algorithms can be employed to identify complex, non-linear relationships between variables and to enhance the precision of propensity score models, thereby reducing residual confounding. Additionally, AI techniques can facilitate the identification of suitable subpopulations for which certain treatments may be particularly effective, thus supporting personalized approaches to healthcare.
Taken together, while traditional RCTs remain the gold standard for causal inference, the emergence of emulated RCTs and advanced AI methodologies provides valuable alternatives for estimating treatment effects in situations where RCTs are not feasible. By leveraging these innovative approaches, deeper insights into causal relationships in healthcare can be achieved, ultimately improving patient outcomes and allowing effective clinical decision-making.
Conclusion
AI has rapidly emerged as a transformative force in healthcare, generating innovative approaches to disease prediction, diagnosis, and patient management. As highlighted in this review, AI applications in the medical field primarily belong to one of two categories: prediction or causal inference. Even though predictive models excel in forecasting outcomes based on historical data, they rely on identifying patterns and correlations without understanding the underlying causative factors, which presents limitations. Causal inference, on the other hand, seeks to address this gap by focusing on the cause-and-effect relationships that drive these outcomes. This inference provides a framework for understanding the mechanisms underlying observed patterns, enabling the development of targeted interventions that can directly influence patient health.
Emulated RCTs leverage observational data to mimic the conditions of randomized trials, providing a promising avenue for causal inference in scenarios where traditional RCTs are impractical. Combined with sophisticated AI and machine learning algorithms, these approaches can identify complex, non-linear relationships between variables, enhance the precision of causal estimates, and reduce residual confounding.
The convergence of prediction and causal inference in AI holds great promise for the future of healthcare. By integrating these approaches, we can better understand why outcomes happen and how to intervene effectively, leading to more personalized patient care. However, if prediction and causal inference are not properly distinguished, significant issues can arise. The future of medicine lies in harnessing AI’s potential not only for prediction and diagnosis but also for understanding and transforming patient care for the better.