Difference between rungs two and three in the Ladder of Causation











up vote
10
down vote

favorite
7












In Judea Pearl's "Book of Why" he talks about what he calls the Ladder of Causation, which is essentially a hierarchy comprised of different levels of causal reasoning. The lowest is concerned with patterns of association in observed data (e.g., correlation, conditional probability, etc.), the next focuses on intervention (what happens if we deliberately change the data generating process in some prespecified way?), and the third is counterfactual (what would happen in another possible world if something had or had not happened)?



What I'm not understanding is how rungs two and three differ. If we ask a counterfactual question, are we not simply asking a question about intervening so as to negate some aspect of the observed world?










share|cite|improve this question
























  • Is this really on topic? Asking out of curiosity
    – Firebug
    Dec 1 at 19:53






  • 4




    @Firebug is causality on topic? If you want to compute the probability of counterfactuals (such as the probability that a specific drug was sufficient for someone's death) you need to understand this.
    – Carlos Cinelli
    Dec 1 at 19:55








  • 5




    twitter.com/yudapearl/status/1069533953223155713 !
    – Tim
    Dec 3 at 10:15















up vote
10
down vote

favorite
7












In Judea Pearl's "Book of Why" he talks about what he calls the Ladder of Causation, which is essentially a hierarchy comprised of different levels of causal reasoning. The lowest is concerned with patterns of association in observed data (e.g., correlation, conditional probability, etc.), the next focuses on intervention (what happens if we deliberately change the data generating process in some prespecified way?), and the third is counterfactual (what would happen in another possible world if something had or had not happened)?



What I'm not understanding is how rungs two and three differ. If we ask a counterfactual question, are we not simply asking a question about intervening so as to negate some aspect of the observed world?










share|cite|improve this question
























  • Is this really on topic? Asking out of curiosity
    – Firebug
    Dec 1 at 19:53






  • 4




    @Firebug is causality on topic? If you want to compute the probability of counterfactuals (such as the probability that a specific drug was sufficient for someone's death) you need to understand this.
    – Carlos Cinelli
    Dec 1 at 19:55








  • 5




    twitter.com/yudapearl/status/1069533953223155713 !
    – Tim
    Dec 3 at 10:15













up vote
10
down vote

favorite
7









up vote
10
down vote

favorite
7






7





In Judea Pearl's "Book of Why" he talks about what he calls the Ladder of Causation, which is essentially a hierarchy comprised of different levels of causal reasoning. The lowest is concerned with patterns of association in observed data (e.g., correlation, conditional probability, etc.), the next focuses on intervention (what happens if we deliberately change the data generating process in some prespecified way?), and the third is counterfactual (what would happen in another possible world if something had or had not happened)?



What I'm not understanding is how rungs two and three differ. If we ask a counterfactual question, are we not simply asking a question about intervening so as to negate some aspect of the observed world?










share|cite|improve this question















In Judea Pearl's "Book of Why" he talks about what he calls the Ladder of Causation, which is essentially a hierarchy comprised of different levels of causal reasoning. The lowest is concerned with patterns of association in observed data (e.g., correlation, conditional probability, etc.), the next focuses on intervention (what happens if we deliberately change the data generating process in some prespecified way?), and the third is counterfactual (what would happen in another possible world if something had or had not happened)?



What I'm not understanding is how rungs two and three differ. If we ask a counterfactual question, are we not simply asking a question about intervening so as to negate some aspect of the observed world?







causality






share|cite|improve this question















share|cite|improve this question













share|cite|improve this question




share|cite|improve this question








edited Dec 1 at 19:14

























asked Dec 1 at 18:25









dsaxton

9,58711537




9,58711537












  • Is this really on topic? Asking out of curiosity
    – Firebug
    Dec 1 at 19:53






  • 4




    @Firebug is causality on topic? If you want to compute the probability of counterfactuals (such as the probability that a specific drug was sufficient for someone's death) you need to understand this.
    – Carlos Cinelli
    Dec 1 at 19:55








  • 5




    twitter.com/yudapearl/status/1069533953223155713 !
    – Tim
    Dec 3 at 10:15


















  • Is this really on topic? Asking out of curiosity
    – Firebug
    Dec 1 at 19:53






  • 4




    @Firebug is causality on topic? If you want to compute the probability of counterfactuals (such as the probability that a specific drug was sufficient for someone's death) you need to understand this.
    – Carlos Cinelli
    Dec 1 at 19:55








  • 5




    twitter.com/yudapearl/status/1069533953223155713 !
    – Tim
    Dec 3 at 10:15
















Is this really on topic? Asking out of curiosity
– Firebug
Dec 1 at 19:53




Is this really on topic? Asking out of curiosity
– Firebug
Dec 1 at 19:53




4




4




@Firebug is causality on topic? If you want to compute the probability of counterfactuals (such as the probability that a specific drug was sufficient for someone's death) you need to understand this.
– Carlos Cinelli
Dec 1 at 19:55






@Firebug is causality on topic? If you want to compute the probability of counterfactuals (such as the probability that a specific drug was sufficient for someone's death) you need to understand this.
– Carlos Cinelli
Dec 1 at 19:55






5




5




twitter.com/yudapearl/status/1069533953223155713 !
– Tim
Dec 3 at 10:15




twitter.com/yudapearl/status/1069533953223155713 !
– Tim
Dec 3 at 10:15










1 Answer
1






active

oldest

votes

















up vote
11
down vote



accepted










There is no contradiction between the factual world and the action of interest in the interventional level. For example, smoking until today and being forced to quit smoking starting tomorrow are not in contradiction with each other, even though you could say one “negates” the other. But now imagine the following scenario. You know Joe, a lifetime smoker who has lung cancer, and you wonder: what if Joe had not smoked for thirty years, would he be healthy today? In this case we are dealing with the same person, in the same time, imagining a scenario where action and outcome are in direct contradiction with known facts.



Thus, the main difference of interventions and counterfactuals is that, whereas in interventions you are asking what will happen on average if you perform an action, in counterfactuals you are asking what would have happened had you taken a different course of action in a specific situation, given that you have information about what actually happened. Note that, since you already know what happened in the actual world, you need to update your information about the past in light of the evidence you have observed.



These two types of queries are mathematically distinct because they require different levels of information to be answered (counterfactuals need more information to be answered) and even more elaborate language to be articulated!.



With the information needed to answer Rung 3 questions you can answer Rung 2 questions, but not the other way around. More precisely, you cannot answer counterfactual questions with just interventional information. Examples where the clash of interventions and counterfactuals happens were already given here in CV, see this post and this post. However, for the sake of completeness, I will include an example here as well.



The example below can be found in Causality, section 1.4.4.



Consider that you have performed a randomized experiment where patients were randomly assigned (50% / 50%) to treatment ($x =1$) and control conditions ($x=0$), and in both treatment and control groups 50% recovered ($y=0$) and 50% died ($y=1$). That is $P(y|x) = 0.5~~~forall x,y$.



The result of the experiment tells you that the average causal effect of the intervention is zero. This is a rung 2 question, $P(Y = 1|do(X = 1)) - P(Y=1|do(X =0) = 0$.



But now let us ask the following question: what percentage of those patients who died under treatment would have recovered had they not taken the treatment? Mathematically, you want to compute $P(Y_{0} = 0|X =1, Y = 1)$.



This question cannot be answered just with the interventional data you have. The proof is simple: I can create two different causal models that will have the same interventional distributions, yet different counterfactual distributions. The two are provided below:



enter image description here



Here, $U$ amounts to unobserved factors that explain how the patient reacts to the treatment. You can think of factors that explain treatment heterogeneity, for instance. Note the marginal distribution $P(y, x)$ of both models agree.



Note that, in the first model, no one is affected by the treatment, thus the percentage of those patients who died under treatment that would have recovered had they not taken the treatment is zero.



However, in the second model, every patient is affected by the treatment, and we have a mixture of two populations in which the average causal effect turns out to be zero. In this example, the counterfactual quantity now goes to 100% --- in Model 2, all patients who died under treatment would have recovered had they not taken the treatment.



Thus, there's a clear distinction of rung 2 and rung 3. As the example shows, you can't answer counterfactual questions with just information and assumptions about interventions. This is made clear with the three steps for computing a counterfactual:





  1. Step 1 (abduction): update the probability of unobserved factors $P(u)$ in light of the observed evidence $P(u|e)$


  2. Step 2 (action): perform the action in the model (for instance $do(x))$.


  3. Step 3 (prediction): predict $Y$ in the modified model.


This will not be possible to compute without some functional information about the causal model, or without some information about latent variables.






share|cite|improve this answer























  • Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
    – mkt
    Dec 3 at 9:24










  • Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
    – mkt
    Dec 3 at 9:26












  • @mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
    – Carlos Cinelli
    Dec 3 at 9:34













Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "65"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f379799%2fdifference-between-rungs-two-and-three-in-the-ladder-of-causation%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
11
down vote



accepted










There is no contradiction between the factual world and the action of interest in the interventional level. For example, smoking until today and being forced to quit smoking starting tomorrow are not in contradiction with each other, even though you could say one “negates” the other. But now imagine the following scenario. You know Joe, a lifetime smoker who has lung cancer, and you wonder: what if Joe had not smoked for thirty years, would he be healthy today? In this case we are dealing with the same person, in the same time, imagining a scenario where action and outcome are in direct contradiction with known facts.



Thus, the main difference of interventions and counterfactuals is that, whereas in interventions you are asking what will happen on average if you perform an action, in counterfactuals you are asking what would have happened had you taken a different course of action in a specific situation, given that you have information about what actually happened. Note that, since you already know what happened in the actual world, you need to update your information about the past in light of the evidence you have observed.



These two types of queries are mathematically distinct because they require different levels of information to be answered (counterfactuals need more information to be answered) and even more elaborate language to be articulated!.



With the information needed to answer Rung 3 questions you can answer Rung 2 questions, but not the other way around. More precisely, you cannot answer counterfactual questions with just interventional information. Examples where the clash of interventions and counterfactuals happens were already given here in CV, see this post and this post. However, for the sake of completeness, I will include an example here as well.



The example below can be found in Causality, section 1.4.4.



Consider that you have performed a randomized experiment where patients were randomly assigned (50% / 50%) to treatment ($x =1$) and control conditions ($x=0$), and in both treatment and control groups 50% recovered ($y=0$) and 50% died ($y=1$). That is $P(y|x) = 0.5~~~forall x,y$.



The result of the experiment tells you that the average causal effect of the intervention is zero. This is a rung 2 question, $P(Y = 1|do(X = 1)) - P(Y=1|do(X =0) = 0$.



But now let us ask the following question: what percentage of those patients who died under treatment would have recovered had they not taken the treatment? Mathematically, you want to compute $P(Y_{0} = 0|X =1, Y = 1)$.



This question cannot be answered just with the interventional data you have. The proof is simple: I can create two different causal models that will have the same interventional distributions, yet different counterfactual distributions. The two are provided below:



enter image description here



Here, $U$ amounts to unobserved factors that explain how the patient reacts to the treatment. You can think of factors that explain treatment heterogeneity, for instance. Note the marginal distribution $P(y, x)$ of both models agree.



Note that, in the first model, no one is affected by the treatment, thus the percentage of those patients who died under treatment that would have recovered had they not taken the treatment is zero.



However, in the second model, every patient is affected by the treatment, and we have a mixture of two populations in which the average causal effect turns out to be zero. In this example, the counterfactual quantity now goes to 100% --- in Model 2, all patients who died under treatment would have recovered had they not taken the treatment.



Thus, there's a clear distinction of rung 2 and rung 3. As the example shows, you can't answer counterfactual questions with just information and assumptions about interventions. This is made clear with the three steps for computing a counterfactual:





  1. Step 1 (abduction): update the probability of unobserved factors $P(u)$ in light of the observed evidence $P(u|e)$


  2. Step 2 (action): perform the action in the model (for instance $do(x))$.


  3. Step 3 (prediction): predict $Y$ in the modified model.


This will not be possible to compute without some functional information about the causal model, or without some information about latent variables.






share|cite|improve this answer























  • Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
    – mkt
    Dec 3 at 9:24










  • Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
    – mkt
    Dec 3 at 9:26












  • @mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
    – Carlos Cinelli
    Dec 3 at 9:34

















up vote
11
down vote



accepted










There is no contradiction between the factual world and the action of interest in the interventional level. For example, smoking until today and being forced to quit smoking starting tomorrow are not in contradiction with each other, even though you could say one “negates” the other. But now imagine the following scenario. You know Joe, a lifetime smoker who has lung cancer, and you wonder: what if Joe had not smoked for thirty years, would he be healthy today? In this case we are dealing with the same person, in the same time, imagining a scenario where action and outcome are in direct contradiction with known facts.



Thus, the main difference of interventions and counterfactuals is that, whereas in interventions you are asking what will happen on average if you perform an action, in counterfactuals you are asking what would have happened had you taken a different course of action in a specific situation, given that you have information about what actually happened. Note that, since you already know what happened in the actual world, you need to update your information about the past in light of the evidence you have observed.



These two types of queries are mathematically distinct because they require different levels of information to be answered (counterfactuals need more information to be answered) and even more elaborate language to be articulated!.



With the information needed to answer Rung 3 questions you can answer Rung 2 questions, but not the other way around. More precisely, you cannot answer counterfactual questions with just interventional information. Examples where the clash of interventions and counterfactuals happens were already given here in CV, see this post and this post. However, for the sake of completeness, I will include an example here as well.



The example below can be found in Causality, section 1.4.4.



Consider that you have performed a randomized experiment where patients were randomly assigned (50% / 50%) to treatment ($x =1$) and control conditions ($x=0$), and in both treatment and control groups 50% recovered ($y=0$) and 50% died ($y=1$). That is $P(y|x) = 0.5~~~forall x,y$.



The result of the experiment tells you that the average causal effect of the intervention is zero. This is a rung 2 question, $P(Y = 1|do(X = 1)) - P(Y=1|do(X =0) = 0$.



But now let us ask the following question: what percentage of those patients who died under treatment would have recovered had they not taken the treatment? Mathematically, you want to compute $P(Y_{0} = 0|X =1, Y = 1)$.



This question cannot be answered just with the interventional data you have. The proof is simple: I can create two different causal models that will have the same interventional distributions, yet different counterfactual distributions. The two are provided below:



enter image description here



Here, $U$ amounts to unobserved factors that explain how the patient reacts to the treatment. You can think of factors that explain treatment heterogeneity, for instance. Note the marginal distribution $P(y, x)$ of both models agree.



Note that, in the first model, no one is affected by the treatment, thus the percentage of those patients who died under treatment that would have recovered had they not taken the treatment is zero.



However, in the second model, every patient is affected by the treatment, and we have a mixture of two populations in which the average causal effect turns out to be zero. In this example, the counterfactual quantity now goes to 100% --- in Model 2, all patients who died under treatment would have recovered had they not taken the treatment.



Thus, there's a clear distinction of rung 2 and rung 3. As the example shows, you can't answer counterfactual questions with just information and assumptions about interventions. This is made clear with the three steps for computing a counterfactual:





  1. Step 1 (abduction): update the probability of unobserved factors $P(u)$ in light of the observed evidence $P(u|e)$


  2. Step 2 (action): perform the action in the model (for instance $do(x))$.


  3. Step 3 (prediction): predict $Y$ in the modified model.


This will not be possible to compute without some functional information about the causal model, or without some information about latent variables.






share|cite|improve this answer























  • Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
    – mkt
    Dec 3 at 9:24










  • Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
    – mkt
    Dec 3 at 9:26












  • @mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
    – Carlos Cinelli
    Dec 3 at 9:34















up vote
11
down vote



accepted







up vote
11
down vote



accepted






There is no contradiction between the factual world and the action of interest in the interventional level. For example, smoking until today and being forced to quit smoking starting tomorrow are not in contradiction with each other, even though you could say one “negates” the other. But now imagine the following scenario. You know Joe, a lifetime smoker who has lung cancer, and you wonder: what if Joe had not smoked for thirty years, would he be healthy today? In this case we are dealing with the same person, in the same time, imagining a scenario where action and outcome are in direct contradiction with known facts.



Thus, the main difference of interventions and counterfactuals is that, whereas in interventions you are asking what will happen on average if you perform an action, in counterfactuals you are asking what would have happened had you taken a different course of action in a specific situation, given that you have information about what actually happened. Note that, since you already know what happened in the actual world, you need to update your information about the past in light of the evidence you have observed.



These two types of queries are mathematically distinct because they require different levels of information to be answered (counterfactuals need more information to be answered) and even more elaborate language to be articulated!.



With the information needed to answer Rung 3 questions you can answer Rung 2 questions, but not the other way around. More precisely, you cannot answer counterfactual questions with just interventional information. Examples where the clash of interventions and counterfactuals happens were already given here in CV, see this post and this post. However, for the sake of completeness, I will include an example here as well.



The example below can be found in Causality, section 1.4.4.



Consider that you have performed a randomized experiment where patients were randomly assigned (50% / 50%) to treatment ($x =1$) and control conditions ($x=0$), and in both treatment and control groups 50% recovered ($y=0$) and 50% died ($y=1$). That is $P(y|x) = 0.5~~~forall x,y$.



The result of the experiment tells you that the average causal effect of the intervention is zero. This is a rung 2 question, $P(Y = 1|do(X = 1)) - P(Y=1|do(X =0) = 0$.



But now let us ask the following question: what percentage of those patients who died under treatment would have recovered had they not taken the treatment? Mathematically, you want to compute $P(Y_{0} = 0|X =1, Y = 1)$.



This question cannot be answered just with the interventional data you have. The proof is simple: I can create two different causal models that will have the same interventional distributions, yet different counterfactual distributions. The two are provided below:



enter image description here



Here, $U$ amounts to unobserved factors that explain how the patient reacts to the treatment. You can think of factors that explain treatment heterogeneity, for instance. Note the marginal distribution $P(y, x)$ of both models agree.



Note that, in the first model, no one is affected by the treatment, thus the percentage of those patients who died under treatment that would have recovered had they not taken the treatment is zero.



However, in the second model, every patient is affected by the treatment, and we have a mixture of two populations in which the average causal effect turns out to be zero. In this example, the counterfactual quantity now goes to 100% --- in Model 2, all patients who died under treatment would have recovered had they not taken the treatment.



Thus, there's a clear distinction of rung 2 and rung 3. As the example shows, you can't answer counterfactual questions with just information and assumptions about interventions. This is made clear with the three steps for computing a counterfactual:





  1. Step 1 (abduction): update the probability of unobserved factors $P(u)$ in light of the observed evidence $P(u|e)$


  2. Step 2 (action): perform the action in the model (for instance $do(x))$.


  3. Step 3 (prediction): predict $Y$ in the modified model.


This will not be possible to compute without some functional information about the causal model, or without some information about latent variables.






share|cite|improve this answer














There is no contradiction between the factual world and the action of interest in the interventional level. For example, smoking until today and being forced to quit smoking starting tomorrow are not in contradiction with each other, even though you could say one “negates” the other. But now imagine the following scenario. You know Joe, a lifetime smoker who has lung cancer, and you wonder: what if Joe had not smoked for thirty years, would he be healthy today? In this case we are dealing with the same person, in the same time, imagining a scenario where action and outcome are in direct contradiction with known facts.



Thus, the main difference of interventions and counterfactuals is that, whereas in interventions you are asking what will happen on average if you perform an action, in counterfactuals you are asking what would have happened had you taken a different course of action in a specific situation, given that you have information about what actually happened. Note that, since you already know what happened in the actual world, you need to update your information about the past in light of the evidence you have observed.



These two types of queries are mathematically distinct because they require different levels of information to be answered (counterfactuals need more information to be answered) and even more elaborate language to be articulated!.



With the information needed to answer Rung 3 questions you can answer Rung 2 questions, but not the other way around. More precisely, you cannot answer counterfactual questions with just interventional information. Examples where the clash of interventions and counterfactuals happens were already given here in CV, see this post and this post. However, for the sake of completeness, I will include an example here as well.



The example below can be found in Causality, section 1.4.4.



Consider that you have performed a randomized experiment where patients were randomly assigned (50% / 50%) to treatment ($x =1$) and control conditions ($x=0$), and in both treatment and control groups 50% recovered ($y=0$) and 50% died ($y=1$). That is $P(y|x) = 0.5~~~forall x,y$.



The result of the experiment tells you that the average causal effect of the intervention is zero. This is a rung 2 question, $P(Y = 1|do(X = 1)) - P(Y=1|do(X =0) = 0$.



But now let us ask the following question: what percentage of those patients who died under treatment would have recovered had they not taken the treatment? Mathematically, you want to compute $P(Y_{0} = 0|X =1, Y = 1)$.



This question cannot be answered just with the interventional data you have. The proof is simple: I can create two different causal models that will have the same interventional distributions, yet different counterfactual distributions. The two are provided below:



enter image description here



Here, $U$ amounts to unobserved factors that explain how the patient reacts to the treatment. You can think of factors that explain treatment heterogeneity, for instance. Note the marginal distribution $P(y, x)$ of both models agree.



Note that, in the first model, no one is affected by the treatment, thus the percentage of those patients who died under treatment that would have recovered had they not taken the treatment is zero.



However, in the second model, every patient is affected by the treatment, and we have a mixture of two populations in which the average causal effect turns out to be zero. In this example, the counterfactual quantity now goes to 100% --- in Model 2, all patients who died under treatment would have recovered had they not taken the treatment.



Thus, there's a clear distinction of rung 2 and rung 3. As the example shows, you can't answer counterfactual questions with just information and assumptions about interventions. This is made clear with the three steps for computing a counterfactual:





  1. Step 1 (abduction): update the probability of unobserved factors $P(u)$ in light of the observed evidence $P(u|e)$


  2. Step 2 (action): perform the action in the model (for instance $do(x))$.


  3. Step 3 (prediction): predict $Y$ in the modified model.


This will not be possible to compute without some functional information about the causal model, or without some information about latent variables.







share|cite|improve this answer














share|cite|improve this answer



share|cite|improve this answer








edited Dec 3 at 9:50

























answered Dec 1 at 19:08









Carlos Cinelli

5,94442350




5,94442350












  • Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
    – mkt
    Dec 3 at 9:24










  • Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
    – mkt
    Dec 3 at 9:26












  • @mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
    – Carlos Cinelli
    Dec 3 at 9:34




















  • Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
    – mkt
    Dec 3 at 9:24










  • Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
    – mkt
    Dec 3 at 9:26












  • @mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
    – Carlos Cinelli
    Dec 3 at 9:34


















Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
– mkt
Dec 3 at 9:24




Interesting answer! A couple of follow-ups: 1) You say "With Rung 3 information you can answer Rung 2 questions, but not the other way around". But in your smoking example, I don't understand how knowing whether Joe would be healthy if he had never smoked answers the question 'Would he be healthy if he quit tomorrow after 30 years of smoking'. They seem like distinct questions, so I think I'm missing something.
– mkt
Dec 3 at 9:24












Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
– mkt
Dec 3 at 9:26






Also, your subsequent worked example relies on the 2 unobserved variables being nonrandomly distributed between the treatment and control. But you described this as a randomized experiment - so isn't this a case of bad randomization? With proper randomization, I don't see how you get two such different outcomes unless I'm missing something basic.
– mkt
Dec 3 at 9:26














@mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
– Carlos Cinelli
Dec 3 at 9:34






@mkt from last to first. The unobserved variable is randomly distrubuted between treated and control, you have exactly 50% of each category of u in both arms. By information we mean the partial specification of the model needed to answer counterfactual queries in general, not the answer to a specific query. To answer counterfactual queries you need the causal structure + some functional information or information of the distribution of the latent variables.
– Carlos Cinelli
Dec 3 at 9:34




















draft saved

draft discarded




















































Thanks for contributing an answer to Cross Validated!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.





Some of your past answers have not been well-received, and you're in danger of being blocked from answering.


Please pay close attention to the following guidance:


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fstats.stackexchange.com%2fquestions%2f379799%2fdifference-between-rungs-two-and-three-in-the-ladder-of-causation%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Quarter-circle Tiles

build a pushdown automaton that recognizes the reverse language of a given pushdown automaton?

Mont Emei