Every little thing you discovered about causal inference in academia is true. It’s additionally not sufficient, and most of us doing utilized causal inference expertise it.
, what’s totally different is the gravity of the selections that lean on the evaluation: not each resolution deserves the identical stage of proof. Match your rigour and causal inference to the gravity of the choice, or waste assets.
Take product discovery. Earlier than constructing and transport, many assumptions want validation at a number of steps. Aiming to nail every reply with excellent causal inference; for what? Shifting up one sq. on a board of many related, even obligatory, however on their very own inadequate selections. The chance is already unfold, hedged, over many choices, due to a course of that values incremental proof, studying and iterations.
Concurrently, causal inference comes with materials alternative price: the rigour requires delays time-to-impact, whereas there might have been a mission ready for you the place this rigour was really wanted to enhance the choice high quality (scale back danger, enhance accuracy and reliability)
Last vs. constructive selections is my go-to framing to make this concept easy:
- Constructive selections transfer you ahead in a course of. “Ought to we discover this function additional?”, “Is that this person downside value investigating?” Getting it fallacious prices you a dash, perhaps two, whereas getting it proper doesn’t change the corporate, but.
- Last selections commit assets or change course, and getting it fallacious is pricey or onerous to reverse: “Ought to we make investments $2M in constructing this out?” “Ought to we kill this product line?“, “Ought to we allocate extra advertising and marketing price range into this or that channel?“
In tech, the amount and tempo of selections is unparalleled. Typically, these are remaining selections. However far more frequent are constructive selections.
As knowledge scientists we’re concerned in each varieties, and failing to recognise after we are coping with one or the opposite results in posing the fallacious questions or chasing the fallacious solutions, losing assets, finally.
On this article I wish to floor three guidelines that I hold coming again to when embarking on causal inference initiatives:
- Begin with the issue, not with the reply
- In the event you can clear up it extra simply with out causal inference, do it
- Do 80/20 in your causal inference mission too
Guidelines hardly ever sound enjoyable. However these helped me enhance my impression by heaps, really.
Let’s unpack that.
1. Begin with the issue, not the reply
Each causal inference mission begins with the issue you’re attempting to resolve; not with the identification technique and the estimator. It’s the proper instance of doing the proper factor, over doing issues proper. Your strategies will be on level, however what’s the worth if you’re fixing for the fallacious factor? Nudge your self to kick off a mission with a crystal clear enterprise downside backing it up, and also you’d get 50% of labor is finished earlier than even beginning.
In the event you’re extremely technical, chances are high the anatomy of a causal inference mission: from DAG to mannequin, to inference, to sensitivity evaluation, and solutions.
However have you learnt the anatomy of downside fixing in organisations?
The issue behind the issue
Huge issues get damaged down into smaller ones. That’s simply extra workable for a group that should discover options. And it permits us to mobilise a number of groups to resolve totally different a part of the larger (sub) downside. The identical goes throughout roles inside one group: you’re estimating churn drivers; your PM wants that to determine whether or not to spend money on retention or acquisition.
That’s the problem: the issue you, the information scientist, are fixing is commonly not the endgame.
Your downside is nested inside another person’s. Different individuals, round you and above you, want your reply as one enter to their answer. Recognise that dependency, and you may tailor your causal inference to what really issues upstream. The wins are concrete: tighter alignment on the causal estimand of curiosity, or faster discarding of causal inference altogether. Backside-line: shorter time-to-insight.
One time I used to be into community concept (Markov Random Fields was what made me perceive DAGs again in 2018). Every little thing was a community in my head. So I went to make a community of our inner BI functionality utilization. All dashboards had been nodes and they might have thicker edges between them once they had been utilized by the identical customers. I calculated all types of centrality metrics; I recognized influential dashboards: dashboards that introduced departments collectively; and far more. I made a whole story round it, however actions by no means adopted. The difficulty was that I had by no means paid consideration to the issue my stakeholders had been attempting to resolve. Maybe I assumed the choice was of the remaining kind, whereas it was a constructive one all alongside. A easy rely of dashboard utilization might’ve achieved the job, however I handled it as a analysis mission.
That was me then. And it wasn’t the final time one thing like that occurred. However the lesson discovered is to begin with the issue, not with the solutions.
The anti-rule: trying on the fallacious issues
In order for you a fast method to throw away cash, then go clear up the fallacious issues. Not solely will the options don’t have any materials final result, but additionally the chance price of not fixing the proper downside in that point will add up.
So, in being keen to search out the issue behind the issue, be crucial about whether or not it’s the proper one to start, if you discover it.
In that sense, beginning with the solutions does provide the remedy. Nevertheless it goes barely in another way. Ask your self:
- If we do get these solutions, what do we all know that we didn’t know earlier than?
- If we all know that, then so-what?
If the reply to the so-what query makes quite a lot of sense, not solely to you, but additionally to your supervisor and their supervisor (presumably), you then’re on the proper downside.
Magical.
2. In the event you can clear up it extra simply with out causal inference, then do it
There’s no cookie-cutter causal inference. Strategies change into canonical as a result of we’ve mapped their assumptions effectively; not as a result of utilizing them is mechanical. Each scenario can violate these assumptions in its personal approach, and every one deserves full rigor.
The problem with that, although, is that we are able to’t justify doing so for all of them, resource-wise.
That’s when making use of causal inference turns into a cost-effective train: how a lot of the assets lets put in, in order that we attain the specified final result with some obligatory stage of confidence?
Ask your self that query subsequent time.
Fortunately, each evaluation wants to not be as rigorous as a full causal inference mission to make the return of funding tip over to the constructive aspect.
The options: frequent sense, area information, and associative evaluation, derive good-enough solutions too.
It positively hurts a bit to say this; principled and rigorous me hates me now. However I’ve discovered that it pays to strategy the trade-off as a strategic alternative.
Right here’s an instance to deliver it house:
The query is: ought to we make investments additional in function A? Now, I can simply flip this round to: what’s the impression of function A on person acquisition/retention? (a quite common angle to absorb a SaaS scenario; and a causal query at its coronary heart)
If it’s excessive, then we spend money on it, in any other case not.
That phrase impression alone places me straight right into a causal inference mode, as a result of impression ≠ affiliation. However we all know that’s pricey. Is the issue value it? What’s the choice?
One strategy is to grasp how many customers are utilizing this function in any respect. How frequent do they use it, provided that they selected to make use of it? That signifies how useful a function might be, and sign that we are able to additional make investments on this function. No diff-in-diff, nor IPSW, nor A/B take a look at: but when these solutions return unfavourable, would a exact causal inference matter nonetheless?
The reality could also be within the center; solutions to these query could also be extra indicative than decisive, and the principle query should really feel open. However absolutely, much less open than if you began: if these solutions ignite deeper analysis, then the product group is in movement, and certain within the course. Maybe extra rigorous causal inference follows.
The anti-rule: skipping causal inference is harmful
Say, the product group picks up the alerts out of your evaluation and makes some materials “enhancements” to the function. The pattern dimension is low and they’re quick on time, so that they skip the A/B take a look at and launch it instantly.
Fanatic experimenters lose it at this level. I feel that it could very effectively be the proper resolution, if any individual did the mathematics and concluded there’s extra at stakes to experiment, than to to not. After all I saved the case so generic nobody can really defend both aspect. That’d transcend the purpose.
However then, whereas the group jumps onto the subsequent dash, the product administration nonetheless stresses how essential it’s to study one thing from what they launched beforehand. They nonetheless wish to a) get a sense of the impression, and b) whether or not some segments the place impacted roughly than others.
You’re glad as a result of learnings -> iterations is precisely the mentality you are attempting to foster. However you’re additionally in ache for no less than three causes:
- Lack of exchangeability: that the customers that went on to make use of the function are a extremely self-selected set. Contrasting them in opposition to non-users. Actually?
- Interacting results: assume that one section was certainly impacted greater than others. Now recall the primary level: we’re conditioning on extremely engaged customers. It might be that that section displayed the next impression merely as a result of the customers had been additionally extremely engaged. The identical segments could not present that differential impression after we take into account decrease engaged customers. However you may’t know. You’re working knowledge is skewed in the direction of extremely engaged customers solely.
- Collider bias: in a worse case, conditioning on excessive engagement could flip across the relationship between segments and the result of curiosity. The evaluation would steer the group to the fallacious course.
3. Do 80/20 in your causal inference mission too
The title is a false buddy. I’m not saying half-bake your evaluation: when the query calls for full rigor, give it. The 80/20 is about the place your effort goes throughout a choice, not how deep you drill into the causal piece.
Recall the nested issues thought. Your causal inference mission typically sits inside a bigger enterprise resolution, and it hardly ever is the one dimension that issues. The stakeholder has to weigh price, timing, strategic match, reversibility; alongside your estimate. Causal inference will not be every part we have to know.
In case your causal reply carries 30% of the burden in that call, treating it like 100% is a waste. Worse: it’s a waste with a chance price, as a result of the opposite 70% sits unanswered.
That is the place the final-vs-constructive framing earns its hold. For constructive selections, spreading effort throughout dimensions nearly at all times beats drilling into one. For remaining selections, the causal dimension typically is the core, and the mathematics ideas the opposite approach.
Guidelines 1, 2, and three overlap however they don’t seem to be the identical. Rule 1 requested whether or not you’re tackling the proper downside. Rule 2 requested whether or not you want causal inference in any respect. Rule 3 assumes you’ve cleared each. Now the query is: inside the mission, are you answering the proper questions, plural, and letting causal inference carry solely the burden that’s really on it?
Ship the choice, not the estimate
A current mission: estimate the impact of a brand new pricing tier on income per person. Instinctively, I reached for the cleanest identification technique I might deploy. Distinction-in-differences with parallel-trends sensitivity, placebo exams, perhaps a synth management for good measure. A month’s work, simply.
However after I zoomed out, the PM had three open questions, not one:
- What’s the impact on income per person? (causal)
- Are we cannibalising the prevailing tier? (causal, totally different final result)
- How reversible is that this if it tanks? (not causal; an ops and product query)
Spending a month on query 1 would have left 2 and three half-answered. The choice wanted all three to be roughly proper, not one to be exactly proper. So: a tighter diff-in-diff on query 1 in two weeks, with express caveats, and the remaining time on 2 and three. The stakeholder walked into the choice assembly with a balanced image slightly than one quantity and two shrugs.
The anti-rule: when the causal query is the choice
In the event you 80/20 a causal inference mission the place the causal estimate is the entire resolution, you’ve hollowed out the evaluation.
That is the final-decision situation. “Ought to we make investments $2M on this channel?” “Does this remedy trigger a significant discount in churn?” When the opposite dimensions are both already nailed down or genuinely secondary, the causal estimate will not be one in every of many inputs; it’s the enter. Chopping corners there to liberate time for work that doesn’t change the choice inverts the unique rule: now you’re misallocating the opposite approach.
The talent is figuring out which scenario you’re in. A fast take a look at: when you can’t record three dimensions your stakeholder wants apart from your estimate, your causal reply most likely is the choice. Don’t 80/20 that one.
So, what now?
These guidelines apply throughout all analytical work, not simply causal inference. However causal inference is the place I’ve felt it the toughest in my previous roles.
Every time I really feel the pull of a clear synth management for a query no one requested, these are the reminders I tape to my very own brow:
The strategies come from finding out them. That’s one thing I gained’t cease. However on the market, on the battlefield, let’s be sharp on when making use of them does good, and when not.
If one in every of these guidelines prevent a dash subsequent time, or an argument with a PM, that’s already a win; and these wins compound. Rigour exhibits up when it issues. The remainder of your time goes to issues that additionally matter.
I’d be glad to have a dose of wholesome debating with you about all of the above. Join with me on LinkedIn, or comply with my private web site for content material like this!

