Wednesday, January 18, 2023
HomeData ScienceAn Intuitive Rationalization for Inverse Propensity Weighting in Causal Inference | by...

An Intuitive Rationalization for Inverse Propensity Weighting in Causal Inference | by Murat Unal | Jan, 2023


Picture by Diego PH on Unsplash

One of many well-established strategies for causal inference is predicated on the Inverse Propensity Weighting (IPW). On this publish we’ll use a easy instance to construct an instinct for IPW. Particularly, we’ll see how IPW is derived from a easy weighted common as a way to account for various remedy project charges in causal analysis.

Let’s take into account the easy instance the place we wish to estimate the common impact of working a advertising coupon marketing campaign on buyer spending. We run the marketing campaign in two shops by randomly assigning a coupon to current prospects. Suppose each shops have identical variety of prospects and, unknown to us, spending amongst handled prospects is distributed as N(20,3²) and N(40,3²) in shops 1 and a pair of, respectively.

All through the instance Yi​(1) represents a person’s spending in the event that they obtain a coupon, Ti​=1, and Yi​(0) represents their spending in the event that they don’t, Ti​=0. These random variables are referred to as potential outcomes. The noticed final result Yi​ is said to potential outcomes as follows:

Our estimand, the factor that we wish to estimate, is the inhabitants imply spending given a coupon, E[Yi​(1)]. If we randomly assign coupons to the identical variety of prospects in each shops, we will get an unbiased estimate of this by merely averaging the noticed spending of the handled prospects, which is 0.5∗$20+0.5∗$40=$30.

Mathematically, this appears as follows:

the place the primary equation is as a result of potential outcomes, and the final equation follows from random project of remedy, which makes potential outcomes unbiased of remedy project:

Easy Common

Let’s outline a perform that generates a pattern of 2000 prospects, randomly assigns 50% of them to remedy in each shops, and information their common spending. Let’s additionally run a simulation that calls this perform for 1000 instances.

def run_campaign(biased=False):
true_mu1treated , true_mu2treated = 20 , 40
n, p , obs = 1, .5 , 2000 # variety of trials, likelihood of every trial,
# variety of observations
retailer = np.random.binomial(n, p, obs)+1
df = pd.DataFrame({'retailer':retailer})
probtreat1 = .5

if biased:
probtreat2 = .9
else:
probtreat2 = .5

deal with = lambda x: int(np.random.binomial(1, probtreat1, 1))
if x==1 else int(np.random.binomial(1, probtreat2, 1))

spend = lambda x: float(np.random.regular(true_mu1treated, 3,1))
if (x[0]==1 and x[1]==1)
else ( float(np.random.regular(true_mu2treated, 3,1) ) )

df['treated'] = df['store'].apply(deal with)
df['spend'] = df[['store','treated']].apply(tuple,1).apply(spend)

simple_value_treated = np.imply(df.question('handled==1')['spend'])

return [simple_value_treated]

sim = 1000
values = Parallel(n_jobs=4)(delayed(run_campaign)() for _ in tqdm(vary(sim)))
results_df = pd.DataFrame(values, columns=['simple_treat'])

The next plot reveals that the distribution of the common spending is centered across the true imply.

Determine 1 by creator

Now, suppose for some motive the second retailer assigned coupons to 90% of the shoppers, whereas the primary retailer assigned it to 50%. What occurs if we ignore this and use the identical strategy as beforehand and take a median of all handled prospects’ spending? As a result of prospects of the second retailer have the next remedy charge, their common spending will take a bigger weight in our estimate and thereby end in an upward bias.

In different phrases, we not have a really randomized experiment as a result of the likelihood of receiving a coupon now will depend on the shop. Furthermore, as a result of handled prospects within the two shops even have considerably totally different spending on common, the shop a buyer belongs to is a confounding variable in causal inference communicate.

Mathematically, if we use the easy common spending of handled prospects, this time, as an alternative of getting this:

we find yourself with this:

Certainly, repeating the simulation and plotting the outcomes, we see that the distribution of the common spending is now centered removed from the true imply.

sim = 1000
values = Parallel(n_jobs=4)(delayed(run_campaign)(biased=True) for _ in tqdm(vary(sim)) )
results_df = pd.DataFrame(values, columns=['simple_treat'])
Determine 2 by creator

Weighted Common

All is just not misplaced, nonetheless. Since we all know that our experiment was tousled as a result of project charges have been totally different between shops, we will appropriate it by taking a weighted common of handled prospects’ spending, the place weights characterize the proportion of shoppers in every retailer. This implies, we will reclaim random project of remedy as soon as we situation on the shop data:

the place Xi represents retailer membership of buyer i

and procure unbiased estimates of our causal estimand, E[Yi(1)].

The maths now works as follows:

the place the primary equation is as a result of regulation of iterated expectations and the second is because of conditional independence.

Let n1 and n2 denote the variety of prospects in each shops. Equally, let n1T and n2T characterize the variety of handled prospects in each shops. Then the above estimator might be computed from the information as follows:

Positive sufficient, if we repeat the earlier sampling course of

def run_campaign2():
true_mu1treated , true_mu2treated = 20, 40
n, p , obs = 1, .5 , 2000 # variety of trials, likelihood of every trial,
# variety of observations
retailer = np.random.binomial(n, p, obs)+1
df = pd.DataFrame({'retailer':retailer})

probtreat1 = .5
probtreat2 = .9

deal with = lambda x: int(np.random.binomial(1, probtreat1, 1))
if x==1 else int(np.random.binomial(1, probtreat2, 1))

spend = lambda x: float(np.random.regular(true_mu1treated, 3, 1))
if (x[0]==1 and x[1]==1)
else ( float(np.random.regular(true_mu2treated, 3, 1) ) )

df['treated'] = df['store'].apply(deal with)
df['spend'] = df[['store','treated']].apply(tuple,1).apply(spend)

simple_value_treated = np.imply(df.question('handled==1')['spend'])

prob1 = df.question('retailer==1').form[0]/df.form[0]
prob2 = df.question('retailer==2').form[0]/df.form[0]

est_mu1treated = np.imply(df.question('handled==1 & retailer==1')['spend'])
est_mu2treated = np.imply(df.question('handled==1 & retailer==2')['spend'])

weighted_value_treated = prob1*est_mu1treated + prob2*est_mu2treated

return [simple_value_treated, weighted_value_treated]

sim = 1000
values = Parallel(n_jobs=4)(delayed(run_campaign2)() for _ in tqdm(vary(sim)) )
results_df = pd.DataFrame(values, columns=['simple_treat','weighted_treat'])

we see that the common of weighted averages is once more proper on the true imply.

Determine 3 by creator

IPW

Let’s now do some algebraic manipulation by rewriting the imply spending in retailer 1:

Doing the identical for retailer 2 and plugging them again in we have now the next:

Denote the proportion of handled prospects in retailer 1 as

and equally for retailer 2, then we will simplify the earlier equation into:

the place p(Xi) is the likelihood of receiving remedy conditional on the confounding variable, aka the propensity rating,

Discover, we began with one weighted common and ended up with simply one other weighted common that makes use of

as weights. That is the well-known inverse propensity weighted estimator.

Working the earlier evaluation with this estimator

def run_campaign3():
true_mu1treated , true_mu2treated = 20, 40
n, p , obs = 1, .5 , 2000 # variety of trials, likelihood of every trial,
# variety of observations
retailer = np.random.binomial(n, p, obs)+1
df = pd.DataFrame({'retailer':retailer})

probtreat1 = .5
probtreat2 = .9

deal with = lambda x: int(np.random.binomial(1, probtreat1, 1))
if x==1 else int(np.random.binomial(1, probtreat2, 1))

spend = lambda x: float(np.random.regular(true_mu1treated, 3, 1))
if (x[0]==1 and x[1]==1)
else ( float(np.random.regular(true_mu2treated, 3, 1) ) )

df['treated'] = df['store'].apply(deal with)
df['spend'] = df[['store','treated']].apply(tuple,1).apply(spend)

prob1 = df.question('retailer==1').form[0]/df.form[0]
prob2 = df.question('retailer==2').form[0]/df.form[0]

simple_value_treated = np.imply(df.question('handled==1')['spend'])

#estimate propensity rating:
ps1 = df.question('handled==1 & retailer==1').form[0]/df.question('retailer==1').form[0]
ps2 = df.question('handled==1 & retailer==2').form[0]/df.question('retailer==2').form[0]
df['ps'] = pd.Collection(np.the place(df['store']==1, ps1, ps2))
ipw_value_treated = np.imply( (df['spend']*df['treated'])/df['ps'])

return [simple_value_treated, ipw_value_treated]

sim=1000
values = Parallel(n_jobs=4)(delayed(run_campaign3)() for _ in tqdm(vary(sim)) )
results_df = pd.DataFrame(values, columns=['simple_treat','ipw_treat'])

offers us the identical unbiased estimate as earlier than.

Determine 4 by creator

Estimating the Common Therapy Impact

Now, our final aim is to study the common incremental spending that the advertising marketing campaign has generated, aka the common remedy impact. To try this we have to additionally estimate the the inhabitants imply spending not given a coupon, E[Y_i(0)] and examine it towards E[Y_i(1)]. Our estimand is now this:

In the direction of this, first we repeat the identical argument for non-treated and procure an unbiased estimate for E[Y_i(0)] as follows:

and at last mix them into estimating the impression:

Let’s now prolong our earlier evaluation into estimating the impression of the marketing campaign. Suppose spending amongst non-treated prospects is distributed as N(10,2²) in each shops, in order that the true impact of the marketing campaign is 0.5*$10 + 0.5*$30 = $20.

def run_campaign4():
true_mu1treated , true_mu2treated = 20, 40
true_mu1control , true_mu2control = 10, 10
n, p , obs = 1, .5 , 2000 # variety of trials, likelihood of every trial, variety of observations
retailer = np.random.binomial(n, p, obs)+1
df = pd.DataFrame({'retailer':retailer})

probtreat1 = .5
probtreat2 = .9

deal with = lambda x: int(np.random.binomial(1, probtreat1, 1))
if x==1 else int(np.random.binomial(1, probtreat2, 1))

spend = lambda x: float(np.random.regular(true_mu1treated, 3, 1))
if (x[0]==1 and x[1]==1)
else ( float(np.random.regular(true_mu2treated, 3, 1) )
if (x[0]==2 and x[1]==1)
else (float(np.random.regular(true_mu1control, 2, 1) ) if (x[0]==1 and x[1]==0)
else float(np.random.regular(true_mu2control, 2, 1)) )
df['treated'] = df['store'].apply(deal with)
df['spend'] = df[['store','treated']].apply(tuple,1).apply(spend)

prob1 = df.question('retailer==1').form[0]/df.form[0]
prob2 = df.question('retailer==2').form[0]/df.form[0]

simple_value_treated = np.imply(df.question('handled==1')['spend'])
simple_value_control = np.imply(df.question('handled==0')['spend'])

simple_tau = simple_value_treated - simple_value_control

est_mu1treated = np.imply(df.question('handled==1 & retailer==1')['spend'])
est_mu2treated = np.imply(df.question('handled==1 & retailer==2')['spend'])

weighted_value_treated = prob1*est_mu1treated + prob2*est_mu2treated

est_mu1control = np.imply(df.question('handled==0 & retailer==1')['spend'])
est_mu2control = np.imply(df.question('handled==0 & retailer==2')['spend'])

weighted_value_control = prob1*est_mu1control + prob2*est_mu2control
weighted_tau = weighted_value_treated - weighted_value_control

#estimate propensity rating:
ps1 = df.question('handled==1 & retailer==1').form[0]/df.question('retailer==1').form[0]
ps2 = df.question('handled==1 & retailer==2').form[0]/df.question('retailer==2').form[0]

df['ps'] = pd.Collection(np.the place(df['store']==1, ps1, ps2))

ipw_value_treated = np.imply( (df['spend']*df['treated'])/df['ps'])
ipw_value_control = np.imply( (df['spend']*(1-df['treated']) )/(1-df['ps'] ))
ipw_tau = ipw_value_treated - ipw_value_control

return [simple_tau, weighted_tau, ipw_tau]

sim=1000
values = Parallel(n_jobs=4)(delayed(run_campaign4)() for _ in tqdm(vary(sim)) )
results_df = pd.DataFrame(values, columns=['simple_tau','weighted_tau','ipw_tau'])

As proven under, each the weighted common and the IPW estimator are centered across the true impact of $20, whereas the distribution of the easy common with out controlling for retailer membership is centered round $23, 15% bigger than the true impact.

Determine 5 by creator

Conclusion

The IPW estimator has an extended historical past in causal inference. The aim of this publish was to develop an instinct for this well-known estimator by a easy instance. Utilizing a advertising case we have now seen that the hallmark of this technique is to appropriate for unequal remedy project mechanisms. Furthermore, we have now proven that the strategy is an extension of the weighted common estimator.

References

[1] Richard Okay. Crump, V. Joseph Hotz, Guido W. Imbens, Oscar A. Mitnik. Coping with restricted overlap in estimation of common remedy results. (2009), Biometrika.

[2] Stefan Wager, Stats 361: Causal Inference (Spring 2020), Stanford College.

Code

The code for this evaluation might be present in my github repository.

Thanks for studying!

My aim is to file my very own studying and share it with others who may discover it helpful. Please let me know for those who discover any errors or have any feedback/options.

RELATED ARTICLES

LEAVE A REPLY

Please enter your comment!
Please enter your name here

- Advertisment -
Google search engine

Most Popular

Recent Comments