BayesianBLP#
- class pymc_marketing.customer_choice.bayesian_blp.BayesianBLP(market_data, *, characteristics, product_col='product', market_col='market', region_col=None, share_col='share', market_size_col='n', price_col='price', instruments=None, outside_good='outside', time_col=None, n_mc_draws=None, random_coef_on=None, product_fixed_effects=True, likelihood='normal_logshare', min_share=0.0001, track_delta=False, hierarchical_parameterisation='centered', model_config=None, sampler_config=None, random_seed=None)[source]#
Bayesian random-coefficients logit on aggregate market-share panels.
- Parameters:
- market_data
pd.DataFrame Long-format panel. Each (region, market, product) cell is one row. Every market must contain exactly one row per inside product plus a single outside-good row whose
product_colvalue matchesoutside_good. Outside-good rows should haveprice, characteristics, and instruments all set to0.- product_col, market_col, region_col, share_col, market_size_col, price_col
str Column names.
region_col=None(default) collapses the region hierarchy to a single bucket.market_colmust uniquely identify a (region, period) cell.- characteristics
listofstr Columns holding product characteristics
x_jt.- instruments
listofstr, optional Columns holding instruments
z_jtfor the price-endogeneity block. IfNone, no first-stage price equation is built and the price coefficient is not identified under endogeneity — a warning is raised.- outside_good
str Row label of the outside good in
product_col.- time_col
str, optional Column holding the period (time) coordinate. When set, every
(region, period)cell must appear exactly once and the panel must be rectangular (every region has every period). Theperiodcoordinate is then exposed on the InferenceData andcounterfactual_shares()/elasticities()acceptperiods=andregions=coord-label arguments. DefaultNone— the model treats markets as unstructured and the graph is bit-identical to the pre-time-aware behaviour.- n_mc_draws
int, optional Number of Owen-scrambled Halton draws used to integrate the share equation over consumer heterogeneity. Defaults to
max(200, 100 * n_random_coefs)and warns when the chosen value looks too small for the integration dimension.- random_coef_on
listofstr, optional Names of dimensions that receive consumer-level random coefficients. Use the literal string
"price"for the price coefficient and any characteristic name for that characteristic. Defaults to["price"].- product_fixed_effectsbool
If
True(default), the structural error decomposes asξ_jt = ξ_j + ξ̃_jtwith a product fixed effect. IfFalse, per-product alternative-specific intercepts are used instead andξ_jt = ξ̃_jt. Only one of the two is included, never both.Falseis not supported in this v1 release; passTrue.- likelihood{“normal_logshare”}
Aggregate-share likelihood. Currently only the Berry (1994) heteroskedastic Normal-on-log-share-ratio formulation is wired up.
- min_share
float Floor applied to observed shares to avoid
log(0). A warning is emitted when the floor is hit.- track_deltabool
If
True, store the mean-utility tensorδ_jtas apm.Deterministic(memory-heavy on large panels). DefaultFalse.- hierarchical_parameterisation{“centered”, “noncentered”}
Parameterisation of the region-level hierarchy on
α_randβ_r. Default"centered". Use"noncentered"only when per-region data is sparse and the prior dominates the likelihood (e.g. many regions with very few markets each). For typical scanner panels — a handful of regions, each with informative per-region data — the centered form has a cleaner posterior geometry and avoids the Neal’s-funnel pathology that otherwise biasesτ_αlow and over-shrinks per-region coefficients.- model_config, sampler_config
dict, optional Standard
ModelBuilderoverrides. The default sampler configuration targetsnumpyroattarget_accept=0.95because theξ̃_jtblock is funnel-prone.
- market_data
Notes
Identification. Endogeneity correction uses the conditional decomposition of the joint
(η_jt, ξ̃_jt)Normal: the price equationp_jt = π_0j + π_z · z_jt + η_jtis fit as a marginal likelihood,η_jtis the price residual, andξ̃_jt | η_jtis parameterised on the slope-residual coordinatesγ = ρ · σ_ξandω = σ_ξ · sqrt(1 − ρ²)so thatξ̃_jt = (γ/σ_η) · η_jt + ω · ε_jt. The marginal scaleσ_ξand correlationρ_price_xiare exposed as Deterministics for downstream summaries. This is mathematically equivalent to a joint MvNormal in(ρ, σ_ξ)coordinates but the conditional likelihood depends onρ × σ_ξonly throughγ, so the slope-residual basis avoids the multiplicative ridge that pinned diagonal-mass NUTS at the depth cap.Sampler geometry. The
ξ̃_jtand random-coefficient raw blocks are non-centered. The region-level hierarchy onα_r/β_ris centered by default — counterintuitive but standard advice (Betancourt & Girolami 2015): centered is preferable when per-group data is informative, while non-centered helps in sparse-data regimes. The default sampler runsnumpyroNUTS withtarget_accept=0.95; when residual correlations between variance components push tree depth toward the cap, prefernutpie(low-rank modified mass matrix by default) or passnuts_sampler_kwargs={"nuts_kwargs": {"dense_mass": True}}tofitfornumpyro. Settrack_delta=Trueonly if you actually need the per-cell mean utility in the trace — on a typical 100-week × 10-SKU panel this is ~7 MB per chain.Notation glossary. Variable names in the trace and posterior summaries map to the model symbols as follows. See the synthetic notebook for the full index conventions and a more detailed table.
Code name
Math
Role
alpha/alpha_rα, α_r
Price coefficient (population, per-region)
beta/beta_rβ, β_r
Characteristic utility weights
alpha_pop,tau_alpha,beta_pop,tau_betaα_pop, τ_α, β_pop, τ_β
Cross-region hyperparameters (only when
region_colis set)sigma_randomσ_d
Consumer heterogeneity scale per random-coefficient dimension
model._halton[:, d]ν_id
Consumer i’s standardised N(0,1) taste shock on dimension d, drawn from the Halton grid (fixed data, not sampled)
internal
mu_devμ_ijm
Consumer-level utility deviation Σ_d σ_d · ν_id · c_jmd; not exposed as a posterior variable
xi/xi_j/xi_tildeξ_jm, ξ_j, ξ̃_jm
Product-market quality shock decomposed as product fixed effect + centered residual
sigma_xi,sigma_xi_jσ_ξ, σ_{ξ_j}
Marginal scales of ξ_jm and ξ_j
eta/sigma_etaη_jm, σ_η
First-stage price residual and its scale
pi_0/pi_zπ_0, π_z
First-stage intercepts / instrument coefficients
rho_price_xiρ
Endogeneity correlation between ξ and η
gamma_xi_eta,omega_xiγ, ω
Slope-residual coordinates the sampler uses; (ρ, σ_ξ) are derived Deterministics
deltaδ_jm
Mean utility (only if
track_delta=True)s_inside/s_outsideŝ_jm, ŝ_0m
Halton-averaged predicted shares
log_share_ratiolog s_jm − log s_0m
Likelihood’s observed quantity
Methods
BayesianBLP.__init__(market_data, *, ...[, ...])Initialize model configuration and sampler configuration for the model.
Convert the model configuration and sampler configuration from the attributes to keyword arguments.
BayesianBLP.batch_shares(alpha_M, beta_M, ...)Numpy-evaluate the share equation for a batch of posterior samples.
BayesianBLP.build_from_idata(idata)Not implemented for v1.
BayesianBLP.build_model(**kwargs)Construct the PyMC model and attach it to
self.model.Posterior shares under a counterfactual price intervention.
Serialise scalar constructor arguments onto
InferenceData.attrs.BayesianBLP.elasticities(*[, at, periods, ...])Posterior price elasticities
ε[market, share, price].BayesianBLP.fit([progressbar, random_seed])Fit by sampling the joint posterior with NUTS.
BayesianBLP.graphviz(**kwargs)Get the graphviz representation of the model.
Create the model configuration and sampler configuration from the InferenceData to keyword arguments.
BayesianBLP.iterate_posterior_samples(n_samples)Stack
chain × drawand (optionally) subsample posterior arrays.BayesianBLP.load(fname[, check])Not implemented for v1.
BayesianBLP.load_from_idata(idata[, check])Create a ModelBuilder instance from an InferenceData object.
Draw from the prior predictive distribution.
BayesianBLP.save(fname, **kwargs)Persist the fitted InferenceData (model graph is not saved).
BayesianBLP.set_idata_attrs([idata])Set attributes on an InferenceData object.
BayesianBLP.table(**model_table_kwargs)Get the summary table of the model.
Reshape the posterior
xito(region, period, inside_product).Attributes
default_model_configDefault priors for every univariate / vector parameter in the model.
default_sampler_configDefault sampler kwargs:
numpyroNUTS attarget_accept=0.95.fit_resultGet the posterior fit_result.
idGenerate a unique hash value for the model.
output_varName of the observed variable (the log-share-ratio likelihood).
posteriorAccess the 'posterior' attribute of the InferenceData object.
posterior_predictiveAccess the 'posterior_predictive' attribute of the InferenceData object.
predictionsAccess the 'predictions' attribute of the InferenceData object.
priorAccess the 'prior' attribute of the InferenceData object.
prior_predictiveAccess the 'prior_predictive' attribute of the InferenceData object.
versionidatasampler_configmodel_config