The package supports sub-national analysis through a geographic region system. Regions can scope simulations to states, constituencies, congressional districts, local authorities, and cities.
Region system¶
Region¶
A Region represents a geographic area with a unique prefixed code:
| Region type | Code format | Examples |
|---|---|---|
| National | us, uk | us, uk |
| State | state/{code} | state/ca, state/ny |
| Congressional district | congressional_district/{ST-DD} | congressional_district/CA-01 |
| Place/city | place/{ST-FIPS} | place/NJ-57000 |
| UK country | country/{name} | country/england |
| Constituency | constituency/{name} | constituency/Sheffield Central |
| Local authority | local_authority/{code} | local_authority/E09000001 |
RegionRegistry¶
Each model version has a RegionRegistry providing O(1) lookups:
from policyengine.tax_benefit_models.us import us_latest
registry = us_latest.region_registry
# Look up by code
california = registry.get("state/ca")
print(f"{california.label}: {california.region_type}")
# Get all regions of a type
states = registry.get_by_type("state")
print(f"{len(states)} states")
districts = registry.get_by_type("congressional_district")
print(f"{len(districts)} congressional districts")
# Get children of a region
ca_districts = registry.get_children("state/ca")from policyengine.tax_benefit_models.uk import uk_latest
registry = uk_latest.region_registry
# UK countries
countries = registry.get_by_type("country")
for c in countries:
print(f"{c.code}: {c.label}")Region counts¶
US: 1 national + 51 states (inc. DC) + 436 congressional districts + 333 census places = 821 regions
UK: 1 national + 4 countries. Constituencies and local authorities are available via extended registry builders.
Scoping strategies¶
Scoping strategies control how a national dataset is narrowed to represent a sub-national region. They are applied during Simulation.run(), before the microsimulation calculation.
RowFilterStrategy¶
Filters dataset rows where a household-level variable matches a specific value. Used for UK countries and US places/cities.
from policyengine.core import Simulation
from policyengine.core.scoping_strategy import RowFilterStrategy
# Simulate only California households
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=us_latest,
scoping_strategy=RowFilterStrategy(
variable_name="state_code",
variable_value="CA",
),
)
simulation.run()This removes all non-California households from the dataset before running the simulation. The remaining household weights still reflect California’s population.
# UK: simulate only England
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
scoping_strategy=RowFilterStrategy(
variable_name="country",
variable_value="ENGLAND",
),
)WeightReplacementStrategy¶
Replaces household weights from a pre-computed weight matrix stored in Google Cloud Storage. Used for UK constituencies and local authorities, where the weight matrix (shape: N_regions x N_households) reweights all households to represent each region’s demographics.
from policyengine.core.scoping_strategy import WeightReplacementStrategy
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=uk_latest,
scoping_strategy=WeightReplacementStrategy(
weight_matrix_bucket="policyengine-uk-data",
weight_matrix_key="parliamentary_constituency_weights.h5",
lookup_csv_bucket="policyengine-uk-data",
lookup_csv_key="constituencies_2024.csv",
region_code="Sheffield Central",
),
)Unlike row filtering, weight replacement keeps all households but assigns region-specific weights. This is more statistically robust for small geographic areas where filtering would leave too few households.
Legacy filter fields¶
For backward compatibility, Simulation also accepts filter_field and filter_value parameters, which are auto-converted to a RowFilterStrategy:
# These two are equivalent:
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=us_latest,
filter_field="state_code",
filter_value="CA",
)
simulation = Simulation(
dataset=dataset,
tax_benefit_model_version=us_latest,
scoping_strategy=RowFilterStrategy(
variable_name="state_code",
variable_value="CA",
),
)Geographic impact outputs¶
The package provides output types that compute per-region metrics across all regions simultaneously.
CongressionalDistrictImpact (US)¶
Groups households by congressional_district_geoid and computes weighted average and relative income changes per district.
from policyengine.outputs.congressional_district_impact import (
compute_us_congressional_district_impacts,
)
baseline_sim.run()
reform_sim.run()
impact = compute_us_congressional_district_impacts(baseline_sim, reform_sim)
for d in impact.district_results:
print(f"District {d['state_fips']:02d}-{d['district_number']:02d}: "
f"avg change=${d['average_household_income_change']:+,.0f}, "
f"relative={d['relative_household_income_change']:+.2%}")Result fields per district:
district_geoid: Integer SSDD (state FIPS * 100 + district number)state_fips: State FIPS codedistrict_number: District number within stateaverage_household_income_change: Weighted mean changerelative_household_income_change: Weighted relative changepopulation: Weighted household count
ConstituencyImpact (UK)¶
Uses pre-computed weight matrices (650 x N_households) to compute per-constituency income changes without filtering.
from policyengine.outputs.constituency_impact import (
compute_uk_constituency_impacts,
)
impact = compute_uk_constituency_impacts(
baseline_simulation=baseline_sim,
reform_simulation=reform_sim,
weight_matrix_path="parliamentary_constituency_weights.h5",
constituency_csv_path="constituencies_2024.csv",
year="2025",
)
for c in impact.constituency_results:
print(f"{c['constituency_name']}: "
f"avg change={c['average_household_income_change']:+,.0f}")Result fields per constituency:
constituency_code,constituency_name: Identifiersx,y: Hex map coordinatesaverage_household_income_change,relative_household_income_changepopulation: Weighted household count
LocalAuthorityImpact (UK)¶
Works identically to ConstituencyImpact but for local authorities (360 x N_households weight matrix).
from policyengine.outputs.local_authority_impact import (
compute_uk_local_authority_impacts,
)
impact = compute_uk_local_authority_impacts(
baseline_simulation=baseline_sim,
reform_simulation=reform_sim,
weight_matrix_path="local_authority_weights.h5",
local_authority_csv_path="local_authorities_2024.csv",
year="2025",
)Using regions with economic_impact_analysis()¶
Scoping strategies compose naturally with the full analysis pipeline:
from policyengine.core.scoping_strategy import RowFilterStrategy
# State-level analysis
baseline_sim = Simulation(
dataset=dataset,
tax_benefit_model_version=us_latest,
scoping_strategy=RowFilterStrategy(
variable_name="state_code",
variable_value="CA",
),
)
reform_sim = Simulation(
dataset=dataset,
tax_benefit_model_version=us_latest,
policy=reform,
scoping_strategy=RowFilterStrategy(
variable_name="state_code",
variable_value="CA",
),
)
# Full analysis scoped to California
analysis = economic_impact_analysis(baseline_sim, reform_sim)