Skip to article frontmatterSkip to article content
Site not loading correctly?

This may be due to an incorrect BASE_URL configuration. See the MyST Documentation for reference.

Regions and scoping

The package supports sub-national analysis through a geographic region system. Regions can scope simulations to states, constituencies, congressional districts, local authorities, and cities.

Region system

Region

A Region represents a geographic area with a unique prefixed code:

Region typeCode formatExamples
Nationalus, ukus, uk
Statestate/{code}state/ca, state/ny
Congressional districtcongressional_district/{ST-DD}congressional_district/CA-01
Place/cityplace/{ST-FIPS}place/NJ-57000
UK countrycountry/{name}country/england
Constituencyconstituency/{name}constituency/Sheffield Central
Local authoritylocal_authority/{code}local_authority/E09000001

RegionRegistry

Each model version has a RegionRegistry providing O(1) lookups:

from policyengine.tax_benefit_models.us import us_latest

registry = us_latest.region_registry

# Look up by code
california = registry.get("state/ca")
print(f"{california.label}: {california.region_type}")

# Get all regions of a type
states = registry.get_by_type("state")
print(f"{len(states)} states")

districts = registry.get_by_type("congressional_district")
print(f"{len(districts)} congressional districts")

# Get children of a region
ca_districts = registry.get_children("state/ca")
from policyengine.tax_benefit_models.uk import uk_latest

registry = uk_latest.region_registry

# UK countries
countries = registry.get_by_type("country")
for c in countries:
    print(f"{c.code}: {c.label}")

Region counts

US: 1 national + 51 states (inc. DC) + 436 congressional districts + 333 census places = 821 regions

UK: 1 national + 4 countries. Constituencies and local authorities are available via extended registry builders.

Scoping strategies

Scoping strategies control how a national dataset is narrowed to represent a sub-national region. They are applied during Simulation.run(), before the microsimulation calculation.

RowFilterStrategy

Filters dataset rows where a household-level variable matches a specific value. Used for UK countries and US places/cities.

from policyengine.core import Simulation
from policyengine.core.scoping_strategy import RowFilterStrategy

# Simulate only California households
simulation = Simulation(
    dataset=dataset,
    tax_benefit_model_version=us_latest,
    scoping_strategy=RowFilterStrategy(
        variable_name="state_code",
        variable_value="CA",
    ),
)
simulation.run()

This removes all non-California households from the dataset before running the simulation. The remaining household weights still reflect California’s population.

# UK: simulate only England
simulation = Simulation(
    dataset=dataset,
    tax_benefit_model_version=uk_latest,
    scoping_strategy=RowFilterStrategy(
        variable_name="country",
        variable_value="ENGLAND",
    ),
)

WeightReplacementStrategy

Replaces household weights from a pre-computed weight matrix stored in Google Cloud Storage. Used for UK constituencies and local authorities, where the weight matrix (shape: N_regions x N_households) reweights all households to represent each region’s demographics.

from policyengine.core.scoping_strategy import WeightReplacementStrategy

simulation = Simulation(
    dataset=dataset,
    tax_benefit_model_version=uk_latest,
    scoping_strategy=WeightReplacementStrategy(
        weight_matrix_bucket="policyengine-uk-data",
        weight_matrix_key="parliamentary_constituency_weights.h5",
        lookup_csv_bucket="policyengine-uk-data",
        lookup_csv_key="constituencies_2024.csv",
        region_code="Sheffield Central",
    ),
)

Unlike row filtering, weight replacement keeps all households but assigns region-specific weights. This is more statistically robust for small geographic areas where filtering would leave too few households.

Legacy filter fields

For backward compatibility, Simulation also accepts filter_field and filter_value parameters, which are auto-converted to a RowFilterStrategy:

# These two are equivalent:
simulation = Simulation(
    dataset=dataset,
    tax_benefit_model_version=us_latest,
    filter_field="state_code",
    filter_value="CA",
)

simulation = Simulation(
    dataset=dataset,
    tax_benefit_model_version=us_latest,
    scoping_strategy=RowFilterStrategy(
        variable_name="state_code",
        variable_value="CA",
    ),
)

Geographic impact outputs

The package provides output types that compute per-region metrics across all regions simultaneously.

CongressionalDistrictImpact (US)

Groups households by congressional_district_geoid and computes weighted average and relative income changes per district.

from policyengine.outputs.congressional_district_impact import (
    compute_us_congressional_district_impacts,
)

baseline_sim.run()
reform_sim.run()

impact = compute_us_congressional_district_impacts(baseline_sim, reform_sim)

for d in impact.district_results:
    print(f"District {d['state_fips']:02d}-{d['district_number']:02d}: "
          f"avg change=${d['average_household_income_change']:+,.0f}, "
          f"relative={d['relative_household_income_change']:+.2%}")

Result fields per district:

ConstituencyImpact (UK)

Uses pre-computed weight matrices (650 x N_households) to compute per-constituency income changes without filtering.

from policyengine.outputs.constituency_impact import (
    compute_uk_constituency_impacts,
)

impact = compute_uk_constituency_impacts(
    baseline_simulation=baseline_sim,
    reform_simulation=reform_sim,
    weight_matrix_path="parliamentary_constituency_weights.h5",
    constituency_csv_path="constituencies_2024.csv",
    year="2025",
)

for c in impact.constituency_results:
    print(f"{c['constituency_name']}: "
          f"avg change={c['average_household_income_change']:+,.0f}")

Result fields per constituency:

LocalAuthorityImpact (UK)

Works identically to ConstituencyImpact but for local authorities (360 x N_households weight matrix).

from policyengine.outputs.local_authority_impact import (
    compute_uk_local_authority_impacts,
)

impact = compute_uk_local_authority_impacts(
    baseline_simulation=baseline_sim,
    reform_simulation=reform_sim,
    weight_matrix_path="local_authority_weights.h5",
    local_authority_csv_path="local_authorities_2024.csv",
    year="2025",
)

Using regions with economic_impact_analysis()

Scoping strategies compose naturally with the full analysis pipeline:

from policyengine.core.scoping_strategy import RowFilterStrategy

# State-level analysis
baseline_sim = Simulation(
    dataset=dataset,
    tax_benefit_model_version=us_latest,
    scoping_strategy=RowFilterStrategy(
        variable_name="state_code",
        variable_value="CA",
    ),
)
reform_sim = Simulation(
    dataset=dataset,
    tax_benefit_model_version=us_latest,
    policy=reform,
    scoping_strategy=RowFilterStrategy(
        variable_name="state_code",
        variable_value="CA",
    ),
)

# Full analysis scoped to California
analysis = economic_impact_analysis(baseline_sim, reform_sim)