```{include} ../README.md :start-after: # py-statmatch :end-before: ## Installation
```{toctree}
:hidden:
:maxdepth: 2
getting-started
api-reference
examples/index
methodology
contributing
changelog
Statistical matching, also known as data fusion or synthetic data matching, is a technique used to integrate information from different data sources that share some common variables but have no or few units in common. This is particularly useful when:
```{include} ../README.md :start-after: ## Features :end-before: ## Installation
## Quick Example
Here's a simple example of using py-statmatch for nearest neighbor matching:
```python
import pandas as pd
from statmatch import nnd_hotdeck
# Donor data has variables X and Y
donor_data = pd.DataFrame({
'age': [25, 30, 35, 40],
'income': [30000, 45000, 55000, 65000],
'satisfaction': [7, 8, 6, 9] # This will be donated
})
# Recipient data has only X variables
recipient_data = pd.DataFrame({
'age': [28, 33, 42],
'income': [35000, 50000, 70000]
})
# Perform matching
result = nnd_hotdeck(
data_rec=recipient_data,
data_don=donor_data,
match_vars=['age', 'income']
)
# Create fused dataset
fused = recipient_data.copy()
fused['satisfaction'] = donor_data.iloc[result['noad.index']]['satisfaction'].values
getting-started
guide for installation and basic usageapi-reference
for detailed function documentationexamples/index
for more complex use casesmethodology
behind statistical matching