Changing Functions of the Taxes and Transfers System#

This tutorial focuses on the policy functions of GETTSIM, one of the two objects returned by the function set_up_policy_environment. Alongside policy parameters, these functions help GETTSIM define a date-specific policy environment based on which it can compute taxes and transfers for individual and household data.

Just like parameters, policy functions can be replaced, added or removed to make changes to the existing policy environment. This way, you can design a new tax or transfer for any specific group of people, e.g. invent a new tax for people that have income from renting an apartment, or change the conditions for receiving already existing transfers.

This tutorial showcases the policy functions using a concrete example. For a more comprehensive and abstract discussion of the feature, check out the how-to guide on Different Ways to Load Policy Functions.

[1]:
import copy

import numpy as np
import plotly.express as px
from gettsim import (
    compute_taxes_and_transfers,
    create_synthetic_data,
    set_up_policy_environment,
)

Changing and Replacing Existing Function(s)#

Example: Receiving Multiple Transfers#

In the German system, there are some transfers for low-income families that cannot be received in combination. Per default, GETTSIM will always choose the most favorable transfers and set other transfers to zero. This assumption could model the behavior of households/families in a wrong way, if they do not always choose the optimal transfers (from a monetary perspective). For example, there could be a social stigma connected to certain transfers or some people simply do not know about some of the available transfers.

To account for these frictions, we can turn off this aspect of GETTSIM so that we see all the transfers a family/household is entitled to, even if the transfers cannot be received in combination. This can be useful for further analysis. For example you could speculate which transfers Germans receive in reality and implement this in GETTSIM.

Find the Function#

Here we can look for the function that implements the aspect we want to change.

[2]:
policy_params, policy_functions = set_up_policy_environment("2020")

Define Changes to the Function#

After you found the function that you want to change, copy the source code from the website to your notebook and change it just as you like:

[3]:
def arbeitsl_geld_2_m_hh(
    arbeitsl_geld_2_vor_vorrang_m_hh,
    # wohngeld_vorrang_hh,
    # kinderzuschl_vorrang_hh,
    # wohngeld_kinderzuschl_vorrang_hh,
    erwachsene_alle_rentner_hh,
):
    if (
        # wohngeld_vorrang_hh
        # | kinderzuschl_vorrang_hh
        # | wohngeld_kinderzuschl_vorrang_hh
        erwachsene_alle_rentner_hh
    ):
        out = 0.0
    else:
        out = arbeitsl_geld_2_vor_vorrang_m_hh

    return out

The lines of the cell above that start with “#” usually do the priority check as described above. With the hash, the lines become a comment and do not influence the code anymore.

Make GETTSIM Incorporate your Changes#

There are different ways to make GETTSIM incorporate your edited function.

Alternative 1:#

One way is to copy the policy_functions and replace the “old” function with the function we defined before.

[4]:
policy_functions_no_check = copy.deepcopy(policy_functions)
policy_functions_no_check["arbeitsl_geld_2_m_hh"] = arbeitsl_geld_2_m_hh

Computations with the new policy_functions_no_check will now have the characteristic of showing the value of all available transfers without checking which ones cannot be received in combination and without choosing the most profitable combination.

Let´s test if this works!

We import simulated data for households with two parents and two children. These households only vary in their income:

[5]:
## idea for use of synthetical data
data = create_synthetic_data(
    n_adults=2,
    n_children=2,
    specs_heterogeneous={
        "bruttolohn_m": [[i, 0, 0, 0] for i in np.linspace(500, 5000, 250)]
    },
)

# Compute sum of pension contributions in household and add it to data.
sum_ges_rente_priv_rente_m = compute_taxes_and_transfers(
    data=data,
    params=policy_params,
    targets="sum_ges_rente_priv_rente_m",
    functions=policy_functions,
)

data["sum_ges_rente_priv_rente_m"] = sum_ges_rente_priv_rente_m[
    "sum_ges_rente_priv_rente_m"
]
data.head(5)
[5]:
p_id hh_id tu_id hh_typ hat_kinder alleinerz weiblich alter kind in_ausbildung ... m_pfleg_berücks_zeit y_pflichtbeitr_ab_40 anwartschaftszeit arbeitssuchend m_durchg_alg1_bezug sozialv_pflicht_5j kind_unterh_anspr_m kind_unterh_erhalt_m steuerklasse sum_ges_rente_priv_rente_m
0 0 0 0 couple_2_children True False False 35 False False ... 1.0 0.0 False False 0.0 0.0 0.0 0.0 0 0.0
1 1 0 0 couple_2_children True False True 35 False False ... 1.0 0.0 False False 0.0 0.0 0.0 0.0 0 0.0
2 2 0 0 couple_2_children False False False 8 True True ... 1.0 0.0 False False 0.0 0.0 0.0 0.0 0 0.0
3 3 0 0 couple_2_children False False True 3 True True ... 1.0 0.0 False False 0.0 0.0 0.0 0.0 0 0.0
4 4 1 1 couple_2_children True False False 35 False False ... 1.0 0.0 False False 0.0 0.0 0.0 0.0 0 0.0

5 rows × 67 columns

For this data we can now compare the results of using GETTSIM with the policy_functions_no_check and the usual policy_functions.

We should expect to see positive values for wohngeld_m_hh, kinderzuschl_m_hh and arbeitsl_geld_2_m_hh at the same time if we do not check which combination of transfers is optimal (policy_functions_no_check).

On the other hand, if we use the default version of the policy_functions, wohngeld_m_hh and kinderzuschl_m_hh should be zero as long as arbeitsl_geld_2_m_hh is positive (and the other way around), as it is a characteristic of the German taxes and transfers system that Wohngeld and Kinderzuschlag cannot be received in combination with Arbeitslosengeld 2.

[6]:
targets = ["wohngeld_m_hh", "kinderzuschl_m_hh", "arbeitsl_geld_2_m_hh"]
[7]:
policies = {
    "Checked Favorability": policy_functions,
    "No Check of Favorabilty": policy_functions_no_check,
}
[8]:
# Loop through keys to plot both scenarios.
for k in policies:
    # Compute taxes and transfers.
    result = compute_taxes_and_transfers(
        data=data,
        functions=policies[k],
        params=policy_params,
        targets=targets,
        columns_overriding_functions=["sum_ges_rente_priv_rente_m"],
    )
    # Add earnings and index to result DataFrame.
    result["bruttolohn_m"] = data["bruttolohn_m"]
    result.index = data["hh_id"]
    # Create DataFrame that contains the maximum value of the target variables
    # in the household and the household gross income.
    result = (
        result.groupby("hh_id")[targets]
        .max()
        .join(result.groupby("hh_id")["bruttolohn_m"].sum())
    )
    # Plot the results.
    fig = px.line(
        data_frame=result,
        x="bruttolohn_m",
        y=targets,
        title=k,
    )
    fig.update_layout(
        xaxis_title="Monthly gross income in € (per household)",
        yaxis_title="€ per month",
    )
    fig.show()

On first glance, both figures look quite confusing because of the complexity of the German taxes and transfers system. But if we take a closer look, the figures confirm our expectations. If we let GETTSIM check for the most favorable combination of transfers, wohngeld_m_hh and kinderzuschl_m_hh are zero as long as arbeitsl_geld_2_m_hh is positive (i.e. the best option for the household) and the other way around.

If we do not let GETTSIM do this check, this does not hold any longer and all transfers can be positive at the same time (which is what we were trying to achieve).

Alternative 2:#

Another way would be to mention the changed function in our compute_taxes_and_transfers-function. This works as follows:

[9]:
result_no_check_p = compute_taxes_and_transfers(
    data=data,
    params=policy_params,
    functions=[policy_functions, arbeitsl_geld_2_m_hh],
    targets=[
        "wohngeld_m_hh",
        "kinderzuschl_m_hh",
        "arbeitsl_geld_2_m_hh",
    ],
    columns_overriding_functions=["sum_ges_rente_priv_rente_m"],
)

Executing this cell will allow you to reproduce the same analysis we did above. We do not want to do it twice, so we skip it.

There are three important points:

  1. Note that arbeitsl_geld_2_m_hh has the same function name as a pre-defined function inside GETTSIM. Thus, the internal function will be replaced with this version.

  2. In general, if there are multiple functions with the same name, internal functions have the lowest precedence. After that, the elements in the list passed to the functions argument are evaluated element by element. The functions in the leftmost element have the lowest precedence and the functions in the rightmost element have the highest.

  3. If policy_functions would not be necessary for this example, you can also directly pass the arbeitsl_geld_2_m_hh function to the functions argument.

Multiple Functions#

You can use exactly the same approach if you want to change more than one function of GETTSIM. But first, for our example we need to invent some changes to another function of GETTSIM. Imagine, we want to double the amount of Kindergeld every household receives in addition to the previously implemented function change.

[10]:
def kinderzuschl_m_hh(
    kindergeld_anspruch, kumulativer_kindergeld_anspruch_tu, kindergeld_params
):
    """Calculate the preliminary kindergeld.

    Parameters
    ----------
    kindergeld_anspruch
        See :func:`kindergeld_anspruch`.
    kumulativer_kindergeld_anspruch_tu
        See :func:`kumulativer_kindergeld_anspruch_tu`.
    kindergeld_params
        See params documentation :ref:`kindergeld_params <kindergeld_params>`.

    Returns
    -------

    """
    # Make sure that only eligible children get assigned kindergeld
    if not kindergeld_anspruch:
        out = 0.0
    else:
        # Kindergeld_Anspruch is the cumulative sum of eligible children.
        kumulativer_anspruch_wins = min(
            kumulativer_kindergeld_anspruch_tu, max(kindergeld_params["kindergeld"])
        )
        out = kindergeld_params["kindergeld"][kumulativer_anspruch_wins]
    return out * 2

If you edit arbeitsl_geld_2_m_hh and kinderzuschl_m_hh, your two options to make GETTSIM incorporate your changes would be:

Alternative 1:

[11]:
policy_functions_reformed = copy.deepcopy(policy_functions)
policy_functions_reformed["arbeitsl_geld_2_m_hh"] = arbeitsl_geld_2_m_hh
policy_functions_reformed["kinderzuschl_m_hh"] = kinderzuschl_m_hh

Alternative 2:

[12]:
df = compute_taxes_and_transfers(
    data=data,
    params=policy_params,
    functions=[policy_functions, arbeitsl_geld_2_m_hh, kinderzuschl_m_hh],
    targets=[
        "wohngeld_m_hh",
        "kinderzuschl_m_hh",
        "arbeitsl_geld_2_m_hh",
        "kinderzuschl_m_hh",
    ],
    columns_overriding_functions=["sum_ges_rente_priv_rente_m"],
)

Adding a New Function#

Instead of replacing existing functions, we can similarly define completely new functions and add them to the policy environment.

Aggregation Functions#

Functions which aggregate a column on the tax unit or household level are treated differently in GETTSIM.

If we would like to add (or replace) such functions, we need to specify them in a dictionary which we provide to compute_taxes_and_transfers via the aggregation_specs argument. An example dictionary is as follows:

aggregation_specs = {
    "anz_erwachsene_tu": {"source_col": "erwachsen", "aggr": "sum"},
    "haushaltsgröße_hh": {"aggr": "count"},
}

See GEP 4 for more information on aggregation functions.

[ ]: