Low overhead error detection using spacetime codes

In this tutorial, we will implement a workflow to sample from a Clifford circuit and use the qiskit-paulice package to boost the fidelity of the measured distribution by applying spacetime Pauli checks throughout the circuit. Spacetime Pauli checks are similar to coherent Pauli checks (CPC) in that a Clifford "payload" circuit is entangled to some ancillary qubits in order to "check" certain invariants in the circuit. Measurement of the ancillas produces a syndrome that is used to determine whether a logical error was detected during execution of the circuit. Postselecting only samples where no error was detected can improve the fidelity of sampled distributions at the expense of decreasing postselection rates. The key difference between CPC and spacetime checks is that measuring CPC syndromes requires measuring time localized, high weight operators. On qubit topologies with limited connectivity, such as heavy hex, implementing these checks requires many SWAP gates, often making the circuit too deep to run in practice. Implementing Pauli checks as spacetime codes allows them to be distributed throughout the payload circuit, allowing for a hardware efficient qubit encoding that remains effective at detecting logical errors during circuit execution. The primary role of the qiskit-paulice package is to find and insert valid spacetime Pauli checks into locations in the circuit that both maximize error detection effectiveness and minimize qubit overhead.

This tutorial implements a workflow for boosting the fidelity of the distribution sampled from a quantum circuit using spacetime checks. Since we will use a Clifford circuit, we can estimate the fidelity straightforwardly from the averaged expectation values of the stabilizers of the circuit. We will follow these steps:

Create a circuit: Shallow 1D brickwork random Clifford circuit used to visualize the checks.
Identify qubits as targets for checks: Map the circuit to physical qubits on the ibm_boston Heron QPU and identify qubits with neighboring ancillas.
Transpile the circuit to the backend: Select a qubit layout and translate the circuit gates into the native gate set of ibm_boston.
Create a noise model from backend data: Generate a rough noise model based on backend benchmark data. This noise data will be used to score the effectiveness of each additional check.
Visualize Pauli checks: Add some Pauli checks to the shallow circuit, so we can visualize how the checks are added.
Determine stabilizers to measure: Increase the depth of the circuit to increase the impact of gate noise. Randomly sample a subset of stabilizers of the deeper circuit.
Find good checks for each measured stabilizer: Each Pauli check must commute with the measured observable, so we find checks with respect to each stabilizer.
Sample from the noisy circuit: Perform noisy samplings of the original (un-checked) circuit with respect to each stabilizer. Perform an additional noisy sampling of the state including the Pauli checks.
Postselect checked circuit samples with no detected errors: Remove samples which fail at least one check. Calculate expectation values for the bare noisy circuit and the circuit containing checks.
Compare fidelities of original and postselected noisy states: Show that postselecting only samples with no detected errors improves the estimated fidelity of the noisy state.

1. Create the circuit

First, we will create a shallow Clifford circuit with its entangling gates laid out in a brickwork connectivity. We will use this small circuit to visualize how checks are added to the circuit.

import numpy as np
from qiskit import QuantumCircuit


def random_clifford_circuit(
    num_qubits: int, depth: int, rng: np.random.Generator
) -> QuantumCircuit:
    """Brickwork random Clifford on `num_qubits`, with `depth` CZ layers."""
    qc = QuantumCircuit(num_qubits)
    qc.h(range(num_qubits))
    for d in range(depth):
        for i in range(d % 2, num_qubits - 1, 2):
            qc.cz(i, i + 1)
        for q in range(num_qubits):
            if rng.integers(0, 2):
                qc.sx(q)
            if rng.integers(0, 2):
                qc.s(q)
            if rng.integers(0, 2):
                qc.sx(q)
    return qc


num_qubits = 12
depth = 4
seed = 1764
rng = np.random.default_rng(seed)
np.random.seed(seed)
circuit = random_clifford_circuit(num_qubits, depth, rng)
circuit.measure_all()
circuit.draw("mpl", fold=-1, scale=0.6)

Output:

2. Identify qubits as targets for checks

Next we choose a 1D qubit layout on the Heron r3 QPU, ibm_boston. We choose a layout that zig-zags diagonally through the coupling graph and maximizes the number of available ancilla qubits. To find available ancillas, we use the get_check_qubits function, which returns lists of target/ancilla qubit pairs. A Pauli check placed on target_qubits[i] will be entangled with the ancilla, ancilla_qubits[i].

In the coupling graph below, the green qubits are the payload qubits, and the orange qubits are the ancillas we will use to implement the checks. Qubits 68, 69, 89, 91, 111, 113, and 133 have adjacent ancillas, so we will use those qubits (corresponding to virtual qubits 0, 1, 3, 5, 7, 9, and 11) as target qubits for implementing checks. Note that qubit 134 is adjacent to qubit 133 in the payload but not highlighted as an ancilla. This is because 133 was already (arbitrarily) chosen to implement a check with ancilla 132.

from qiskit.visualization import plot_coupling_map
from qiskit_ibm_runtime import QiskitRuntimeService
from qiskit_paulice.layout import get_check_qubits

service = QiskitRuntimeService()
backend = service.backend("ibm_boston")

# Choose a layout and pair each available target qubit with a neighboring ancilla
layout = [68, 69, 78, 89, 90, 91, 98, 111, 112, 113, 119, 133]
target_qubits, ancilla_qubits = get_check_qubits(backend, layout)

print(f"Target qubits:  {target_qubits}")
print(f"Ancilla qubits: {ancilla_qubits}")
plot_coupling_map(
    num_qubits=backend.num_qubits,
    qubit_coordinates=getattr(
        backend.configuration(), "qubit_coordinates", None
    ),
    coupling_map=backend.configuration().coupling_map,
    figsize=(12, 12),
    qubit_color=[
        "#4CAF50"
        if i in set(layout)
        else "#FF9800"
        if i in set(ancilla_qubits)
        else "#DDDDDD"
        for i in backend.coupling_map.graph.node_indices()
    ],
    qubit_size=220,
    line_width=2,
    font_size=90,
)

Output:

Target qubits:  [68, 69, 89, 91, 111, 113, 133]
Ancilla qubits: [67, 70, 88, 92, 110, 114, 132]

3. Transpile the circuit to the backend

Since the backend has already been chosen, we can go ahead and transpile our virtual circuit into an ISA circuit. Here, we just set the qubit layout and translate the gates into the native gate set of ibm_boston.

from qiskit.transpiler import generate_preset_pass_manager

pm = generate_preset_pass_manager(
    optimization_level=0, backend=backend, initial_layout=layout
)
circuit_isa = pm.run(circuit)
circuit_isa.draw("mpl", fold=-1, scale=0.6)

Output:

4. Create a noise model from backend data

Now that we know how the circuit will be laid out on the QPU, we can model how the gate and readout noise on the backend will affect circuit execution. The noise model is used to determine optimal places in the circuit to place Pauli checks. We want to place checks in positions where they capture the maximum amount of error. While a more accurate noise model can improve the detection capabilities of the checks, it is usually not necessary to learn a noise model by sampling the QPU. Here we infer uniform depolarizing noise models for gate and readout noise using only backend benchmark data from qiskit-ibm-runtime. See the qiskit_paulice.NoiseModel documentation for more information on specifying gate and readout noise. For users who wish to use a learned Pauli-Lindblad noise model, see NoiseModel.from_pauli_lindblad_maps.

Additionally, we transpile the virtual circuit to the backend's qubit topology and native gate set.

from qiskit_paulice.noise_models import NoiseModel

# Model the gate and readout noise as a uniform depolarizing channel based on backend benchmark data
noise_model = NoiseModel.from_backend(
    backend, layout, uniform_gate_noise=True
)
print(noise_model)

Output:

NoiseModel(gate_noise=0.001072968081103818, readout_noise=0.004313151041666667, idling_noise=None)

5. Visualize Pauli checks

Now let's add some checks to the circuit and see how they look. We can use the qiskit_paulice.add_pauli_checks function to find a good set of valid checks. We provide the Clifford payload circuit as input, along with a list of payload qubits on which to add checks (target qubits) and the noise model we made in the previous step. In this example, we will add checks to all 7 target qubits with available ancillas. Checks are added to qubits in the order they appear in target_qubits, which makes it easy to reason about the final layout of the checked circuit: final_layout = layout + ancilla_qubits. If a user decides they would prefer to run one of the output circuits with fewer (i) checks, the final layout would be final_layout = layout + ancilla_qubits[:i+1].

The circuit visualization shows that the checks were correctly implemented using the specified target/ancilla pairs.

Details on qiskit_paulice.add_pauli_checks:

The first input should be a Clifford circuit with at least one terminal measurement.
The second positional argument is a list of target qubit indices. Each target qubit will share entangling gates with an ancilla qubit to form a check. Checks will be found and added to the circuit in the order they are given in this argument.
The final positional argument is a noise model representing the noise expected to affect the circuit during execution. This model is used to evaluate the effectiveness of a given check.
ancilla_qubits: If the input circuit has a layout, this kwarg instructs add_pauli_checks on how to assign physical qubits to newly added ancillas. ancilla_qubits should be the same length as target_qubits (2nd positional arg), and ancilla_qubits[i] should be used to implement checks with target_qubits[i].
cost: The cost function used to evaluate checks. Can be gamma or LER (logical error rate), which are all ways to describe how much of the noise channel is left undetected by the checks
method: The method used to find valid checks. We use windowed which randomly samples many spatial window subsets of a target qubit's support (wires). Other valid values are genetic and windowed_genetic.
The output of add_pauli_checks contains circuits with increasing numbers of checks, from a circuit with no checks to a circuit with checks on every target qubit (assuming valid checks are found).

See Sections II-IV of the supplementary information for detailed information on finding good spacetime Pauli checks

import matplotlib.pyplot as plt
from qiskit_paulice import add_pauli_checks

# Find checks on circuit
checked = add_pauli_checks(
    circuit_isa,
    target_qubits,
    noise_model,
    ancilla_qubits=ancilla_qubits,
    cost="gamma",
    method="windowed",
    seed=seed,
)

print(f"Physical layout of payload+ancillas: {layout + ancilla_qubits}")
print("Checked circuit:")
checked[-1].circuit.draw("mpl", fold=-1, idle_wires=False)

Output:

Physical layout of payload+ancillas: [68, 69, 78, 89, 90, 91, 98, 111, 112, 113, 119, 133, 67, 70, 88, 92, 110, 114, 132]
Checked circuit:

6. Determine stabilizers to measure

To test the performance of the error detection we will estimate the fidelity of the stabilizer state, $|\psi\rangle = U|0\rangle^{\otimes n}$ , that the circuit ideally prepares against the noisy state, $\rho$ , the hardware actually outputs. The projector onto the pure stabilizer state, $|\psi\rangle$ , is defined to be the uniform average over the $2^n$ elements of its stabilizer group $\mathcal{S}$ :

$|\psi\rangle\langle\psi| = \frac{1}{2^n}\sum_{G \in \mathcal{S}} G$

Plugging this into the fidelity equation, we see that the fidelity of noisy state, $\rho$ , is equivalent to the average expectation value of all stabilizers, $G \in \mathcal{S}$ , with respect to $\rho$ :

$F = \mathrm{Tr}(\rho|\psi\rangle\langle\psi|) = \mathrm{Tr}({\frac{1}{2^n}\sum_{G \in \mathcal{S}} \rho G}) = \frac{1}{2^n} \sum_{G \in \mathcal{S}} \mathrm{Tr}(\rho G) = \frac{1}{2^n} \sum_{G \in \mathcal{S}} \langle G \rangle_\rho$

For larger problems, measuring (or even enumerating) all $2^n$ stabilizers is infeasible, so fidelity is often estimated by measuring a random sample of stabilizers. Sampling $M$ stabilizers $G_1, \ldots, G_M$ uniformly at random from $\mathcal{S}$ gives an unbiased estimator of the fidelity, $\hat F_M$ ,

$\hat F_M = \frac{1}{M} \sum_{i=1}^M \langle G_i \rangle_\rho$ For this experiment we will make the circuit much deeper to enhance the effect of gate noise, and we will add checks to every qubit with a neighboring ancilla qubit. While there are more sophisticated ways to sample stabilizers to estimate fidelity, in this example we will sample $100$ stabilizers with uniform probability.

from qiskit.quantum_info import Clifford, Pauli, PauliList

num_stabilizers = 100
num_shots = 1_000
num_checks = 7
depth = 72

circuit = random_clifford_circuit(num_qubits, depth, rng)
# Create all 2^N stabilizers
circ_no_meas = circuit.remove_final_measurements(inplace=False)
stabilizers = PauliList([Pauli("I" * num_qubits)])
for generator in (
    Pauli(label) for label in Clifford(circ_no_meas).to_labels(mode="S")
):
    stabilizers = stabilizers + stabilizers.compose(generator)

# Choose some random stabilizers from the set
keep = np.where(stabilizers.x.any(axis=1) | stabilizers.z.any(axis=1))[0]
chosen = np.random.default_rng(seed).choice(
    keep, size=min(num_stabilizers, len(keep)), replace=False
)
stabilizers = [stabilizers[int(i)] for i in chosen]
depth = circuit.depth(lambda x: x.operation.num_qubits == 2)
print(
    f"Sampled {len(stabilizers)} stabilizers of {circuit.num_qubits}-qubit circuit with two-qubit depth of {depth}: {{{stabilizers[0]}, {stabilizers[1]}, ...}}"
)

Output:

Sampled 100 stabilizers of 12-qubit circuit with two-qubit depth of 72: {-YIIYZZZIZIYY, -XIXYIIZXYIZX, ...}

7. Find good checks for each measured stabilizer

In general, two random stabilizers will not qubit-wise commute, so a set of checks will not usually be valid for two circuits implementing both sets of basis rotations. One could group stabilizers into qubit-wise commuting sets and find checks which represent the whole group, but since we are choosing stabilizers uniformly at random, we will likely not be able to group many stabilizers together for simultaneous sampling. Instead, we will just find a good set of checks for each stabilizer independently. A set of checks is "good" if they are valid and capture a large fraction of logical errors during execution of the circuit.

For this experiment, we will shuffle the order in which we specify target_qubits so that we can study whether the order seems to make a big impact on how the quality of checks progresses as more are added. Remember, checks are added and committed to the circuit sequentially and in the order specified in target_qubits. Once the first check is set, it is not changed as more checks are added.

import random
import time

from tqdm import tqdm


def append_basis_rotation(c, G):
    """Strip measurements, append basis rotations, and re-measure_all."""
    out = c.remove_final_measurements(inplace=False)
    n = G.num_qubits
    for q in range(n):
        if G.x[q]:
            if G.z[q]:
                out.sdg(q)
            out.h(q)
    out.measure_all()
    return out


noisy_circuits = []
checked_circuits = []
t0 = time.time()
depths_2q = []
for i, G in enumerate(tqdm(stabilizers)):
    # Transpile each basis-rotated circuit onto the backend before adding checks
    noisy_circuits.append(pm.run(append_basis_rotation(circuit, G)))
    # Shuffle target/ancilla pairs together so each target keeps its ancilla
    targets, ancillas = zip(
        *random.sample(
            list(zip(target_qubits, ancilla_qubits, strict=True)),
            k=len(target_qubits),
        ),
        strict=True,
    )
    checked_circuits.append(
        add_pauli_checks(
            noisy_circuits[-1],
            list(targets),
            noise_model,
            ancilla_qubits=list(ancillas),
            cost="gamma",
            method="windowed",
            seed=seed + 1 + i,
        )
    )
    depths_2q.append(
        checked_circuits[-1][-1].circuit.depth(lambda x: len(x.qubits) == 2)
    )
print(
    f"Added {len(target_qubits)} checks to {len(stabilizers)} circuits in {(time.time() - t0):.0f}s."
)
print(
    f"On average, 2-qubit depth increased from {circuit.depth(lambda x: len(x.qubits) == 2)} to {int(np.mean(depths_2q))} when adding {num_checks} checks."
)

Output:

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [05:47<00:00,  3.48s/it]

Added 7 checks to 100 circuits in 348s.
On average, 2-qubit depth increased from 72 to 87 when adding 7 checks.

8. Sample from the noisy circuit

We use Qiskit Aer to produce noisy samples for the bare payload circuit as well as the checked circuit. We use the same depolarizing model for simulation of noisy samples that we used to find checks. Once the simulations are complete, we will calculate the average postselection rate and expectation values to get a feel for how well the error detection worked before visualizing the data.

from qiskit_aer import AerSimulator
from qiskit_aer.noise import NoiseModel as AerNoiseModel
from qiskit_aer.noise import ReadoutError, depolarizing_error

# Create a noisy simulator with the same info we used in the noise model for check picking
aer_nm = AerNoiseModel()
aer_nm.add_all_qubit_quantum_error(
    depolarizing_error(noise_model.gate_noise, 2), ["cz"]
)
p = noise_model.readout_noise
aer_nm.add_all_qubit_readout_error(ReadoutError([[1 - p, p], [p, 1 - p]]))
noisy_sim = AerSimulator(method="stabilizer", noise_model=aer_nm)

counts = []
t0 = time.time()
for i, checked_circ_result in enumerate(tqdm(checked_circuits)):
    noisy_counts = (
        noisy_sim.run(
            noisy_circuits[i], shots=num_shots, seed_simulator=seed * i + 1
        )
        .result()
        .get_counts()
    )
    checked_counts_per_variant = []
    for k, ck in enumerate(checked_circ_result):
        cnts = (
            noisy_sim.run(
                ck.circuit, shots=num_shots, seed_simulator=seed * i + 2 + k
            )
            .result()
            .get_counts()
        )
        checked_counts_per_variant.append(cnts)
    counts.append((noisy_counts, checked_counts_per_variant))

Output:

100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 100/100 [05:50<00:00,  3.50s/it]

9. Postselect checked circuit samples

In this package a check is implemented using entangling gates between one ancilla qubit and one target qubit. Each ancilla starts in $|0\rangle$ , so $Z_\text{anc}$ stabilizes its input state. Forward propagating $Z_\text{anc}$ from the beginning of the ancilla through the entire checked circuit yields a Pauli operator on the output which may be higher weight and extend into the payload circuit. The qubit indices on which this output operator has non-identity terms are called the check's support, and the check passes if the bits, $b$ , in the check's support have even parity: $\bigoplus_{i=1} b_i = 0$ . A sample is kept if each check produces $0$ for its parity check.

The CheckedCircuit instances output from add_pauli_checks provide a get_postselection_method method, which returns a method that takes in a bitstring or bit array and returns a syndrome vector which reports the syndrome associated with each check. As shown below, samples which have 0 syndrome for every check are kept, and the rest are discarded.

In the chart, we see that adding more checks causes the postselection rate to go down. While the goal of error detection is to catch as many errors as possible, as postselection rate goes down more shots may be required to achieve some target accuracy. If the postselection rate drops too low, the number of shots needed to collect one logical sample may become prohibitive. In this example, the postselection rate seems to converge, implying that additional checks are contributing less additional detection capability. Although adding more Pauli checks generally provides greater error detection capability, it comes at a cost of increased circuit depth and an increase in sampling cost.

psr_per_variant = []
kept_per_stab = []
for i, (_, checked_counts_per_variant) in enumerate(counts):
    psrs = []
    kept_at_num_checks = None
    for k, cnts in enumerate(checked_counts_per_variant):
        ps_fn = checked_circuits[i][k].get_postselection_method()
        kept = {bs: n for bs, n in cnts.items() if not ps_fn(bs).any()}
        psrs.append(sum(kept.values()) / num_shots)
        if k == num_checks:
            kept_at_num_checks = kept
    psr_per_variant.append(psrs)
    kept_per_stab.append(kept_at_num_checks)

max_len = max(len(s) for s in psr_per_variant)
psrs_arr = np.full((len(psr_per_variant), max_len), np.nan)
for i, s in enumerate(psr_per_variant):
    psrs_arr[i, : len(s)] = s
ks = np.arange(max_len)

fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(ks, psrs_arr.T, color="darkorange", alpha=0.15, linewidth=1)
ax.plot(
    ks,
    np.nanmedian(psrs_arr, axis=0),
    color="black",
    linewidth=1,
    linestyle="--",
    label="median",
)
ax.set(
    xlabel="Checks committed",
    ylabel="Post-selection rate",
    ylim=(0, 1.05),
    title=f"Per-stabilizer post-selection rate ({len(psrs_arr)} stabilizers)",
)
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Output:

10. Compare fidelities of original and postselected noisy states

We see that postselecting only samples with no detected errors greatly improves the expectation value for every stabilizer, and thus improves the estimated fidelity of the stabilizer state. Although the postselected values were calculated using about 60% the number of samples as the raw, noisy values, the expectation values are much more accurate and the sampled variance is lower. We can also now see that the postselection rate above was actually converging to a value similar to the noisy circuit fidelity. The postselection rate was converging because the checks were catching the vast majority of errors actually occurring during circuit execution.

from qiskit.result import sampled_expectation_value


def expectation(counts, G):
    """EV of G from counts measured in Z-basis.

    Pads with `I` on any qubits beyond G's support (e.g. check ancillas in the postselected counts).
    """
    if not counts:
        return float("nan")
    n = G.num_qubits
    sign = -1 if int(G.phase) % 4 == 2 else 1
    total = len(next(iter(counts)))
    label = "".join(
        "Z" if q < n and (G.x[q] or G.z[q]) else "I"
        for q in range(total - 1, -1, -1)
    )
    return sign * sampled_expectation_value(counts, label)


results = []
for i, ((noisy_counts, _), kept) in enumerate(
    zip(counts, kept_per_stab, strict=True)
):
    results.append(
        (
            expectation(noisy_counts, stabilizers[i]),
            expectation(kept, stabilizers[i]),
            sum(kept.values()) / num_shots,
        )
    )

fidelity_noisy = float(np.nanmean([r[0] for r in results]))
fidelity_postsel = float(np.nanmean([r[1] for r in results]))
psr = np.mean([r[2] for r in results])
print(
    f"ideal fidelity: 1.0\n"
    f"noisy fidelity: {fidelity_noisy:.4f}\n"
    f"postselected fidelity: {fidelity_postsel:.4f}\n"
    f"mean post-sel rate: {psr:.3f}\n"
)

evs_ideal = np.array([1.0 for r in results])
evs_noisy = np.array([r[0] for r in results])
evs_post = np.array([r[1] for r in results])
idx = np.arange(len(results))


def cum_mean_sem(ys):
    valid = ~np.isnan(ys)
    s = np.cumsum(np.where(valid, ys, 0.0))
    s2 = np.cumsum(np.where(valid, ys**2, 0.0))
    n = np.maximum(np.cumsum(valid).astype(float), 1)
    mean = s / n
    sem = np.sqrt(np.maximum(s2 / n - mean**2, 0) / n)
    return np.where(np.cumsum(valid) > 0, mean, np.nan), sem


def strip(ax, ys, color, label):
    m, s = np.nanmean(ys), np.nanstd(ys)
    ax.axhspan(
        m - s, m + s, color=color, alpha=0.15, label=f"{label} mean ± std"
    )
    ax.axhline(m, color=color, linewidth=1, linestyle="--")


fig, ax = plt.subplots(figsize=(8, 4))
ax.axhline(np.nanmean(evs_ideal), color="black", linewidth=1.5, label="ideal")
strip(ax, evs_noisy, "red", "noisy")
strip(ax, evs_post, "green", "post-sel")
ax.scatter(
    idx, evs_noisy, color="red", s=22, alpha=0.7, zorder=3, label="noisy <G>"
)
ax.scatter(
    idx,
    evs_post,
    color="green",
    s=22,
    alpha=0.7,
    zorder=3,
    label="post-sel <G>",
)
ax.set(
    xlabel="stabilizer index",
    ylabel=r"$\langle G \rangle$",
    ylim=(-0.1, 1.1),
    title="Per-stabilizer EVs",
)
ax.legend(loc="lower left", ncol=2)
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

M = np.arange(1, len(results) + 1)
fig, ax = plt.subplots(figsize=(8, 4))
for ys, color, label in [
    (evs_ideal, "black", "ideal"),
    (evs_noisy, "red", "noisy"),
    (evs_post, "green", "post-sel"),
]:
    cm, sem = cum_mean_sem(ys)
    ax.plot(M, cm, color=color, linewidth=1.5, label=label)
    ax.fill_between(M, cm - sem, cm + sem, color=color, alpha=0.15)
ax.set(
    xlabel="number of stabilizers averaged",
    ylabel="running fidelity estimate",
    title="Fidelity convergence vs. number of stabilizers",
)
ax.legend()
ax.grid(True, alpha=0.3)
plt.tight_layout()
plt.show()

Output:

ideal fidelity: 1.0
noisy fidelity: 0.6251
postselected fidelity: 0.9864
mean post-sel rate: 0.605

Here we see that although we randomized the order in which we added checks to the target qubits, the convergence of gamma follows roughly the same trend for all stabilizer experiments. For other problems on a different set of physical qubits, and with an imperfect noise model (remember, our noise model in this tutorial exactly modeled the noise affecting the samples), one might see more pronounced differences in convergence depending on the order in which checks are added.

This gamma curve converging to 1.0 implies that additional checks will not be able to detect a great deal of additional error, as there is very little modeled error that is uncovered by the checks. This reinforces our earlier observation regarding convergence in postselection rate being due to a lack of uncovered error in the circuit.

stab_scores = [
    [v.cost for v in checked_circ_result]
    for checked_circ_result in checked_circuits
]
max_len = max(len(s) for s in stab_scores)
scores = np.full((len(stab_scores), max_len), np.nan)
for i, s in enumerate(stab_scores):
    scores[i, : len(s)] = s
ks = np.arange(max_len)

fig, ax = plt.subplots(figsize=(8, 4))
ax.plot(ks, scores.T, color="steelblue", alpha=0.15, linewidth=1)
ax.plot(
    ks,
    np.nanmedian(scores, axis=0),
    color="black",
    linewidth=1,
    linestyle="--",
    label="median",
)
ax.set(
    xlabel="Checks committed",
    ylabel="Gamma",
    yscale="log",
    title=f"Per-stabilizer gamma curves ({len(scores)} stabilizers)",
)
ax.legend()
ax.grid(True, alpha=0.3, which="both")
plt.tight_layout()
plt.show()

Output: