Modeling Notes & Pitfalls

Click the badge below to try this tutorial interactively in your browser:

Launch Binder

You can also run this tutorial inGoogle Colab. It takes a one-time setup per session: follow theColab instructions.

  • Prepared by:

  • Learning objectives. After this tutorial, you will be able to:

    • Recognize common QSDsan modeling surprises by their symptoms.

    • Diagnose whether a surprising result is a misconfiguration, a unit-design signal, or expected platform behavior.

    • Apply fix patterns across streams and components, unit design, and TEA/LCA.

  • Prerequisites:

  • Covered topics:

      1. Streams and components

      1. Unit design and simulation

      1. TEA and LCA

Looking for install or environment errors? See the FAQ Common Errors page.

This tutorial documents practical notes and common pitfalls a user may run into when using QSDsan. Each entry below follows the same shape: What you see, Why it happens, and the Fix. The code cells use minimal toy components and streams so each entry runs on its own.

Setup

Import QSDsan and confirm the installed version.

[1]:
import qsdsan as qs
print(f'This tutorial was made with qsdsan v{qs.__version__}.')
This tutorial was made with qsdsan v1.5.3.

1. Streams and components

1.1. WasteStream vs. SanStream vs. Stream

What you see. A unit configured for wastewater (e.g., a clarifier) is fed a plain qs.Stream, and the unit’s outlet drops the solids and concentration fields that downstream WW-aware code depends on.

[2]:
import qsdsan as qs

cmps = qs.Components.load_default()
qs.set_thermo(cmps)

plain = qs.Stream('plain', H2O=1000, units='kg/hr')
print(type(plain).__name__)
print('has TSS attribute:', hasattr(plain, 'get_TSS'))
Stream
has TSS attribute: False

Why. Stream is the BioSTEAM base type for mass/energy flows. SanStream adds construction/transportation/impact bookkeeping for LCA. WasteStream adds the component-aggregation properties (TSS, COD, BOD, and so on) used by every WW characterization workflow. Mixing types is not a hard error; it silently drops the richer attributes when a stream falls back to the base class.

Fix. Use WasteStream for anything that participates in WW characterization; SanStream if you only need LCA bookkeeping without WW properties; Stream only for utility or heat-medium flows where neither matters.

[3]:
ww = qs.WasteStream('ww', H2O=1000, X_OHO=0.5, units='kg/hr')
print(type(ww).__name__)
print(f'TSS: {ww.get_TSS():.1f} mg/L')
WasteStream
TSS: 387.1 mg/L

1.2. Component is not Chemical

What you see. A user builds a plain Thermosteam Chemical and assumes it behaves like a Component, then finds the WW-classification attributes (degradability, i_COD, and the rest) that QSDsan characterization relies on are simply not there.

[4]:
from qsdsan import Component
from thermosteam import Chemical

glucose_chem = Chemical('Glucose')
print('type:', type(glucose_chem).__name__)
print('Chemical has degradability?', hasattr(glucose_chem, 'degradability'))
print('Chemical has i_COD?', hasattr(glucose_chem, 'i_COD'))
type: Chemical
Chemical has degradability? False
Chemical has i_COD? False

Why. Component subclasses Thermosteam’s Chemical and adds the WW-classification attributes (particle_size, degradability, organic, i_COD, and so on) that QSDsan’s characterization and process models read. A plain Chemical carries the thermodynamics but none of that bookkeeping, so those attributes are absent and anything downstream that expects them fails.

Fix. Build a Component (not a Chemical) for anything that participates in WW workflows, and give it the classification attributes at construction. The thermo properties you already know from Chemical (such as Tb) are still there.

[5]:
glucose = Component('Glucose', search_ID='Glucose',
                    particle_size='Soluble',
                    degradability='Readily',
                    organic=True)
print('type:', type(glucose).__name__)
print('degradability:', glucose.degradability)
print(f'i_COD: {glucose.i_COD:.3f}')
print(f'Tb (inherited from Chemical): {glucose.Tb} K')
type: Component
degradability: Readily
i_COD: 1.066
Tb (inherited from Chemical): 844.68 K

1.3. Stream IDs and the registry

What you see. Re-running a cell that creates a stream prints a replaced ... in registry warning, and a later System resolves a stream by ID and pulls the wrong one.

[6]:
import qsdsan as qs

cmps = qs.Components.load_default()
qs.set_thermo(cmps)

feed = qs.WasteStream('feed', H2O=1000)
feed = qs.WasteStream('feed', H2O=2000)  # warns: replaced in registry
print(qs.main_flowsheet.stream.feed.F_mass)
2000.0
C:\Users\Yalin\Documents\Coding\QSDsan-platform\.venv\Lib\site-packages\thermosteam\_stream.py:407: RuntimeWarning: <WasteStream: feed> has been replaced in registry
  self._register(ID)

Why. Streams (and units) register themselves by ID in the flowsheet registry. Reusing an ID overwrites the prior entry, so anything that resolved the old object by ID is now stale. Auto-IDs (ws1, ws2, …) do not create replacement warnings, but they may change from simulation to simulation and drift from the variable name a reader sees (e.g., ws3’s real ID might be ws5).

[7]:
# Omit the ID and QSDsan assigns one automatically (ws1, ws2, ...).
qs.main_flowsheet.set_flowsheet('auto_id_demo')

feed = qs.WasteStream(H2O=1000)
print(f'variable "feed" -> registry ID {feed.ID!r}')

# Re-building the stream (for example, by re-running the cell) bumps the global
# counter, so the same variable lands under a different ID each time:
for _ in range(3):
    feed = qs.WasteStream(H2O=1000)
    print(f'variable "feed" -> registry ID {feed.ID!r}')
variable "feed" -> registry ID 'ws1'
variable "feed" -> registry ID 'ws2'
variable "feed" -> registry ID 'ws3'
variable "feed" -> registry ID 'ws4'

Fix. For streams you would like to track, pass an explicit ID that matches the variable name. To overwrite intentionally, del the old one first or call qs.main_flowsheet.stream.clear().

[8]:
qs.main_flowsheet.stream.clear()
feed = qs.WasteStream('feed', H2O=2000)
print(f"feed mass: {qs.main_flowsheet.stream.feed.F_mass}")
feed mass: 2000.0

2. Unit design and simulation

2.1. Absurd geometry overlooked

What you see. Very high costs and/or environmental impacts associated with the construction of a reactor. The user assumes the design code is buggy.

[9]:
# Toy CSTR sized far outside its intended HRT range. A real qs unit would do
# this inside _design using its own validated correlations; the toy keeps the
# point self-contained.
class ToyCSTR:
    _design_HRT_range = (0.5, 24)  # hours
    def __init__(self, HRT):
        self.HRT = HRT
    def _design(self):
        lo, hi = self._design_HRT_range
        if not (lo <= self.HRT <= hi):
            return {'note': f'HRT={self.HRT} h is outside the valid range [{lo}, {hi}]'}
        return {'D_m': 4.0, 'H_m': 8.0}

print(ToyCSTR(HRT=200)._design())
{'note': 'HRT=200 h is outside the valid range [0.5, 24]'}

Why. If you are sure that the design code is correct, then check the geometry. A reactor sized for a normal volumetric loading may have a “pancake” shape with very wide diameter but very shallow height (or the reverse). The absurd geometry is the model’s way of flagging an unphysical operating point.

Fix. When a unit returns weird dimensions, check the unit’s docstring or the design references it cites for the valid operating range. If you have a legitimate reason to operate outside that range, either subclass the unit and extend _design with a custom correlation, or insert a second stage so each unit stays inside its window.

[10]:
print(ToyCSTR(HRT=12)._design())
{'D_m': 4.0, 'H_m': 8.0}

2.2. Recycle convergence with biokinetic models

What you see. A System with an ASM-based bioreactor and a sludge recycle loop times out or returns a recycle convergence error at tight tolerance.

[11]:
# Conceptual demonstration of the symptom (kept dependency-free so the
# tutorial runs fast). The point is the convergence behavior, not the numbers.
print("symptom: tight tolerance + a far-from-solution initial guess => stall")
print("a real reproducer is a 2-unit bioreactor + clarifier recycle loop")
symptom: tight tolerance + a far-from-solution initial guess => stall
a real reproducer is a 2-unit bioreactor + clarifier recycle loop

Why. Biokinetic stoichiometry is stiff: small composition changes in the recycle loop produce large kinetic responses, which in turn produce large composition swings. Aitken/Wegstein acceleration helps near the solution but can amplify oscillations when the initial guess is far off. The default tolerance may be tighter than necessary for many wastewater analyses.

Fix.

  1. Loosen the System’s molar_tolerance and temperature_tolerance first (a one-line change; a larger number is looser).

  2. Improve the initial guess for the recycle stream (set realistic concentrations explicitly).

  3. Only if the first two fail, increase maxiter (the maximum number of iterations).

See 6. System for the full convergence-control surface.

[12]:
print("# defaults: molar_tolerance = 1.0 kmol/hr, temperature_tolerance = 0.1 K")
print("sys.molar_tolerance = 5.0        # looser than the default")
print("sys.temperature_tolerance = 0.5  # looser than the default")
print("# then sys.simulate(); raise sys.maxiter only if it still will not converge")
# defaults: molar_tolerance = 1.0 kmol/hr, temperature_tolerance = 0.1 K
sys.molar_tolerance = 5.0        # looser than the default
sys.temperature_tolerance = 0.5  # looser than the default
# then sys.simulate(); raise sys.maxiter only if it still will not converge

2.3. Dynamic simulation not updating states

What you see. A dynamic simulation runs to completion, but every state stays at its initial value. The user assumes the ODE solver is broken.

[13]:
# Conceptual demonstration of the symptom.
print("symptom: simulate() runs, but the state never moves")
print("inspect: did the subclass override _init_state? did _compile_ODE wire dstate?")
symptom: simulate() runs, but the state never moves
inspect: did the subclass override _init_state? did _compile_ODE wire dstate?

Why. Dynamic SanUnit subclasses must override _init_state to populate self._state from ins, and _compile_ODE to populate self._dstate from self._state and ins. If _init_state is missing, the state starts as zeros (or whatever the base class set); if _compile_ODE is missing, dstate is zero, so the state never moves. nbsphinx_allow_errors does not help here, because the simulation does not error, it just does not move.

Fix. Implement both methods in any dynamic subclass, and cross-check state and dstate at t=0 before running the full simulation. See 11. Dynamic Simulation for the full state/dstate contract.

[14]:
print("verify before t-stepping:")
print("u._init_state(); print(u._state)")
print("u._compile_ODE(); u._ode(t=0, y=u._state); print(u._dstate)")
verify before t-stepping:
u._init_state(); print(u._state)
u._compile_ODE(); u._ode(t=0, y=u._state); print(u._dstate)

2.4. simulate does more than _run

What you see. A parameter sweep over a kinetic constant changes the unit’s outlet concentrations from cycle to cycle, but installed_cost is the same every cycle. The user assumes the cost correlation ignores composition.

[15]:
print("symptom: outlet streams change across sweep iterations, installed_cost does not")
symptom: outlet streams change across sweep iterations, installed_cost does not

Why. The sweep is calling unit._run() (or assigning into the unit’s parameters and then reading outlets) without invoking _design and _cost. unit.simulate() calls _run, then _summary, which itself calls _design and _cost. Otherwise the cost results from the previous full simulation stay cached.

Fix. Always go through unit.simulate() (or system.simulate()) when downstream code reads design or cost results. See Putting them together with simulate in tutorial 5 for the full call graph.

[16]:
print("for k in k_range:")
print("    unit.k = k")
print("    unit.simulate()  # not just unit._run()")
print("    record(unit.installed_cost)")
for k in k_range:
    unit.k = k
    unit.simulate()  # not just unit._run()
    record(unit.installed_cost)

3. TEA and LCA

3.1. CEPCI year drifts costs silently

What you see. A benchmark unit’s installed cost prints about 20% off from a published reference, or you suddenly notice a large change in cost metrics despite no recent updates.

[17]:
import qsdsan as qs

print(f'qs.CEPCI = {qs.CEPCI}')
print(f'CEPCI[2023] = {qs.CEPCI_by_year[2023]}')
print(f'CEPCI[2017] = {qs.CEPCI_by_year[2017]}')
print(f'ratio = {qs.CEPCI_by_year[2023] / qs.CEPCI_by_year[2017]:.3f}')
qs.CEPCI = 567.5
CEPCI[2023] = 797.9
CEPCI[2017] = 567.5
ratio = 1.406

Why. qs.CEPCI is module-level state set once at import. If a unit’s purchase-cost correlation was developed for a different year than the current qs.CEPCI, costs are scaled by the ratio of those two indices. When a user runs several analyses targeting different reference years without setting qs.CEPCI explicitly, costs silently drift.

Fix. Set qs.CEPCI deliberately at the top of every analysis, and assert it immediately afterward so a misconfiguration fails loudly:

[18]:
target = qs.CEPCI_by_year[2023]
qs.CEPCI = target
assert qs.CEPCI == target
print(f'CEPCI locked to {qs.CEPCI} (year 2023)')
CEPCI locked to 797.9 (year 2023)

3.2. Purchase vs. installed vs. total capital cost

What you see. Three numbers from one unit (unit.purchase_cost, unit.installed_cost, and the contribution to TEA.installed_equipment_cost) do not tally with the user’s hand calculation.

[19]:
print("purchase_cost: cost of the equipment 'as bought' (FOB).")
print("installed_cost: purchase_cost * F_BM (bare-module factor).")
print("TEA.installed_equipment_cost: sum of installed_cost across units.")
purchase_cost: cost of the equipment 'as bought' (FOB).
installed_cost: purchase_cost * F_BM (bare-module factor).
TEA.installed_equipment_cost: sum of installed_cost across units.

Why. _cost populates purchase_cost; the bare-module factor F_BM (per unit type, often 2 to 3) inflates it to installed_cost; the TEA aggregator may add further capital markups on top (see the TEA tutorial). Most “CAPEX does not add up” reports come from comparing the wrong two fields.

Fix. Match the level: cite purchase_cost for raw-equipment comparisons, installed_cost for delivered-and-installed, and the TEA aggregate for total capital investment. Each is correct at its own level. See 7. TEA for the full capital build-up.

[20]:
print("for unit in sys.units:")
print("    print(unit.ID, unit.purchase_cost, unit.installed_cost, unit.F_BM)")
for unit in sys.units:
    print(unit.ID, unit.purchase_cost, unit.installed_cost, unit.F_BM)

3.3. Static values for varying inventory quantities

What you see. An LCA item whose quantity depends on the simulation (for example, electricity use) is given a fixed number computed at baseline. Later runs that change the system, such as each sample in an uncertainty or sensitivity analysis, produce different electricity use, but the LCA keeps reporting the baseline impact.

[21]:
import qsdsan as qs
from qsdsan.utils import create_example_system

qs.main_flowsheet.set_flowsheet('lca_inventory_demo')  # isolate this entry
sys = create_example_system()
sys.simulate()
lifetime = 10

GWP = qs.ImpactIndicator('GlobalWarming', alias='GWP', unit='kg CO2-eq')
e_item = qs.ImpactItem('e_item', 'kWh', GWP=0.7)  # 0.7 kg CO2-eq per kWh

# Freeze the electricity use computed at this baseline (a plain number).
power_kWh = sys.power_utility.rate * 24 * 365 * lifetime
lca = qs.LCA(system=sys, lifetime=lifetime, simulate_system=False,
             indicators=(GWP,), e_item=power_kWh)
print(f'baseline power: {sys.power_utility.rate:.3f} kW')
print(f'baseline GWP:   {lca.get_total_impacts()[GWP.ID]:,.0f} kg CO2-eq')

# Change a parameter (as a UA/SA sample would) and re-simulate.
sys.flowsheet.stream.salt_water.F_mass *= 2
sys.simulate()
print(f'after change:   {sys.power_utility.rate:.3f} kW (electricity roughly doubled)')
print(f'LCA still says: {lca.get_total_impacts()[GWP.ID]:,.0f} kg CO2-eq  <- unchanged!')
baseline power: 0.916 kW
baseline GWP:   56,150 kg CO2-eq
after change:   1.973 kW (electricity roughly doubled)
LCA still says: 56,150 kg CO2-eq  <- unchanged!

Why. LCA stores each “other item” quantity as a function and re-evaluates it every time results are requested. If you pass a plain number, it is frozen as a constant (lambda: value), so it never reflects later changes to the system. The stream and construction impacts, which are read from the live system, do update, so the electricity term silently falls out of sync with the rest of the inventory, and every UA/SA sample reuses the baseline electricity.

Fix. Pass a callable (or a (callable, unit) tuple) instead of a number, so the quantity is recomputed from the current system state each time impacts are requested. Anything that varies with the simulation (electricity, chemical dosing, emissions) should be supplied this way. See 8. LCA for the full inventory surface.

[22]:
# Pass a callable so the quantity is recomputed from the current system each time.
lca_fixed = qs.LCA(system=sys, lifetime=lifetime, simulate_system=False,
                   indicators=(GWP,),
                   e_item=lambda: sys.power_utility.rate*24*365*lifetime)
print(f'doubled-flow GWP: {lca_fixed.get_total_impacts()[GWP.ID]:,.0f} kg CO2-eq')

# Revert the change and the callable tracks it back down.
sys.flowsheet.stream.salt_water.F_mass /= 2
sys.simulate()
print(f'reverted GWP:     {lca_fixed.get_total_impacts()[GWP.ID]:,.0f} kg CO2-eq  <- tracks the change')
doubled-flow GWP: 120,974 kg CO2-eq
reverted GWP:     80,776 kg CO2-eq  <- tracks the change

↑ Back to top