The Rouse model
In the Rouse model, a polymer chain is described as point particles in viscous medium, chained to each other by harmonic bonds. The force exerted by such a harmonic bond is simply linear in the separation between the two monomers, such that the equation of motion for a linear Rouse polymer reads \begin{equation}\label{eq:vanilla} \dot{x}_i(t) = k(x_{i-1}(t) - x_i(t)) + k(x_{i+1}(t) - x_i(t)) + \xi_i(t)\,, \tag{1} \end{equation} where \(x_i(t)\) is the position of the \(i\)-th monomer at time \(t\), \(k\) is the spring constant of the harmonic bonds, and \(\xi(t)\) is thermal noise; its strength is given by the free diffusion constant \(D\) of an individual monomer: \begin{equation} \left\langle \xi_i(t)\xi_j(t') \right\rangle = 2D\delta_{ij}\delta(t-t')\,. \end{equation}
In principle the \(x_i\) are vectors in \(d\) spatial dimensions; however, due to the linear equation of motion, driven by dimensionally uncorrelated white noise, the spatial dimensions fully decouple, such that we can specialize the analytical treatment to 1D for convenience.
The solution to the model is most naturally found by recognizing eq. \eqref{eq:vanilla} as a (multi-dimensional) linear differential equation; we thus write it in the following form, therebey achieving also some generalization: \begin{equation}\label{eq:eom} \dot{x}(t) = -kAx(t) + F(t) + \xi(t)\,, \tag{2} \end{equation} where now \(x \equiv (x_1, x_2, \ldots, x_N)\), \(F(t)\), and \(\xi \equiv (\xi_1, \xi_2, \ldots, \xi_N)\) are vectors, encoding: the full conformation of the polymer; any deterministic force acting on it; and the thermal fluctuations of each monomer, respectively. \(k\) is the “default” bond strength (see below).
The matrix \(A\) encodes the connectivity of the chain, i.e. which monomer is bound to which and how strongly. From eq. \eqref{eq:vanilla} we can see that for a simple linear chain the connectivity \(A\) would be \begin{equation} A = \left(\begin{array}{ccccc} 1 & -1 & & & & \\ -1 & 2 & -1 & & & \\ & -1 & 2 & -1 & & \\ & & & \ddots & & \\ & & & -1 & 2 & -1 \\ & & & & -1 & 1 \\ \end{array}\right)\,. \end{equation} We can insert more bonds into this matrix, or remove some; to insert a bond of strength \(rk\) between monomers \(i\) and \(j\), we do the following substitutions: \begin{equation} A_{ij} \gets A_{ij} - r \,;\qquad A_{ji} \gets A_{ji} - r \,;\qquad A_{ii} \gets A_{ii} + r \,;\qquad A_{jj} \gets A_{jj} + r \,. \end{equation} Of course this means we can remove an existing bond by “inserting” a bond with \(r < 0\).
The solution to eq. \eqref{eq:eom} is relatively straight-forward and forms the basis of the rouse module: \begin{equation}\label{eq:sol}
x(t) = \mathrm{e}^{-kAt}x(0) + \int_0^t\mathrm{d}\tau\, \mathrm{e}^{-kA(t-\tau)}\left(F(\tau) + \xi(\tau)\right)\,. \tag{3}
\end{equation}
The rouse module now essentially provides various useful ways to evaluate eq. \eqref{eq:sol} (and a few related things):
evolving a conformation \(x(t)\) forward for some time \(\Delta t\)
calculating the steady state ensemble of conformations
propagating Gaussian ensembles of conformations forward in time
evaluating the analytical expression for mean squared displacement (MSD) of a given combination of loci
Seeing how all of these work is essentially the to-do list for this tutorial. So let’s start by setting up a rouse.Model.
[1]:
import numpy as np
from scipy import stats
from matplotlib import pyplot as plt
import rouse
Creating a rouse.Model
We start by creating a simple model with 10 monomers:
[2]:
model = rouse.Model(N=10, d=2)
print(model)
rouse.Model(N=10, D=1.0, k=1.0, d=2)
By default, rouse.Model initializes a free, linear polymer. We can check this by taking a look at the connectivity matrix A and the external force F:
[3]:
print("Connectivity A")
print(model.A)
print()
print("External force F")
print(model.F)
Connectivity A
[[ 1. -1. 0. 0. 0. 0. 0. 0. 0. 0.]
[-1. 2. -1. 0. 0. 0. 0. 0. 0. 0.]
[ 0. -1. 2. -1. 0. 0. 0. 0. 0. 0.]
[ 0. 0. -1. 2. -1. 0. 0. 0. 0. 0.]
[ 0. 0. 0. -1. 2. -1. 0. 0. 0. 0.]
[ 0. 0. 0. 0. -1. 2. -1. 0. 0. 0.]
[ 0. 0. 0. 0. 0. -1. 2. -1. 0. 0.]
[ 0. 0. 0. 0. 0. 0. -1. 2. -1. 0.]
[ 0. 0. 0. 0. 0. 0. 0. -1. 2. -1.]
[ 0. 0. 0. 0. 0. 0. 0. 0. -1. 1.]]
External force F
[[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]
[0. 0.]]
Set up like this, the model represents a linear polymer, which in equilibrium forms a random walk coil. We can sample these equilibrium conformations with conf_ss():
[4]:
# Define a function to plot a few polymer conformations
# we'll use this repeatedly below
def plot_conformations(conformations):
n = len(conformations)
fig, axs = plt.subplots(1, n, figsize=[3*n, 3])
for i, ax in enumerate(axs):
conf = conformations[i]
ax.plot(conf[:, 0], conf[:, 1],
marker='o',
color=f"C{i}",
)
ax.axis('square')
ax.axis('off')
plt.show()
[5]:
conformations = [model.conf_ss() for _ in range(5)]
plot_conformations(conformations)
As outlined above, we can modify the connectivity of the chain: by adding a bond between the first and last monomer we get a ring polymer.
[6]:
model.add_bonds([(0, -1)])
conformations = [model.conf_ss() for _ in range(5)]
plot_conformations(conformations)
We can also build a Brownian bridge by tethering the two ends of the chain to fixed positions in space:
[7]:
model.add_bonds([(0, -1)], k_rel=-1) # remove the circularization from previous example
model.add_tether(mon=0, point=(0, 0))
model.add_tether(mon=-1, point=(10, 10))
conformations = [model.conf_ss() for _ in range(5)]
plot_conformations(conformations)
Evolving conformations in time
Let’s watch a linearly extended conformation collapse to the random walk steady state we saw above:
[8]:
model = rouse.Model(N=10)
init = np.array([np.arange(10), np.arange(10)]).T # diagonally stretched linear conformation
t = np.array([0, 1, 2, 5, 10, 20])
for _ in range(3):
# Get time trace of conformations
conformations = [init]
for dt in np.diff(t):
current = conformations[-1]
evolved = model.evolve(current, dt=dt)
conformations.append(evolved)
# Plot
n = len(t) # number of subplots
fig, axs = plt.subplots(1, n,
figsize=[n*3, 3],
sharex=True,
sharey=True,
)
for i, (ax, conf, t_cur) in enumerate(zip(axs, conformations, t)):
ax.plot(conf[:, 0], conf[:, 1],
marker='o',
color=f"C{i}",
)
ax.axis('square')
ax.axis('off')
ax.set_title(f"t = {t_cur}")
plt.show()
Note that the logarithmically changing time step here is a bit untypical. For many applications, we are more interested in evolving repeatedly for the same time step; this allows to speed up computations significantly, as explained below.
Cached computations
The evolution of a given conformation \(x(t)\) is given by the analytical solution to the Rouse model as \begin{equation} x(t+\Delta t) = \text{e}^{-kA\Delta t} x(t) + \eta(t)\,, \end{equation} where \(\eta(t)\) is a Gaussian random variable defined by \(\eta(t) = \int_t^{t+\Delta t}\text{d}\tau\, \text{e}^{-kA(t-\tau)} \xi(\tau)\); we set the external force \(F = 0\) for the time being.
The key point here is that evolution over a time step \(\Delta t\) amounts to matrix multiplication with \(\text{e}^{-kA\Delta t}\) (the “propagator”) and sampling of Gaussian random variables with a given correlation structure. Executing these steps is very fast, provided that we have numerical representations of the propagator and the noise correlation matrix; calculating those two, on the other hand, is computationally costly.
To avoid recalculating propagator and noise correlation for every single evolution step, rouse.Model caches these internally. You can update this cache by calling update_dynamics(), which takes the time step dt as argument. This function is also called upon initialization of a rouse.Model, with a time step of dt = 1. In many cases there is no reason to use another time step; see the side note on scale
invariance below.
This caching—while speeding up time evolution significantly—has the disadvantage that the cached data has to be kept up to date when modifying the model. rouse.Model tries to keep track of changes and detect them with check_dynamics():
[9]:
model = rouse.Model(N=10, k=1,
setup_dynamics=True, # setup_dynamics = True is the default; just mentioning it for clarity
) # as the name suggests, this sets up (i.e. caches) the
# dynamics with the current settings
model.k = 5 # change model; the cached values are now incorrect
try:
model.check_dynamics(run_if_necessary=False)
except RuntimeError:
print("model.check_dynamics() detected a change in the model")
model.check_dynamics() detected a change in the model
Note the use of run_if_necessary = False: by default, check_dynamics(), upon detecting a mismatch between cached and current values, does not actually throw an error (as in the example), but just calls update_dynamics() to bring the cache up to date.
Most of this cache-checking is usually done internally, so you as the user won’t have to be concerned with it too much. There are, however, two things to take away from this: + check_dynamics() is not all-encompassing. It will not detect, for example, edits you make directly to model.A (the connectivity matrix). Doing so is therefore strongly discouraged, use add_bonds() instead. Direct edits of model.F are also not detected; remember to call update_dynamics() (or potentially
update_F_only(), if that was the only edit). + functions like evolve and propagate (the latter will be introduced below) by default run check_dynamics() on every call, which might be inefficient if you can be sure that the model has not been modified. In these cases, we recommend to run check_dynamics() manually once and then pass the keyword argument check_dynamics = False to these functions.
Side note on scale invariance
The Rouse model is scale invariant, i.e. the only time scale in the system is \(1/k\), while the only length scale is \(\sqrt{D/k}\). Thus, evolving a model with given \((D, k)\) over a short time step \(r\Delta t\) is exactly the same as evolving a modified model \((D', k')\) over the time step \(\Delta t\), where \(D' = rD'\), \(k' = rk'\). You can thus choose to either keep \(\Delta t\) constant and adjust the model parameters as needed, or fix e.g. \(k = 1\) and adjust \(\Delta t\) to achieve the desired behavior.
Steady state ensemble
In the examples above, we were concerned with single conformations of the polymer at a given time. From a statistical physics perspective, we are frequently not so much interested in single conformations, but rather in the behavior of the whole ensemble. To that end, we run many repeats of the same simulation and estimate the probability density over various variables of interest.
The Rouse model being analytically solvable saves us from this laborious task, since we can just write down the full probability distribution over chain conformations analytically. In steady state, this distribution will always be Gaussian, such that we need to be concerned only with its mean \(\mu_i\equiv\left\langle x_i\right\rangle\) and covariance \(\Sigma_{ij}\equiv\left\langle x_i x_j\right\rangle_c\). Using the notation introduced above, one finds the steady state ensemble to be given by
\begin{align} \left\langle x\right\rangle_\text{ss} {}={}& (kA)^{-1} F \,, \\ \left\langle x^2\right\rangle_\text{c, ss} {}={}& (2kA)^{-1} \,. \end{align}
Note the use of \(A^{-1}\); \(A\) is not necessarily invertible. As a matter of fact, it rarely is: check the connectivity matrix for a free chain written above. But then, technically speaking, a free chain also does not actually have a steady state: once the internal degrees of freedom equilibrate, the whole polymer coil will just keep diffusing in the medium, eventually drifting off to infinity. A free chain of \(N\) monomers therefore has one degree of freedom (the center of mass location) that does not equilibrate, while the remaining \(N-1\) degrees of freedom (sometimes called “Rouse modes”) are internal and equilibrate in a finite time. When calculating the “steady state” for a free chain then, what we really would like to do is to get the steady state for all internal degrees of freedom and pin the center of mass to some fixed point in space. This prescription is exactly achieved by using the Moore-Penrose inverse of \(A\) whenever we write \(A^{-1}\).
The Moore-Penrose inverse is best understood in terms of the eigenvalues of the matrix. Normally, to invert a diagonalizable matrix \(A\), one simply replaces all eigenvalues \(\lambda_i\) by their inverse \(\lambda_i^{-1}\). This clearly fails when \(\lambda_i = 0\) for some \(i\), i.e. the matrix is not invertible. The Moore-Penrose inverse of a diagonalizable matrix \(A\) now simply follows that prescription “as far as possible”: replace all eigenvalues \(\lambda_i \neq 0\) by their inverse, while simply keeping any zero eigenvalue.
Using the Moore-Penrose inverse in the expressions for the steady state ensemble means that we calculate the steady state for all equilibrating degrees of freedom, while pinning everything that does not equilibrate at zero.
As an example, consider a system with two independent chains. This is easy to set up by removing any bond from the free chain, thus “cutting” it into two pieces. With the two centers of mass of the chains, the system now has two non-equilibrating degrees of freedom. The steady state calculated by rouse.Model will thus place both coils at the origin.
Finally, let’s see what these steady states look like:
[10]:
model = rouse.Model(N=10, d=1)
mu, Sigma = model.steady_state()
# Plot μ and Σ
fig, axs = plt.subplots(1, 2,
figsize=[7, 4],
gridspec_kw={'width_ratios' : [2, 5], 'wspace' : 0.2},
sharey=True,
)
ax = axs[0]
ax.plot(mu, np.arange(model.N))
ax.set_title(r'$\mu_i = \left\langle x_i\right\rangle$')
ax.set_xlabel('mean position')
ax.set_ylabel('monomer index i')
ax.set_yticks(np.arange(model.N))
ax = axs[1]
ax.imshow(Sigma)
ax.set_title(r'$\Sigma_{ij} = \left\langle x_i x_j\right\rangle_c$')
ax.set_xlabel('monomer index j')
ax.set_ylabel('monomer index i')
ax.set_xticks(np.arange(model.N))
ax.yaxis.set_tick_params(which='both', labelleft=True)
plt.show()
Since we initialized a free chain (no external forcing or tethering), the expectation value for all monomers is exactly the chain center of mass. The covariance is highest at the ends of the chain, since these tend to be furthest from the center of mass.
We can utilize this Gaussian ensemble to calculate probability distributions over observable quantities, such as the end-to-end separation, \(R_\text{ee} \equiv x_N - x_0 = w\cdot x\), where the measurement vector \(w\) is defined below. Since \(x\) is Gaussian, so is \(R_\text{ee}\), with zero mean and covariance \(\left\langle R_\text{ee}^2\right\rangle_c = w\Sigma w^T\). Thus:
[11]:
w = np.zeros(model.N)
w[0] = -1
w[-1] = 1
R2 = w @ Sigma @ w.T
# This is a well-known result
print(f"Mean squared end-to-end distance with N = {model.N} monomers: N-1 = {R2:.3f}")
# Plot
xplot = np.linspace(-4*np.sqrt(R2), 4*np.sqrt(R2), 100)
plt.plot(xplot, stats.norm(loc=0, scale=np.sqrt(R2)).pdf(xplot))
plt.xlabel('End-to-end separation R_ee')
plt.ylabel('density')
plt.show()
Mean squared end-to-end distance with N = 10 monomers: N-1 = 9.000
Propagating Gaussian ensembles
We will now repeat the example above, where we watched a linearly extended chain collapse, except that now we are calculating the evolution of the whole ensemble. We initialize the system with a Brownian bridge:
[12]:
model = rouse.Model(N=10, d=2)
model.add_tether(mon=0, point=(0, 0))
model.add_tether(mon=-1, point=(10, 10))
M, C = model.steady_state()
# Plot
fig, axs = plt.subplots(1, 2,
figsize=[8, 4],
gridspec_kw={'width_ratios' : [4, 4], 'wspace' : 0.2},
)
ax = axs[0]
ax.plot(M[:, 0], M[:, 1], marker='o')
ax.axis('square')
ax.set_title('mean conformation')
ax = axs[1]
ax.imshow(C)
ax.set_title('Covariance C_ij')
ax.set_xlabel('monomer index j')
ax.set_ylabel('monomer index i')
ax.set_xticks(np.arange(model.N))
ax.set_yticks(np.arange(model.N))
plt.show()
Plotting the full covariance matrix might not be the most intuitive. Instead, let’s look at the monomer (i.e. 1-point) density in space; we expect this to be scattered around the mean (which is shown in red below), and concentrating towards the ends, where the monomers are tethered. Note that we use only the diagonal elements of the covariance matrix to calculate this density, omitting any correlation between the monomers.
[13]:
def visualize_ensemble(M, C, ax=None):
x, y = np.meshgrid(np.linspace(0, 10, 100), np.linspace(0, 10, 100))
density = np.zeros_like(x)
for i, m in enumerate(M):
std = np.sqrt(C[i, i])
pdf = stats.norm(loc=m[0], scale=std).pdf(x)
pdf *= stats.norm(loc=m[1], scale=std).pdf(y)
density += pdf
if ax is None:
ax = plt.gca()
ax.pcolormesh(x, y, density, shading='nearest')
# ax.plot(M[:, 0], M[:, 1], marker='o', color='tab:red')
ax.axis('square')
ax.axis('off')
visualize_ensemble(M, C)
plt.show()
Now let’s look at the collapse of this chain, using the propagate() function:
[14]:
# Remove the tethers
model.add_tether(mon=0, point=(0, 0), k_rel=-1)
model.add_tether(mon=-1, point=(10, 10), k_rel=-1)
# Run the calculation
t = np.array([0, 1, 2, 5, 10, 20])
ensembles = [[M, C]]
for dt in np.diff(t):
M, C = model.propagate(*ensembles[-1], dt=dt)
ensembles.append([M, C])
# Plot
fig, axs = plt.subplots(1, len(t), figsize=[3*len(t), 3])
for t_cur, ens, ax in zip(t, ensembles, axs):
visualize_ensemble(*ens, ax=ax)
ax.set_title(f't = {t_cur}')
plt.show()
The same remarks about cached computations as mentioned below the examples for evolution of single trajectories apply: try to avoid changing the time step, especially for bigger models.
Note the return values of steady_state(): while the mean conformation M, carries the correct number of spatial dimensions (it is an N x d array) we only get an N x N covariance matrix, instead of the (Nd) x (Nd) matrix that we should get in full generality for N monomers in d spatial dimensions. This is technically a simplification that this implementation of the Rouse model makes, but a very fair one at that: under the Rouse model, the spatial dimensions fully
decouple; as long as they are uncorrelated in the initial conditions, they will remain independent going forward. You can, however, use separate covariance matrices C for the different spatial dimensions. In this case, use propagate_C() and propagate_M() to propagate mean and covariance separately. (The evolution equations for M and C are independent; propagate() is just a convenience function
calling both propagate_M() and propagate_C() in succession).
Mean squared displacement (MSD)
The Rouse model allows us to calculate MSD curves analytically. For example, the MSD of the center monomer of a short chain:
[15]:
# Setup
model = rouse.Model(N=21)
w = np.zeros(model.N)
w[model.N//2] = 1
dts = np.logspace(-1, 3, 100)
# Calculation
msd = model.MSD(dts, w)
# Plotting (including scaling laws)
plt.loglog(dts, msd, color='k', linewidth=2, label='analytical MSD', zorder=5)
dt = np.logspace(-1, 0, 10); plt.plot(dt, 2*model.d*model.D*dt, label='free monomers')
dt = np.logspace(-1, 3, 10); plt.plot(dt, model.Gamma()*np.sqrt(dt), label='Rouse scaling')
dt = np.logspace(1, 3, 10); plt.plot(dt, 2*model.d*(model.D/model.N)*dt, label='whole coil diffusion')
plt.legend()
plt.xlabel('lag time')
plt.ylabel('MSD')
plt.title('MSD of center monomer of 21 monomer chain')
plt.show()
Note the “measurement vector” w, indicating which monomer we are looking at. Generally, MSD() returns the MSD for an observable \(w\cdot x\); so we can for example also look at the MSD of the vector between two monomers in the chain:
[16]:
# Setup
model = rouse.Model(N=61)
w = np.zeros(model.N)
w[20] = -1
w[40] = 1
dts = np.logspace(-1, 4, 100)
L = np.diff(np.nonzero(w))[0] # number of bonds between the two labelled monomers
# Calculation
msd = model.MSD(dts, w)
# Plotting (including scaling laws)
plt.loglog(dts, msd, color='k', linewidth=2, label='analytical MSD', zorder=5)
dt = np.logspace(-1, 0, 10); plt.plot(dt, 4*model.d*model.D*dt, label='free monomers')
dt = np.logspace(-1, 3, 10); plt.plot(dt, 2*model.Gamma()*np.sqrt(dt), label='Rouse scaling')
plt.hlines(model.MSD(np.inf, w), 1e2, 1e4, color='C2', label='steady state')
plt.legend()
plt.xlabel('lag time')
plt.ylabel('MSD')
plt.title('2-point MSD of monomers (20, 40) in chain of 61 monomers')
plt.show()
In this example, note the use of MSD(np.inf) to calculate the saturation value of the MSD. Also, compare the above plot to rouse.twoLocusMSD(), which evaluates the analytical expression for the 2-point MSD of two loci on an infinite & continuous Rouse chain.
Contact frequency maps
From the steady state distribution, we can also estimate what a contact frequency map à la Hi-C would look like:
[17]:
model = rouse.Model(N=100)
F0 = model.contact_frequency()
# for comparison: insert a loop
model.add_bonds([(33, 66)])
F1 = model.contact_frequency()
# Plot
fig, axs = plt.subplots(1, 2, figsize=[10, 5])
for F, ax, title in zip([F0, F1],
axs,
['free chain', 'fixed loop'],
):
ax.imshow(np.log10(F),
cmap='inferno_r',
vmin=-3, vmax=0,
)
ax.axis('off')
ax.set_title(title)
plt.show()