1 minute read

Normalizing Flows (NFs) are generative models that keep every layer invertible. They learn a mapping (f: x \rightarrow z) that pushes complex data (x) into a simple base distribution (p(z)) (usually a standard Gaussian), and the inverse (f^{-1}) serves as a sampler. Because (f) is invertible, we get an exact likelihood via the change-of-variables formula:

[ \log p(x) = \log p(z) + \log \left|\det \frac{\partial f(x)}{\partial x}\right|. ]

That determinant term is the “cost of twisting space,” so the art of NFs is designing layers where the Jacobian determinant is cheap to compute but the transformation is expressive.

A quick timeline (beginner-friendly)

  • NICE (2015): Additive coupling layers; determinant becomes 1, so likelihoods are trivial to compute.
  • RealNVP (2016): Affine couplings and multi-scale structure; greater expressiveness while keeping determinants easy.
  • Glow (2018): Invertible (1 \times 1) convolutions to mix channels, leading to sharper image synthesis.
  • Beyond 2018: Spline flows, continuous-time flows (FFJORD), and flows for audio, text, and 3D.

Why use flows?

  • Exact likelihoods: Unlike GANs, you can compute (\log p(x)) directly, which stabilizes training and evaluation.
  • Two-way mapping: The same model handles inference (data (\rightarrow) latent) and sampling (latent (\rightarrow) data).
  • Modularity: Coupling layers, permutations, and invertible convolutions can be stacked like Lego bricks.

Design knobs that matter

  • Base distribution: Typically (\mathcal{N}(0, I)); richer priors can encode domain bias.
  • Coupling transform: Additive/affine vs. spline-based vs. neural ODE flows—trading off flexibility and compute.
  • Dimensionality handling: Squeezing, channel mixing, and multi-scale splits balance detail and efficiency.
  • Scaling to long sequences and video: Determinants and inverses must stay cheap as dimensionality grows.
  • Hybrid models: Combining flows with diffusion or autoregressive priors to get both likelihoods and strong samples.
  • Physics- and geometry-aware flows: Injecting structure for scientific and 3D domains.

My angle (BiFlow)

In Bidirectional Normalizing Flow (BiFlow), we push for better forward–inverse symmetry: keeping invertibility while improving expressiveness in both directions. The goal is to retain exact likelihoods and efficient sampling, but make the learned map robust for high-dimensional data and downstream tasks. This direction aims to make flows practical for modern generative workloads rather than only small or toy datasets.

Tags:

Categories:

Updated: