# Quantum Circuit Born Machine

Reference: Jin-Guo Liu, Lei Wang (2018) Differentiable Learning of Quantum Circuit Born Machine

```
using Yao, Yao.Blocks
using LinearAlgebra
```

## Training target

A gaussian distribution

```
function gaussian_pdf(x, μ::Real, σ::Real)
pl = @. 1 / sqrt(2pi * σ^2) * exp(-(x - μ)^2 / (2 * σ^2))
pl / sum(pl)
end
pg = gaussian_pdf(1:1<<6, 1<<5-0.5, 1<<4);
```

This distribution looks like

## Build Circuits

### Building Blocks

Gates are grouped to become a layer in a circuit, this layer can be **Arbitrary Rotation** or **CNOT entangler**. Which are used as our basic building blocks of **Born Machines**.

#### Arbitrary Rotation

Arbitrary Rotation is built with **Rotation Gate on Z**, **Rotation Gate on X** and **Rotation Gate on Z**:

Since our input will be a $|0\dots 0\rangle$ state. The first layer of arbitrary rotation can just use $Rx(\theta) \cdot Rz(\theta)$ and the last layer of arbitrary rotation could just use $Rz(\theta)\cdot Rx(\theta)$

In **幺**, every Hilbert operator is a **block** type, this includes all **quantum gates** and **quantum oracles**. In general, operators appears in a quantum circuit can be divided into **Composite Blocks** and **Primitive Blocks**.

We follow the low abstraction principle and thus each block represents a certain approach of calculation. The simplest **Composite Block** is a **Chain Block**, which chains other blocks (oracles) with the same number of qubits together. It is just a simple mathematical composition of operators with same size. e.g.

We can construct an arbitrary rotation block by chain $Rz$, $Rx$, $Rz$ together.

`chain(Rz(0), Rx(0), Rz(0))`

```
Total: 1, DataType: Complex{Float64}
chain
├─ Rot Z gate: 0.0
├─ Rot X gate: 0.0
└─ Rot Z gate: 0.0
```

`Rx`

, `Ry`

and `Rz`

will construct new rotation gate, which are just shorthands for `rot(X, 0.0)`

, etc.

Then, let's chain them up

```
layer(nbit::Int, x::Symbol) = layer(nbit, Val(x))
layer(nbit::Int, ::Val{:first}) = chain(nbit, put(i=>chain(Rx(0), Rz(0))) for i = 1:nbit);
```

Here, we do not need to feed the first `nbit`

parameter into `put`

. All factory methods can be **lazy** evaluate **the first arguements**, which is the number of qubits. It will return a lambda function that requires a single interger input. The instance of desired block will only be constructed until all the information is filled. When you filled all the information in somewhere of the declaration, 幺 will be able to infer the others. We will now define the rest of rotation layers

```
layer(nbit::Int, ::Val{:last}) = chain(nbit, put(i=>chain(Rz(0), Rx(0))) for i = 1:nbit)
layer(nbit::Int, ::Val{:mid}) = chain(nbit, put(i=>chain(Rz(0), Rx(0), Rz(0))) for i = 1:nbit);
```

#### CNOT Entangler

Another component of quantum circuit born machine is several **CNOT** operators applied on different qubits.

`entangler(pairs) = chain(control([ctrl, ], target=>X) for (ctrl, target) in pairs);`

We can then define such a born machine

```
function build_circuit(n::Int, nlayer::Int, pairs)
circuit = chain(n)
push!(circuit, layer(n, :first))
for i = 1:(nlayer - 1)
push!(circuit, cache(entangler(pairs)))
push!(circuit, layer(n, :mid))
end
push!(circuit, cache(entangler(pairs)))
push!(circuit, layer(n, :last))
circuit
end;
```

We use the method `cache`

here to tag the entangler block that it should be cached after its first run, because it is actually a constant oracle. Let's see what will be constructed

```
julia> build_circuit(4, 1, [1=>2, 2=>3, 3=>4]) |> autodiff(:QC)
Total: 4, DataType: Complex{Float64}
chain
├─ chain
│ ├─ put on (1)
│ │ └─ chain
│ │ ├─ [̂∂] Rot X gate: 0.0
│ │ └─ [̂∂] Rot Z gate: 0.0
│ ├─ put on (2)
│ │ └─ chain
│ │ ├─ [̂∂] Rot X gate: 0.0
│ │ └─ [̂∂] Rot Z gate: 0.0
│ ├─ put on (3)
│ │ └─ chain
│ │ ├─ [̂∂] Rot X gate: 0.0
│ │ └─ [̂∂] Rot Z gate: 0.0
│ └─ put on (4)
│ └─ chain
│ ├─ [̂∂] Rot X gate: 0.0
│ └─ [̂∂] Rot Z gate: 0.0
├─ [↺] chain
│ ├─ control(1)
│ │ └─ (2,)=>X gate
│ ├─ control(2)
│ │ └─ (3,)=>X gate
│ └─ control(3)
│ └─ (4,)=>X gate
└─ chain
├─ put on (1)
│ └─ chain
│ ├─ [̂∂] Rot Z gate: 0.0
│ └─ [̂∂] Rot X gate: 0.0
├─ put on (2)
│ └─ chain
│ ├─ [̂∂] Rot Z gate: 0.0
│ └─ [̂∂] Rot X gate: 0.0
├─ put on (3)
│ └─ chain
│ ├─ [̂∂] Rot Z gate: 0.0
│ └─ [̂∂] Rot X gate: 0.0
└─ put on (4)
└─ chain
├─ [̂∂] Rot Z gate: 0.0
└─ [̂∂] Rot X gate: 0.0
```

`RotationGate`

s inside this circuit are automatically marked by [̂∂], which means parameters inside are diferentiable. `autodiff`

has two modes, one is `autodiff(:QC)`

, which means quantum differentiation with simulation complexity $O(M^2)$ ($M$ is the number of parameters), the other is classical backpropagation `autodiff(:BP)`

with simulation coplexity $O(M)$.

Let's define a circuit to use later

```
circuit = build_circuit(6, 10, [1=>2, 3=>4, 5=>6, 2=>3, 4=>5, 6=>1]) |> autodiff(:QC)
dispatch!(circuit, :random);
```

Here, the function `autodiff(:QC)`

will mark rotation gates in a circuit as differentiable automatically.

## MMD Loss & Gradients

The MMD loss is describe below:

We will use a squared exponential kernel here.

```
struct RBFKernel
sigma::Float64
matrix::Matrix{Float64}
end
"""get kernel matrix"""
kmat(mbf::RBFKernel) = mbf.matrix
"""statistic functional for kernel matrix"""
kernel_expect(kernel::RBFKernel, px::Vector, py::Vector=px) = px' * kmat(kernel) * py;
```

Now let's define the RBF kernel matrix used in calculation

```
function rbf_kernel(basis, σ::Real)
dx2 = (basis .- basis').^2
RBFKernel(σ, exp.(-1/2σ * dx2))
end
kernel = rbf_kernel(0:1<<6-1, 0.25);
```

Next, we build a QCBM setup, which is a combination of `circuit`

, `kernel`

and target probability distribution `ptrain`

Its loss function is MMD loss, if and only if it is 0, the output probability of circuit matches `ptrain`

exactly.

```
struct QCBM{BT<:AbstractBlock}
circuit::BT
kernel::RBFKernel
ptrain::Vector{Float64}
end
"""get wave function"""
psi(qcbm::QCBM) = zero_state(qcbm.circuit |> nqubits) |> qcbm.circuit
"""extract probability dierctly"""
Yao.probs(qcbm::QCBM) = qcbm |> psi |> probs
"""the loss function"""
function mmd_loss(qcbm, p=qcbm|>probs)
p = p - qcbm.ptrain
kernel_expect(qcbm.kernel, p, p)
end;
```

problem setup

`qcbm = QCBM(circuit, kernel, pg);`

### Gradients

the gradient of MMD loss is

```
function mmdgrad(qcbm::QCBM, dbs; p0::Vector)
statdiff(()->probs(qcbm) |> as_weights, dbs, StatFunctional(kmat(qcbm.kernel)), initial=p0 |> as_weights) -
2*statdiff(()->probs(qcbm) |> as_weights, dbs, StatFunctional(kmat(qcbm.kernel)*qcbm.ptrain))
end;
```

## Optimizer

We will use the Adam optimizer. Since we don't want you to install another package for this, the following code for this optimizer is copied from Knet.jl

Reference: Kingma, D. P., & Ba, J. L. (2015). Adam: a Method for Stochastic Optimization. International Conference on Learning Representations, 1–13.

```
mutable struct Adam
lr::AbstractFloat
gclip::AbstractFloat
beta1::AbstractFloat
beta2::AbstractFloat
eps::AbstractFloat
t::Int
fstm
scndm
end
Adam(; lr=0.001, gclip=0, beta1=0.9, beta2=0.999, eps=1e-8)=Adam(lr, gclip, beta1, beta2, eps, 0, nothing, nothing)
function update!(w, g, p::Adam)
gclip!(g, p.gclip)
if p.fstm===nothing; p.fstm=zero(w); p.scndm=zero(w); end
p.t += 1
lmul!(p.beta1, p.fstm)
BLAS.axpy!(1-p.beta1, g, p.fstm)
lmul!(p.beta2, p.scndm)
BLAS.axpy!(1-p.beta2, g .* g, p.scndm)
fstm_corrected = p.fstm / (1 - p.beta1 ^ p.t)
scndm_corrected = p.scndm / (1 - p.beta2 ^ p.t)
BLAS.axpy!(-p.lr, @.(fstm_corrected / (sqrt(scndm_corrected) + p.eps)), w)
end
function gclip!(g, gclip)
if gclip == 0
g
else
gnorm = vecnorm(g)
if gnorm <= gclip
g
else
BLAS.scale!(gclip/gnorm, g)
end
end
end
optim = Adam(lr=0.1);
```

## Start Training

We define an iterator called `QCBMOptimizer`

. We want to realize some interface like

```
for x in qo
# runtime result analysis
end
```

Although such design makes the code a bit more complicated, but one will benefit from this interfaces when doing run time analysis, like keeping track of the loss.

```
struct QCBMOptimizer
qcbm::QCBM
optimizer
dbs
params::Vector
QCBMOptimizer(qcbm::QCBM, optimizer) = new(qcbm, optimizer, collect(qcbm.circuit, AbstractDiff), parameters(qcbm.circuit))
end
```

In the initialization of `QCBMOptimizer`

instance, we collect all differentiable units into a sequence `dbs`

for furture use.

**iterator interface** To support iteration operations, `Base.iterate`

should be implemented

```
function Base.iterate(qo::QCBMOptimizer, state::Int=1)
p0 = qo.qcbm |> probs
grad = mmdgrad.(Ref(qo.qcbm), qo.dbs, p0=p0)
update!(qo.params, grad, qo.optimizer)
dispatch!(qo.qcbm.circuit, qo.params)
(p0, state+1)
end
```

In each iteration, the iterator will return the generated probability distribution in current step. During each iteration step, we broadcast `mmdgrad`

function over `dbs`

to obtain all gradients. Here, To avoid the QCBM instance from being broadcasted, we wrap it with `Ref`

to create a reference for it. The training of the quantum circuit is simple, just iterate through the steps.

```
history = Float64[]
for (k, p) in enumerate(QCBMOptimizer(qcbm, optim))
curr_loss = mmd_loss(qcbm, p)
push!(history, curr_loss)
k%5 == 0 && println("k = ", k, " loss = ", curr_loss)
k >= 50 && break
end
```

```
k = 5 loss = 0.00847749636271937
k = 10 loss = 0.0022075107632702835
k = 15 loss = 0.0013664488262862158
k = 20 loss = 0.0008221894171723352
k = 25 loss = 0.0004509647774503069
k = 30 loss = 0.0002799245762106908
k = 35 loss = 0.0001433244922560841
k = 40 loss = 9.266614925835051e-5
k = 45 loss = 4.383088287793888e-5
k = 50 loss = 3.559024250813189e-5
```

The training history looks like

and the learnt distribution

*This page was generated using Literate.jl.*