machine-learing

resnet

residual block(skip-connection)

transformer

CLIP

class Clip(nn.Module):
    def __init__(self,motion_dim=75,music_dim=438,feature_dim=256):
        super(Clip, self).__init__()

        self.motion_encoder = MotionEncoder(input_channels=motion_dim,feature_dim=feature_dim)
        self.music_encoder = MusicEncoder(input_channels=music_dim,feature_dim=feature_dim)

        self.motion_project = nn.Linear(feature_dim, feature_dim)
        self.music_project = nn.Linear(feature_dim, feature_dim)

        self.temperature = nn.Parameter(torch.tensor(1.0))

        self.criterion = nn.CrossEntropyLoss()

    def forward(self, motion:Tensor, music:Tensor):
        assert motion.shape[1] == music.shape[1]

        b,s,c= motion.shape

        motion_features = self.motion_encoder(motion)
        music_features = self.music_encoder(music)

        motion_features =F.normalize( self.motion_project(motion_features),p=2,dim=-1)
        music_features = F.normalize( self.music_project(music_features),p=2,dim=-1)


        # relation=(motion_features@music_features.T)*(1.0 / math.sqrt(c))

        # batch matrix multiplication and .mT is batch transpose matrix
        logits=torch.bmm(motion_features,music_features.mT)*self.temperature

        labels=torch.arange(s).repeat(b,1).to(motion.device)

        loss_motion = self.criterion(logits, labels)
        loss_music = self.criterion(logits.mT, labels)

        loss=(loss_motion+loss_music)/2

        return (motion_features,music_features),(loss,loss_motion,loss_music)

CAN(GAN base)

AE

AE from https://lilianweng.github.io/posts/2018-08-12-vae/

VAE

VAE from https://lilianweng.github.io/posts/2018-08-12-vae/

Assuming the distribution of the output is a Gaussian distribution, the model only predicts the mean ( $μ$ ) and std ( $σ$ ). We then sample the latent variable from this Gaussian distribution. The sample latent distribution parameters should match the true distribution, which is enforced using the KL divergence.

VAE loss

Evidence Lower Bound (ELBO)

We want replace $p_{θ} (z ∣ x)$ with $q_{ϕ} (z)$ since we dont have GT of $p_{θ} (z ∣ x)$

$lo g p_{θ} (x) lo g p_{θ} (x) lo g p_{θ} (x) L_{E L BO} = E_{(q_{ϕ})} [lo g p_{θ} (x)] = E_{q_{ϕ} (z)} [lo g (\frac{p _{θ} ( x , z )}{p _{θ} ( z ∣ x )})] = E_{q_{ϕ} (z)} [lo g (\frac{p _{θ} ( x , z )}{p _{ϕ} ( z )} \frac{p _{ϕ} ( z )}{p _{θ} ( z ∣ x )})] = E_{q_{ϕ} (z)} [lo g (\frac{p _{θ} ( x , z )}{p _{ϕ} ( z )})] + E_{q_{ϕ} (z)} [lo g (\frac{p _{ϕ} ( z )}{p _{θ} ( z ∣ x )})] = E_{q_{ϕ} (z)} [lo g (\frac{p _{θ} ( x , z )}{p _{ϕ} ( z )})] + K L (q_{ϕ} (z) ∣∣ p_{θ} (z ∣ x)) = E_{q_{ϕ} (z)} [l o g (\frac{p _{θ} ( x , z )}{p _{ϕ} ( z )})] + K L (q_{ϕ} (z) ∣∣ p_{θ} (z ∣ x)) = L_{E L BO} + K L (q_{ϕ} (z) ∣∣ p_{θ} (z ∣ x)) = lo g p_{θ} (x) - K L (q_{ϕ} (z) ∣∣ p_{θ} (z ∣ x))$

base on difference condition KL divergence can simplify to difference term

Variational Inference
Importance Sampling to ELBO
Variational EM

refs

VQVAE(d-vae)

Variational Autoencoder (Kingma & Welling, 2014)

quantise bottleneck

random init centroids
find the nearest centroids of each unquantise vector
- if quantise have low usage then random init a new centroids
calculate average center of unquantise vector
Exponential Moving Average Update between new centroid and old centroid

loss

VQ loss: The L2 error between the embedding space and the encoder outputs.
Commitment loss: A measure to encourage the encoder output to stay close to the embedding space and to prevent it from fluctuating too frequently from one code vector to another.
where $sg [.]$ is the stop_gradient operator.

$L = reconstruction loss ∥ x - D (e_{k}) ∥_{2}^{2} + VQ loss ∥ sg [E (x)] - e_{k} ∥_{2}^{2} + commitment loss β ∥ E (x) - sg [e_{k}] ∥_{2}^{2}$

CVAE

reinforce learning

actor-critic

origin

$s_{t}$ : state at time t
$V (s_{t})$ : value function predict reward with $s_{t}$
$A (s_{t})$ : action function return action probability base on $s_{t}$
top1: select the action with max probability
$R (a_{[0 : t]})$ : reward function input a sequence of action out float
advantages value: if the can get more reward then positive value else negative value

$action probability a_{[0 : t - 1]} reward L_{critic} advantages value L_{actor} = A (s_{[0 : t - 1]}) = top1 (action probability) = R (a_{[0 : t - 1]}) = MSE (re w a r d, V (s_{t - 1})) = reward - V (s_{t - 1}) = CrossEntropy (action probability, a_{[0 : t - 1]}) \times advantages value$

with Temporal Difference error(TD-error)

Hope $V (s_{t})$ can predict reward that may get in future with proportion $γ$
note: you should add stop gradient to $T D_{target}$ (aka detach in pytorch)

$T D_{target} T D_{error} L_{critic} advantages value L_{actor} = reward + γV (s_{t}) = T D_{target} - V (s_{t - 1}) = MSE (T D_{error}, 0) = MSE (T D_{target}, V (s_{t - 1})) = T D_{error} = CrossEntropy (action probability, a_{[0 : t - 1]}) \times advantages value$

something