Today I'm recreating the learner framework from the FastAI course. It's a flexible and quite powerful abstraction around the optimization of the DNN model, which streamlines the user experience. For example, it will be very easy to add different logging capabilities, learning rate finder etc. It is built during the …
read moreOther articles
A naive autoencoder on FashionMNIST
Today we'll recreate the fastai notebook on autoencoders, where we train a vanilla autoencoder in FashionMNIST. Even though the autoencoder was actually doing a pretty bad job, it will be good practice for working with HuggingFace databases, CNNs and autoencoders.
Getting the data
read moreimport datasets from torch.utils.data import …
Building up PyTorch abstractions: Part 1
Today we will retrace lesson 13-14's notebook that "builds up" pytorch abstractions from scratch. As a first step we'll rederive everything in hardcore numpy (maybe hardcore should be reserved for C). Then we'll start building the abstractions.
First up we load
mnist
data:read morefrom pathlib import Path from …
Musings on the reparametrization trick
Reading the variational autoencoder chapter from the "Understanding Deep Learning" book (which is available for free!). Not trivial, which is why I never got around to learning it, I guess. There are a lot of moving math parts to figure out. One of them is called "the reparametrization trick". So …
read moreDebugging session: Logseq Omnivore plugin
I'm trying to debug a weird issue with the Logseq omnivore plugin where it takes forever to sync and it seemingly creates and deletes pages needlessly.
My first step was to properly setting up a dev env (
read morepnpm dev
) which didn't work out of the box, instead of just building …RNN generations
On advice from my uncle I'm continuing to fallback on task difficulty with RNNs.
Unc's tips: - Swirch to generation task - Try residuals - Go deeper - Add projections - No dropout?
Let's recreate Karpathy's classic post and train a language model on tiny-shakespeare. We can get the entire dataset which is a text …
read moreRecreating Stable Diffusion's Pipeline
Today I'm going to recreate the pipeline shown in lesson 10 of the fast.ai course. We'll go through what's needed on the high-level, using pretrained models for everything. The pipeline is fed in a text prompt and it produces an image. A prompt means we need a tokenizer to …
read moreShifting to translation with RNNs
I'm pivoting the RNN summarization code to an easier example - Machine translation. Easier in the sense of the dataset, which consists of much shorter en-de sentence pairs compared to the summarization task. I have some suspicion that the there is a bug or something in my code, so today, after …
read moreCNN summarization task
Today we're gonna dip our fingers into the first generative NLP task - text summarization. We're gonna use the CNN/Daily Mail dataset as done in this paper. Let's get to it.
Data prep
I started by doing all the preprocessing of the files myself, but then found a the dataset …
read more