Sam Nhut Nguyen

What I've been working on.

Lip-sync, video dubbing, and talking head generation — from research to production at scale.

🎬

EditYourself — How It Works

Interactive Flow Matching Realtime Parallel Audio-to-Video Image-to-Video Text-to-Video

An interactive guide to EditYourself's audio-driven talking head pipeline — selective masking, audio conditioning, sliding window denoising, and multi-scale refinement.

Read article →

🎙️

Mirage — Seeing Voices

Interactive Flow Matching Audio-to-Video

How Mirage generates complete A-roll video from audio — asymmetric self-attention, learned RoPE, flow matching, and the data pipeline behind it.

Read article →

👄

LipDub — How It Works

Interactive Landmarks GAN Audio-to-Video Identity

An interactive guide to LipDub's two-stage talking face pipeline — a Landmark VAE and Diffusion Transformer with flow matching for audio-driven landmark generation, then photo-realistic rendering with SPADE alignment and AdaIN audio injection.

Read article →

🧑‍🎤

AI Creator — 3D Talking Avatar

Interactive NeRF 3D Avatar Audio-to-Video Blendshapes Relighting

How to build a 3D talking avatar from a single video — audio-visual lip sync, blendshape expressions, head pose optimization, hash-grid NeRF rendering, portrait restoration, and relighting from reference images.

Read article →

🔄

Transformer + RoPE: Full Pipeline

Interactive Transformer

Step-by-step walkthrough of the full Transformer pipeline with Rotary Position Embeddings — from raw text to output, with actual numbers and visualizations.

Read article →

📉

Optimizer Evolution — SGD to Muon

Interactive Optimization SGD Adam Muon

Why each optimizer was invented — from SGD's zig-zagging to Adam's adaptive moments to Muon's orthogonalization. Interactive training simulations show how each optimizer navigates a loss surface differently.

Read article →

2025 — Present

Pipio AI — Building Next Round In Progress

$6.75M raised to date · 4,000+ companies · 100K+ videos

Joined as Senior Research Engineer, contributing to research on EditYourself and multi-modal video synthesis. Building core product demos and technical vision to support the next fundraising milestone.

Role: Senior Research Engineer

Jul 2024

Captions AI — Series C

$60M raised · $500M valuation

Built core demos and technical presentations showcasing Lipdub and Mirage capabilities for the fundraising process. Led by Index Ventures, with Kleiner Perkins, a16z, and Sequoia returning.

Role: Senior Member of Technical Staff

Jun 2023

Captions AI — Series B

$25M raised · $250M valuation

Developed product demos and pitch materials for the Lipdub video dubbing engine. Led by Kleiner Perkins, with a16z and Sequoia participating.

Role: Senior Member of Technical Staff

2022

VTC Academy / Onlinica — Fundraise

$20M raised

Led ML team building AI-powered e-learning features including virtual lecturers and voice cloning for the Onlinica platform, supporting the $20M fundraise for VTC Academy's digital education ecosystem.

Role: Lead Machine Learning

2022

Dizim AI — Techfest 2022 & Pre-Seed

Top 10 Techfest · $110K from Antler

Co-founded and built the core AI engine, demos, and pitch video for Techfest Vietnam 2022 (Top 10 nationally). Secured pre-seed funding from Antler, a global early-stage VC.

Role: Co-Founder & Head of AI

2024

Ausynclab AI — Microsoft for Startups

Up to $150K in Azure credits

Advised on voice cloning technology and product development. Ausynclab was selected into the Microsoft for Startups program, reaching 100K+ users within months of launch.

Role: Technical Advisor

What I've been working on.

Projects

Open Source

Articles

Experience

Impact