What I've been working on.Lip-sync, video dubbing, and talking head generation โ from research to production at scale. AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
AI-generated
Mirage Studio
AI-powered video creation studio enabling users to generate cinematic-quality videos from text prompts with advanced motion and style control.
Lipdub
Video dubbing engine that translates and lip-syncs videos into 28+ languages. Led development of the core lip-synchronization pipeline.
AI Creator (3D Avatar)
The world's first 3D avatar designed for content creation. Generates photorealistic talking-head videos from text with natural lip-sync, head movement, and emotional expression in 30+ languages.
AI Twin
Digital clone technology that creates a virtual version of the user from a short recording. Generates talking-head videos from text in 29 languages with AI voice cloning and natural expressions.
Ausynclab AI
Voice cloning technology achieving best local Vietnamese voice quality. Reached 100k+ users within months of launch.
Dizim AI
Virtual presenter platform serving 200k+ users. Top 10 at Techfest 2022. Generates AI-driven talking head videos from text and slides.
Onlinica / OnliCV
First AI-powered online education platform in Vietnam. OnliCV connects your professional network. Led 8-person ML team building the core AI features.
VisibilityPro
Computer vision for retail automation. AI-powered shelf monitoring and product recognition that reduced manual review by 70%. Silver Winner SAP SME SEEDx 2020.
Face Detection & e-KYC
Facial recognition system for e-KYC and access control across Banking, Retail, and Security sectors. Real-time face detection and verification pipeline.
Japanese OCR Engine
OCR engine specialized for complex Japanese Kanji/Kana character recognition. Built for document digitization workflows serving Japanese enterprise clients.
Wav2Lip 288x288
Open Source
High-resolution Wav2Lip implementation at 288x288 with complete training pipeline. Based on my thesis "Talking Face via Audio Driven".
Split Mean Flow
Open Source
Unofficial implementation of Split Mean Flow from ByteDance. One-step generative modeling using flow matching techniques for efficient sampling.
Learnable Speech
Open Source
Text-to-speech with learnable audio encoder without alignment with transcript reference. Novel approach to speech synthesis using learned representations.
DAC-VAE: Descript Audio Codec VAE
Open Source
Variational Autoencoder variant of Descript Audio Codec for high-fidelity audio compression. Enables efficient audio encoding for generative models.
EditYourself โ How It Works
An interactive guide to EditYourself's audio-driven talking head pipeline โ selective masking, audio conditioning, sliding window denoising, and multi-scale refinement. Read article โ
Mirage โ Seeing Voices
How Mirage generates complete A-roll video from audio โ asymmetric self-attention, learned RoPE, flow matching, and the data pipeline behind it. Read article โ
LipDub โ How It Works
An interactive guide to LipDub's two-stage talking face pipeline โ a Landmark VAE and Diffusion Transformer with flow matching for audio-driven landmark generation, then photo-realistic rendering with SPADE alignment and AdaIN audio injection. Read article โ
AI Creator โ 3D Talking Avatar
How to build a 3D talking avatar from a single video โ audio-visual lip sync, blendshape expressions, head pose optimization, hash-grid NeRF rendering, portrait restoration, and relighting from reference images. Read article โ
Transformer + RoPE: Full Pipeline
Step-by-step walkthrough of the full Transformer pipeline with Rotary Position Embeddings โ from raw text to output, with actual numbers and visualizations. Read article โ
Optimizer Evolution โ SGD to Muon
Why each optimizer was invented โ from SGD's zig-zagging to Adam's adaptive moments to Muon's orthogonalization. Interactive training simulations show how each optimizer navigates a loss surface differently. Read article โ
2025 โ Present
Joined as Senior Research Engineer, contributing to research on EditYourself and multi-modal video synthesis. Building core product demos and technical vision to support the next fundraising milestone. Role: Senior Research Engineer
Jul 2024
Captions AI โ Series C
Built core demos and technical presentations showcasing Lipdub and Mirage capabilities for the fundraising process. Led by Index Ventures, with Kleiner Perkins, a16z, and Sequoia returning. Role: Senior Member of Technical Staff
Jun 2023
Captions AI โ Series B
Developed product demos and pitch materials for the Lipdub video dubbing engine. Led by Kleiner Perkins, with a16z and Sequoia participating. Role: Senior Member of Technical Staff
2022
VTC Academy / Onlinica โ Fundraise
Led ML team building AI-powered e-learning features including virtual lecturers and voice cloning for the Onlinica platform, supporting the $20M fundraise for VTC Academy's digital education ecosystem. Role: Lead Machine Learning
2022
Dizim AI โ Techfest 2022 & Pre-Seed
Co-founded and built the core AI engine, demos, and pitch video for Techfest Vietnam 2022 (Top 10 nationally). Secured pre-seed funding from Antler, a global early-stage VC. Role: Co-Founder & Head of AI
2024
Ausynclab AI โ Microsoft for Startups
Advised on voice cloning technology and product development. Ausynclab was selected into the Microsoft for Startups program, reaching 100K+ users within months of launch. Role: Technical Advisor
|