Resume

My experience, education, and professional skills

Download PDF

Do Hoang Vu

AI Engineer

Hoang Van Thu Street

Ho Chi Minh City, Phu Nhuan District

Vietnam

Education

The Saigon International University

Bachelor of Artificial Intelligence

Expected graduation: 2027

GPA: 3.62/4.00

Certifications

Machine Learning - Deep Learning Foundation

cole.vn

Exploratory Data Analysis for Machine Learning

IBM - Coursera

Machine Learning Specialization

Stanford - Coursera

Languages

EnglishIntermediate
VietnameseNative

Projects & Competitions

VIZQUEST: ENHANCED VIDEO EVENT RETRIEVAL USING FUSION AND TEMPORAL MODELING

Co-authored a research paper accepted at SOICT24, introducing a novel framework combining spatio-temporal attention with hierarchical feature fusion to optimize long-range video event detection.

SciHorizone - IELTS Exam Generator

AI-powered application that converts scientific papers (PDF) into IELTS/TOEIC reading comprehension exams using Google Gemini AI. Features intelligent PDF processing, automated question generation, and interactive exam interface.

  • Google Gemini AI integration for intelligent question generation
  • Advanced PDF processing with docling-serve and fallback methods
  • Support for IELTS (band 4.0-9.0) and TOEIC (400-900 points) formats
  • Interactive exam interface with timer and navigation system
  • Automated grading with detailed explanations and analysis
  • Multiple question types: Multiple choice, True/False/Not Given, Matching, Fill in blanks
  • Professional exam simulation with responsive design
  • Real-time PDF extraction from uploads or URLs
  • Comprehensive result analysis with improvement recommendations
  • Production deployment with Docker containerization

Deeplearning-Practice

A comprehensive collection of deep learning implementations coded from scratch, covering everything from computer vision to natural language processing. This repository showcases high-quality implementations of fundamental and advanced deep learning algorithms.

  • Computer Vision models including CNN architectures and ResNet implementations
  • Natural Language Processing models including LSTM with attention mechanisms
  • Regression models with detailed logging for housing price prediction
  • Sentiment analysis on IMDb reviews with deep learning approaches
  • Classification models for Vietnamese news articles
  • All implementations feature clean, well-documented code with detailed explanations

EzClip

A powerful desktop application designed to effortlessly download videos from various online platforms including YouTube, Facebook, and TikTok. Built with Electron.js and leverages yt-dlp for wide format support.

  • Support for multiple platforms including YouTube, Facebook, TikTok
  • Download videos in various formats and resolutions
  • Simple and intuitive user interface with modern design
  • Offline functionality - no server required
  • Built with Electron.js for cross-platform compatibility

Decision Tree Visualization

A web application for building and visualizing decision trees from CSV data. Features include customizing model parameters, interactive visualization of decision trees, and performance metrics calculation.

  • Built with Python, FastAPI, and scikit-learn for the backend
  • Interactive UI with HTML, CSS, JavaScript, and TailwindCSS
  • Supports custom model parameters (max depth, min samples split, criterion)
  • Visualizes decision trees as hierarchical structures
  • Calculates and displays model evaluation metrics

PlugAgentX: Adaptive Plugin-Based LLM Agent Framework

Developing a plugin-based framework for LLM agents, enabling dynamic integration of external tools and services. Features include plugin management, context management, and adaptive agent behavior based on task requirements.

  • Plugin management system for adding, removing, and updating external tools
  • Context management to track and update agent state
  • Adaptive agent behavior based on task requirements
  • Integration with external APIs and services

ZALO AI CHALLENGE 2023

Built deep generative models for symbolic and audio-based music generation using Transformer-based architectures. Fine-tuned temporal coherence and structural consistency to align with evaluation metrics in generative audio tasks.

AIC24 COMPETITION

Developed scalable video understanding pipelines for event retrieval, leveraging contrastive learning and multimodal embeddings to enhance temporal-semantic alignment in untrimmed video datasets.

  • Integrated multi-head self-attention, temporal convolutional networks, and cross-modal fusion to improve mAP and retrieval latency in benchmark datasets.

Technical Skills

Programming

Python
JavaScript
SQL
Bash
C/C++

Libraries/Frameworks

PyTorch
LangChain
FastAPI
Transformers
Node.js
Next.js

AI Expertise

LLMs
Multi-Agent Systems
Neuro-Symbolic AI
Graph-RAG
Computer Vision

Tools

Git
Docker
Jenkins
Cloudflared
VScode

Soft Skills & Interests

Soft Skills

Analytical & Problem-solvingEffective collaborationPersonable communicationPresentationTime managementResponsible AI Usage

Interests

Generative AILLMsMulti-Agent SystemsVoice TransformationMathematics & Physics