Do Hoang Vu
AI Engineer
Hoang Van Thu Street
Ho Chi Minh City, Phu Nhuan District
Vietnam
Education
The Saigon International University
Bachelor of Artificial Intelligence
Expected graduation: 2027
GPA: 3.62/4.00
Certifications
Machine Learning - Deep Learning Foundation
cole.vn
Exploratory Data Analysis for Machine Learning
IBM - Coursera
Machine Learning Specialization
Stanford - Coursera
Languages
Projects & Competitions
VIZQUEST: ENHANCED VIDEO EVENT RETRIEVAL USING FUSION AND TEMPORAL MODELING
Co-authored a research paper accepted at SOICT24, introducing a novel framework combining spatio-temporal attention with hierarchical feature fusion to optimize long-range video event detection.
SciHorizone - IELTS Exam Generator
AI-powered application that converts scientific papers (PDF) into IELTS/TOEIC reading comprehension exams using Google Gemini AI. Features intelligent PDF processing, automated question generation, and interactive exam interface.
- •Google Gemini AI integration for intelligent question generation
- •Advanced PDF processing with docling-serve and fallback methods
- •Support for IELTS (band 4.0-9.0) and TOEIC (400-900 points) formats
- •Interactive exam interface with timer and navigation system
- •Automated grading with detailed explanations and analysis
- •Multiple question types: Multiple choice, True/False/Not Given, Matching, Fill in blanks
- •Professional exam simulation with responsive design
- •Real-time PDF extraction from uploads or URLs
- •Comprehensive result analysis with improvement recommendations
- •Production deployment with Docker containerization
Deeplearning-Practice
A comprehensive collection of deep learning implementations coded from scratch, covering everything from computer vision to natural language processing. This repository showcases high-quality implementations of fundamental and advanced deep learning algorithms.
- •Computer Vision models including CNN architectures and ResNet implementations
- •Natural Language Processing models including LSTM with attention mechanisms
- •Regression models with detailed logging for housing price prediction
- •Sentiment analysis on IMDb reviews with deep learning approaches
- •Classification models for Vietnamese news articles
- •All implementations feature clean, well-documented code with detailed explanations
EzClip
A powerful desktop application designed to effortlessly download videos from various online platforms including YouTube, Facebook, and TikTok. Built with Electron.js and leverages yt-dlp for wide format support.
- •Support for multiple platforms including YouTube, Facebook, TikTok
- •Download videos in various formats and resolutions
- •Simple and intuitive user interface with modern design
- •Offline functionality - no server required
- •Built with Electron.js for cross-platform compatibility
Decision Tree Visualization
A web application for building and visualizing decision trees from CSV data. Features include customizing model parameters, interactive visualization of decision trees, and performance metrics calculation.
- •Built with Python, FastAPI, and scikit-learn for the backend
- •Interactive UI with HTML, CSS, JavaScript, and TailwindCSS
- •Supports custom model parameters (max depth, min samples split, criterion)
- •Visualizes decision trees as hierarchical structures
- •Calculates and displays model evaluation metrics
PlugAgentX: Adaptive Plugin-Based LLM Agent Framework
Developing a plugin-based framework for LLM agents, enabling dynamic integration of external tools and services. Features include plugin management, context management, and adaptive agent behavior based on task requirements.
- •Plugin management system for adding, removing, and updating external tools
- •Context management to track and update agent state
- •Adaptive agent behavior based on task requirements
- •Integration with external APIs and services
ZALO AI CHALLENGE 2023
Built deep generative models for symbolic and audio-based music generation using Transformer-based architectures. Fine-tuned temporal coherence and structural consistency to align with evaluation metrics in generative audio tasks.
AIC24 COMPETITION
Developed scalable video understanding pipelines for event retrieval, leveraging contrastive learning and multimodal embeddings to enhance temporal-semantic alignment in untrimmed video datasets.
- •Integrated multi-head self-attention, temporal convolutional networks, and cross-modal fusion to improve mAP and retrieval latency in benchmark datasets.