Duc (Daniel) Ngo
Duc (Daniel) Ngo

Currently at TORILAB · Hanoi, Vietnam

Hi, I'm Duc (Daniel) Ngo.

I build analytics systems, BI dashboards, and AI-powered data workflows that help teams measure performance, understand users, and make better decisions.

Where I've Worked

March 2025 – PresentSenior Data / Business Intelligence Analyst @ TORILAB
February 2024 – March 2025Team Leader AI - R&D @ VNDIRECT
July 2022 – November 2023Marketing Data Analyst @ In4mation Insights

TORILAB · March 2025 – Present

At TORILAB I support three AI-powered apps with data analytics, BI dashboards, and product intelligence.

Twomi

Twomi

AI personalization, born from a data insight

Twomi exists because of data. Through bi-weekly business reporting, I identified a rising trend in AI Companion apps and personalization features across the competitive market. I surfaced this signal to senior stakeholders, and the insight directly triggered the decision to build and launch Twomi as TORILAB's newest AI personalization app.

My contribution

Discovered the AI Companion and personalization trend through competitive market analysis in bi-weekly reports. Presented findings to C-suite, which informed the product decision to create Twomi. Now tracks Twomi's growth and engagement post-launch.

Key outcomes

  • Trend discovery → product launch
  • 15% marketing ROI uplift
  • 50% DAU increase

Selected Case Studies

TORILABMarch 2025 – PresentCase study 1

Building BI Dashboards and LookML Models on GCP

Reduced report preparation time by 80% through Looker and SQL automation.

Context

Business teams across Soulriza, AI Avatar, and Twomi needed faster, more reliable reporting on marketing, revenue, and product performance. Reporting was slow, manual, and inconsistent, making it hard for stakeholders to trust the data or act quickly.

My Role

I led the development of 15+ Looker dashboards and LookML models on GCP, designed the semantic data model, wrote complex SQL queries, and built Python scripts to automate core BI workflows across all three products.

Approach

I audited existing reporting workflows to find the biggest time sinks, then modeled data in LookML to create a reusable, consistent semantic layer. Python automation handled repetitive data transformation and delivery tasks.

Impact

Report preparation time dropped by 80%. Business and product teams now access self-serve dashboards instead of waiting for manual reports. The LookML semantic layer ensures consistent metric definitions across all dashboards.

Key Metrics

  • 15+ dashboards
  • 80% time reduction
  • Self-serve reporting

Tools

LookerLookMLGCPSQLPythonBigQuery
TORILABMarch 2025 – PresentCase study 2

Bi-Weekly Business Reporting that Launched a New Product

Discovered the AI Companion and personalization trend that led to the creation and launch of Twomi.

Context

Senior stakeholders needed clear, recurring visibility into marketing performance, revenue, product growth, and competitive AI market trends, all in one place, presented bi-weekly. No structured framework existed for this.

My Role

I created and owned the bi-weekly business reporting cycle, including data preparation, market trend analysis, visualization, and direct presentation to C-suite and senior stakeholders.

Approach

I connected marketing spend data, product engagement metrics, and revenue figures into a unified reporting framework. Each report told a clear story: what changed, why, and what to do next. I layered in competitive AI market analysis to identify emerging product opportunities.

Impact

The recurring reporting cadence contributed to a 15% uplift in marketing ROI and a 50% increase in Daily Active Users. Through the bi-weekly reports, I identified the emerging AI Companion and personalization trend, a discovery that directly informed the decision to create and launch Twomi.

Key Metrics

  • 15% marketing ROI uplift
  • 50% DAU increase
  • Led to Twomi launch

Tools

LookerSQLPythonGCPBusiness ReportingMarket Analysis
TORILABMarch 2025 – PresentCase study 3

Designing Event Tracking for 60+ Product Features

Improved feature-adoption visibility by 40% across Soulriza, AI Avatar, and Twomi.

Context

Product and engineering teams had limited visibility into which features users were actually adopting across all three products. Without tracking, it was impossible to measure feature impact or prioritize roadmap decisions.

My Role

I championed end-to-end event tracking across 60+ product features, from defining key metrics with the design team through to aligning development instrumentation and validating data flowing into LookML dashboards.

Approach

I created a tracking taxonomy, aligned with design and engineering on instrumentation standards, wrote validation queries in GCP, and built LookML dashboards to surface adoption trends for product managers.

Impact

Feature-adoption visibility improved by 40% across all three apps. Product managers now have real-time dashboards to measure engagement for every shipped feature and use that data to inform roadmap prioritization.

Key Metrics

  • 60+ features tracked
  • 40% visibility boost
  • Real-time adoption dashboards

Tools

Product AnalyticsGCPLookMLLookerSQLBigQuery
VNDIRECT Social Listening banner
VNDIRECTFebruary 2024 – March 2025Case study 4

Building a Social Listening System for Market Intelligence

Monitored retail investor sentiment at scale to give VNDIRECT customers a broader view of the market.

Context

VNDIRECT customers relied on price data and news headlines for investment decisions. Social forums like F319 contained valuable retail sentiment signals that were impossible to read at scale. The team needed a system to surface this signal reliably.

My Role

I designed and led the Social Listening project from data collection through model deployment, building the crawling infrastructure, NLP pipeline, and customer-facing sentiment product.

Approach

I built a web scraping pipeline using Selenium and BeautifulSoup to collect and clean forum discussion data from Vietnamese financial communities. I then applied an LLM-based classifier to extract and aggregate sentiment signals, evaluated performance against labeled benchmarks, and tuned the pipeline for production.

Impact

The model reached 90% accuracy on financial sentiment classification and was integrated into VNDIRECT's analytics product, giving customers access to aggregated market sentiment alongside traditional price data.

Key Metrics

  • 90% accuracy
  • Forum sentiment at scale
  • Production deployment

Tools

PythonLLMSeleniumBeautifulSoupNLPSentiment Analysis
VNDIRECTFebruary 2024 – March 2025Case study 5

Leading AI R&D for PDF Information Extraction

Achieved 95% accuracy and 90% faster extraction with a 5-person AI team.

Context

VNDIRECT processes large volumes of financial PDF documents. Manual extraction was slow, error-prone, and couldn't scale. The team needed an automated solution that could reliably detect and extract structured text from complex financial PDFs.

My Role

I led a team of 5 AI engineers, set the technical direction, coordinated model development, and ensured delivery on time. I also contributed directly to model architecture decisions and evaluation.

Approach

We used PaddleOCR for word detection combined with PyMuPDF for document parsing. I structured the project in phases: baseline evaluation, model fine-tuning on financial documents, post-processing for structured output, and production integration.

Impact

The model achieved 95% accuracy, extraction speed improved by 90%, and error rates dropped by 25%. The team delivered on schedule, and the model went into production use.

Key Metrics

  • 95% accuracy
  • 90% faster extraction
  • 25% error reduction

Tools

PythonPaddleOCRPyMuPDFOCRAI Model Development
Sales forecast banner
In4mation InsightsJuly 2022 – November 2023Case study 6

Forecasting Transaction Trends for 10,000+ Stores

90% forecast accuracy across 1M+ historical transactions to support a loyalty payment program.

Context

A client was evaluating the costs and ROI of a new loyalty payment program. They needed reliable transaction forecasts for 10,000+ store locations to model the program's financial impact before launch.

My Role

I built the forecasting workflow end-to-end: data cleaning, EDA, feature engineering, model development, and stakeholder presentation.

Approach

I analyzed 1M+ historical transactions from 2020 to 2022 using Python and SQL, performed extensive EDA to identify seasonal patterns and store-level trends, then built an interactive Python script that produced store-level transaction forecasts.

Impact

The model achieved 90% forecast accuracy. The cost analysis output directly informed the client's decision-making strategy around the loyalty program rollout.

Key Metrics

  • 90% forecast accuracy
  • 10,000+ stores
  • 1M+ transactions analyzed

Tools

PythonSQLEDAPredictive ModelingStakeholder Reporting

Where I've Studied

Macalester College

Data Science Major, Computer Science Minor

Saint Paul, Minnesota, United States

August 2018 – May 2022
  • GPA 3.9 / 4.0
  • Best In-Show & Best Insight, 2022 DataFest Competition
  • #1, May 2021 Kaggle Tabular Playground

Hanoi Amsterdam High School for the Gifted

High School Diploma, Natural Science Track

Hanoi, Vietnam

August 2011 – May 2018
  • One of Vietnam's top selective high schools
  • Natural Science specialization
Macalester College graduation
Duc with Macalester friends at dinner
Duc with Macalester classmates

Personal Projects

Writing, music, and academic projects built outside of work.

Feb 2026 – Present

Writing

Still Here, Duc (Substack)

I write about analytics, learning, career, and things I'm figuring out as I go. A place for longer reflections that don't fit in a dashboard or a slide deck.

Substack
View project →
Still Here, Duc (Substack)

2023 – Present

Music

Piano Recordings (YouTube)

I play piano and sometimes record pieces I'm working through. A completely different kind of practice from data work.

YouTube
View project →
Piano Recordings (YouTube)

March 2022 – May 2022

PythonFinanceMachine Learning

Stock Picking with Machine Learning

Built a financial dataset for S&P 500 companies from 1999 to 2021 by scraping Yahoo Finance. Evaluated Lasso, Random Forest, and Stacking models based on backtested returns. The top 20 picks returned 47.35% in 2021, beating the S&P 500 by 21%.

PythonLassoRandom ForestStackingSelenium
View project →
Stock Picking with Machine Learning

March 2022 – May 2022

RSports Analytics

Mapping Spatial Patterns in Soccer

Used the Wyscout dataset covering five major European leagues in 2018. Applied linear mixed-effects models and created 100+ animated graphs with ggsoccer and gganimate to reveal stylistic differences between teams and leagues.

RggsoccergganimateWyscout
View project →
Mapping Spatial Patterns in Soccer

Want to see more?

Still curious? Check out all of my projects!

Machine learning, finance, sports analytics, geospatial work, and more.

Check out all my projects