CC
ChengAI
HomeExperienceProjectsSkillsArticlesStoriesChat with AI

© 2026 Charlie Cheng. Built with ChengAI.

GitHubLinkedInEmail
Admin

Projects

A collection of my work in AI, full-stack development, and more.

C

ChengAI

AI-Powered Personal Website

• Built an AI portfolio enabling recruiters to explore candidate background via RAG chat & automated JD matching. • Developed hybrid retrieval combining pgvector similarity search & full-text search with sub-2.5s p95 latency. • Implemented JD parsing engine that extracts requirements, maps to skill graph, and generates analysis reports. • Engineered streaming chat via SSE with 3 modes: Auto, Tech Deep Dive, and Behavioral (STAR). • Built knowledge ingestion pipeline for PDF/DOCX/TXT with smart chunking & batch embedding. • Designed Admin CMS with CRUD, CSRF protection, analytics dashboard; deployed at chengai-tianle.ai-builders.space

CodeDemoDetails
L

LuxPricer: AI-Powered Luxury Bag Valuation App

• Built an AI appraisal platform for luxury handbags; users upload photos to get instant valuations & resale insights. • Engineered FastAPI orchestration with LangGraph parallel workflows, reducing appraisal latency by 40%. • Integrated Gemini 3 Pro vision models for image-based brand/model identification, achieving 95%+ detection accuracy. • Built Node.js + Puppeteer scraper aggregating real-time pricing from 3 luxury resale platforms with CAPTCHA handling. • Leveraged Perplexity API for market analysis & built Plotly charts with outlier removal & similarity scoring. • Developed React web & React Native mobile apps with Supabase auth & data; deployed via Docker & Coolify

CodeDemoDetails
A

Ask AI: AI-Native Debug Knowledge Vault

• Built a structured debugging knowledge base where AI coding assistants can search & record fixes via MCP protocol. • Integrated with 4+ AI tools (Cursor, Claude, Windsurf, Gemini) enabling ”search before code, record after fix” workflow. • Built RRF fusion search engine combining Postgres FTS + pg trgm fuzzy matching, achieving sub-50ms query latency. • Implemented 3-layer sanitization pipeline (secret scan → pattern redact → AI semantic) blocking 7 secret types. • Architected ledger-based credit system with ”no-hit-no-charge” billing model supporting 10+ transaction types. • Deployed production stack via Docker Compose with pgvector, Redis, and Caddy for automatic HTTPS.

DemoDetails
A

Abstraction AI

# Abstraction AI: From Long, Messy Context to an Elegant Build Project: https://abstractai.ai-builders.space/ When I start a new project, a workflow I often fall into is: I talk with ChatGPT for many rounds until I end up with a very long context. In the past, I would throw that entire conversation directly into a coding AI (like Augment Code or Claude Code) and ask it to implement the project “based on all the context”. It *feels* like there’s enough information—after so many back-and-forth rounds, the AI should understand what we want and how to do it. Over time, I realized this is not the ideal workflow. ## Two Problems With “Just Dump the Whole Context” ### 1) The context is too long, too messy The same point may be discussed, overturned, and rebuilt multiple times. That makes the AI confused about what the final version actually is, and it’s easy for implementation to drift off course. ### 2) We often don’t know what we don’t know One of the hardest parts of using a coding AI agent today is that many people—especially those without a technical background—don’t truly know what a “complete system” includes. Jumping straight from an idea to a “full implementation”, skipping PRDs, system design, architecture, and engineering documentation, and expecting the AI to write a system that satisfies everything from a fuzzy starting point is genuinely difficult. ## What Abstraction AI Adds Abstraction AI deliberately inserts a crucial step between “long context” and “actual development”: **It turns the context into a complete set of documents, and produces a clear design for the whole system.** This is a bit like manually inserting a deliberate “long thinking” step into the workflow—forcing a round of high-quality system-level thinking, structuring, and design before any code is written. ## Flexible Inputs, Practical Outputs In practice, the tool turned out to be very flexible. It can take: - Any length of text - AI chat logs - Meeting transcripts - A long project description you wrote yourself No matter the user’s background, it generates a set of documents that you can hand to an AI engineer. With those documents, the AI can build the system more reliably, with higher success rates and more stable outcomes. I intentionally made the documents beginner-friendly: usable for coding AIs, but also readable for people who aren’t very technical. Before you pay an AI to “do the work”, you can read the system description yourself, edit it according to your understanding, and then hand it off for implementation. After the docs are generated, you can also use a set of prompts I prepared to build a complete product directly from these documents. Currently, it supports switching between GPT-5 and Gemini 2.5 Pro. The Gemini 2.5 Pro frontend visualization still has some rough edges, and I’ll keep improving it. ## Cost, And an Unexpected Effect: Saving Money The project itself was built with Augment Code, and the core prompt was written with help from GPT-5 Pro. End-to-end—building, iterating, debugging—the total cost was about $20. Interestingly, this project was built by “having AI read long context”, not by starting with structured documentation. But my next startup project was implemented *on top of the structured docs generated by this tool*. That project was much more complex, closer to a complete system, and the total cost still came out to roughly $20. That showed me a very direct effect: **it saves money**. Because before execution, the AI already has a clear “instruction manual”. It can follow it, instead of repeatedly trial-and-erroring in ambiguous context and reworking mistakes. And my next project is, in essence, also about making coding AI agents faster, better, and cheaper. If you also tend to talk with AI for a long time before bringing in a coding AI, you might want to try converting “long context” into a complete, elegant, executable product document set first—then handing it off to your AI engineer. This is my first time sharing a project publicly. If it helps anyone, that would mean a lot. Please try it, share it, and give feedback—those are incredibly valuable for someone like me who’s still learning how to work with users. Deploying on Builder Space was important for this project, and I’m grateful to the AI Architect course for making it so easy to share. ## Comments **10 likes · 9 comments** ### mier20 (Full-Stack Engineer) Thanks a lot for sharing—this is a great project. I’m curious: for users with a technical background, this can save some concrete implementation time. But for users without a technical background, how can they judge whether the AI-generated docs are correct, or whether they’re truly what the user needs? **Charlie** That’s a crucial question. What I tried to do inside the system—based on my experience prompting AIs to explain things—is to make the generated docs as friendly as possible to people of any background, so more people can actually read them. I also include a glossary to help with terminology. So I think “help the user understand” is the first direction. The second direction is “learn with AI”: after downloading these docs, use a coding agent chat to ask it to explain things more clearly—ask wherever you don’t understand—until you feel confident you have a solid grasp. If we’re using natural language to orchestrate compute, then helping users understand more of the natural language they didn’t understand before also expands the range of language they can use—so they can gradually obtain enough information to make the important judgments. **Charlie** Thanks for the kind words! ### Xu Jia I feel your tool solves the core problem of maximizing the effectiveness of collaboration between a person and AI tools. The efficiency improvement you mentioned is just the result. The deeper point is: your tool draws clear boundaries for the AI tool, and the AI tool explores and optimizes within those boundaries. I’d love to discuss further and learn from each other. Thank you very much for sharing. **Charlie** Thanks for trying it and for the feedback! The efficiency gain was indeed something I discovered unexpectedly—my main goal was still to help AI develop better things in a better way. Your description—optimizing within boundaries—is very accurate and inspiring. For example, Claude Opus 4.5 will look back at the docs at the right time to check whether requirements are met and what tests might be missing. The development process shifts from “brute-force, messy exploration” to an optimization process with clear ground truth, a way to compute loss, and a path for backpropagation. After multiple iterations, it tends to converge to what we want. Developing software like training an optimized model—this has been how AI coding has felt to me for a while, and this project really makes that path smoother. ### Xu Jia My experience is very similar to what you described. I’ve discussed a potential project for a long time with multiple LLMs, kept a large amount of text discussion records, and ended up generating multiple PRD versions—but the development process becomes more and more chaotic. **Charlie** Yeah—if you want to leverage different LLMs’ strengths, it can definitely lead to that kind of difficulty. ### Xu Jia Could I try it? How do I use it? Thanks. **Charlie** Here’s the link: https://abstractai.ai-builders.space/ Thanks for your interest!

CodeDemoDetails
M

Multimodal LLM Fall-Risk Assessment for Aging-in-Place

• Built LangGraph workflow: 4-step pipeline for scene detection, RAG retrieval, safety assessment, and image editing. • Used GPT-4o / Gemini VLM to analyze home images, detecting room type, hazards, and safety features. • Built Hybrid RAG (FAISS + BM25) retrieving 105 curated CDC/HSSAT guidelines with risk-level citations. • Fused deterministic scoring rules with LLM reasoning to output 0-10 safety score and cited recommendations. • Generated safety improvement previews via Gemini Nano Banana showing recommended modifications. • Shipped FastAPI + React app with Docker deployment, supporting 11 scene types and real-time analysis.

DemoDetails
T

That’sEat: Scenario-Intelligent AI Recommendation Agent

• Built a restaurant recommendation agent using RAG architecture and AWS infrastructure from scratch. • Combined RAG with custom prompts for GPT-4o, improving recommendation accuracy from 50% to 90%. • Extracted 5000+ venues via Google Maps API, vectorized with text-embedding-3-small and OpenSearch. • Deployed serverless backend with AWS Lambda and API Gateway, achieving 2-second responses at 100 QPS. • Created an iOS app with SwiftUI featuring real-time chat and one-tap Google Maps integration. • Configured VPC networking with private subnets, NAT Gateway, and IGW to enable Lambda internet access

Details
M

Multi-Stage, Cloud-Native Recommendation Engine

• IArchitected a scalable Go/Python microservices stack via gRPC powering global recommendation infra. • Implemented a multi-stage recommendation engine with Two-Tower, DeepFM & LightGCN models on PyTorch. • Developed Actor-Critic RL agent enabling dynamic, multi-objective ranking and throughput optimization. • Provisioned end-to-end cloud infrastructure on AWS via Terraform (EKS, S3, MSK Kafka, IAM policies VPCs). • Containerized 10 microservices in Docker; orchestrated deployment and autoscaling via Kubernetes and Helm. • Engineered real-time Feature Store on Kafka & Redis Cluster, and an Apache Iceberg data lake on S3.

Details
M

MLOps for Churn Prediction with AWS SageMaker

• Built end-to-end MLOps pipeline for churn prediction using AWS SageMaker, EC2, and XGBoost. • Created ETL workflows with AWS Glue and Lambda to preprocess customer data stored in S3. • Optimized model using SageMaker Automatic Model Tuning, improving prediction accuracy by 15%. • Deployed trained model via Flask API backend for real-time inference with sub-second response times. • Implemented CI/CD with GitHub, CodePipeline, and CloudFormation, using Docker on ECR. • Managed model lifecycle with SageMaker Model Registry and Step Functions, monitoring performance via AUC.

Details
R

Realtime Free-Viewpoint Streaming with 3D Gaussian Splatting

Built a Unity scene with a 5×5 camera rig and third-person controller for deterministic, replayable data capture. • Scripted Cinemachine trajectories & GameRecorder to export aligned RGB and G-Buffer streams for low-res proxy scoring. • Reconstructed geometry with COLMAP and trained 3D Gaussian Splatting from poses using multi-view frames. • Achieved 34.4 dB PSNR and 0.97 SSIM on unseen cameras; analyzed ghosting under character motion. • Benchmarked QUEEN and IGS vs 3DGS, validating target-conditioned camera selection.

Details
F

Full Stack Purchase and subscription Platform with in Go

• Architected a distributed e-commerce platform with Go, GoHTML, and PostgreSQL. • Implemented Docker containerization and Kubernetes orchestration for automated scaling and deployment. • Developed authentication and payment systems using JWT tokens and Stripe API for secure user transactions. • Built concurrent email service with Goroutines and channels, achieving 40% faster document processing. • Designed microservices architecture using gRPC and RabbitMQ for scalable asynchronous communication. • Optimized data structure with PostgreSQL for transactions and MongoDB for logs, boosting speed by 30%.

Details
F

Full Stack E-Commerce App with SpringBoot & React

Full Stack E-Commerce App with SpringBoot & React • Built a full-stack e-commerce platform from scratch with Spring Boot, MySQL, React and Docker. • Crafted responsive UI using React, TypeScript, Redux, Material-UI for seamless cross-device UX. • Implemented core flows—product browse/search/filter, cart, checkout—plus JWT auth via Spring Security. • Designed clean RESTful APIs and containerized services, enabling one-command local or cloud deployment. • Optimized persistence with Spring Data JPA & Hibernate, modelling orders, inventory and payments. • Added Redis caching for shopping cart, cutting database load and improving checkout speed by 30%.

Details