ビジョン · モノレポ
04

Project 04 · 2024 · Harsh Yadav

Vision
AI
Ecosystem.

マルチモーダル知能システム

A monorepo housing multiple experimental and production-bound AI tools — all built around giving AI agents real superpowers: eyes (screenshot understanding), hands (MCP tool access), and a voice (landing pages and showcases).

Year2024–Present
RoleAI Developer
Modules5 Components
Status🚧 Active Dev
Monorepo Modules · モジュール Python · TypeScript · MCP · React · Docker
Module 01 📸

screenshot-to-code

Converts any screenshot, mockup, or wireframe into clean HTML, CSS, or React code using multimodal LLMs (GPT-4o / Claude). Drag-and-drop UI with live code preview.

FastAPI · OpenAI Vision · React
Module 02 🧠

mcp-brain

Configurable MCP (Model Context Protocol) server exposing tools to AI agents — file system, web, APIs, databases. Compatible with Antigravity, Claude Code, Cursor, Gemini CLI.

Python · MCP Protocol · JSON Schema
Module 03 🎬

animation-showcase

Curated showcase of advanced CSS and JS animation techniques — keyframes, scroll-driven animations, Intersection Observer, GSAP, Canvas API, SVG morphing.

HTML5 · CSS3 · GSAP · Vanilla JS
Module 04 🌐

antigravity-website

Landing page for Google's Antigravity AI coding IDE ecosystem. TypeScript-first, Vite-powered, containerized via Docker for deployment to any host.

TypeScript · Vite · Tailwind · Docker
[01] Overview

Giving AI agents eyes, hands — and a voice.

Vision is actively developed with an emphasis on Multimodal AI — using vision-capable LLMs to understand and reproduce UI designs with layout analysis, style extraction, and component generation in a single pipeline.

The MCP (Model Context Protocol) module implements the open standard pioneered by Anthropic, letting AI agents call real-world tools through structured JSON schema interfaces via stdio or HTTP/SSE transports.

Python FastAPI TypeScript React 18 OpenAI Vision MCP Protocol Vite Docker GSAP
Monorepo Architecture · アーキテクチャ ビジョンシステム設計
Vision/ (Monorepo Root) │ ├── screenshot-to-code/ # Screenshot → UI pipeline │ ├── backend/ # FastAPI + Vision LLM │ │ ├── vision.py # LLM image analysis module │ │ ├── generator.py # HTML/CSS/React code generation │ │ └── requirements.txt │ └── frontend/ # TypeScript React drag-drop UI │ ├── mcp-brain/ # MCP Server Infrastructure │ ├── server.py # MCP entrypoint (stdio / HTTP) │ ├── tools/ # tool registry definitions │ │ ├── web_search.py │ │ ├── file_ops.py │ │ └── code_exec.py │ └── schema/ # JSON schema tool specs │ ├── animation-showcase/ # CSS & JS animation reference │ ├── index.html │ ├── styles/ # Keyframe + animation CSS │ └── scripts/ # GSAP + Intersection Observer │ ├── antigravity-website/ # Antigravity IDE landing page │ ├── src/ # TypeScript React components │ ├── Dockerfile # Container build │ └── package.json │ └── _write_test/ # Internal LLM output eval harness

MCP Tool Registry

Tool Description Transport
web_search Fetch and scrape live web data HTTP
file_read / file_write Read and write project files stdio
code_exec Run shell commands and scripts stdio
api_bridge Connect to external services HTTP