The Science of Style: How Our AI Works

A deep dive into the neural architecture behind Tailor X — from computer vision wardrobe scanning to LLM-powered outfit generation.

style_engine.ts

const profile = await StyleEngine.analyze({

wardrobe: scannedItems,

ratings: userHistory,

context: { calendar, weather, occasion }

}) // → { aestheticDNA, styleScore: 9.2, confidence: 0.97 }

Style is deeply personal and highly contextual. It's not just about color preferences or brand loyalty — it's about the relationship between dozens of variables: your coloring, your body proportions, your lifestyle, the occasions you dress for, your cultural context, and the subtle aesthetic grammar that makes a wardrobe feel cohesive.

Building an AI that understands all of that required rethinking how we represent clothing and personal taste as data.

Layer 1: The vision pipeline

Every item in your wardrobe starts as a photograph. Our computer vision pipeline — fine-tuned on a dataset of 2M+ labeled garment images — extracts over 60 attributes per photo: garment type, cut, color (full LAB color space, not just "blue"), fabric texture class, visible construction details, formality tier, and more.

The result is a structured representation of each item that our style engine can reason about mathematically. A white Oxford shirt becomes a 60-dimension vector. A navy slim trouser becomes another. And now we can calculate — with precision — which items belong together and why.

Layer 2: The taste model

Understanding your clothes is only half the problem. The other half is understanding you. We model personal taste as a high-dimensional preference vector — trained on your explicit ratings, your wear history, and contextual signals like which outfits you photograph versus which you just wear.

The model uses a transformer architecture similar to those powering modern language models, but adapted for fashion semantics. It learns to predict which item combinations you'll rate highly before you've seen them — enabling proactive outfit generation rather than reactive filtering.

Layer 3: Contextual generation

The final layer takes your taste vector and your wardrobe representations and generates complete outfit proposals, conditioned on context: today's weather, your calendar events, the time of day, and seasonal factors. This is where the LLM component operates — reasoning about occasion-appropriateness and style coherence in natural language, then translating that back into specific item selections.

The whole pipeline runs in under 200ms on-device for inference. Your wardrobe data never leaves your phone unless you explicitly share it.

The Tailor X Team

San Diego, CA