Spinner

Drawify Publication

Explore Drawify research publications covering AI, visual design, image generation, image editing, and creative design evaluation.

Filter by
Filter by
Tag
Year Published
Clear Filter
All Publications
Anshu Raj
5th Jan, 2026

When Language Meets Visual Design Can Large Language Models Improve Automated Image Composition

This paper explores how Large Language Models can improve automated image composition by applying visual design principles such as balance, hierarchy, and spatial organization. It discusses how language-driven reasoning can guide AI image generation systems to create more structured and aesthetically coherent designs. The study also highlights challenges, evaluation methods, and future research directions for multimodal design-aware AI systems.

Download Paper
Tag Large Language Models Multimodal AI Visual Design Automated Image Composition
Anshu Raj
5th Jan, 2026

Can Large Language Models Understand Visual Aesthetics for Intelligent Image Editing

This paper explores how Large Language Models can support intelligent image editing through aesthetic reasoning and multimodal understanding. It focuses on how AI systems can analyze composition, lighting, color harmony, and visual balance to generate structured editing recommendations. The study proposes a language-guided editing framework that improves image quality while aligning more closely with human aesthetic preferences and creative design principles.

Download Paper
Tag Large Language Models Multimodal AI Visual Design AI Image Editing
Anshu Raj
5th Jan, 2026

A Structured Three-Stage Pipeline for Compositional Text-to-Image Generation with Editable Layouts and Object-Wise Attention

This paper presents a structured three-stage pipeline for controllable text-to-image generation using language understanding, editable layouts, and object-wise attention control. The framework improves compositional grounding by separating prompt parsing, layout planning, and image synthesis into independent stages. Experimental results show stronger object accuracy, spatial consistency, and attribute fidelity compared to existing diffusion-based generation methods, while maintaining high visual quality.

Download Paper
Tag Multimodal AI Text-to-Image Generation Diffusion Models Structured Layout Modeling
Anshu Raj
5th Jan, 2026

DrawBench: A Benchmark for High-Level Intent Multi-Format Creative Outputs

This paper introduces DrawBench, a benchmark framework designed to evaluate how generative AI systems handle real-world design workflows across raster images, vector graphics, and editable infographic formats. The study focuses on measuring creative intent understanding, layout precision, structural consistency, and multi-step editing performance rather than only visual quality. Experimental analysis highlights the strengths and limitations of diffusion, vector-based, and instruction-tuned multimodal models in professional design-oriented tasks.

Download Paper
Tag Multimodal AI Visual Design Design Benchmarking Multi-Format AI

Get visualisation tips every week

Subscribe to the Drawify Newsletter, and feed your creativity with visualisation tips and techniques, as well as the latest Drawify workshops, news and resources.