20
submitted 5 days ago* (last edited 5 days ago) by [email protected] to c/[email protected]

Abstract

Scalable Vector Graphics (SVG) is an important image format widely adopted in graphic design because of their resolution independence and editability. The study of generating high-quality SVG has continuously drawn attention from both designers and researchers in the AIGC community. However, existing methods either produces unstructured outputs with huge computational cost or is limited to generating monochrome icons of over-simplified structures. To produce high-quality and complex SVG, we propose OmniSVG, a unified framework that leverages pre-trained Vision-Language Models (VLMs) for end-to-end multimodal SVG generation. By parameterizing SVG commands and coordinates into discrete tokens, OmniSVG decouples structural logic from low-level geometry for efficient training while maintaining the expressiveness of complex SVG structure. To further advance the development of SVG synthesis, we introduce MMSVG-2M, a multimodal dataset with two million richly annotated SVG assets, along with a standardized evaluation protocol for conditional SVG generation tasks. Extensive experiments show that OmniSVG outperforms existing methods and demonstrates its potential for integration into professional SVG design workflows.

Paper: https://arxiv.org/abs/2504.06263

Code: https://github.com/OmniSVG/OmniSVG/

Weights: https://huggingface.co/OmniSVG/OmniSVG

Project Page: https://omnisvg.github.io/

Demo: https://huggingface.co/spaces/OmniSVG/OmniSVG-3B

top 10 comments
sorted by: hot top new old
[-] [email protected] 5 points 5 days ago

I am very into this if it can take a non-vector graphic as input and work to that. OpenAI's attempts at that have been complete dickfarts

[-] [email protected] 2 points 5 days ago* (last edited 5 days ago)

This is the first time I’ve seen a model target SVG drafting. Anything you have seen previously about unicorns or whatever was just someone experimenting with interesting edge case usage of models not designed for this purpose.

Feeding a language model a bunch of vector art does not seem productive to me. So it makes sense that something like GPT4 sucks at it.

[-] [email protected] 2 points 5 days ago

It can do IMG to SVG. Check out the right side of this image:

[-] [email protected] 5 points 5 days ago

Hard to judge quality when what we're seeing is practically a pixel-perfect recreation. The tricky part of automated vectorization is detecting and plotting curves in such a way that it scales correctly. Bad implementations will use too many elements, or include straight lines that should be parts of curves, etc. Those errors would not be visible in those low-res rasterizations.

[-] [email protected] 4 points 5 days ago

The project page didn't have a link to it, but there is a demo on HF.

[-] [email protected] 3 points 5 days ago

Just gave it a try. I couldn't get coherent results from img-to-svg with a few different tests of low-res pixel art and high-res cartoons. txt-to-svg also gave me incoherent blobs even with simple prompts. Something must be wrong there. Is it working for anyone else?

I might just try installing it locally when I get home.

[-] [email protected] 2 points 4 days ago

Okay, you let me tie this into a soreadsheet ir something to geberate charts, and there's finally a use case for this that i like.

Im not sure it's worth needing a 5080 to make ultra pretty graphs, but, you know; smoke em if you got em.

[-] [email protected] 2 points 5 days ago

[creams self in simple features]

[-] [email protected] 1 points 5 days ago
[-] [email protected] 2 points 5 days ago

I want to fine tune this model on large geospatial datasets.

this post was submitted on 22 Jul 2025
20 points (100.0% liked)

Stable Diffusion

4939 readers
1 users here now

Discuss matters related to our favourite AI Art generation technology

Also see

Other communities

founded 2 years ago
MODERATORS