Controllable synthetic image generation

We have developed a novel framework for synthetic image generation that focuses on modifying existing images rather than creating them entirely from scratch. Our approach enables precise control over which regions of an image are altered and which remain unchanged. Users can either select regions to modify or mark areas that must be preserved, and then provide a prompt describing the desired transformation.

Key Advantages

This region-based methodology offers several advantages, particularly in data augmentation and content consistency:

Annotation Preservation

By controlling which parts of the image are modified, we ensure that existing annotations (e.g., bounding boxes, masks) remain valid post-generation. This significantly reduces the need for manual re-annotation, a common bottleneck in training data pipelines.

Unlimited Data Augmentation

Leveraging existing datasets, we can generate an effectively unlimited number of variations, enriching training sets with greater diversity and edge cases.

Targeted Edge Case Generation

We can easily produce rare or difficult scenarios (e.g., extreme weather conditions, lighting variations) by modifying only relevant regions of the image, while keeping the rest intact.

Applications Beyond Data Augmentation

Our framework is also highly applicable in industry settings where visual consistency is critical. One example is fashion e-commerce, particularly in generating "pack shots":

In such scenarios, it's essential that the clothing item—its texture, structure, logos—remains unaltered.
Traditional generative models often hallucinate or deform these critical regions.
With our approach, users can lock the clothing item and regenerate only the background or surrounding context.
To improve our model even further we used model distillation technique to transfer knowledge from a complex but slower model. Depending on the deployment target we could improve the speed by applying other techniques such as model quantization.

Our segmentation model is a great illustration of our model training pipeline. The example model segments people from the background, can be deployed on MediaTek Genio or NVIDIA Jetson. But our training process is very flexible. Depending on our clients needs, we can train models with different segmentation classes (e.g. vehicles). It can be also deployed on a different hardware platform. In such cases we can apply our architecture search solution to find the most suitable model for selected hardware.

Request your AI consultation

Consent*