A multimodal MLLMs (multimodal model) that can transform an image based on a specific instruction.
Your review recommended to be at least 140 characters long :)