The project levarages strong visual illustrations and gamification of interactive quizzes and AI simulations to encourage a wide audience to understand the innovative image-generative AI technology released in 2022. The image generative AI, “Stable Diffusion,” featured in this article, uses a technique called “Latent Diffusion Model” that “generates images from noise” in response to user input of any word.The birth of generative AI such as Stable Diffusion and chatGPT is a major step toward the technological singularity. In terms of science communication, this project is successful in communicating to readers how generative AI works .
In the world’s news media, This project is so far the most successful piece of science communication on artificial intelligence . Generative AI, by its very nature of using human products for training, is controversial in terms of intellectual property rights and ethics, but there are also many examples of misunderstanding of AI technology that confuse the debate. In some cases, AI researchers and AI-enhanced designers have been slandered on social networking sites. A typical example of misunderstanding is the perception that generative AI directly generates images by collaging human products used for training so that AI violates on intellectual property rights.
Through careful discussions with AI researchers, this project has succeeded in correcting misunderstandings about collage, while creating an image-generating AI mechanism that “generates images from noise” with strong visual and interactive elements, and achieving both comprehensibility to a wide range of readers and scientific accuracy. This success is evidenced by the response to the project: many people on social networking sites responded that they could understand how AI works in a very easy-to-understand way, and the Github program used in the project to reproduce image generative AI received multiple stars from AI researchers.
The Washington Post (https://www.washingtonpost.com/technology/interactive/2022/ai-image-generator/) and other media have also launched experiential content using image generative AI after this project. Our project was the first and highest quality experiential content published soon after the release of Stable Diffusion. This project will remain valuable for many years to come as a snapshot of the early days of generative AI, which will be a game changer in the 2020s.
This project uses more than 1,000 images generated by the latest image-generative AI “Stable Diffusion” released in August and December 2022. Stable Diffusion generates elaborate images according to the words entered. In the human or AI quiz, we used a variety of creative words to create AI-generated images resembling famous painters such as Van Gogh and Renoir.
We built an image-generative AI by using machine learning libraries (Pytorch, Transformers, etc.). For the latter half of the project, we built an automated pipeline using Python and had the AI generate a large number of images (about 1000) for an experience that allowed the AI to pseudo-generate images by arbitrarily selecting 10 keywords.
To illustrate what a large amount of data was used to train the image generative AI, tens of thousands of public domain images of artworks from the Metropolitan Museum of Art and the AI’s training dataset “LAION-5B” were obtained using scraping methods.
Google Cloud Platform:
The image generation and data acquisition described above require large-scale computing resources such as GPUs (Graphics Processing Units) and time. By building a scalable large cloud computing environment, we were able to successfully apply AI to image generation.
Github provides a program that anyone can use to run the image generative AI discussed in this project. It is also possible to reproduce the generated images used in this project.
They are used as a front-end language to build the gamification of the human or AI quiz and the simulated experience of the image generative AI.
Context about the project:
This is an innovative project that takes the latest AI for image generation as its theme and uses actual AI for content creation. The project is capable of outputting approximately 1,000 different images based on 10 keyword combinations. For example, among the 1,000 possible combinations, the keywords “Kyoto” and “Paris” can be entered simultaneously to produce a strange and unique image of the Eiffel-like Tower in a Kyoto-like cityscape. This is an unprecedented example of journalism that makes full use of AI, using a large amount of computing resources and algorithms in cloud environment. The generated images are created with the latest AI models, which will be released between August and December 2022.
This project also carefully considered the legal issues surrounding image generative AI under the supervision of an attorney specializing in intellectual property rights. There are two major issues in image-generating AI: whether it is legally acceptable to use existing copyrighted works for training, and the legal risk that the generated images may have similarities to existing copyrighted works. We have confirmed that this project has no intellectual property issues under Japanese domestic law, and we have also guaranteed reproducibility by releasing a program on Github that can reproduce the image generation.
In this project, scientific accuracy was ensured not only through the supervision of professional AI researchers, but also through technical verification by the journalist himself, who carefully read and understood the paper on generative AI (https://arxiv.org/abs/2112.10752), and reproduced and implemented it in his own in-house computing environment. The scientific correctness is ensured through those technical verification.
The theme of the Human or AI quiz was carefully chosen to metaphorically represent the project’s theme of “how humans and AI should coexist”. Impressionist paintings such as Renoir’s and Post-Impressionist paintings such as Van Gogh’s were inspired by the birth of the pioneering technology of “photograph” in the 19th century, and the quiz metaphorically expresses the message that the birth of image-generating AI in the 21st century will spur new human creative activities, and that AI and humans should productively coexist in the future.
What can other journalists learn from this project?
Science communication is the issue that the media and politicians around the world failed miserably to address during the COVID-19 pandemic. The gamification of the human or AI quiz played an important role in attracting a wide range of readers to the article. The explanation of how AI works is the result of heated discussions among journalists, editors, engineers, designers, and researchers on an equal footing, and manages to combine scientific accuracy with ease of understanding for the reader.
The AI simulation section illustrates the importance of cloud computing in journalism. Cloud computing allows computing resources to scale semi-infinitely as needed, making it possible for a single journalist to analyze very large amounts of data and use sophisticated AI on a large scale.
At the end of the project, a reproducible program is available on Github, allowing anyone to easily run the image generative AI and regenerate the AI-generated images used in the article. The project also ensures transparency and verifiability, which are important to the ethics of both data journalism and science.
This project demonstrates the value of news media investing in R&D departments, and the high value of having newsrooms that follow the latest news and R&D departments that follow the latest technology working together.
The generative AI discussed in this project is not only an important issue for journalists in the context of fake images and AI ethics, but should also be recognized for its importance as a tool for journalists to organize and retrieve vast amounts of information.