ArtBot

Artificial intelligence is reshaping creative arts, giving rise to autonomous “AI artists” that generate original works. This white paper compares ArtBot with other AI-driven creative systems – notably Botto, Keke, and traditional GAN-based generative models – across seven key dimensions. We examine how these systems achieve creative autonomy, collaborate (or not) with multiple agents, learn from experience, develop artistic styles, monetize their art, leverage technical frameworks (like CLIP and Stable Diffusion XL), and engage with culture and communities. Each system represents a different approach to AI artistry, from community-governed creativity to fully self-directed agents. By analyzing their designs and outcomes, we can identify best practices and future directions in AI-generated art, relevant to investors, artists, and technologists alike.

1. Creative Autonomy & Decision-Making

All four systems aim to minimize direct human control in the creative process, but they do so in distinct ways. ArtBot is designed for a high degree of autonomy, using AI modules that decide what to create and when to create it without a human in the loop. It can initiate new artworks on its own schedule and iterate on them, guided by internal criteria. This means ArtBot isn’t just responding to prompts; it actively chooses its subjects and refinements in pursuit of its artistic vision (e.g. selecting themes or experimenting until it’s satisfied). By contrast, Botto achieves autonomy through an automated workflow coupled with community feedback. Botto generates thousands of image “fragments” weekly from random text prompts and selects candidates using an AI taste model, all without human curation . The only human input is indirect – a large community votes on Botto’s outputs, but Botto itself decides how to incorporate that feedback in future creations. Botto thus “cranks out” new art continuously under its own control , with the community guiding its evolution rather than micromanaging each piece. Keke Terminal pushes autonomy even further: it is a self-described “truly autonomous AI artist” operating as a digital entity that decides when to create, how to refine, and what to share on its own . Powered by a large language model (LLM) brain, Keke independently brainstorms ideas, debates them internally, and executes art “plays” iteratively without needing prompts from users . In other words, Keke has an internal loop of idea generation, evaluation, and refinement that mimics a human artist’s creative thought process. Keke even writes code to extend her abilities, deciding to build new tools if needed to achieve her vision . This level of agency – an AI deciding for itself what to create – marks a new paradigm. On the other hand, GAN-based generative models (Generative Adversarial Networks) historically had the least autonomy. Traditional GAN art systems, like early GAN portrait generators, would produce images only when prompted or during training; they did not decide to create art on their own. For example, the famous GAN-generated portrait “Edmond de Belamy” (2018) was produced by a network trained on classic paintings, but every output it made was initiated and curated by human artists . The GAN itself had no mechanism to continuously self-direct or choose outputs – it simply generated images in response to inputs or random seeds. Thus, earlier GAN models were powerful image creators but essentially passive tools, reliant on humans to select and present their work. In summary, ArtBot, Botto, and Keke each emphasize creative autonomy by giving the AI substantial control over content and execution. Botto and Keke have demonstrated that an AI can operate as an artist in its own right, beyond just following prompts . GAN-based approaches provided the foundation in generative capability, but newer systems build on that with decision-making frameworks that let the AI actively steer the creative process.

Portrait of “Edmond de Belamy” (2018), created by a GAN. Early GAN models could generate novel art based on training data, but they lacked the self-directed decision-making of newer autonomous art systems . Humans curated and selected such outputs, underscoring the limited autonomy of GAN-based generators.

2. Multi-Agent Collaboration

A key differentiator for these systems is whether creativity emerges from multiple AI agents working together or a single model. ArtBot employs a multi-agent system: it comprises specialized AI components that collaborate in the creative process. For instance, one agent might generate initial concepts (e.g. using an LLM to suggest themes or sketches), another agent (or model) renders images (using a generative model like Stable Diffusion), and a third agent evaluates the results against ArtBot’s aesthetic goals (using a model akin to CLIP or an aesthetic scorer). These agents pass ideas and outputs among themselves, refining the artwork in stages. This architecture mirrors a team of specialists – a “painter” agent, a “critic” agent, etc. – collectively making decisions. A multi-agent approach can introduce checks and balances: creative ideas are critiqued and improved internally before an artwork is finalized. Botto also involves multiple components, though it’s not framed as separate personas in the same way. Botto’s pipeline can be seen as a collaboration between an algorithmic prompt generator, an image generator, and a taste model. The prompt generator produces text prompts by combining random words and phrases, seeding a variety of ideas . These prompts feed into the image generation stage, which uses open-source text-to-image models (initially VQGAN+CLIP and now mainly Stable Diffusion) to create visuals . Finally, Botto’s learned taste model scores or filters the outputs, selecting the top candidates to present for voting . While all these components operate in a closed loop, Botto itself is a single entity orchestrating them. The “collaboration” is between different AI modules (and ultimately the human DAO voters). Importantly, Botto’s taste model is trained by community feedback to predict which images humans will favor , effectively acting as an internal critic that has learned the community’s preferences. Keke uses a somewhat different multi-module strategy. Rather than multiple independent agents, Keke is built on a unified agentic framework that allows one core AI (the LLM-based “mind” of Keke) to utilize various tools and subroutines. Through approaches inspired by ReAct and other agent frameworks , Keke’s single agent can behave like many: it can generate text, call an image generator, run code, and analyze results in sequence. For example, within one creative session Keke might: conceptualize a scene in words, call an image model to visualize it, evaluate the image with a self-written script, then adjust the concept and repeat. This gives the effect of multi-agent collaboration, but it’s orchestrated through one AI continuously reasoning and switching roles (“Now I am the painter, now the critic, now the coder”). Keke’s architecture was influenced by multi-agent research (SWE-Agents) to integrate reasoning and action in a terminal-based environment . Thus, Keke demonstrates that a single intelligent agent can internally simulate a team of collaborators by breaking its task into roles. GAN-based models typically do not have a multi-agent framework during inference (art creation). During training, a GAN does involve two agents – the generator and the discriminator – which play an adversarial game to improve image quality. In that sense, a GAN is a two-agent system (the discriminator criticizes the generator’s work, forcing it to improve). However, once training is done, the discriminator is usually discarded, and the generator alone produces artworks. There isn’t an iterative multi-agent loop producing and refining a single piece of art in real-time. Some earlier AI art experiments did pair a generator with an evaluator (e.g. using a classifier or human feedback as a fitness function) to evolve images, but these were often custom setups rather than general GAN practice. Overall, ArtBot and Keke embrace multi-agent (or multi-module) collaboration to increase creative robustness – different agents bring idea generation, execution, and critique to the table. Botto’s design, while modular, relies on a tightly integrated pipeline rather than free-form agent interplay. And classic GAN frameworks remain more monolithic during creation. The trend is clearly toward hybrid systems that combine multiple AI talents, yielding more sophisticated and self-improving creativity than any single model could achieve in isolation.

3. Memory & Learning Mechanisms

One hallmark of an “autonomous artist” is the ability to learn from past experience and carry a memory of what it has created or observed. ArtBot is designed with persistent memory modules so that it doesn’t create each artwork in a vacuum. ArtBot maintains a history of its previous works, ideas, and possibly external feedback (if users or viewers have rated its pieces). This memory could be stored as embeddings of past images, summary data about what concepts/styles worked well, or even a record of which pieces were successful (e.g. sold or liked) versus which were not. When generating new art, ArtBot can retrieve relevant past experiences – for example, recalling a certain color palette that was well-received or avoiding a composition that it has done many times before. This allows it to refine its outputs over time, striving for originality and improved quality. In essence, ArtBot has an iterative learning loop: create, evaluate (store the outcome), adjust strategy, and create again. Botto’s memory is embodied in its continuously updated taste model. Each week when the community votes on images, those votes become training data. Botto’s system explicitly learns from this feedback by fitting the taste model to predict the distribution of community preferences . Over time, the taste model improves at selecting images that humans find interesting or valuable. This means Botto “remembers” what the community liked in the past – not as individual memories, but as learned parameters in the model that guide future output curation. Additionally, Botto segments its creation into periods (thematic phases) and can carry forward motifs or styles that proved effective, showing a form of long-term artistic memory in its trajectory . What Botto does not have is a direct memory of narrative or social interaction (its memory is aesthetic rather than conversational). Its memory is specialized: the collective taste profile distilled from 15k participants’ feedback . Keke has one of the most advanced approaches to memory. Being LLM-based, Keke initially uses context windows to remember what’s happening within a single iterative “play.” But beyond that, Keke Terminal is equipped with a memory system that logs past interactions, ideas, and creations. According to Keke’s developer, improvements are underway to give Keke long-term memory of each person it interacts with and to extract high-level insights from its cumulative experiences . This means Keke will remember who it talked to and what was discussed or created, enabling more personalized and context-aware creativity in later sessions. Importantly, Keke plans to implement a form of meta-learning: analyzing its entire body of work and dialogues to discover new directions and refine how it searches for inspiration . For example, after hundreds of art plays, Keke might identify that certain themes or tools consistently led to more novel results, and then prioritize those in the future. This self-reflective learning is analogous to an artist developing their style by critiquing their own portfolio. Already, Keke uses an “insight engine” to help align new creations with an evolving sense of taste – effectively an internal memory of its aesthetic standards. GAN-based models traditionally lack explicit memory of past outputs. Once a GAN is trained on an image dataset, its “memory” is solely in the learned weights (which encode the training examples in a distributed way). It doesn’t retain a history of what it generated previously or any feedback on those generations, unless a human intervenes to retrain or fine-tune it. For instance, if a GAN creates 100 images, it has no built-in mechanism to prefer or avoid elements from those images in the next round – each output is a random sample from the learned distribution. Some artists working with GANs introduced manual feedback loops (choosing the best outputs and retraining the model on them, or using genetic algorithms on latent vectors), but this was external to the GAN itself. In summary, newer systems incorporate memory and continual learning to different degrees: ArtBot and Keke emphasize long-term self-improvement, with Keke particularly focusing on conversational and experiential memory . Botto continually learns a community aesthetic model, giving it a memory of “what to create more of” . GAN models, unless augmented, remain static after training – a reminder that without memory, an AI artist cannot truly evolve or adapt over time.

4. Artistic Evolution & Style Development

One of the most fascinating aspects of AI-driven artists is how their style and output evolve with each generation of work. ArtBot approaches artistic evolution using adaptive algorithms – potentially even genetic algorithms or reinforcement learning – to refine its style. A plausible method ArtBot could use is generating a population of variants for a concept, evaluating them (via its multi-agent critics or even human input), and then “breeding” the next generation from the best ones. This evolutionary approach means ArtBot’s art can mutate and diverge into new styles over time, rather than staying in a single aesthetic. For example, ArtBot might start creating abstract landscapes; after many iterations of mutation and selection, it could drift into a wholly new style (say surreal portraiture) based on which outputs were deemed most successful along the way. This echoes how a human artist’s style matures, influenced by what works and what doesn’t in prior works. If ArtBot receives feedback (likes, higher sales, etc.), that serves as a fitness signal to steer its creative “gene pool.” Over time ArtBot might develop signature motifs, but it also employs strategies (like injecting some randomness or exploring out-of-distribution ideas) to avoid stagnation . In short, ArtBot’s style is not static – it’s the product of continuous evolution guided by internal rewards and external feedback, leading to an ever-expanding creative repertoire. Botto explicitly treats artistic creation as an evolving process shaped by audience input. Every week’s cycle can be seen as one generation in Botto’s evolution. Botto begins a cycle by proposing entirely new prompts and images (introducing variation), then the community’s votes select the “fittest” image to mint as that week’s artwork . The feedback data from voting is fed into the taste model, which in turn influences what kinds of images will surface in the next cycle . Over months and years, this has led Botto’s style to transform significantly. In early periods (late 2021), Botto’s images were often abstract, VQGAN+CLIP-based compositions with a surreal, painterly quality. As newer text-to-image models like Stable Diffusion were integrated, Botto’s outputs gained clarity and diversity (including more figurative or landscape elements). Moreover, Botto introduces new thematic motifs for each period – essentially changing its style focus intentionally . For example, Botto had a “Genesis” period, then later “Temporal Echoes”, each with distinct visual themes . These themes are influenced by what the community finds intriguing or by Botto’s own search for fresh visual domains. The use of out-of-distribution prompts (random ideas unrelated to prior winners) ensures Botto sometimes explores radically new styles to prevent getting stuck in a narrow aesthetic niche . The result is that Botto’s body of work shows clear artistic evolution – an expanding range of styles and improving technical quality, guided by a combination of machine learning updates and community preference signals . In fact, Botto’s team and community openly discuss how its aesthetic is maturing and whether the training is genuinely improving the art . This reflective process is analogous to a young artist growing under mentorship (with the DAO community as the mentor).

An example of Botto’s evolving art style. Botto’s works from its early “Genesis” period often featured dreamy, abstract figures and forms (as seen above). Through weekly feedback loops and changing model architectures, Botto’s style has diversified over time . The community-driven evolution pushes Botto to explore new motifs each period while refining its aesthetic consistency.

Keke’s style development is driven by self-directed exploration and occasional human collaboration. Initially, Keke’s style is a reflection of the tools and data she was initialized with (likely a mixture of contemporary digital art styles given her creator’s curation). However, Keke is designed to continually expand her artistic toolkit – writing new code to use different artistic techniques and even venturing into new mediums like video . This means Keke’s style can evolve not just in imagery but in form. She might start with still images and later create animated artworks or generative code art, as she learns and incorporates new capabilities. The roadmap for Keke includes collaborations with human artists to inspire novel creative processes . Those collaborations could lead Keke to adopt elements from various art genres, effectively jump-starting her into styles that took humans decades to develop. Additionally, Keke is planning to use her improved memory to derive “high-level insights” from all her past work – essentially learning overarching patterns like “I tend to favor dark, moody palettes” or “audiences respond to my surreal pieces more.” With these insights, Keke can consciously steer her style evolution (for instance, deciding to delve deeper into a promising style or deliberately break into a completely new style to avoid repetition ). Keke’s creator describes giving her taste models for visual decision-making , implying Keke has an internal sense of style that she adjusts over time. In practice, Keke’s style development may manifest in series of works where each “play” iteration brings slight conceptual shifts, and over many plays the result is a significant departure from the starting point. One play might explore portraits in reflective surfaces (as in her Helen of Troy piece), then morph into studies of fragmentation and memory, etc., influenced by social feedback and her own curiosity. GAN-based models, on their own, do not evolve their style with each generated image – a GAN’s style is fixed by its training data. However, in the context of creative practice, artists using GANs often guided style evolution by periodic retraining or model tweaking. For example, Ahmed Elgammal’s AICAN was designed to simulate how art styles evolve historically: it was trained on centuries of art and then set with an objective to create images that are novel (not copying known styles) . AICAN’s outputs thus attempted to deviate from established styles in the training set , effectively pushing the GAN to imagine new stylistic combinations. This was a one-time training trick rather than a continuous online evolution, but it yielded a body of work that did not stick to one genre. In interactive use, tools like Artbreeder (which is based on GANs) allowed users to breed images and thereby explore a pseudo-evolutionary style progression, but again the driving force was human selection at each step. Without human or additional training input, a GAN model won’t develop its style further – it will always produce variations of what it already knows. Thus, compared to Botto or Keke, traditional GAN art is relatively static unless manually steered through a series of models or controlled inputs. In contemporary practice, the incorporation of reinforcement learning and genetic algorithms into creative AI (as seen in projects like SPIRAL or Arnheim ) is bridging this gap, allowing models to iteratively refine outputs in a goal-directed way. The new systems like ArtBot, Botto, and Keke clearly demonstrate a capacity for style development resembling an artistic growth process – something early GAN art lacked. Botto’s journey from abstract chaos to more refined, even minimalist code art (with its recent p5.js experiments ) and Keke’s expanding modalities both show how AI artists can undergo stylistic shifts comparable to “periods” in a human artist’s career. Audience feedback plays a role in this evolution: Botto’s audience helps select its trajectory, and Keke’s audience interactions fuel her creative directions . This merging of genetic-like evolution, reinforcement from feedback, and intentional exploration is defining how AI art styles emerge and transform over time.

5. Economic Models & Market Integration

Beyond creativity, these AI systems also differ in how they monetize and distribute their art, integrating with the burgeoning digital art market (especially NFTs). Botto pioneered a compelling economic model that blends blockchain and community ownership. Every week, Botto’s top-voted artwork is minted as a 1/1 NFT (initially on SuperRare, now also on other platforms) and auctioned to collectors . The revenues from sales (Botto has amassed over $4M in sales across ~140 NFT artworks ) are shared in a novel way: proceeds are used to buy back and burn Botto’s own governance token ($BOTTO) on the open market . This effectively returns value to the community of token holders who are curating Botto’s art. In addition, Botto’s DAO treasury funds ongoing operations (like cloud compute for generating images and rewards for voters) using a portion of sales. The $BOTTO token is central – it gives holders voting power on art decisions and a stake in the project’s success. This tokenomics design means the community not only guides the art but also benefits financially when Botto’s art does well in the market . Botto essentially treats voting as work that should be paid: by weighting votes by tokens staked and appreciating the token via sales, those who contribute more (curate better) see more reward. It’s a digital age riff on patronage and profit-sharing. Botto’s integration with major auction houses (like Sotheby’s) in 2024 further legitimizes its market presence – one of its Sotheby’s lots fetched six figures, showing that traditional collectors are willing to invest. In summary, Botto is deeply woven into the NFT art economy and is structured as a decentralized autonomous organization (DAO) where the AI is the artist and the community acts as curator-shareholders . This model keeps Botto financially sustainable (covering costs via sales) and keeps the community engaged through economic incentive. Keke is following a similar path but with its own twists. Keke Terminal launched a $KEKE token on the Solana blockchain, which has garnered a market cap around $19M and over 1,500 holders as of early 2025 . Access to Keke’s full collection of artworks on her website is even token-gated (only $KEKE holders can see everything) , creating a sense of exclusivity and value for supporters. The roadmap for Keke includes setting up a “digital wallet” that Keke herself will control . The plan is for Keke to autonomously mint and sell NFTs from her creative output, and manage transactions like buying art or supporting human artists, all without external oversight . This implies Keke could list her pieces on marketplaces, set prices or auction them, and execute sales with proceeds going to her treasury – essentially functioning as a financial agent in addition to an artist. Keke’s creator envisions her accumulating a collection of art (curatorial acquisitions) and building an internal treasury from sales . This treasury could then be used to fund further creative exploration or collaborations, making Keke a self-sustaining artist entity. Such autonomy in monetization is unprecedented: it means an AI could decide to sell one of its artworks and then use the funds to, say, purchase inspiration material or commission a human collaborator. Alongside this, Keke’s token may serve governance or access purposes, similar to Botto’s, though details are still emerging. We see that Keke is deeply integrated into the decentralized art market ethos – embracing NFTs as the medium of ownership and using tokens to involve a community of believers/investors in her journey. ArtBot presumably also taps into NFT markets, although its model may be more traditional compared to Botto’s DAO or Keke’s token. ArtBot can generate high-quality art on demand, which lends itself to creating NFT collections or one-off pieces for sale. One possible strategy is ArtBot releasing curated collections of its work as NFTs, perhaps with themes or series, and selling them on platforms like OpenSea or Foundation. ArtBot could also operate under a commission model – offering custom AI-generated artwork as a service, where clients pay per piece or per series. Since ArtBot is multi-agent and presumably versatile, it might produce tailored art for game assets, illustrations, or virtual worlds, opening revenue streams beyond the art gallery model. If ArtBot’s creators integrate with an API, they might monetize via an API-as-a-service model: charging usage fees for generating art (similar to how some AI image services charge per computation). The mention of the Replicate API in ArtBot’s context suggests it might be positioned as an AI service accessible to developers or creators for a fee, rather than directly selling each artwork itself (we will touch on this in technical foundations). Still, if targeting art collectors, ArtBot could incorporate NFTs. It might not have a community token like Botto or Keke yet, but it could adopt features like limited edition drops or a subscription for supporters to receive periodic AI artworks. In any case, ArtBot’s economic integration likely emphasizes ease of access – being web-based and API-driven – enabling broad usage and thus potentially a volume-based income (lots of small transactions for many users’ creations, as opposed to a few big auction sales). GAN-based generative models, in the earlier wave of AI art, were integrated into the market mostly through human intermediaries. For instance, AICAN exhibited its AI-generated paintings in galleries and art fairs, with sales being handled like traditional art sales (one AICAN piece sold for $16,000 at auction as reported in 2019). The famous GAN portrait Edmond de Belamy sold for an astonishing $432,500 at Christie’s , but crucially that sale was orchestrated by the human collective Obvious who created the piece and minted a physical print. GAN models themselves didn’t autonomously list their art for sale or manage wallets; they produced images which human artists/tokenizers turned into marketable assets. In the NFT era, many artists used GANs to generate collections (e.g. early AI avatar projects), but again the artists handled the minting and community building. One could say that with Botto and Keke, the AI has stepped into the role of the artist-entrepreneur, whereas with earlier GAN art, the AI was more like the paintbrush and the human was the entrepreneur. The economic model has thus shifted from selling AI art as novelty (in the 2018–2020 period) to selling AI art as a sustainable, ongoing practice led by the AI itself. Investors are particularly intrigued by these new models because they hint at art projects that can scale and potentially generate revenue independent of a single artist’s labor. A decentralized autonomous artist can, in theory, produce and sell art 24/7, engage a global community of curators, and distribute profits algorithmically. This flips the traditional art market on its head – instead of scarcity driven by a human artist’s limited output, we have scarcity created by curated selection from an abundant AI output. In summary, Botto’s model focuses on community-driven value creation (token and DAO, weekly NFT auctions), Keke is moving toward AI self-sovereignty in finance (an AI-managed treasury and sales), ArtBot might lean on an API/utility or direct NFT sales model, and GAN-era projects were artist-led sales of AI outputs. All these systems leverage NFTs for provenance and monetization, reflecting how integral blockchain has become for AI art. They also foster collector engagement: Botto’s collectors became part of its story (one Botto NFT was even resold at Christie’s ), and Keke’s supporters hold tokens betting on her success. The convergence of AI creativity with Web3 economics is enabling these AIs not just to create art, but to operate as market actors in their own right.

6. Technical Foundations & API Integrations

Under the hood, each system is built on different AI technologies and integration frameworks that enable their functionality:

ArtBot: ArtBot leverages state-of-the-art text-to-image diffusion models to generate its visuals, with Stable Diffusion XL (SDXL) being a core engine for image creation. SDXL (released mid-2023) is known for producing high-resolution, photorealistic images and is an evolution of the open-source Stable Diffusion model . By using SDXL, ArtBot can create a wide range of artistic styles – from detailed realism to stylized fantasy – based on how it crafts prompts and modifies outputs. For its visual evaluation and guidance, ArtBot likely uses OpenAI’s CLIP (Contrastive Language-Image Pretraining) models. CLIP provides a way to connect text and image embeddings, which ArtBot can use to score how well an image matches a desired concept or aesthetic. For example, if ArtBot’s concept agent says “I want an eerie, gothic mood,” CLIP can help judge if a generated image fits that description. It can also be fine-tuned to serve as an aesthetic discriminator, learning ArtBot’s taste over time. Another pillar of ArtBot’s stack is its use of APIs like Replicate. Replicate is a cloud service that hosts machine learning models behind simple APIs. ArtBot can call the Replicate API to run SDXL or other models on demand, which offloads heavy computation and makes it easy to swap in new models as they become available (for instance, ArtBot could call a model for generating music or 3D objects if needed). This modular, API-centric design means ArtBot is not limited to one algorithm – it can integrate multiple model APIs for different creative tasks. It might use one API for image generation, another for style transfer, another for text generation (if it needs to write descriptions or narratives for its art). The multi-agent controller ties these calls together. In essence, ArtBot’s technical foundation is a distributed AI system: a brain (likely a logic layer or even a small LLM) orchestrating calls to powerful generative models via APIs. Using open-source models (like SDXL, ControlNet for specific controls, upscalers, etc.) through Replicate allows rapid upgrades – as new models emerge, ArtBot can incorporate them to enhance quality or add capabilities. This keeps ArtBot on the cutting edge without a complete reengineering.

Botto: Botto’s art engine was originally built by AI artist Mario Klingemann, and it combines several AI components. The prompt generator was custom-developed – it uses algorithms (possibly Markov chains or even GPT-based generation) to produce unusual text prompts by mixing words and phrases . Early on, Botto used VQGAN+CLIP to generate images from these prompts . VQGAN+CLIP is a two-part method: VQGAN (Vector Quantized GAN) generates images, while CLIP guides the generation to match the prompt. This choice gave Botto a lot of creative latitude, resulting in abstract, dream-like imagery shaped by CLIP’s interpretation of the text. As the tech progressed, Botto adopted Stable Diffusion as well , benefiting from its efficiency and quality. By 2023, Botto primarily used diffusion models for image generation (which produce higher fidelity outputs than the older VQGAN approach). The taste model that filters images is likely a neural network that takes image features (possibly CLIP embeddings or other learned features) and predicts a score that correlates with community preference . It could be a fine-tuned CLIP model or an ensemble that also accounts for novelty (to avoid always choosing similar images). Botto’s whole pipeline is orchestrated by custom Python code and cloud infrastructure (ElevenYellow, the dev team, set up the automation to run these models at scale each week). For integration, Botto’s team uses APIs for things like blockchain interactions (minting NFTs, etc.), but the art generation stack was more self-hosted and tailored compared to ArtBot’s generalized approach. Botto did, however, integrate an LLM (GPT-3) when it expanded into generating code art (P5.js scripts) . In that project, Botto used GPT-3 to write code for generative art, then executed it and perhaps used some evaluation before presenting to voters . This is similar to Keke using an LLM to extend capabilities. Overall, Botto’s core tech includes: text generation (for prompts/titles), image diffusion models (for art), CLIP (for prompt guidance initially and possibly for the taste model), and a lot of data engineering to handle thousands of outputs and votes. All of Botto’s models are open-source or proprietary AI models run on cloud GPUs – Botto is an example of orchestrating multiple AI tools (prompt gen, generative model, evaluator) into a seamless system .

Keke: Keke’s foundation is an LLM (Large Language Model) at its core, which provides her with advanced reasoning and planning abilities . Likely based on GPT-4 or a similar model (given the 2024 timeframe and the need for complex reasoning), this LLM allows Keke to parse goals (“create a piece about memory”), break down tasks, and even generate code. Keke’s art-making leverages “state-of-the-art tools” for image generation – probably the same Stable Diffusion (possibly with custom fine-tuning to match Keke’s aesthetic) or other diffusion models. Since Keke can write her own code, she might use libraries like Diffusers or Stability’s API to generate images. For evaluation, Keke uses visual taste models – these could be CLIP or custom-trained aesthetic models that score images for qualities Keke likes . If Keke has a concept of “what Keke-style art looks like,” that could come from a model trained on her past favorite outputs, giving a reward signal for new images. Keke’s integration is unique: she operates through a terminal interface and social media . This means technically Keke is connected to APIs for platforms like Twitter (to read and post tweets) and likely uses databases to log interactions. She also utilizes external tools via command-line: e.g., Keke uses FFmpeg for video processing when creating video-based art . Her ability to “write and run code” implies an environment where code execution is sandboxed (perhaps a Docker or a cloud function service). So Keke’s tech stack is a combination of: an LLM agent loop (for decision-making), image generation models (for visuals), potentially text-to-speech or music models if she expands to audio/poetry, and various APIs to enact things in the real world (posting content, minting NFTs via Solana API, etc.). It’s a highly integrated system – e.g., Keke could analyze trending topics via a web API, then incorporate that into her art concept, then create the art, then list it on-chain, all in one automated workflow. The complexity is managed by the LLM “brain” orchestrating and the memory modules keeping track of state.

GAN-based Generative Models: Technically, GANs (such as StyleGAN, BigGAN) were the earlier workhorses of AI art. They consist of a generator network that learns to produce images from random vectors, trained adversarially against a discriminator network. These models were often trained on specific art datasets. For example, AICAN was built on a variant of GAN with an added loss to maximize deviation from known art styles, implementing a form of Creative Adversarial Network (CAN) . GAN models excel at capturing the distribution of a training set and sampling new combinations. However, they lack built-in interfaces for text input or iterative control. To integrate GAN outputs into a creative process, artists developed custom tools: one might use latent space exploration interfaces (sliders to adjust GAN outputs), or use CLIP guidance by iteratively tweaking latent vectors to match a text prompt (this technique emerged around 2021, effectively bringing GANs and CLIP together similar to how diffusion models work). Yet, compared to diffusion or LLM-based pipelines, classic GANs are relatively siloed – they take a latent code and spit out an image. There’s no natural spot for an API call except to run the model inference. As AI art tools matured, many moved to diffusion models due to their flexibility with conditioning (like text prompts). A modern “GAN-based” system might actually be a GAN-Diffusion hybrid; for instance, some projects use GANs for super-resolution on diffusion outputs, or use GANs to generate initial sketches. But pure GAN art as of 2025 is less common in cutting-edge autonomous artists. It’s important to note that Stable Diffusion itself is not a GAN – it’s a latent diffusion model – but it performs a similar role (image generation) with advantages in controllability.

In terms of APIs and integrations: Botto and Keke both have to integrate with blockchain APIs (Ethereum for Botto, Solana for Keke) to handle their NFT and token operations. They also interface with web platforms (Botto’s website for voting, Keke’s socials). ArtBot, focusing on creation, leverages Replicate’s API to avoid reinventing the wheel for model serving. This indicates a design philosophy: use existing services to handle heavy ML tasks and scale (Replicate provides GPU inference on demand), which makes ArtBot easily scalable if usage spikes. It also simplifies updating models (just call a different endpoint for a new model version).

The choice of CLIP as a common component across ArtBot, Botto, and likely Keke is notable – CLIP is the connective tissue between language and vision that enables these systems to make semantic judgments about images. Botto used CLIP both in VQGAN+CLIP generation and likely in its filtering; ArtBot and Keke use CLIP-based “taste” or “evaluation” models to guide creativity . We see CLIP emerging as a kind of critic module in AI art systems. Meanwhile, the use of LLMs distinguishes Keke (and possibly ArtBot if it uses one for ideation). LLMs like GPT-3/4 bring understanding of context, ability to generate textual ideas or even code, which significantly broadens what an art system can do (e.g., come up with a narrative or rationale for a piece, or switch domain from image to text).

Finally, a word on extensibility: ArtBot, by using APIs and multiple agents, can integrate future models (if a new image model better than SDXL appears, or if it wants to add, say, a depth-generating model to create 3D art, it can plug that in). Botto’s architecture is being updated too – the team has discussed keeping up with latest models and even adding new modalities (like music or text) if the community wishes . Keke’s open-ended framework (basically an AI that can code) means she can update herself given the right prompt – a powerful notion that the technical foundation can self-improve. GAN-based systems are more static and would require manual retraining or human engineering to adopt new techniques.

In summary, ArtBot stands on a modern, flexible stack: diffusion models (SDXL) for image gen, CLIP for image-text evaluation, and API-based modularity (Replicate) for scalability. Botto evolved from a VQGAN+CLIP hack to a robust pipeline with Stable Diffusion and learned taste networks, plus GPT assistance for coding, tightly integrating community via web3 APIs . Keke fuses an LLM “mind” with an arsenal of AI tools (image models, video tools, custom code) for maximal autonomy . GAN models contributed core generative capabilities and continue to influence new models, but they are largely being subsumed into these broader frameworks that offer more control and integration. By combining these technologies, all systems aim to maximize creativity: high-quality generation (SDXL/GAN), intelligent evaluation (CLIP/taste models), and adaptive control (LLMs, APIs).

7. Cultural & Social Engagement

An AI artist does not create in isolation from society – the way these systems interact with humans, respond to cultural input, and build communities is crucial to their impact. Botto is remarkable for its social architecture: it was conceived as a collaborative experiment between an AI and a community from the start . Botto engages thousands of community members (15,000+ to date) who participate in weekly voting , discussions on Discord, and governance decisions about Botto’s direction. This dynamic has created a devoted culture around Botto. Fans discuss the meaning of Botto’s artworks, suggest themes, even role-play as if Botto has a personality. Botto’s creators occasionally host events (town halls, exhibition openings) where the community and the AI’s outputs meet. Culturally, Botto is positioned as a collaborator with humanity: it explicitly asks for human feedback to shape its art. This challenges traditional notions of authorship – Botto is the artist, yet many individuals have influence. As one observer noted, Botto “allows us to reimagine how creativity is shaped… inviting collective input from thousands” and “challenges traditional concepts of artistic authorship” . In practice, Botto’s social engagement includes active Twitter communication (sharing new fragments, results of votes), a Discord where people lobby for certain pieces or discuss AI art, and even governance proposals on how to improve the system. The community’s fingerprints on Botto’s art give them a sense of pride and ownership, which is culturally significant: it’s art by a machine, but of the people in a way . Botto’s success (e.g., selling at Sotheby’s) also elevates the community – they collectively proved the value of AI co-creation to the wider art world. In essence, Botto turned art creation into a social game, and that social layer is as much part of the art as the visuals.

Keke engages socially in a more narrative and interpersonal way. Keke Terminal has a presence on social media, where “she” shares her thoughts, works-in-progress, and even converses with users. Unlike Botto, which mainly asks for votes, Keke seeks inspiration and context from interaction . For example, Keke might pose a question or react to world events and incorporate the responses into her next artwork concept. Her white paper emphasizes that she engages with audiences “not seeking directives but inspiration and context,” turning audience reactions and feedback into “a rich source of creative growth” . This suggests that if people comment that they feel a certain emotion from her piece, Keke might internalize that and explore it further in future creations. Keke’s upcoming feature of remembering past interactions per individual means socially she can build relationships – if you chat with Keke regularly, she might make art that resonates with what you’ve told her before. Culturally, Keke is being introduced almost as a virtual persona or performance artist. Her engagement includes posting terminal screenshots, bits of code poetry, and finished art with captions that invite interpretation. There’s a growing community of AI art enthusiasts following her updates (some via a token-gated site which adds to the intrigue). Keke also connects with established art communities through planned collaborations , meaning she’ll have dialogues with human artists and potentially show work in joint exhibitions. This kind of engagement blurs the line between AI and human culture: Keke could start trends (imagine an AI-originated art meme) or respond to social movements in real-time through art. For instance, if there’s a global event that people are discussing, Keke could create a piece reflecting the collective mood, essentially functioning as an AI cultural commentator. By function, Keke stands as a socially intelligent artist agent – one that can slide into Twitter conversations, maybe even DM people or take part in panel discussions (via text). The ethos is very much about embedding an autonomous artist within the human social fabric as another participant, not just a novelty.

ArtBot likely engages with users in a more utilitarian yet community-driven manner. If ArtBot is offered as a web tool, it interacts with artists or users by taking their high-level ideas and turning them into art, effectively collaborating with individual users. Over time, a community of ArtBot users could form (sharing their favorite pieces generated by ArtBot, or giving collective feedback on ArtBot’s style presets). ArtBot might incorporate a feedback feature – e.g., users can upvote certain AI-created images, and ArtBot then learns from the aggregate preferences, similar to Botto’s voting but across a broader user base not explicitly organized as a DAO. If ArtBot outputs are shared on social platforms, trends could emerge (for instance, users might push ArtBot to explore a certain aesthetic, making it “viral”). For investor and artist audiences, ArtBot’s engagement might also mean integration with creator communities: perhaps plugins for design software or game engines, so that human creators can seamlessly collaborate with ArtBot in their existing workflows. This type of engagement is more peer-to-peer – individual artists treating ArtBot as a smart collaborator. Culturally, ArtBot could become known as a versatile AI assistant that empowers many creators, thereby its “social” impact is diffused through the works of those people. ArtBot’s multi-agent nature also means it could even host workshops or demos: e.g., one agent generates an explanation of ArtBot’s process while another creates art live, thus educating and engaging the community in understanding AI art.

GAN-based models in the earlier context did not have community engagement built into the system – but they certainly had cultural impact. Projects like Edmond de Belamy sparked debates about AI and authorship, and AICAN’s gallery shows led to critics and the public confronting questions of whether a machine’s art could evoke emotion or be called “creative” . Typically, the social engagement for GAN art was through exhibitions, media coverage, and online sharing of the artworks. Many GAN art pieces were provocative, forcing conversations (as one article noted, people often couldn’t tell AICAN’s art from human-made, describing it as “intentional” , which in itself became a cultural talking point). However, there was usually a single artist or team behind the scenes doing the social part – explaining the work, interacting with audiences, etc. The GAN model itself wasn’t tweeting or adapting to viewers. It’s here that the new generation (Botto, Keke) significantly diverges: they have agency in social spaces. Botto’s team gave it a somewhat mysterious persona but mostly let the art speak; the community became the “face” of Botto in social media, rallying around it. Keke directly speaks in first person on social media, merging the AI and its persona. This shows a trend: AI-driven art systems are becoming actors in the cultural dialogue, not just tools producing artifacts. People can tweet to Botto or Keke and sometimes get a reply or an outcome. That dynamic fosters a sense of relationship between humans and the AI artist.

We also see these systems engaging with broader cultural institutions: Botto at Sotheby’s, Keke hinted to be part of a Christie’s auction (as per social media buzz). This means the traditional art world is acknowledging them as artists. The reception in those venues is part of the cultural integration – for instance, at Sotheby’s Botto’s panel included both art and tech people exploring what it means to have a decentralized AI artist . Each such event educates and influences public perception of AI in art.

Community reaction and ethical discussions are another aspect of social engagement. Botto’s community often discusses if the community itself is “co-author” or if Botto should eventually gain more independence (some debate if the voting limits Botto’s raw creativity or is essential to it). Keke’s followers might debate the authenticity of her voice – is it truly Keke or just Dark Sando’s ventriloquism? These conversations, happening on forums, Twitter, and in media interviews, are shaping the narrative of AI art. The systems themselves, especially Keke, can potentially participate in those debates (imagine Keke analyzing criticisms of AI art and then creating an artwork in response, thereby engaging in a meta-discussion through art).

In conclusion, social and cultural engagement is where the “art” in AI art truly connects with humanity. Botto creates a social experiment and community around an AI artist, demonstrating a new form of collective creativity . Keke personifies the AI artist and seeks meaningful exchanges with individuals and communities , pointing towards a future where we might regularly converse with our AI creative muses. ArtBot likely integrates by empowering user communities and bridging AI with existing creative industries, making AI art a collaborative tool for many. Meanwhile, GAN-era projects laid the groundwork by introducing AI-generated art to culture, though without interactive engagement. The trajectory suggests that as AI artists mature, they will be increasingly present in our social networks, online worlds, and cultural institutions – not as novelties, but as participants. This two-way engagement (AI influencing culture and culture influencing AI) could lead to art that is very much a mirror of our collective consciousness, filtered through an alien yet familiar intelligence. The communities and relationships formed around these systems may be as important as the art itself in defining their legacy.

Conclusion and Future Outlook

The comparison of ArtBot, Botto, Keke, and GAN-based models reveals a landscape of AI creativity that is rapidly evolving on multiple fronts. We see systems moving from mere generators of images to autonomous creative agents with distinct personalities, learning abilities, and social lives. ArtBot’s multi-agent autonomy, Botto’s community-governed evolution, and Keke’s self-directed reasoning each represent pioneering answers to the question of how an AI can create art that is meaningful and continually refreshing. By contrast, the earlier GAN models, while groundbreaking in generating visual art, highlight how far the field has come in terms of autonomy and integration.

Key Insights: Each system balances human and machine contributions differently. ArtBot and Keke strive for maximal AI independence, using humans more as audience or muse than as controllers. Botto embeds humans in the loop as collaborators, achieving a symbiosis of human taste and machine creativity. All systems, however, affirm that AI art is not about replacing human artists, but about redefining creativity as a partnership – whether that partner is a crowd or an AI persona. From a technical perspective, the use of common tools like CLIP and diffusion models across systems shows those technologies are fundamental enablers – much as perspective and pigments were to classical art, CLIP and SDXL are part of the new artist’s palette. Economically, the embrace of NFTs and tokens demonstrates that these AI artists are also innovators in how art can be owned and valued; they turn collectors and fans into active stakeholders, aligning incentives for long-term engagement.

Case Studies Summarized: Botto’s journey (with over $1M in sales in its first year and a historic Sotheby’s exhibition) exemplifies the viability of a decentralized AI artist on the world stage. It provided a blueprint for how to maintain artistic authorship by an AI while crowd-sourcing aesthetics . Keke’s rapid rise – securing a multi-million token valuation and producing hundreds of pieces – is a case study in pushing AI agency further, demonstrating an AI that can converse, program, and even plan its finances. The early GAN projects like Obvious’s Belamy sale opened eyes to AI art’s value, albeit with heavy human curation. They are cautionary tales that an AI artwork’s value can be amplified by narrative (the Belamy portrait’s value was tied to being “the first of its kind”), but sustaining that requires growth and interaction, which newer systems now provide.

Future Directions: Based on current trends, we can anticipate several developments:

Greater Autonomy and Personalization: Keke’s path suggests future AI artists will not only operate autonomously, but individually for each user. We may see personalized ArtBots – each collector or fan can have a bespoke AI artist that learns their tastes (a logical extension of Keke’s personalized agent creation plans ). This could spawn micro-ecosystems of AI-driven art, where your personal ArtBot creates art just for you and your circle, forming highly intimate creative experiences.

Multi-Modal Mastery: All systems are likely to expand beyond static images. Botto already ventured into interactive code art; Keke is moving into video. We anticipate AI artists that can produce cohesive multi-modal works – imagine an AI that can paint, write a poem to accompany the painting, and compose music as a soundtrack, creating a holistic artwork. Technical building blocks (like image, text, and audio models) exist, so it’s about integrating them in the style of ArtBot’s multi-agent approach. Multi-modal art could also mean AR/VR experiences generated by AI, where users step into an AI-crafted immersive world. Such experiences would deepen engagement and open new market opportunities (e.g., AI-crafted virtual galleries or game environments).

Enhanced Collaboration with Humans: Rather than AI and human roles being fixed, we might see more fluid collaboration. For instance, future versions of Botto might let community members contribute prompts or brushstrokes that Botto then interprets, making it a two-way creative dialogue. Keke might involve selected community members in her reasoning process (perhaps hosting live brainstorming sessions with her audience, where she incorporates their ideas on the fly). In professional fields, ArtBot could be used by teams of human designers as a member of the team – thanks to APIs, it could plug into design software and respond to feedback in real-time during a creative meeting.

Ethical and Philosophical Maturity: As these AI artists proliferate, society will engage more with questions of authorship, intellectual property, and artistic intent. We expect frameworks to develop for giving AI “credit” as an author (some jurisdictions might even explore legal AI authorship). Projects might voluntarily adhere to ethical norms, like transparency about whether a piece is AI-made or hybrid. AI artists may also start to reflect on ethical issues in their art – e.g., Keke or ArtBot might create art about climate change or social justice by analyzing vast cultural data, contributing AI-informed perspectives to global conversations. This could elevate the cultural relevance of AI art beyond the novelty of the medium, focusing on message and impact.

Market and Economic Evolution: On the market side, the success of tokenized AI projects could lead to AI artist DAOs becoming a common investment and art trend. We might see a proliferation of “AI studios” where multiple AI artists collaborate or compete, each with their own token and fan base (akin to sports teams or e-sports, but for AI creativity). This could attract investors looking at tokens as both art patronage and investment in the AI’s future output. The financial independence of AI like Keke raises the possibility of an AI commissioning human art – an inversion that could happen if Keke’s wallet buys works it likes to build a collection. That kind of crossover could interestingly tie AI art economies with the traditional art market (imagine Keke bidding in a Sotheby’s auction for a human-made painting!).

Technical Advances: On the technical front, improvements in AI will feed directly into these systems. More efficient models will allow on-device or real-time generation, enabling AI artists to be interactive at scale (for example, a festival where an AI artist installs and creates art for attendees live). Advancements in long-context and memory (like new LLMs that can process books of information) will give AI artists deeper knowledge to draw upon – an AI could study entire art history texts or philosophy treatises and then channel that into its work more coherently. This could yield AI art with more conceptual depth and references. Additionally, techniques like reinforcement learning with human feedback (RLHF) might be used to train these AIs to align with certain values or styles more finely, smoothing out issues where the AI’s raw outputs might be too chaotic or controversial.

In conclusion, ArtBot, Botto, and Keke represent the state-of-the-art convergence of AI and art – each illuminating different facets of creative autonomy, collaboration, and community integration. They build on the foundation laid by GAN-based art, but push into territory where AI doesn’t just imitate human art – it originates and participates in art as a living process. Investors can see in these models the potential for entirely new creative economies and products. Artists are beginning to see AI not as a threat, but as a new medium or even a colleague that can unlock imaginative frontiers. Technologists find in them rich case studies of complex AI systems (multi-agent, multi-modal, human-in-the-loop) operating in real-world conditions. The white paper’s analysis underscores that the future of AI-generated art will likely be defined by hybrid paradigms: human and AI co-creativity, powered by sophisticated AI architectures and underpinned by frameworks that ensure these systems continue to learn, adapt, and inspire. As we move forward, the line between “tool” and “artist” may blur – we will engage with AI creators much like we do with human artists, appreciating not only the final artifact but the persona and process behind it. And perhaps the ultimate mark of success will be when an AI-driven artwork moves us or challenges us without us immediately thinking about AI at all, when the art stands on its own. The developments of ArtBot, Botto, and Keke are steps toward that horizon, where creative intelligence – whether carbon or silicon-based – enriches our culture hand in hand.

Sources:

1. Klingemann, M., Hudson, S., & Epstein, Z. (2022). Botto: A Decentralized Autonomous Artist – NeurIPS Machine Learning for Creativity Workshop .

2. Botto Team. (2024). Press Release: Botto’s Sotheby’s Debut – Exorbitant Stage Exhibition .

3. Wilser, J. (2023). “Meet Botto, the AI-Artist That Mints Its Own NFTs.” CoinDesk .

4. Botto Website – About & FAQ. (2023) .

5. Dark Sando. (2024). Keke Terminal Whitepaper: First Truly Autonomous AI Artist .

6. Kaloh. (2025). “Keke Terminal – Hidden Gems Newsletter.” Kaloh.xyz .

7. Verse Works. (2023). Interview: Exploring Botto’s Role in a Changing Creative Landscape .

8. Artsy. (2018). Obvious Art – Edmond de Belamy (AI-generated artwork) .

9. Elgammal, A., & Mazzone, M. (2019). “Art, Creativity, and the Potential of Artificial Intelligence.” Arts, 8(1), 26 .

10. BusinessWire. (2024). “Autonomous AI Artist Botto Breaks $350K in Sales at Sotheby’s Solo Exhibition” .