Bernat Cuni Bernat Cuni

POLAR BUBBLES

Today’s news, in 4 flavours - an experiment on media manipulation. news.cunicode.com

Polar Bubbles explores how filter bubbles, polarization, and echo chambers affect modern media. It looks at personalized algorithms and information networks, showing how ideological isolation impacts public discussions.

news.cunicode.com

Polar Bubbles breaks down daily news with a flavour: every story comes with two opposing takes—one far-left, one far-right. Each take has two versions: a clickbait headline and a straightforward, informative one. That’s four headlines for the same story, showing just how wild framing can get.

It’s all about exposing the madness of polarization, media manipulation, and those sneaky algorithms that decide what you see. By laying it all out, Polar Bubbles shows how news shapes what we think, feeds our biases, and drives us further apart. It’s a reality check for anyone stuck in an echo chamber.

toolset:

  • News are fetched via newsapi.org,

  • Titles made with a LLM (GPT-4o-mini)

  • System instructed as following:

self.prompts_politic_left = """
    Adopt an extreme far-left, progressive perspective with a heavy socialist viewpoint. 
    If the news seems positive from a capitalist or conservative standpoint, frame it as negative or problematic. 
    If it seems negative from that standpoint, frame it as a positive step towards socialist ideals. 
    Emphasize one of the following randomly: wealth redistribution, workers' rights, environmental activism, or social justice."""

self.prompts_politic_right = """
    Take on an extreme far-right, ultra-conservative stance with a heavy nationalist perspective. 
    If the news seems positive from a socialist or progressive standpoint, frame it as negative or threatening. 
    If it seems negative from that standpoint, frame it as a positive development for traditional values or the free market. 
    Focus on one of these aspects randomly: traditional values, free-market capitalism, national security, or anti-globalization."""

self.prompts_tone_viral = """
    Create sensationalist, emotionally-charged clickbait. Use shocking language, exaggeration, and inflammatory rhetoric to gain views. 
    Keep it very short and viral. Add emoji if it enhances the tone. Randomly incorporate one of these elements: unexpected statistics, conspiracy theories, or urgent calls to action."""

self.prompts_tone_formal = """
    Craft an informative, journalistic headline with complex vocabulary and detailed analysis. 
    Describe why the news is significant, good, or bad, ensuring this aligns with the given political bias. 
    Emphasize positioning and context. Randomly focus on one of these angles: historical parallels, economic implications, societal impact, or future predictions."""

Glossary

A filter bubble or ideological frame is a state of intellectual isolation[1] that can result from personalized searches, recommendation systems, and algorithmic curation. The search results are based on information about the user, such as their location, past click-behavior, and search history.[2] Consequently, users become separated from information that disagrees with their viewpoints, effectively isolating them in their own cultural or ideological bubbles, resulting in a limited and customized view of the world.[3] The choices made by these algorithms are only sometimes transparent.[4] Prime examples include Google Personalized Search results and Facebook's personalized news-stream.

However, there are conflicting reports about the extent to which personalized filtering happens and whether such activity is beneficial or harmful, with various studies producing inconclusive results.

In news media and social media, an echo chamber is an environment or ecosystem in which participants encounter beliefs that amplify or reinforce their preexisting beliefs by communication and repetition inside a closed system and insulated from rebuttal.[2][3][4] An echo chamber circulates existing views without encountering opposing views, potentially resulting in confirmation bias. Echo chambers may increase social and political polarization and extremism.[5] On social media, it is thought that echo chambers limit exposure to diverse perspectives, and favor and reinforce presupposed narratives and ideologies.[4][6]

The term is a metaphor based on an acoustic echo chamber, in which sounds reverberate in a hollow enclosure. Another emerging term for this echoing and homogenizing effect within social-media communities on the Internet is neotribalism.

Clickbait (also known as link bait or linkbait)[2] is a text or a thumbnail link that is designed to attract attention and to entice users to follow ("click") that link and view, read, stream or listen to the linked piece of online content, being typically deceptive, sensationalized, or otherwise misleading.[3][4][5] A "teaser" aims to exploit the "curiosity gap", providing just enough information to make readers of news websites curious, but not enough to satisfy their curiosity without clicking through to the linked content. Clickbait headlines often add an element of dishonesty, using enticements that do not accurately reflect the content being delivered.[6][7][8] The "-bait" suffix makes an analogy with fishing, where a hook is disguised by an enticement (bait), presenting the impression to the fish that it is a desirable thing to swallow.[9]

Before the Internet, a marketing practice known as bait-and-switch used similar dishonest methods to hook customers. In extreme degree, like bait-and-switch, clickbait is a form of fraud. (Click fraud, however, is a separate form of online misrepresentation which uses a more extreme disconnect between what is being presented in the frontside of the link versus what is on the click-through side of the link, also encompassing malicious code.) The term clickbait does not encompass all cases where the user arrives at a destination that is not anticipated from the link that is clicked.

Far-left politics, also known as extreme left politics or left-wing extremism, are politics further to the left on the left–right political spectrum than the standard political left. The term does not have a single, coherent definition; some scholars consider it to be the left of communist parties, while others broaden it to include the left of social democracy. In certain instances—especially in the news mediafar left has been associated with some forms of authoritarianism, anarchism, communism, and Marxism, or are characterized as groups that advocate for revolutionary socialism and related communist ideologies, or anti-capitalism and anti-globalization. Far-left terrorism consists of extremist, militant, or insurgent groups that attempt to realize their ideals through political violence rather than using democratic processes.

The definition of the far-left varies in the literature and there is not a general agreement on what it entails or consensus on the core characteristics that constitute the far left, other than being to the left of mainstream left-wing politics.[1] As with all political alignments, the exact boundaries of centre-left versus far-left politics are not clearly defined and can vary depending on context.[2] Far-left ideologies often include types of socialism, communism, and anarchism.[3][4]

Far-right politics, often termed right-wing extremism, encompasses a range of ideologies that are typically marked by radical conservatism, authoritarianism, ultra-nationalism, and nativism.[1] This political spectrum situates itself on the far end of the right, distinguished from more mainstream right-wing ideologies by its opposition to liberal democratic norms and emphasis on exclusivist views. Far-right ideologies have historically included fascism, Nazism, and Falangism, while contemporary manifestations also incorporate neo-fascism, neo-Nazism, white supremacy, and various other movements characterized by chauvinism, xenophobia, and theocratic or reactionary beliefs.

Key to the far-right worldview is the notion of societal purity, often invoking ideas of a homogeneous "national" or "ethnic" community. This view generally promotes organicism, which perceives society as a unified, natural entity under threat from diversity or modern pluralism. Far-right movements frequently target perceived threats to their idealized community, whether ethnic, religious, or cultural, leading to anti-immigrant sentiments, welfare chauvinism, and, in extreme cases, political violence or oppression.[2] According to political theorists, the far-right appeals to those who believe in maintaining strict cultural and ethnic divisions and a return to traditional social hierarchies and values.[3]

Read More
Bernat Cuni Bernat Cuni

VR TERROIR

Exploring user’s representations in virtual environments with contextual, non-generic avatars rooted in culture and space.

Goal: Bring the richness and entropy found in natural materials to the digital space to create diverse digital identities.

Why: Character design in XR applications lack the richness found in other media. This can be due to the historical technical restrictions of the mediums or the heavy platformization of the space.

What am I fighting?

Corporate positivism and standardisation, reduction of self expression to controlled archetypes and homogenisation of visual culture.

As in other areas of the digital existence, services and spaces offered (controlled) by profit-prioritising entities (companies) tend to operate from a risk-averse neutrality, where companies enforce strict, controversy-free environments to avoid any potential backlash. This approach propels culture towards a Hyper-sanitized mediocrity, a reality where only the safest, most generic content survives.

This is happening offline and online, we see it in content moderation of user generated content, specially text, images and video, where Algorithmic puritanism reigns with overzealous, automated censorship (nipple) that enforces arbitrary moral standards.

To reclaim spaces free from dictating aesthetic standards we need tools to build them.

Why this matters, and why in XR/VR?

Access to the means of creation is common, ubiquitous and affordable, media creation is within reach to the broader public, who flood instagram with photos, tik-tok with videos and their work wit template-driven designs. This is great, but also heavily controlled.

While which photos a camera can take can't be controlled (yet, although you can’t scan money), what is heavily limited is where the media can be shared.

As platforms are hungry for content, new and novel tools are given to people to create stuff, specially for platform-specific behaviour-based media such as cam-filters, reacting memojis, or avatars.

In this cases, we see the above-mentioned limitations of censorship happening before the creation, in the form of limitation.

These walled gardens are often camouflaged as "paradox of choice", where the user is provided with seemingly a lot of freedom as long as it happens within the predefined boundaries.

3D creation, whilst being popularised with tools such as wamp or easy image-to-3D services, has a steep learning curve, with complex softwares and functionality pipelines that makes the creation of unique avatars a craft. So, users still rely on platform-provided tools to define their digital existence, or user-generated content which is also heavily monitored.

VR_Terroir is one of many directions identity and aesthetics in avatars can take to questions the aesthetic narratives of digital presence, going away from the corprorate-positivism of superficial-profyling avatars, with a texture-rich non-anthropomorphic approach, and is based in the artist's appreciation of the richness found in nature, the expressivity in folk culture, attitude in fashion and detail in character-design.

What: A system to create 3D digital characters based on images and contextual information.

Technically: Input a photo of a texture, get a 3D avatar.

Possible production directions:

  • Tool: A web-app to generate 3D avatars

  • Artifact: A camera with a hologram display: (tangible version of the web-app)

  • Artwork: Real photos of places in Europe, with their characters. Similar to bigfoot viewing, or fairies in forests, gods in mountains...


[WIP Dec’24-Jan’26] : A project from the S+T+ARTS EC[H]O residency

  • challenge: Virtual Representations of Users / Spatial Computing

  • context: Virtual Reality, Scientific Visualisations, Collaboration, Avatar, Design

  • keywords: Biases & Challenges / scientific datasets / creative potential / realism and abstraction / nonverbal communication / perceptions.


VR TERROIR explores the virtual representations of users. in collaboration with the High Performance Computing Center in Stuttgart (HLRS), who have a solution to explore scientific data in Virtual or Augmented Reality settings, allowing for remote and multiple users to be present.

It is an amazing tool, but there is a missing piece of the puzzle: how do we represent users there, in a non uncanny valley way, supporting non-verbal communication and with a right degree of realism and abstraction?

My proposal, in a nutshell is: building from the existing rich and cultural background of hyperlocal particularities found in popular culture, heritage, nature and space, physical and digital

We could call it the Avatar Terroir. same as in wine, where the terroir defines the complete natural environment in which a particular wine is produced, including factors such as the soil, topography, and climate.

  • Going from: a fake, generic, loud, bias, profiling, sterile and simplistic representations of self.

  • To: Using treats of the local culture, materials and place.

On the left, we place the current state of avatars, which are mainly an expression of Corporate Optimism, hyper functional, median, politically correct digital beings. From the cuteness of apple’s memojis, to Meta’s cartoon clones.

There are also places of extreme niche self expression, as virtual communities like VRCHAT, where a plethora creatures roam around polygons. Those avatars are wild and creative and fantastical, but are mainly rooted in digital culture, lacking a sense of place or context.

And here on the right, where I want to go, is the existing representations of beings and humans by popular culture. We do have a rich, tactile, physical heritage of creatures in europe.

From these down there from the basque country in northern spain, who dress with sheep wool and giant bells, to this guy from hungary. I do not aim to make a direct translation of all of this, but this is to illustrate the physicality I aim to achieve, also taking from identifiers of the localities, from the Kelp of Greece or the moss in some Alpine forests.

I also want to explore how the digital context within a digital environment can drive the representation: since it is a scientific visualisation environment, how avatars should look behave or be affected if we are exploring nano microbial data or outer space simulations?


It turns out that this have a name :)

I’ll tap into the Critical Regionalism movement in Architecture, that reject placelessness of postmodernism and reclaims a contextualized practice, using local materials, knowledge, culture while using modern construction technologies.

For insance, the image in the background is a map of the Bioregions in europe, geographical areas with a similar biodiversity and common geographies. This could be a starting point, to break away from country boundaries that tend to be simplistic and profiling. For exampleinstance, I feel more mediterranean than spanish; in some aspects I have less in common with someone in the woods of Galicia, and more in common with someone from the coast of Lebanon.

Could the type of trees, the style of the cities or the food grown and eaten on a certain place drive the look and feel of avatars?

If this sounds political, is because it is,

digital spaces are colonized with globalized corporative capitalist extractive practices, Google Facebook Amazon and Apple dominate the language, the platforms and the narratives.

And I believe we need to build tools to reclaim identity and agency.

And how to do all this?

With my skillset, Digital Craftsmanship, coming from an industrial design background, I’m familiar with the world of making things, and over the years I’ve been intentionally de-materializing my practice and designing for the digital space, because I don’t see a reason strong enough to add make more stuff to the world. So I have experience in 3D environments and digitization.

For this project I envision using 3D scanning, computer vision, generative textures, parametric design, computational photography, generative AI, motion capture.

As a hint, what I could see myself doing in this project is climbing a mountain deep in the carpats, 3D scanning a rock and feed it to a generative system that blends it with a scientific dataset and proposes avatars.

 

VR TERROIR : contextual placefullness


FRAMEWORK

Technical scope

Research

 

Technical context

From the Host: HLRS

VR tools and facilities

Software developed at HLRS transforms data into dynamic projections. Wearing 3D glasses or head-mounted displays, users of our facilities and other tools can meet in groups to discuss and analyze results collaboratively.

The CAVE

Wearing 3D glasses, users can step inside simulations in this 3x3 meter room. Using a wand, it is possible to move through the virtual space and magnify small details.

Collaboration in virtual reality

When face-to-face meetings in the CAVE are not possible, software developed at HLRS enables persons in different physical locations to meet and discuss simulations in virtual reality from their workplaces or home offices.

Software

The Visualization Department at HLRS offers powerful virtual reality and augmented reality tools for transforming abstract data sets into immersive digital environments that bring data to life. These interactive visualizations support scientists and engineers across many disciplines, as well as professionals in nonscientific fields including architecture, city planning, media, and the arts.

COVISE (Collaborative Visualization and Simulation Environment) is an extendable distributed software environment to integrate simulations, postprocessing, and visualization functionalities in a seamless manner. From the beginning, COVISE was designed for collaborative work, allowing engineers and scientists to spread across a network infrastructure.

VISTLE (Visualization Testing Laboratory for Exascale Computing) is an extensible software environment that integrates simulations on supercomputers, post-processing, and parallel interactive visualization.

 

Other tools and technical solutions for this project:

  • Visual Language Models: for understanding images

  • Generative 3D models: for giving shape to concepts

  • Parametric Design: For computationally define boundaries and shapes

  • Gaussian splatting: For details and visuals beyond meshes and rendering

  • Computer vision: for movements and pose estimation

    • Depth estimation / video-to-pose

  • 3D scanning: for real world sampling

    • photogrammetry / dome systems / lidar / optical scans

  • Image generators: for texture synthesis

 

Societal context

Understanding the role of avatars in culture

Bibliographic research at the library collection from the Museu Etnològic i de Cultures del Món looking for folklore representations of beings.

[non-human / human] representations in folk culture

Pagan / material-rich costumes / masks.

A good repository of material qualities of costumes is found on the work Wilder Mann by Charles Fréger:

https://www.charlesfreger.com/portfolio/wilder-mann-fr/

photos: Wilder Mann by Charles Fréger

Other sources consulted include:

  • The photographies by Ivo Danchev, specially the Goat Dance collection

  • Masques du monde - L'univers du masque dans les collections du musée international du Carnaval et du Masque de Binche

  • Mostruari Fantastic (barcelona)

  • El rostro y sus mascaras (mario satz)

Analogies and parallels

  • Expressive avatars at edge digital cultures (Vrchat)

  • VR spaces

  • Costuming Cosplay Dressing The Imagination (Therèsa M. Winge)

Webcam backgrounds as a shy attempt to self expression

Relationship between the human body, space, and geometry [ Bauhaus ]

Triadisches Ballett (Triadic Ballet) is a ballet developed by Oskar Schlemmer. The ballet became the most widely performed avant-garde artistic dance and while Schlemmer was at the Bauhaus from 1921 to 1929, the ballet toured, helping to spread the ethos of the Bauhaus.

The Triadisches Ballett was conceived in 1912 in Stuttgart in a cooperation between the dance troupe of Albert Burger (1884–1970) and his wife Elsa Hötzel (1886–1966) and Oskar Schlemmer. Parts of the ballet were performed in 1915, but though Burger asked Arnold Schoenberg to write the score ("Your music, which I know from the local concert, seemed to me to be the only one suitable for my ideas"[1]) he had no success so Schlemmer and the Burgers performed to music by Enrico Bossi. It premiered in the Stuttgart landestheater on 30 September 1922, with music composed by Paul Hindemith. It was also performed in 1923 during the Bauhaus Week at the Nationaltheater, Weimar, and at the Annual Exhibition of German Crafts, Dresden.[2][3]



DEV / WIP / EXPERIMENTS / PROOF OF CONCEPT

Proof of concept 01

Characters informed by the materiality of their context.

In this experiment, I combine the aesthetics, material and texture characteristics of a setting [microscopic imagery, fluid dynamics, space simulations…] and translate them to visual cues shaping and texturing a character.

Visit at the HLRS and Media Solutions Center - January ‘25

Demo of equipment, projects, use cases and vision

With the aim of including humanistic thought and sociological context in this project, a conversation was arranged with the Head, Department of Philosophy of Computational Sciences at HLRS: Nico Formánek

Some of the concepts discussed were:

  • How do we “represent”?

  • Representation is a choice

    • When something is defined/described (i.e in thermodynamics) there is a whole set of things that are undefined.

    • Every choice leaves something out. (as per Jean-Paul Sartre’s concept of "choice and loss" —that every decision involves a trade-off, meaning something is always left behind. Sartre’s idea of "radical freedom" suggests that we are condemned to choose, and in choosing, we necessarily exclude other possibilities.)

    • Oportunity for this project to make explicit what is left in its choices

  • The concept of positioning: in a social context

    • How conventions apply in digital spaces / VR?

  • Standing - relationships between two things

    • How context guides standing. (i.e music in opera)

  • References: Claus Beisbart

    • Virtual Realism: Really Realism or only Virtually so?

      • This paper critically examines David Chalmers's "virtual realism," arguing that his claim that virtual objects in computer simulations are real entities leads to an unreasonable proliferation of objects. The author uses a comparison between virtual reality environments and scientific computer simulations to illustrate this point, suggesting that if Chalmers's view is sound, it should apply equally to simulated galaxies and other entities, a conclusion deemed implausible. An alternative perspective is proposed, framing simulated objects as parts of fictional models, with the computer providing model descriptions rather than creating real entities. The paper further analyzes Chalmers's arguments regarding virtual objects, properties, and epistemological access, ultimately concluding that Chalmers's virtual realism is not a robust form of realism.

    • new book: Was heißt hier noch real? (What's real today?)

Proof of concept #3

Using gaussian splatting to represent detailed material features

-> scan of the Triadisches Ballett at the Staatsgalerie Stuttgart

Text 2 3D models

challenge:

need to find: if a system can output a turntable of the creation before the meshing step, it could be possible to nerf it, and create a neural representation of the volume, without going through triangles…


HLRS collab

Meetings with the HLRS team to discuss the project. Some of the things we discussed are:

  • VR continuum and role of the CAVE systems

  • XR expos and industry events

  • Remote Access to a GPU cluster

  • Access to photo-scanned materials and textures

  • Use-cases for avatars in HLRS

Tech Sessions

Meetings with Carlos Carbonell to explore the technological soundess of the idea and preceding , existing and future solutions.

  1. Session: History of identity in digital spaces and vr platforms

  2. Session: Hardware setup.

  3. Session: visit at Event-Lab at the Psychology department from the Universitat de Barcelona - where they carry out technical research on virtual environments, with applications to research questions in cognitive neuroscience and psychology. We were hosted by Esen K. Tütüncü who showed us their research projects, equipment, pipelines and vision.

  4. Session: we’ve did a VRCHAT tour, exploring different spaces, avatars and functions.

    1. Trying on avatars seems the most fun (for me)

      1. It seems to be a popular thing to do there, as there are many worlds for avatar hopping.

      2. Some avatars are designed with great detail, with personalization features, physics, particles and add-ons

    2. It is interesting the self-exploratory phase of looking at oneself every time you get a new avatar

    3. Portals are a fun to land to unexpected places

    4. There is a feeling of intentional weirdness, where avatars, spaces and functionalities are strange for the sake of strangeness.

    5. Photogrammetry in VR: some speces are built or contain 3D scans. Due to technical limitations, those scans look bad.

    6. Some programatically built objects have rich features, such as collisions and translucency.

    7. Scale is a powerful factor in VR. Being tiny or giant really changes the experience


Next Sessions

Meetings with Dr. Claudia Schnugg to discuss and explore how to shape this project for continuation beyond the residency


Glossary

The virtuality continuum is a continuous scale ranging between the completely virtual, a virtuality, and the completely real, reality. The reality–virtuality continuum therefore encompasses all possible variations and compositions of real and virtual objects. It has been described as a concept in new media and computer science. The concept was first introduced by Paul Milgram.[1]

The area between the two extremes, where both the real and the virtual are mixed, is called mixed reality. This in turn is said to consist of both augmented reality, where the virtual augments the real, and augmented virtuality, where the real augments the virtual.

This continuum has been extended into a two-dimensional plane of virtuality and mediality.[2] Taxonomy of reality, virtuality, mediality. The origin R denotes unmodified reality. A continuum across the virtuality axis, V, includes reality augmented with graphics (augmented reality), as well as graphics augmented by reality (augmented virtuality). However, the taxonomy also includes modification of reality or virtuality or any combination of these.

The metaverse is a loosely defined term referring to virtual worlds in which users represented by avatars interact,[1] usually in 3D and focused on social and economic connection.[2][3][4][5]

In computing, an avatar is a graphical representation of a user, the user's character, or persona. Avatars can be two-dimensional icons in Internet forums and other online communities, where they are also known as profile pictures, userpics, or formerly picons (personal icons, or possibly "picture icons"). Alternatively, an avatar can take the form of a three-dimensional model, as used in online worlds and video games, or an imaginary character with no graphical appearance,[1] as in text-based games or worlds such as MUDs.

The term avatāra (/ˈævətɑːr, ˌævəˈtɑːr/) originates from Sanskrit, and was adopted by early computer games and science fiction novelists. Richard Garriott extended the term to an on-screen user representation in 1985, and the term gained wider adoption in Internet forums and MUDs. Nowadays, avatars are used in a variety of online settings including social media, virtual assistants, instant messaging platforms, and digital worlds such as World of Warcraft and Second Life. They can take the form of an image of one's real-life self, as often seen on platforms like Facebook and LinkedIn, or a virtual character that diverges from the real world. Often, these are customised to show support for different causes, or to create a unique online representation.

The platform economy encompasses economic and social activities facilitated by digital platforms.[1] These platforms — such as Amazon, Airbnb, Uber, Microsoft and Google — serve as intermediaries between various groups of users, enabling interactions, transactions, collaboration, and innovation. The platform economy has experienced rapid growth, disrupting traditional business models and contributing significantly to the global economy.[2]

Platform businesses are characterized by their reliance on network effects, where the platform's value increases as more users join. This has allowed many platform companies to scale quickly and achieve global influence. Platform economies have also introduced novel challenges, such as the rise of precarious work arrangements in the gig economy, reduced labor protections, and concerns about tax evasion by platform operators. In addition, critics argue that platforms contribute to market concentration and increase inequality.

Historically, platforms have roots in pre-digital economic systems, with examples of matchmaking and exchange-based systems dating back millennia. However, the rise of the internet in the 1990s enabled the rapid expansion of online platforms, starting with pioneers like Craigslist and eBay. Since the financial crisis of 2007–08, the platform economy has further expanded with the growth of sharing economy services like Airbnb and labor market platforms such as TaskRabbit. The increasing prominence of platforms has attracted attention from scholars, governments, and regulators, with many early assessments praising their potential to enhance productivity and create new markets.

Corporate censorship is censorship by corporations. It is when a spokesperson, employer, or business associate sanctions a speaker's speech by threat of monetary loss, employment loss, or loss of access to the marketplace.[1][2] It is present in many different kinds of industries.

Corporate censorship in the E-commerce and technology industry is usually the explicit or implicit ban or suppression of certain material by a tech company from the product it offers.[3] Earlier in 2018, Bloomberg reported that Google and Amazon are involved in a case of Russian censorship of a Russian company called Telegram.[4] After Russian intelligence Federal Security Service (FSB) attempted to gain access to and found terrorist messages on Telegram, a messenger service in Russia with 15 million users, the app was banned by a Moscow court.[4] In April 2018, Apple, Google, Amazon, and Microsoft were thanked by Telegram's founder for "not taking part in political censorship."[4] It is said that Google and Amazon were thanked because they were possibly engaged in domain fronting, a technique that circumvents Internet censorship.[4] However, things later changed as Google and Amazon disabled domain fronting and helped in the Russian censors' endeavor.[4]


Read More
Bernat Cuni Bernat Cuni

Adversarial Violence

embed social critique within images, to force algorithms to verbalise and expose inequalities.

Machine vision & silenced stories

The aim is to force the algorithm to tell me what is there but is not seen. 

The conceptual framework of this attack, is to embed social critique within images to force algorithms to verbalise and expose inequalities.

Captioning is the process to identify objects, scenes and relationships in an image and describe them. This has traditionally been used to make media more accessible and contextualized. 

In early days of web 2.0 flickr pioneered tagging and captioning image descriptions, where users enhanced their photos with a descriptive narrative of what was being displayed. 

Later on, AI companies fagocited all that content (and more) to build their automated machines.

So now, using API’s and services, it is possible to automagically compose a description of an image

But what description? From which point of view? Who is telling the story?

  • It is often the mainstream perception, the forced averageness.

A transactional visual interpretation that avoids the violence of current times, the exclusion, the abuse and the inequalities of the economic system.

Adversarial attacks are a technique to modify a source content in a manner to control what machine learning models see on them, while being imperceptible to human eyes. This raised alarms around 2017 when it was presented in research environments, as it exposed the fragility of the algorithms that many corporations praise and depend on.

I feel that AI is a beautification of capitalism’s negative impact, a makeup of niceness and solutionist efficiency. With AI’s expansion and gooey omnipresence, it becomes hard to see the edges, the labor, the pain and the suffering that tech companies and shortermist policies ignore and exhacerbate.

Here I’m imperceptibly perturbing an image (A:Nice housing) to display the content of another image (B:homelessness and city’s precariety) to a captioning model (im2text).

Captions for original image original_housing_256.png.npy:

1) a large building with a clock on it (p=0.000326)

1) a large building with a clock on it . (p=0.000312)

1) a tall building with a clock on it (p=0.000229)

*I expected a nicer description as “white house, clean design, sunny weather”, but I got a very disappointing scarce string of words

Captions for target image homeless_2_256.jpg:

1) a man sitting on a bench in front of a store . (p=0.000008)

1) a man sitting on a bench in front of a building . (p=0.000004)

1) a man sitting on a bench in a city street . (p=0.000004)

*Here I’m also expectedly disappointed as the model nice-washes the scene

🥸 success:

The image now carries the injected visual qualities to fool the model to view the second image, thus creating captions of what only the machine (that specific algorithm) sees, but not the human eye.

Captions for adversarial image adversarial_housing.png:
  1) a man sitting on a bench in front of a store . (p=0.000261)
  1) a man sitting on a bench in front of a building . (p=0.000120)
  1) a man sitting on a bench in front of a store (p=0.000052)

🤦‍♂️ failure:

The model used here is not even able to describe a homeless / precariousness situation, thus we can’t embed that description to the source image.

🛠️ tools:

IBM/CaptioningAttack - Attacking Visual Language Grounding with Adversarial Examples: A Case Study on Neural Image Captioning




🤖 outdated approach:

currently with the use of Large Multi Modal models this approach renders anecdotical, as the language models have the capacity to scramble text coherently enough to build a use-case for a specific situation.

I still like the beauty of potential unexpected critique of targeted adversarial attacks

 

📷 image credits



Read More
Bernat Cuni Bernat Cuni

WTFood

🥬 curiosity -> action 🤘

Exploring the systemic socioeconomic glitches of the food system through the lens of multiple agents.

A lens to explore systemic socioeconomic glitches in the food system and discover ways to take action.
From curiosity to activism.

🥬 curiosity -> action 🤘

*

🥬 curiosity -> action 🤘 *

You feel that something is wrong with the food system, but you still don't know what to do about it.

Take a fruit or vegetable, open the camera, and watch it morph into a glitch of the food system.

The glitch, is the socioeconomic impact of certain practices, policies or market forces.

You already knew that, you feel it when you go to the supermarket and everything is shiny and beautiful, and all the packagings and advertisings show you happy farmers.

  • Where are the precarious workers?

  • Where are the small stores that can't compete with large distribution chains?

  • Where are the industrial growers whose produce feeds cheap processed salads that will mostly end as waste?

WTFood will show you this, and will also show you links to people, companies, communities and policies that are fixing this glitches near you, so you can take action.

Use it again, and a new systemic glitch will appear.

Each time from the point of view of different people involved in the food system.

The Map

You are not alone, each tile of the map is a story triggered by someone else. Together, we collaboratively uncover a landscape of systemic glitches and solutions.

Desktop:

  • click-drag to pan around the map, and use your mouse-wheel to zoom in-out.

  • when you click to a tile, it will center-zoom to it.

  • click anywhere on the grayed-out area to navigate again

Mobile:

  • swipe up/down/left/right to jump to other tiles

  • pinch with two fingers to zoom in-out and pan

The Sorting

You can explore this map by seemingly unrelated categories.

  • enter a word or sentence in each box

  • hit SORT to see how the cards arrange accordingly

  • navigate the map as usual

Status

This is a "digital prototype", which means it is a tool built as a proof-of-concept; and as such, it has been built with some constrains:

Content

The issues of the food system have been narrowed down into the following 5 areas:

  • Power consolidation

  • Workers' rights and conditions

  • Food distribution and accessibility

  • Economic reliance and dependence

  • Local and cultural food variations

The perspectives have been limited to 5 stakeholders, subjectively chosen by their impact and role in the food system:

  • Permaculture local grower

  • CEA industrial grower

  • Supermarket chain manager

  • Wealthy consumer

  • Minimum wage citizen

Functionality

  • 🧑‍💻 The system works beautifully on a computer, specially the Map and Sort

  • 📱 Android is preferred for mobile use

  • ⚠️ Avoid iPhones as videos do not auto-play and map navigation is sluggish and confusing.

  • ⌛️ Video creation and link making takes about 50 seconds.

  • 👯 Multiple users might collide if a photo is sent while a video is created, so your image might be lost or appear on another tile.

  • 🍑🍆 Non-fruit-or-vegetables photos won’t be processed

The system has two modes: Live / Archive

  • Live: you can use it to add photos and generate new videos and links and browse and sort existing tiles. Live mode will be enabled until the end of May 2024, and probably during some specific events only.

  • Archive: You can use it to browse and sort existing tiles. This will be the default status from June to December 2024

Data & Privacy

  • Photos you upload are used to create the video and run the service. They're stored on our servers.

  • Location: We use your IP address (like most websites) to show you relevant local links, but we don't sell or trade any data.

  • 🙌 Open source code: The code is available for anyone to use and see. → github

The Digital Prototype

The technical development has been performed focusing on adaptability and easy deployment.

The software package has scripts to install all the components and required libraries.

The code has been made available as an opensource repository at GitHub

The main components of WTFood are:

  1. Photo capturing and contextual labelling: an easy to use interface to securely and frictionlessly open the device’s camera (mobile) or allow for a photo upload (desktop):

    • Crop the image to 1:1 aspect ratio

    • Identify if it contains any fruit or vegetable

    • Detect the most prominent fruit or vegetable and label the image accordingly

    • Based on user’s IP, identify generic location (country + city)

  2. Story building and link crafting: A chained language model query based on 4 variables: issue + stakeholder + identified-food + location. This model is built with LangChain, using Tavily search API and GPT as the LLM. (this code was built prior to to GPT’s ability to search the web while responding).

    • Explain an issue from the perspective of a stakeholder and related to the identified food

    • Build an engaging title

    • Describe a scene that visually represents the issue explained.

    • Search the web and find 3 relevant links to local events, associations, policies or media that respond to “what can I do about it”, and rephrase the links as descriptive titles according to their content

  3. Visualizer: A generative image+text-to-image workflow developed for ComfyUI that does the following:

    • Given an image of a fruit or vegetable generates additivelly 4 images that match a textual description.

    • The images are interpolated using FILM

    • A final reverse looping video is composed

  4. Cards: An interactive content display system to visualize videos, texts and links

  5. Mapping: Grid-based community generated content: a front-end interface that adapts and grows with user’s contribution

  6. Sorting: A dynamic frontend for plotting media in a two axis space according to their similarities to two given queries. This component uses CLIP embeddings in the backend.

Since the code is built with modularity, each of the avobe mentioned componens can be used independently and refactured to take the inputs from different sources (i.e. an existing database, or web-content) and funnel the outputs to different interfaces, (i.e: a web-based archive, or a messaging system…).

This flexibility is intentionally designed for future users to implement the desired component of the WTFood system to their needs.

Next Steps

This prototype will be used on the 2nd phase of the Hungry EcoCities Project.

Wtfood can be useful for many kinds of SME's and associations as the system is robust and built for flexibility and customization. Here are some potential use cases:

  • Engaged Communities: As the image classifier can be fine-tuned to accept photos within very specific categories, associations can leverage the platform to launch challenges around (weird) specific themes such as "food in the park" or "salads with tomatoes"

  • Interactive Product Exploration: Companies can create dynamic explorable product maps categorized by user-defined criteria.

  • Event Dissemination: Communities can adapt the system to promote and share links to upcoming events and activities

Credits

✉️ contact: hello@wtfood.eu

Work, concept & code under HL3 Licence
Hippocratic License HL3-FULL


👇 Below, a journal of the project development, with videos, research links and decisions made


09-2023 / 05-2024

Food Dysmorphia uses AI technology to empower citizens to critically examine the impact of the industrial food complex through aesthetics and storytelling. It addresses the hidden economic aspects of the food system, revealing the true costs obscured within the complexities of the food industrial complex. The project maps out food inequalities, visually representing externalized costs to enhance understanding. It aims to develop a "reality filter" using AI and computer vision, altering images of fruits and vegetables to reflect their true costs, thus facilitating discussions, actions, and knowledge sharing. The target audience includes individuals interested in sustainability, food systems, and social change, especially those open to deeper exploration of these topics. This is a political project, shedding light on the political dimensions of the food industry and its societal impact.

A Hungry EcoCities project
team: Bernat Cuní + EatThis + KU Leuven Institutes + Brno University of Technology + In4Art

Exploring systemic failures and opportunities through the food’s look & feel

Utilizing generative AI and computer vision technologies, this project will use reality filters to narrate food realities and uncover untold stories, shedding light on system failures and food-related issues

research & references links -> food.cunicode.com/links

initial approach

this project answers the specific research direction:

The eco-modernism / alternative food production systems standoff

In the 2022 documentary The Future of Food eco-modernist Hidde Boersma and advocate for more traditional nature inclusive inspired food production practices Joris Lohman challenges the dichotomies that are between them. Boersma and Lohman represent a large societal standoff between those who believe in tech driven innovation in agriculture (Boersma) and those who reject the influence of modern technology and advocate living in closer harmony with nature by i.e., adhering to the principles of i.e. permaculture (Lohman). The documentary shows how this (Western) standoff is preventing the change in the food system, and that there are things to gain if we listen more to each other. Both ‘camps’ claim they are the true admirers of Nature: the eco-modernist by trying to perfect nature, the perm culturalists by complete submission to nature. It begs the question whether both viewpoints are at odds with each other or if an ecomodernist permaculture or a permacultural eco-modernist garden would be feasible? Could we explore if a hybrid model could have benefits? What would a food forest inside a greenhouse look like? What additional streams/ functionalities can be supported by the greenhouse? Can greenhouses become more nature/ biodiversity enhancing spaces? How could recent biome research in biosciences be introduced into CEA systems? What type of data models would be necessary to support it? But also the opposite direction could be explored: What would adding more control to alternative practices like biological or regenerative farming yield? How would the AI models used in CEA respond to a poly-organized production area?


DEVELOPMENT

below, all the content that I develop for this project, probably in chronological order, but maybe not.

Food? which food?

Contextualising “food” for thid project: include fruits and vegetables, excluding growing and distribution stages, and post-consumer or alternative processing stages

Why How What

Positioning the project

The starting point

“capitalism shapes the food system, and while most of the actors act in good faith, collectively the food system is filled with inequalities, absurdities and abuses, and because the food system is not organised as a system but more of a network of independent actors, those systemic failures are apparently nobodies fault. I believe that what ties it all together is capitalism, which being extractive by nature it rejects any non as-profitable-as-now approach. Thus a fight for a fair food system is an anticapitalist fight.” initial biased assumption

Narratives (possible)

  • Do all technical innovations benefit the profits, not the consumer?

  • is not that we do not know how to make food for all the world, it is that we do not know how to do it while accumulating large amount of profits

  • Preventing something “bad” is more impactful than doing something “good”

  • Item description
  • If food is not affordable or accessible, it perpetuates inequalities


Field Research / technofetishism

Together with other HungryEcoCities members, we did a field trip to Rotterdam to visit the Westlands, the place where horticulture is booming.

At the Westland Museum, we learned about the history (and economics) of greenhouses and how crucial was for Holland to have a rich industrial England to sell expensive grapes to. We visited Koppert Cress and saw how robots (and people) grow weird plants that fit their business model of catering high-end restaurants. We explored the +45 varieties of tomatoes at TomotoWorld and saw how they use bumblebees to pollinate their plants. At the World Horti Center we were presented the techno-marvels that are supposed to keep the Netherlands on top of the food chain. I also went to Amsterdam to meet with Joeri Jansen and discuss behaviour-change & activism from an advertising point of view. There, while at the hipster neighborhood of De Pijp 🙄 I visited De Aanzet a supermarket that presents two prices to the customers, the real one, and the one including the hidden costs, so the customer can decide to pay the fair price or not.

Project KPI’s flow

Visual Classifier / fruits and vegs

Testing different approaches to build an entry touchpoint to FoodDysmorphia.

goal: process images of food (fruits & vegetables) and reject non-food images/scenarios/content

A classifier is needed: local running python = √
Remote: via huggingface or virtual server / or using an API as Google’s AI-Visoin

field research

finetune

learning: when bringing the cost issue into the table, the conversation is not the same anymore (good)

Frame the anticapitalist narrative within the ecosystem, together with solutions, approaches, facts, easthetics -> to not be perceived solely as a naive rant.

conceptual framework

Technical Aesthetics Exploration

Mapping the narratives of the food system

Focus Exercise

“write a press release as if the project is already done”

This helped to frame/visualize/experience the ongoing project concept into something tangible.

Month 1 + 2 / development report

Challenges

At this point, the critical things to decide are.

  • Find the Aesthetics:

    • GenerativeAI is a copy-machine, it fails at “realistic photorealist” (meaning, that is capable of making perfectly beautiful hands but with 7 fingers 🤦‍♂️), and is able to replicate existing visual styles.

Trying to avoid photorealism, a graphics-oriented aesthetics could work.

Propaganda poster aesthetics is explored as a starting point.

flat colors / defined shapes / complementary chromatic schemes

  • Find the Voice

    • who is telling the story? the food? the people in the food system?

    • finding the characters/agents/individuals to tell stories through

stereotypical profiles of food system’s agents

  • Call to Action

    • What do we want to happen? Find a balance between activism / alarmism / solutions

exploring the possible multi-step process to tell a story: upon image submitted, an agent (person) starts the narrative / sets the context / problem exposed / action-solution

Technical context checkup:

Why AI?

  • at scale / reach / framework where things can happen

  • averageness / common denominator / popular culture

Why AR?

  • relatable / personal context 

  • point of reference 


Group Session / Turin Nov’23

2 full time days with presentations & discussions. led by Carlo Ratti Associati team.

I structured my presentation with an intro to my relationship wiht AI tools and how they tend to the mainstreamification of content, and how this relates to the project.

For this project the use of AI is actually related to the averageness, to use a tool capable to communicate/condense concepts in the most transversal and understandable way for many types of audiences.

For each presentation each of the members took notes and added comments to a feedback form that was then shared.

Next steps

From the discussions and feedback the areas of further research are the following

  • this sounds like a tactical media project. (context)

  • potential polarising effects (desired?)

  • how to avoid eyerolling 🙄

  • limit the scope/effect of the output? geographically / sector / company ?

Thoughts

The problem with pointing things out: nobody cares.

I’m subscribed to multiple newsletters and communities around food, and I find myself deleting emails before reading them, same with climate change content, same with other issues… It is not about the information… it is not about the data.


Why Why Why?

When researching a topic, often all roads point to capitalism systemic economic failures. Food is no exception.

so, maybe instead following the thread (positive or negative) we can showcase that path, and communicate a narrative without overwhelming with data or pointing fingers and losing the audience half way… 🤔

Here is this bit by Louic CK, where he explains how kids keep asking “why?” endlessly. is fun, and it is also an Iterative Interrogative Technique, that made me want to try it out.

I instructed a language model to behave in such way, the “5Why” model, with a goal to link any given insight to economic reasoning.

And it works! and the best is that it works for positive and negatives scenarios as:

  • why fruits are so beautiful at the supermarket?” gives a reasoning towards market preferences and how this influences the farmers

  • “why buying local is good?” explaining how this strentens comunities and makes them more resilient within the global economic context

The multi-dimensional map of inequalities

During the by-weekly discussions with the project partners, the question of why/when would someone use this often arises. And linking it with an early conversation with a behavioural media campaign publicist I met in Amsterdam in October, I explored the concept of “giving something to do to the user, a task”

The idea here is that each individual’s actions contributes to a greater result. As analogy, we can use an advent calendar , where each day you unlock something that gets you closer to the goal.

Also, since the project is about showing multiple narratives/realities of the food system that are behind every product, we must find a way to show that variability.

Also, some partners rise concerns that the project is too biased towards demonising corporate practices (which it somehow is), thus a way to tell their story needs to be contemplated.

Mixing all this up, we can now frame the project as a collaborative multi-dimensional exploration of socio economical inequalities in the industrial food system.

A way to explore a complex system with many actors, and many issues.

To do so, I propose looking at the food system as a volume that can be mapped to different axis to show specific intersections of issues. This by nature is very vast as it can contain as much granularity as desired. For prototyping purposes, three main axis are defined, and a forth would be each food that is run through the tool.

In one axis we can have the agents, and in the other the issues, and even a third with the stages of the food system.

In this way, the project has a potential end-goal, which is to map and uncover all the possible scenarios.

To test this approach, the following entries are selected:

This approach allow us to play with the concept of “volume of possibilities”.

Each “agent” needs to get their own “flavour” their voice. Because each agent looks at the issues from their own perspective.

For the final tool, the particularities of each “agent” and their subjectivity (“their story”) will be captured either with interviews and forms, via interface dials or with presets.

Each interaction ends with a call-to-action. Each agent could define a set of actions and the system would dance around them. ⚠️ Safeguarding setup needs to be implemented to prevent greenwashing/foodwashing, because a main goal of the project is to expose the socioeconomical issues in a friendly engaging un unapocalyptical and blameless manner.

Below, a test run with a 🍋, from a Permaculture local grower perspective focusing on local food culture:

Some other tests with 🥑 and 🍅

Technical development

Since mid October I’ve been scouting for a cool developer to help with the implementation,
and I’m happy to have connected with Ruben Gres

In late December we hosted an intensive work session in the studio to draft the architecture of the digital prototype:

The prototype will take the form of a mobile-friendly accessible website. 

Initial development have been made to display a scrollable/draggable endless grid of images on the web.

The backend is running a ComfyUI with Stable Diffusion and a custom web interface collects the outputs.

A workflow to programatically select and mask a given food has also been tested with promising results. 

We still need to solve the following technicalities:

  • integrate LLM generation to drive image creation

  • store and reference text and images

  • combine each generation into compelling videos

Next steps from jan-may 24

  • define the user experience

  • interaction with the “map” as a viewer

  • interaction with the “filter” as a generator user

  • define how to fine-tune each agent’s perspective to the prototype

    • interviews / dials / presets

    • develop and implement those decisions

  • design a workflow to create videos with messages

  • technical test the prototype in the wild (supermarket, home, printed media…)

  • stress-test the prototype for failures


re-conceptualizing

call-to-action -> links

The deep dive in technical development triggers some questions that allow for rethinking the conceptual framework.

For instance, the language model (GPT) has to be invoked several times to:

  • compose the issue (from a set of socioeconomic factors) from a perspective (depending from stakeholder)

  • create a visual description of that issue: to drive the image generation and morphing

  • create a call-to-action to propose solutions to the user.

Using this approach we managed to obtain convincing call-to-actions as:

  • Reduce food deserts: Increase access to fresh foods.

  • Participate in food swaps: Share surplus with neighbors.

  • Demand fair trade avocados: Support small farmers and market competition.

  • Preserve food diversity: Choose heirloom watermelon varieties.

  • Choose community gardens: Cultivate accessible produce with neighbors.

As a proof of concept, a site was setup to collect early generations of text+imgs, but when seeing all generations at once, some emerge repeat and the call too action seem very repetitive, this is becasue the language model does not have memory of what has said previously, and often uses verbs as “choose”, “avoid”, “fight”…

I had to re-think the role call-to-action and experimented with explicit links for the viewer to take action.

The aim is to get recommendations of resources, associations, media related to the presented narrative.

This works, but often the presented links are the same, a high-level approach as “watch food.inc documentary” or “join the sloow-food movement”.

To add granularity to the generation, I try including location to the query, thus creating site-specific receommendations.

In some cases it surfaces very interesting content, as EU policies on food sovereignty or very specific local associations 👍
This is great because it allows us to go from a moment in reality, straight to a very specific piece of knowledge/data/action.

Results seem good, but sometimes the links are invented, as a Language Model is predictive, the most probable way a link starts is with http: and it often ends with .com or the locale variants. thus, the generated links look ok, but may not be real :(

Examples:

Zagreb, Croatia

Lisbon, Portugal

To mitigate this, a session with experts from BRNO University was conducted and a change on prompt-engineering design will be implemented, were instead of asking for links related to a content, we might extract content from links. The approach involves RAG Retrieval Augmentation Generation, and empowers the LLM with eyes to the internet.

multiple perspectives

As the whole project is to showcase the different realities of the food system existing behind every piece of food, we explore how to narrate those from different perspectives.

Initially we thought on having interviews or work-sessions with different stakeholders, but later we considered to explore the existing knowledge within a language model to extract the averaged points of view.

Initial perspectives are:

  • As a wealthy consumer, my purchasing power can influence the food system towards ethical practices by boycotting brands that consolidate power unfairly, mistreat workers, or contribute to unequal food distribution. I support local and cultural foods through patronage and invest in initiatives promoting sustainable economic models in the food industry.

  • As a supermarket chain manager, I prioritize ethical sourcing and fair labor practices to address power consolidation and workers' rights. We're enhancing food distribution to improve accessibility, supporting local economies to reduce reliance, and promoting local foods to preserve cultural variations. Our sector's efforts include partnerships with small producers and community initiatives.

  • As a Permaculture local grower, I champion decentralized food production, enhancing workers' rights through community-based projects and equitable labor practices. We improve food accessibility by fostering direct-to-consumer distribution channels, reducing economic dependence on industrial agriculture. Our approach preserves local food varieties, countering the homogenization driven by power consolidation within the industry.

  • As a CEA industrial grower, we recognize the complexities surrounding power consolidation and strive for equitable industry participation. We prioritize workers' rights, ensuring safe, fair conditions. Our technology improves food distribution/accessibility, reducing economic dependencies on traditional agriculture while preserving local/cultural foods through diverse crop production. We're committed to sustainable practices and solving systemic issues collaboratively.

The generated perspectives could be contrasted/validaded by stakeholders.


Digital prototype - dev

A clear workflow of tools and data is drafted, the prototype will be composed by two clear components:

  • The reality filter: a mobile-first interface to take a photo in context and experience a narrative of the food system through a morphing video

  • The perspective map: a navigational interface to sorting and grouping large amounts of media according to similarities

To experiment and develop the reality filter, I tinker with comfyui, whch is a frontend visual-coding framework to integrate multiple workflows related to text-to-image mainly with Stable Diffusion models.

The good side of this is that with this tool I can chain several image generation with conditional guidance processes to create the necessary scenes to compose the video. And I can also pipe some interpolation processes at the end.

For this workflow, ControlNet is key to carry visual similarities from frame to frame. Initially it has been proposed to use depth estimation, but tests proved to be too strong or too weak in different cases, thus experimenting was needed to find another approach.

Using the TILE model gives great results as it is a model normally used for super-resolution, it carries the visual qualities of the source image in the conditioning.

ControlNET -

Adding Conditional Control to Text-to-Image Diffusion Models

Scenes

Creating new images from starting ones has the following challenge: if the gap is too big, the visual connection is lost, but if the gap is small, we have very few pixel-space to tell a story. So If I want to bring the viewer to a new reality, I need to do it smoothly, and I can achieve this by incrementally decresing teh strength of the initial image in each generation step.

With 4 steps I can already go quite far.

Scene interpolation / video / morph

To craft a smooth visual narrative from the real-world photo to the generated food-system-scene the FILM: [Frame Interpolation for Large Motion] will be used, even if it is quite computationally expensive. There are other approaches, but I feel they break the magic and the mesmerizing effect of seeing something slowly morphing into something else without almost noticing it.

24 in-between-frames seem a good number to create a smooth transition from scene to scene

img2morph -> workflow

videos

samples of the resulting morphs

On the wild / testing

A key advancement has been to find a workflow that allows for experimentation and at the same time use the setup in the real world.

to do this, we use comfyUI as an API, and build a gradio app to be able to use it from a web interface.

Once we have this, we are able to pipe that interface outside the server and use it via a public link, allowing for testing in different devices and locations.

next (technical) steps

*The technical development is done in collaboration with Ruben Gres,. from now on, when I saw "we” I mean Ruben + Bernat

With all the bits and pieces working independently, now we need to glue it together in an accessible digital prototype (within time and resources constrain)

some design guidelines are:

  • zero friction

    • app must work without requesting permissions or security warnings

  • mobile first

    • the photo taking has to be intuitive and fast - 1 click away

  • desktop delight

    • the system should allow for “enjoyment” and discoverability

  • we are not alone

    • somehow find a way to give the user the sense that other people went through “this” (the discovery process)


Interface prototype

Sorting

Learning about dimensional reduction, I feel this can be a nice touch to use as navigation for the project.

the aim is to find a way to sort the content (media + text) per affinity, or according to non-strict-direct variables.

One approach is using TSNE T-distributed stochastic neighbor embedding (here a good interactive demo). This allows for sorting/grouping of entities according to their distance (embedding) to each other or to a reference (word, image…)

Some examples using this technique is this GoogleArts project of mapping artworks by similarity

We have been able to plot vegetables according to different axis, using CLIP embeddings

grid / navigation / upload / video generation

For the exploration map, an endless canvas with a grid will be used.

getting the right voice / perspective

Testing the whole text-generation pipeline

food -> agent -> issue -> context -> title -> links -> image representation

Relevant links

As the prototype evolves, the ability to provide contextual localized links to events, communities, policies and resources to the viewer becomes more relevant and interesting.

Using the language model, we get beautiful links but that in some cases are non-existing :(

This is an inherent quality of an AI Language Model, as those work mostly as predictive systems, where the most probable word is added after another in a context, thus it is very probable that a link starts as http://www., then include some theme-related words and most likely end as .com or .es .it... according to the location.

This disrupts the prototype, because action can't be taken by the user, and it is frustrating.

To overcome this, the whole structure of interaction with the language model has been changed, and we added eyes to the internet to it 👁️

Using LangChain, and Tavily API , we can request the LLM to search the internet (if instructed) and then analyze the content of the found links and use them as knowledge to build the output.

With this approach, now the language model does the following:

  • picks up a random stakeholder + issue + food (the photo provided by the user).

  • Builds up a story around this that hightlights the point of view of the stakeholder in relation to the issue.

  • Then uses that response coupled with the location of the user (the IP) to search the internet for events, associations or policies that answer a "what can I do about this" question.

  • The relevant links are rephrased as actionable tasks: "read this... join that... go there...".

  • Finally a nice engaging title is composed.


to add:

An european dimension : localized content 
Build for customization: open github repo + documentation + easy deployment (add tech demo video)


This project is developed as part of the Hungry EcoCities S+T+ARTS Residency which has received funding from the European Union’s Horizon Europe research and innovation programme under grant agreement 101069990. 
Read More
Bernat Cuni Bernat Cuni

This Person Will Not Exist

Photos of people that will die due to climate crisis and anthropogenic hazards.


tab to reload -> more persons will not exist

Photos of people that will die/disappear/vanish due to climate crisis and anthropogenic hazards.


An environmental disaster or ecological disaster is defined as a catastrophic event regarding the natural environment that is due to human activity.[2] This point distinguishes environmental disasters from other disturbances such as natural disasters and intentional acts of war such as nuclear bombings.

Environmental disasters show how the impact of humans' alteration of the land has led to widespread and/or long-lasting consequences.[3] These disasters have included deaths of wildlife, humans and plants, or severe disruption of human life or health, possibly requiring migration.[4]

Environmental disasters historically have affected agriculture, biodiversity including wildlife, the economy and human health. The most common causes include pollution that seeps into groundwater or a body of water, emissions into the atmosphere and depletion of natural resources, industrial activity or agricultural practices.[5]

This is an alphabetical list of environmental issues, harmful aspects of human activity on the biophysical environment. They are loosely divided into causes, effects and mitigation, noting that effects are interconnected and can cause new effects.

Anthropogenic hazards are hazards caused by human action or inaction. They are contrasted with natural hazards. Anthropogenic hazards may adversely affect humans, other organisms, biomes, and ecosystems. They can even cause an omnicide. The frequency and severity of hazards are key elements in some risk analysis methodologies. Hazards may also be described in relation to the impact that they have. A hazard only exists if there is a pathway to exposure. As an example, the center of the earth consists of molten material at very high temperatures which would be a severe hazard if contact was made with the core. However, there is no feasible way of making contact with the core, therefore the center of the earth currently poses no hazard.

Anthropogenic hazards can be grouped into societal hazards (criminality, civil disorder, terrorism, war, industrial hazards, engineering hazards, power outage, fire), hazards caused by transportation and environmental hazards.


Not that we need more images of climate collapse and environmental disasters, but since AI is usually used to imagine nice optimistic scenarios, I used it here to highlight the peak stupidity moment we are in.

Inspired by the ThisPersonDoesNotExist site (that served as a StyleGAN demo and triggered an awakening for synthetic media in the general public).

The images have been produced with StableDiffusion, using a combinations of the following descriptive prompt:

"A person in a climate disaster, with (wildfires, flood, drought, pollution, oil, microplastics, waste, misery, technical collapse) / photojournalism photo style”

oh, yes, I used technology, electricity and plastic to produce those images, well spotted.

  • solution? none

  • remedies? many: stop consuming / stop complaining / start fighting / go forward with nature

Read More
Bernat Cuni Bernat Cuni

DIVIDE

Exposing techwashing to reclaim a social narrative for digital rights and empowerment.

AI + ART for ACTIVISM - digital barriers / EU Project / AI4Future

AI + ART for ACTIVISM - digital barriers / EU Project / AI4Future 

EXPOSING TECHWASHING TO RECLAIM A SOCIAL NARRATIVE FOR DIGITAL RIGHTS AND EMPOWERMENT

Artwork

The piece explores the multiple narratives around Access to Internet and Digital Divide.

To do so, images from the affected districts are run against descriptive scenarios in a machine-learning text-to-image algorithm.

The artwork is a set of lenticular prints that reveal different narratives depending on the angle that are viewed from, thus showcasing how different people can experience or understand the multiple realities if explored from diverse points of view.

Statement

Access to internet is not equally guaranteed to all citizen, either for the lack of commercial interest by ISP's, institutional and legal boundaries or economical resources. Either way, certain parts of the population see their Right to Internet Access limited and abused.

The activists entity EXO is running a pilot program to provide free Internet Access to affected households in some districts of Barcelona. Their action goes beyond providing internet, and focuses on digital inclusion, stressing that citizen's rights need to be preserved online.

The digital divide appears to be very different depending who is talking about it.

In one hand, the official narrative, the story told by the administration and tech companies, is that everything is fine, that most of the people have access to internet and that sales of smartphones keep rising, thus there's nothing to worry about, we are ready for digital education, digital health, digital banking, digital administration and digital everything...

On the other hand, the social movements indicate that an unidirectional shift towards the digitalisation of citizenship leaves many people behind; from families with one device and several kids needing to do a video-call, citizens who are forced to manage their finances via unusable and data-extractive apps, or households where the internet is shared from a single limited mobile-data-plan.

Digital Divide

🙈👁🤖

Right to Internet Access

👨‍💻🔒📕

Physical Digital

📱🎒🔌

Digital Divide 🙈👁🤖 Right to Internet Access 👨‍💻🔒📕 Physical Digital 📱🎒🔌

Glossary:

  • The Right to Internet Access, also known as the right to broadband or freedom to connect, is the view that all people must be able to access the Internet in order to exercise and enjoy their rights to freedom of expression and opinion and other fundamental human rights, that states have a responsibility to ensure that Internet access is broadly available, and that states may not unreasonably restrict an individual's access to the Internet.

  • The Digital Divide is a gap between those who have access to digital technology and those who do not.[1] These technologies include, but are not limited to, smart phones, computers, and the internet. In the Information Age in which information and communication technologies (ICTs) have eclipsed manufacturing technologies as the basis for world economies and social connectivity, people without access to the Internet and other ICTs are at a socio-economic disadvantage because they are unable or less able to find and apply for jobs, shop and sell online, participate democratically, or research and learn.[2]


Work in Progress / notes 
(chronological order -> new at the end)

background

AI4Future is a Creative Europe’s project which involves three different European countries: Italy, Spain and Netherlands. It aims at enhancing the understanding and dissemination of AI related technologies for the active and creative participation of young activists to the European cultural scene, allowing them to work with artists for a joint creation of a new urban community awareness.

barcelona chapter / partners

  • Canodrom: Open technologies, participatory democracy and digital culture

  • ESPRONCEDA Institute of Art & Culture: international platform and multi-disciplinary environment for artists, curators


dev notes 

About eXO’s project: Open Networks for Digital Inclusion in Neighbourhoods / presented by Efraín Foglia

2022/2/3 - hello project / Discussion with @canodrom and @espronceda to know their ongoing projects and draft collaboration framework

  • We learn about the Xarxa Oberta de Barris, (open-networks-in-neighbourhoods), a project by Expansió Xarxa Oberta / eXO, aimed to facilitate the right to internet access by bringing quality internet to families in Sagrerea/Congrés-Indians in Barcelona who have vulnerabilities in their internet access.

  • Within the project then, Canodrom becomes the facilitator, eXO the activists and the families the citizens.

  • the citizens / the project is looking for people with vulnerable internet access

drafting ideas / following the initial proposal, the aim remains to visualize and explore.

  • eXO’s works with hardware, with physical stuff to enable internet access. Since internet is usually visualised as ethereal clouds, it is interesting to go back to the tangible world and work with the stuff that make this possible.

  • As we will be working with families with vulnerable digital access, I’m curious to understand the expectations, what are the words, concepts associated to the digital world and the internet. I cold work with the families and the activists to


v0.1 “Possible landscapes made of cables”

Artwork: A Panorama image made with "internet hardware", blending with AI-generated images from the community's description of "internet".

technically: Using ALIS as a generative GAN fed with images of cables, routers and devices. And using a Text-to-Image algorythm to visualize internet-expectations from the community.

proof of concept / v0.1

Testing models and datasets

Proof of Concept - 000 / collecting random cables and boxes

cropping by region of (algo)-interest

2022/03/01 defining collabs: discussing the approach with Espronceda & Canodrom

  • avoid the parachute effect, where a project lands in a context, extracts and does not return

  • in this project, the artists is a resource for the activists to use AI as channel to approach a challenge

  • the goal of the AI4Future project is to see how AI can empower activism.

  • The focus should be on the process, more than on the piece.

2022/03/02 preparing an upcoming event at Rotterdam V2 with the other artists in residence - Nino Basilashvili / Chunju Yu / Luca Pozzi

proof of concept 02 / internet as a visual representation

digesting the visual qualities of popular image representations of “internet”.


2022/03/04 session with the local activists / understanding their vision of the issue, the concerns and the potential

  • This project is key as a legal framework, because in Barcelona it is not allowed for citizens/associations to provide internet to other citizens.

  • There’s a dissonance between the official narrative and what’s happening “on-the-ground”

    • Digital divide is politically understood as infrastructure

    • There is a layer of WASHING where “technology” is equaled to “accessibility” thus painting a picture of tech-enabled city / (TechWashing/SocialWashing)? / with possible redlining

    • Access to Internet is not guaranteed, as it is left to corporations and their market-driven-decision-making, and pushed to the individuals requiring paid connections and updated devices.

    • Beyond “going-online”, citizens must have their rights granted while online

    • Majority of services are switching to online-mode, leaving behind who does not follow (skills, disabilities, legalities): banking, education,

  • There is an opportunity to re-write the narrative, the concepts, the words.

  • There’s a gender gap within the digital rights activism

  • Mapping the actual project could also visualize what’s the local reality

    • i.e: there’s a high concentration of internet-cafés in that specific area (laSagrera / Congrés-Indians), probably a sign of poor home connectivity / lack of devices or help.

  • There’s a practical challenge to find candidates for social projects as their detailed realities are unknown for the administration or activists (ironically, that information is well known by BigTech/GAFAM)

  • Whatever we make, should not just TALK about the issue, but be INVOLVED with the subject and add up instead of taking.

Other considerations:


v0.2 /

Digging into the idea of Double-Narrative, I’m cooking a process/piece where we’ll define which are those narratives and how they shape our perceived reality.

Why:

Expose tech-washing and reclaim a narrative of digital rights and empowerment.

  • Work with the activists to define what’s wrong about the official narrative and what’s under-looked

How:

Side by side comparison of the same location, visually shaped according to two opposite narratives.

Use text-to-Image algorithms to visualize the transformation of spaces according to specific descriptions of reality.

  • Use Barcelona’s open image datasets

  • Use maps built with BarcelonaOpenData datasets

proof of concept / Plaça de Masadas - Sagrera (initial image by Vicente Zambrano, for Barcelona gov)
left: “a smart city inside the metaverse, cyberspace style”
right: ”piles of e-waste and old electronic devices”


What:

A multi-viewer non-linear artworks that can be placed in multiple settings.

  • Lenticular Prints that allows for a low-tech experience while giving a sense of transformation and exposing different realities depending on the viewing angle.

  • Bringing the digitally AI-created images into the physical realm


Right to Internet

A (growing) list of double-narratives built with conversations with the activists and literature review

The media
  • Catalonia is a leading “Digital Country” in EU
  • Telecommunications is a citizen's right
  • Blockchain technologies for digital democracy
  • App-based healthcare system
  • Barcelona as a technologically advanced city
  • Innovative digital education platforms
  • BigTech trade fairs
  • Funding of new technologies like the metaverse and digital currencies
  • Barcelona is a fully connected city
  • In 1997 the public company Telefonica is privatised and inherits the existing physical infrastructure, and rents it to other ISP's.
    From 2008 any company can build their own optic-fiber infrastructure (companies rush to take profitable markets like big cities (often over-installing) and ignoring other parts of the territory).
  • The Goverment starts a publicly owned optic-fiber installation (XOC) with regulated prices. (11 years after the announcement, only arrives to 32% of the municipalities, connecting government sites, not people). The network is run by the main ISP and does not allow small operators to connect to or use it at a fair price.
    ...
The people
  • more than 50% of the population in Catalonia does not have access to broadband
  • The infrastructure that allows telecommunications is a resource to speculate with
  • ISP's do not have commercial interest in providing internet to everybody
  • Digital divide affects specifically to traditionally excluded populations
  • How can someone have access to digital health if can't afford a data plan?
  • How can digital education be fair when access to technology isn't?
  • BigTech fairs exclude the social agenda.
  • Digital rights are an extension of human rights in the digital domain
  • Not all neighbourhoods have the same infrastructure 
  • A better and more efficient optic-fibre installations would lower the costs and provide better services by using bandwidth surplus. 
  • It is technically viable to provide internet to any part of the territory.
  • Internet is a technology that can be run and managed locally without a speculative approach.
  • Networks as a commons 
    ...

Rotterdam / worksession / mid march

We met with the other artists at V2 - {lab for unstable media} / Rotterdam
to discuss the scope of the projects, review development and share technical approaches.

We put together a dev-show to share directions and get feedback from visitors.

v0.3 /

Technical approach

The project will pivot around CLIP (A neural network called CLIP which efficiently learns visual concepts from natural language supervision. CLIP can be applied to any visual classification benchmark by simply providing the names of the visual categories to be recognized, similar to the “zero-shot” capabilities of GPT-2 and GPT-3. / 400.000.000 images)

CLIP’s capacity for semantic processing will allow the project to process the narratives around Digital Divide.

For Image generation, DDPM (Demonising Probabilistic Models) will be used. Specifically Guided Diffusion models like GLIDE

When combining CLIP + DDPM we are able to create a Guided Diffusion (paper), where a loop happens between imgs < - > text.

An example of guided diffusion is DALL-E or GLIDE, or RUDALL-E (paper)

With the stated tools and models, a workflow will be set up to input an initial image and a prompt i order to generate an output image as following:

Being a “words & pixels” approach, a certain focus needs to be put on Prompt-Engineering: when words, descriptions and meanings need to be carefully selected to craft the direction of the output.

below : keywords comparison image grids for the CLIP + VQGAN workflow to visualize the effect of different styles/descriptions applied to the same concepts. (more grids here)


Localisation - The activist’s project will be implemented at El Congrés i els Indians neighbourhood of Barcelona.

In order to collect visual material for the project, a field study was conducted in early April.

A photo-tour capturing localised infrastructure, spaces, housing and elements.

The results of the fieldwork will be the starting point for the AI workflow

dev v03 - images & prompts

Testing some prompts in the workflow in order to explore styles and visual languages.

starting from generic concepts of connectivity, and narrowing towards detailed descriptive scenarios.

crosstests: different prompts and workflow settings


Curating samples:

Combinations of settings, prompts and source images proved to be interesting while others lack detail or effectiveness.

A CONNECTED CITY, WITH INTERNET FOR EVERYBODY WITH EFFICIENT DIGITAL SERVICES

 

A CITY WHERE CITIZENS ARE IN A DISADVANTAGE DUE TO OBSOLETE DEVICES AND POOR ACCESS TO INTERNET

Prototype:

Printed prototypes using lenticular technology will be produced in order to test the visual quality and desired effect.


V04 / understanding activism

A workshop session was run with EXO and Canodrom in order to explore and define the narratives that will drive the final artwork.

raw concepts:
  • Open and neutral free network for everyone
  • Digital culture, free technologies and democratic innovation
  • Citizen governance digital democratic participation
  • Open Knowledge
  • Transparency and open-data
  • Public money for public code
  • Construction of community infrastructures
  • Joint network
  • The Internet as a place for human rights
  • Inclusive network
  • Radical participatory democracy.
  • eGovernment as a digitalization of democratic and participatory processes. (desired)
  • eGovernment as a digitalization of the status quo. (what is happening)
  • The system phagocytose and naturalizes radical movements.
  • Ideological extractivism
  • Neo-conservative vampirism
  • TechWashing
  • Surveillance capitalism
  • BigTechs are amoral, they only position themselves in the band that brings the most profits.
  • Software sophistication widens the digital divide due to lack of knowledge.
  • BigTechs feed on the dependence of users and institutions on their software.
  • Alternative initiatives cannot be equated with the service and usability standards offered by BigTechs, creating a barrier to the adoption of new ideas and ways of operating on the Internet.
  • There is an intentional complexity to online services, blurring responsibilities and positioning the user as responsible in the event of an error or event unfavorable to the company.
  • When banking is digitized in the name of efficiency and better service 24/7, the company gets more profits and more responsibilities are transferred to its users.
  • The algorithmic complexities of digital services are used as an excuse by companies and administration to offer services without guarantees or transparency.
  • SmartCity has not meant any visible improvement to the city's main problem, which is housing.
  • The administration does not allow to act with the communication infrastructures.
  • Some government projects approach digital rights with a paternalistic approach to helping the needy.
  • There is the opportunity to do community work and train the population for digital autonomy.
  • The government allows it and promotes private companies to own the communication infrastructure.
  • The initiatives that are encouraged are only in the field of services, such as Apps and StartUps.
  • The cost of network access: When choosing between having internet or food.
  • A citizen without internet is excluded from public participation, leisure and the relationship with the administration.
  • Internet access is necessary and not optional, but it is not treated as a fundamental right.
  • Historically, the Internet has appeared after large infrastructures have already been privatized, so it is not even considered to be a public good and its implementation and management is quickly awarded to the private company.


The concepts were organised by themes and a set of narratives extracted to drive the AI workflow:.

The Alternative Narrative:

Apps and digital services are intentionally opaque to confuse users and avoid responsibilities.

Corporate practices should be auditable and companies held accountable for bad practices

Smart City has not solved the housing problem

The Official Narrative:

New apps and services make life easier and more efficient.

Corporate information and operations need to be kept secret in order to offer good services and apps. 

Smart City bring prosperity and the future to everybody.

New media has been created using the above descriptions


physical artwork

Exploring the lenticular printing process, it is possible to show multiple frames form different angles.


Tests with different resolutions and frame combinations have been tested.

Professional prints have been produced as exhibition pieces.


To make the artwork self-explanatori, a video will be produced to introduce the activist’s role and the tools used to produce the visuals.

Since all the media used is square 1:1 ratio, which is a common ration for Machine-Lerning models, the explainer video will also be square, and displaeyed in a square display, mounted inside a frame.


Exhibition in Barcelona

As an extended program of the International Symposium on Electronic Art a preview show was organised at Esproceda’s space.

A joint exhibition with LEONARDO organisation, Aalto University and the AI for Future Project, as part of the New European Bauhaus Festival


Technology update

In February 2022, when this project started, the most suitable tech approach for the artwork idea was image generation with CLIP and Diffusion models, a good implementation of this technique is DiscoDiffusion (A frankensteinian amalgamation of notebooks, models and techniques for the generation of AI Art and Animations.) info + video tutorial

All the generated images of this project have been made using variations of that code.

During the past months, many advances occurred in the space of text-to-image generative models; including public access to Dalle2 by OpenAI, the promising Imagen by Google Research, or the popularity of MidJourney.

 

Stable Diffusion is a machine learning model developed by Stability.ai to generate digital images from natural language descriptions. The model can be used for other tasks too, like generating image-to-image translations guided by a text prompt.[1]

It can run on most consumer hardware equipped with a modest GPU and was hailed by PC World as "the next killer app for your PC".[2]

Unlike competing models like DALL-E, Stable Diffusion is open source.[3] Its license prohibits certain harmful use cases.[4][5] Critics have raised concerns about AI ethics, stating that the model can be used to create deepfakes[6] and also questioning the legality of generating images with a model trained on a dataset containing copyrighted content without the consent of the original artists.[7]

 

This technological update is fantastic, but it comes a bit too late for this particular project, so I stick with the original tech setup to finalise the artwork.

Here is a test of some of the "narratives" from this project run through the new model:


v05 / a large scale piece

With the fantastic results obtained with the pieces produced for the barcelona exhibition, it is proposed to develop a new large format piece.

A new round of photos were made onsite, with special attention to the quality of the center image, looking for a photo that reflects the characteristics of the territory while captures certain aesthetic qualities.

More photography tests were done using a fisheye lens, hoping for an immersive and captivating point of view.

The prompts/narratives were also fine-tuned to extract/distill visual qualities.

The new sets of photos were run again through the same AI system, producing novel and mesmerising results.

survivors of the curation process:

The large 100x100cm piece will be produced with the following images:



Exhibition at Canodrom / Oct-Dec 2022

In order to bring the piece closer to the community, a 2 month public exhibition was organized at Canodrom, including an open guided tour during a community meeting of the families that participated in the XOB initiative.


sidenotes 
The role of AI 
Datasets 
references / projects / artworks

Artwork created thanks to the production grant by ESPRONCEDA-Institute of Art & Culture, within the European project AI4Future
Read More
Bernat Cuni Bernat Cuni

VOLS

Interventions in the public space,
without permission, without atoms.

💬 work in progress - development

Interventions in the public space, without permission, without atoms.

exploring technologies, tools and approaches to experience digital content in the physical world.

Background / Motivation

The Weight of Stuff

While working with 3D, I’ve always been seduced with the idea of bringing the content out of the screen, hence my sizeable experience and affair with 3D Printing and digital fabrication… But I don’t feel a pressing need for turning bits to atoms, I can’t find a reason to go through all the resource consumption and physical nuisances that such endeavour entails. Physicality is overrated.

I come from a product design (industrial) background, where stuff is designed to be made, and I feel I’ve developed a sensitivity towards stuff, matter, geometric volumes, materials and finishings. With such eye for the built material environment, I do appreciate and I even find joy in observing stuff, admiring the shape, materiality and mechanical solutions used to solve physical challenges. I’ve devoured countless pages of “design magazines” (I even ran a “web-ring” devoted to product design back in 2006)… I like objects, I like stuff that is expressive, products that go beyond the average intentional mediocrity of mainstream mass produced instant-junk. I do admire and appreciate the classics like Eames furniture, the contemporaries like Marc Newson, the works of the Bourollec Brothers, or the lamps of Patricia Urquiola


I know their designs from images, videos and photos… but I have NEVER seen IRL, used, touched or owned any of their designed products…. , but I like them, very much.

  • What am I in relation with?… with the object? or with the idea of the object?

  • If stuff does not need to physically exist to be enjoyed, can can I just experience the idea of a creation?…

  • so… should I make stuff? or just design it?

This project is an attempt to do that:

Design without making; enabling the experience of volumes, without having to materialise stuff.


Technical approach: Augmented Reality

Augmented reality (often known as AR) is a computer technology that shows digital images on top of the current real world. For example it can show a digital table on top of a floor. It is different from virtual reality which makes a person feel they are somewhere else.

Vectors of interest

The Scale

I’m interested in the perception of scale allowed by augmented reality.

The built environment operates in a human scale, where the larger scales belong to infrastructures (bridges, damns,..) and architecture (buildings, public spaces…) while objects are at a smaller scale, (from paperclips to refrigerators). Placing something to the world of a significant scale is ecologically expensive.

With AR, it is possible to overlay volumes at any given scale.

The Public Space

As a place where interventions are forbidden or reserved to a selected few. Stuck in between the no-places and the privatisation of the public space, individuals have a very few channels of modification of public space.

The Private Experience

Within the intimacy of our devices, we exist in public (metro, street, mall…) while having a very personal, unique and secretive activity: what is happening on our screens.

This allows for experiencing something very personal, and unique to the individual (location, point of view…) while interacting in a public space with other entities


AR as a channel / tests


🧑‍💻 development notes
<modelviewer>:
  • new version auto-generates USDZ files for iOS
  • auto-generated USDZ files do not trigger iOS quick-look in Chrome :(
3D model preparation in blender:
  • USDZ export add-on des not generate proper files for iOS
  • to use transparency textures in a .glb, the material needs to be set to alpha-blend. 
  • bake the material if contains generative textures, and build it again before exporting. 
    • only in Cycles / w CUDA (OPTIX not supported)
  • Animations generated by keyframing modifiers (i.e displacement) won’t export to .glb
    • a) convert selected keyframes to shapekeys and interpolate
    • b) export model as .mdd and import it again = vertex positions will be turned into shapekeys, 1xframe! = model size * n frames 
    • [ ! ] USDZ does not read shapekey animations :( 

GLTF / USDZ specifications

  • animated materials won’t show

 

potential workshop

Augmented Photography

Workshop to design and add digital creations to the physical environment as triggers for experimental photography.

Using Creative 3D modelling and Augmented Reality for street photography.

In this workshop we will create 3D volumes, export them as valid .gltf files viewable on the web, and prepare them for being launched in Augmented Reality via smartphone.

We will then go out with our freshly created AR models and run an augmented photo-session in the public space, capturing the best angles and shaping the environment at our wish.


participants

ideal for creatives who want to experiment building basic AR experiences in the public space.

Photographers / Architects / Designers / Creatives

No prior knowledge of 3D is required

motivation

This workshop is a sharing session of the current work being developed by the artist, where he is creating Augmented Reality installations in the public space as a tool to modify the environment without asking for permission and without using atoms.

learnings

  • build basic 3D models

  • make 3D models accessible via web

  • enable augmented reality on the phone

  • take photos / videos in AR

tools

  • blender for 3D modeling

  • modelviewer for publishing on the web and AR

  • Android’s Scene-Viewer to view AR

  • iOS Quick-look to view AR

requirements

BYOD - bring your own device:

  • laptop with Blender installed

  • cellphone / Android or iOS


Workshop Outputs

photographies of digital 3D models taken in the physical public space.


Read More
Bernat Cuni Bernat Cuni

Landscapes

Populating a publication of post-human thoughts with bacterial sceneries.

Populating a publication of post-human thoughts with bacterial sceneries.

A collaboration with Estudi Bicoté for @CARN_de_CAP

CARN DE CAP is a yearly literary publication by escola bloom

For this issue #3, they asked 12 authors to think creatively about the imaginary and speculative landscape of posthumanism.

Contents:

To package such content, they invited Julia Francino and Maialen Arana from Estudi Bicoté to design the publication itself, (as they previously did for book #2 and book #1.)


Together with Bicoté we explored ways to involve some kind of AI in order to tell a story of post-humanism…
Some initial ideas included, creating a unique font for the magazine, machine-learning the texts and create a new piece….
Soon we put together the concept of bacteria, as a living organism the precedes humans and will outlive them.

Test 01: AI generated bacterias.

We gathered some hundreds of images of bacteria (and small things), and created a dataset of 1500 (by augmenting it via cropping, rotating and mirroring the original images).

bacteria_0001.jpg

I used Stylegan2 to train a model capable of generating novel images resembling the originals.
The results were beautiful, and the generated videos, mesmerising.

but there was a problem; the images looked too much like real bacterias, without an expert eye, the viewer might be confused to believe that those were actual images of bacteria.

It was challenging to tell the story of AI-generated bacteria with those images, because we’d have to spoon-feed the viewer into the concept, so that path was discarded.

Instead, we speculated with the concept of landscapes, as territories where possibilities and stories happen, and as permanent and permanently changing entities that existed before humans and will be present in a posthuman scenario.

Test 02: Breed bacterias in the wild

Exploring the concept of transfer-learning, (where a trained model is used to train a new one), I took an existing model of realistic landscape photographs and used it as the starting point for training a new model of bacteria.

During the early iterations of transfer-learning, the model takes features from its origins (landscapes) and tries to generate images resembling the new dataset (bacteria), the results are interesting in-betweens that neither look like landscapes nor like bacterias but have visual qualities of both.

LANDS_0001.jpg

I tried both ways:
landscape -> bacteria: somehow landscapes predominated and created imagery with the shape of bacteria grows that resembled satellite images.
bacteria -> landscape: it kept the textures and colors of the bacterias, but with the shape of mountains and valleys.
best results were obtained learning landscapes starting from the bacterial model.

Bacterial model trained with landscape dataset

Bacterial model trained with landscape dataset

Landscape model trained with bacteria dataset

Landscape model trained with bacteria dataset

This triggered the team and we ran a curatorial session to select the candidates from a pool of 2000 generated images.

Results:

With this material, Bicoté decided to spread it through the publication in a circle-shaped fashion resembling petri dishes.

Read More
Bernat Cuni Bernat Cuni

Sketch to Color - AR/ML - Lens

Drawing Colorisation Lens / Augmented Reality & Machine Learning / creator residency @Snap

A camera lens for Snapchat aimed to point at a drawing and use machine learning to colorise it.

detailed video ->
the same, 90% longer

Project developed as part of the AR Creator Residency Program at SnapChat - a global initiative designed for artists and developers to explore new ways to bring their work to life with augmented reality.

Using the Lens Studio software and the Machine Learning component.

Behind the Scenes

initial thought: develop a lens to colorise drawings.
(without having no idea how to do it 😅)

Previously, I've seen some web-based applications that use machine learning to do that. So, initially, it shouldn’t be hard to make it run on the camera, as a lens.

AI-Powered Automatic Colorization / https://petalica-paint.pixiv.dev/index_en.html

Initial research pointed me to the Styles2Color repository which does exactly that in great detail. Unfortunately I was not able to make it run, and I had to look for an alternative.

I choose Pix2Pix because it is well documented and seemed like a good solution for the task. I set up on my local machine with this repository by Jun-Yan Zhu and started the testing.

First I had to find a suitable dataset of image pairs, linework -> color, so Pix2Pix could learn how colorized images should look based on a source line drawing.
I found it at Kaggle: Anime Sketch Colorization Pair is a 14k images dataset of Anime drawings (quite NSFW).

Trained Pix2Pix up to 200 epochs, and results were convincing.

But, I hit a roadblock when seeing that the model file generated was over 200Mb, which is an impossible size for fitting it inside a mobile lens :(
Luckily, with the support of Snap team, I was pointed to the work of Char Stiles who previously did a residency and developed a “Statue Lens” using Pix2Pix as well and managed to shrink the model size to 1.1 Mb.

Using her code I trained the model again, and results were surprisingly good.

While the model was working nicely with line drawings, it produced weird outputs when pointed to non-drawings (obviously).
so using Edge-Detection from LensStudio’s Post-Effects library I could try to stylise anything into a line-drawing.

Adding the model to LensStudio is quite straightforward if starting from a SnapML template.
I started with a StyleTransfer lens, and imported the .onnx file with its corresponding Scale and Bias settings.

Screenshot+2020-09-22+at+22.51.52.jpg

For the User Interface I opted for something clean and monochromatic to mimic the comic/mange look & feel.
I designed a square canvas to accommodate all kind of screen sizes and placed the 1:1 ratio camera in the center.

layout composed with indistinguishable comic fragments

layout composed with indistinguishable comic fragments

Since the model behaves very differently if the source image is a drawign or a fake-edge-drawing-style-realworld-image, I had to offer the user the chance to change the effect applied to the camera depending on the usage. To do so, I placed two previews grayscale/edge for the user to choose and use whichever feels better.

User Interaction / Interface layout

lens_post.png

Before continuing, a next round of tests was done to try to improve the model’s behaviour.

It appeared that the camera noise was making the model loose the sharpness and color richness I was getting when running it locally.

To try to solve that, I augmented the dataset x4 by adding noise, blur and grayscale versions of the original image-pairs.
Results did not improve dramatically, so the original model was used to build the final lens.

Final touches affected the layout of the user interface because once the lens was published the top part of the screen became unresponsive to touch, rendering the grayscale/edges previews unusable. :(

A workaround was to move the affected elements a bit down, covering the main element, a design decisions that makes the whole interface more crowded but makes it usable.

The ui elements disappear once the user is happy with the results and taps the snap button, resulting in a clean comic-like page with the colorized drawing cantered in the middle of the screen.


Download lens

Sketch to Color / @cunicode

credits & thanks:

model: CycleGAN and pix2pix in PyTorch
code tweaked for SnapML from Char Stiles
dataset: Anime Sketch Colorization Pair / kaggle
support from snap team: Olha Rykhliuk - UX / Aleksei Podkin - ml


Tests & FAILS

Read More
Bernat Cuni Bernat Cuni

C is for CAT 🐈

Exploring machine's ABC's / image datasets

Exploring machine's ABC’s


Digging into tools for machine learning (ML), I found that there are some publicly available datasets that are commonly used for training ML tools that surround us.

This is fascinating and scary.
Those datasets contain weird stuff... and those are the building blocks of the tech that takes decisions for and on us.

One of those popular datasets is COCO (common objects in context). (build with using people’s Flickr photos, with their legal consent, but probably without their knowledge)

By curiosity, I started digging the 42,7 GB of images from the COCO dataset, where you can find 330K images containing over 1.8 million objects within 80 object categories and 91 stuff categories. 😱
-> here some wonders found inside COCO-VAL2017 dataset : 🤦‍♂️

*I personally do not find myself identified with the look & feel of the COCO dataset... my food does not look like that, my living room does not look like that, my hometown dos not look like that... and so on...

👉 explore the COCO dataset here
COCO Categories: person / bicycle / car / motorcycle / airplane / bus / train / truck / boat / traffic light / fire hydrant / street sign / stop sign / parking meter / bench / bird / cat / dog / horse / sheep / cow / elephant / bear / zebra / giraffe / hat / backpack / umbrella / shoe / eye glasses / handbag / tie / suitcase / frisbee / skis / snowboard / sports ball / kite / baseball bat / baseball glove / skateboard / surfboard / tennis racket / bottle / plate / wine glass / cup / fork / knife / spoon / bowl / banana / apple / sandwich / orange / broccoli / carrot / hot dog / pizza / donut / cake / chair / couch / potted plant / bed / mirror / dining table / window / desk / toilet / door / tv / laptop / mouse / remote / keyboard / cell phone / microwave / oven / toaster / sink / refrigerator / blender / book / clock / vase / scissors / teddy bear / hair drier / toothbrush / hair brush

So, intrigued by "how does stuff inside COCO looks like", I ran a segmentation model MASK_RCNN through it to extracted thousands of items of each classification:


The images found are weirdly beautiful...

A is for Airplane ✈️

B is for Banana 🍌

C is for Cat 🐈

D is for Donut 🍩

...

and of course, P is for Pizza 🍕


⚠️ People make datasets -> Machines learn from datasets -> machines take decisions based on their learnings -> decisions affect people
  • Build your datasets,

  • defend diversity in datasets,

  • fight for non-biased datasets,

  • there's always a human to be made accountable for machine errors.

Remember when there's an issue at the bank/store/insurance and some staff tells you "oh, there is a computer error, I can't do anything about it"… well, that's a cheap excuse; computer errors rarely exist, those are mostly human errors made by the human that programmed the thing, the human that installed it or the human operates it.

and.. bad news, because things are about to get worst with Machine Learning and Artificial Intelligence, as more decisions are delegated to seemingly inscrutable algorithms, you or your actions can be easily mislabelled, and then, it might be very hard to prove the machine wrong, because we know how to program AI, but we don't really know how/Why decisions are taken within the AI. 🤷


Read More
Bernat Cuni Bernat Cuni

CAVE

Non-existing cave paintings created with StyleGAN from 2k real images mostly from Chauvet and Altamira

Dreams of Forgotten Caves

Non-existing cave paintings created with StyleGAN from 2k real images mostly from Chauvet and Altamira

A digital art project that bridges the ancient past and the present through the power of artificial intelligence. The gallery showcases non-existing cave paintings, each one a unique creation of StyleGAN, a generative adversarial network trained on a dataset of 2,000 real images from the renowned Chauvet and Altamira caves.

These digital paintings are more than just images; they are a testament to the enduring human spirit of creativity. They serve as a reminder that the people who painted the original cave walls, biologically identical to us, used art as a medium to communicate, express, and understand their world.

This project highlights the continuity of human creativity, emphasizing that the act of creation is not just a part of our past, but an essential part of our present and future. It is an ancestral part of humanity that continues to thrive in this digital age.

Dreams of forgotten caves connects with the timeless human desire to create and communicate, and see how the act of creation is a fundamental part of our shared human heritage.


Dream compositions prints below / digital collectibles here


References:

Cave of Forgotten Dreams is a 2010 3D documentary film by Werner Herzog about the Chauvet Cave in southern France, which contains some of the oldest human-painted images yet discovered. Some of them were crafted around 32,000 years ago.[4][5] The film premiered at the 2010 Toronto International Film Festival[6] and consists of images from inside the cave as well as of interviews with various scientists and historians.[7] The film also includes footage of the nearby Pont d'Arc natural bridge.[1]

Herzog's interest in the Chauvet cave was prompted by Judith Thurman's New Yorker article "First Impressions".[8] Thurman is listed as one of the co-producers of the film.

Cave of Forgotten Dreams is a 2010 3D documentary film by Werner Herzog about the Chauvet Cave in southern France, which contains some of the oldest human-painted images yet discovered. Some of them were crafted around 32,000 years ago.

Read More
Bernat Cuni Bernat Cuni

lenscapes

experiments with immersive environements / 3603D

experiments with immersive environments / 3603D

Goal: publish immersive light-playing installations for users to experience in-context

Step1) Design a lens-like objects in 3D.

Using grasshopper’s paneling tools and Rhino 3D ‘s raytraced viewport visualisation I managed to create some effects to explore their refractions.

Approach 1) render a 360 3D video and use youtube’s VR player to deliver immersive experience.

Approach 2) using Apple ARKit / Reflections are beautiful but glass material does not have refraction

Approach 3) using Snap’s LensStudio / managed to get nice screen-space reflections, but not refraction

Approach 4) using facebook/instagram Spark AR Studio / couldn’t make glass material with refraction

Approach 5) using Sketchfab’s AR visualisation / glass has nice and beautiful refractions but only in VR, not AR (not able to see the distorted reality through fake glass)

Temporary solution & compromise) place the refractive installation within a created controlled 3D environment to curate the viewer’s experience.


References

Artworks that inspired this experiment:

Read More
Bernat Cuni Bernat Cuni

Deep Textures

Using Machine Learning to create unexpected materials with textures and normal data.

Using Machine Learning to create unexpected materials with textures and normal data.

A [remote] artist residency at Runway, a company that develop tools to make machine learning more accessible for creators.

Results:

  • mouse wheel/pinch= zoom
  • click/tap + drag / arrow keys = rotate
  • *doesn't seem to work with Safari desktop
  • Here a curated selection of specially beautiful textures and normal maps

    Play / Models released at Runway


    Motivation:

    Having extensively used CAD & 3D software packages as part of my product designer practice, I personally find the task of creating new materials tedious and expectable, without a surprise factor. On the other and, with my playing with GANs and image-based ML tools has a discovery factor that keeps me engaged with rewards of nice findings while exploring the unknown.

    So, how could we create new 3D materials using GANs?

    Proof of concept:

    In order to test the look & feel of ML generated 3D materials, I ran a collection of 1000 Portuguese Azulejo Tiles images through StyleGAN to generate visually similar tiles. The results are surprisingly nice, so this seems to be a good direction.

    Next step was to add depth to those generated images.

    3D packages can read BW data and render it as height transformation (bump map), if this is done with the same image used as texture, results are ok but not accurate, as different colours with similar saturation will show with the same height :(.

    So, I had to find translation tool to identify/segment which part of the tiles are embossed or extruded to add some height detail.

    To add depth to those tiles, I ran some of them through DenseDepth model, and I got some interpertations of depth as grayscale images, not accurate probably because the model is trained with spatial scenes (Indoor and outdoor), but the results were beautiful and interesting.

    DeepTiles02.png
    800px-Normal_map_example_with_scene_and_result.png

    Approach

    The initial idea is to source a dataset composed of texture images (diffuse) and their corresponding normal map (a technique used for faking the lighting of bumps and dents. Normal maps are commonly stored as regular RGB images where the RGB components correspond to the X, Y, and Z coordinates, respectively, of the surface normal).

    And then ran those images through StyleGAN.

    It would ideal be to have two StyleGAN's to talk to each other while learning, so they'd know which normal-map relates to which texture... But doing that is beyond my current knowledge.

    A workaround could be to stitch/merge both Diffuse + Normal textures into a single image, and ran it through the StyleGAN, with the hope that it would understand that there are two different and corresponding sides of each image, and thus, generate images with both sides... for later splitting them in order to get the generated diffuse and the corresponding generated normal.

    But first, I need to see if the model is good at generating diffuse texture images from a highly diverse dataset containign bricks, fur, wood, foam... (probably yes, as it did with the Azulejo, but needs to be tested).

    Diffuse Texture generation

    The Dataset chosen is Describable Textures Dataset (DTD) a texture database, consisting of 5640 images, organized according to a list of 47 terms (categories) inspired from human perception. There are 120 images for each category.

    The results are surprisingly consistent and rich, considering that some categories have confusing textures that are not “full-frame” but appear in objects or in context. i.e: “freckles”, “hair” or “foam”.

    Describable Textures Dataset (DTD)

    Describable Textures Dataset (DTD)

    Normal Map generation

    Finding a good dataset of normal-maps has been challenging, since there's an abundance of textures for sale but a scarcity of freely accessible ones. Initially I attempted to scrape some sites containing free textures, but it required too much effort to collect few hundred textures.

    A first test was done with the dataset provided by the Single-Image SVBRDF Capture with a Rendering-Aware Deep Network project.
    From this [85GB zipped] dataset I cropped and isolated a subset of 2700 normal map images to feed the StyleGAN training.

    Dataset: Normal Maps

    StyleGAN was able to understand what makes a normal map, and it generates consistent images, with the colors correctly placed.
    I then ran the generated normal maps through blender to see how they drive light within a 3D environment.

    Diffuse + Normal Map single image StyleGAN training.

    Now I knew that StyleGAN was capable of generating nice textures and nice normal maps independently, but I needed the normal maps to correlate to the diffuse information. Since I wasn’t able to make two stylegans talk to each other while training, I opted for composing a dataset containing the diffuse and the normal map within a single image.

    This time I choose cc0textures which has a good repository of PBR textures and provides a csv information that can be used to download the desired files via wget.

    Dataset: 790 images

    Dataset: 790 images

    The generated images are beautiful and they work as expected.

    StyleGAN is able to understand that both sides of the images are very different and respects that when generating content.

    it works! The normal maps look nice and correspond to the diffuse features.

    What if?…

    The diffuse images generated with the image-pairs are a bit weak and uninteresting, because the source dataset isn’t as bright and variate as the DTD. But unfortunately the DTD doesn’t have normal maps associated. :(

    I want a rich diffuse texture and a corresponding normal map… 🤔

    Normal map generation from source diffuse texture

    I had to find a wat to turn any given image into a somehow working normal map. To do this, I tested Pix2Pix (Image-to-Image Translation with Conditional Adversarial Nets) with the image pairs composed for the previous experiment.

    I trained it locally with 790 images, and after 32h of training I had some nice results:

    Pix2Pix generates consistent normal maps of any given texture.

    Chaining models

    Workflow idea:

    Diffuse from StyleGAN -> input to Pix2Pix -> generate the corresponding normal map

    First I had to brig the Pix2Pix trained model to Runway. [following these steps].

    Then I used StyleGAN’s output as Pix2Pix Input. 👍

    With the help of Brannon Dorsey and following this fantastic tutorial from Daniel Shiffman, we were able to run a P5js sketch to generate randomGaussian vectors for StyleGAN, and use that output to feed Pix2Pix and generate the normal-maps.

    Final tests / model selection

    From all the tests, the learnings are the following:

    • Stylegan for textures works good - [SELECTED] - dataset = 5400 images

    • StyleGAN for normal maps works good, but there’s no way to link the generated maps to diffuse textures - [DISCARDED]

    • StyleGAN trained with a 2-in-one dataset produces nice and consistent results - but the datasets used did not produce varied and rich textures. [DISCARDED]

    • Pix2Pix to generate normal maps from a given diffuse texture works good. [SELECTED] - dataset = 790 image pairs

    • Pix2Pix trained with 10000 images produced weak results. [DISCARDED]

    The final approach is: StyleGan trained with textures from the DTD dataset (5400 images) -> Pix2Pix trained with a small (790) dataset of Diffuse-Normal pairs from cc0Textures

    Visualization & Interaction

    I really like how runway visualises the latent space using a 2D grid to explore a multi-dimensional space of possibilities. (in resonance with some of the ideas in this article: Rethinking Design Tools in the Age of Machine Learning ).

    It would be great to build a similar approach to explore the generated textures within a 3D environment.

    To visualise textured 3D models on the web, I explored Three.js and other tools.

    babylonjs seems the best suited for this project.:

    A quick test with the textures generated from StyleGAN looks nice enough.

    Trying to add the grid effect found in runway makes the thing a bit slower but interesting. The challenge is how to change the textures on-the´fly with the ones generated by StyleGAN/Pix2Pix…

    After some tests and experimentations I managed to load video textures, and display them within a realtime web interface. 👍

    • mouse wheel/pinch= zoom

    • click/tap + drag / arrow keys = rotate

    • *doesn't seem to work with Safari desktop

    Learnings & Release

    The StyleGAN behaved as expeced: producing more interesting results when given a more rich and varied dataset.
    Pix2Pix model has proven strong enough, but results could be better. For next iterations of this idea I should explore other models. like MUNIT or even some StyleTransfer approach.

    Both checkpoints have been released and are free to use at Runway. StyleGAN-Textures + Diffuse-2-Normal

    You can also download the model files here for your local use.

    Thanks & next

    This project has been done in about two weeks, within the Something-in-Residence program at RunwayML. Thanks.

    I’ve learned a lot during this project and I’m ready to build more tools and experiments blending Design/AI/3D/Crafts and help others play with ML.

    For questions, comments and collaborations, please let me know -> contact here or via twitter.


    Background & Glossary

    Some impressive work has been previously done in that area:

    Texture Synthesis Using Convolutional Neural Networks


    Some prints:

    Read More
    Bernat Cuni Bernat Cuni

    confusing coleopterists / 🤔🐞

    breeding bugs in the latent space

    breeding bugs in the latent space

    A beetle generator made by machine-learning thousands of #PublicDomain illustrations.

    Inspired by the stream of new Machine Learning tools being developed and made accessible and how they can be used by the creative industries, I was curious to run some visual experiments with a nice source material: zoological illustrations.

    Previously I ran some test with DeepDream and StyleTransfer, but after discovering the material published at Machine Learning for Artists / @ml4a_ , I decided to experiment with the Generative Adversarial Network (GAN) approach.

    Creating a Dataset

    Through the Biodiversity Heritage Library, I discovered the book: Biologia Centrali-Americana :zoology, botany and archaeology, hosted at archive.org, containing fantastic #PublicDomain illustrations of beetles.

    Through a combination of OpenCV and ImageMagick, I managed to extract each individual illustration and generate nicely centered square images.

    Training a GAN

    Following the Lecture 6: Generative models  [10/23/2018] The Neural Aesthetic @ ITP-NYU, Fall 2018 - [ Lecture 6: Generative models  [10/23/2018] -> Training DCGAN-tensorflow (2:27:07) ] I managed to run DCGAN with my dataset and paying with different epochs and settings I got this sets of quasi-beetles.

    Nice, but ugly as cockroaches

    it was time to abandone DCGAN and try StyleGAN

    Expected training times for the default configuration using Tesla V100 GPUs

    Expected training times for the default configuration using Tesla V100 GPUs

    Training StyleGAN

    I set up a machine at PaperSpace with 1 GPU (According to NVIDIA’s repository, running StyleGan on 256px images takes over 14 days with 1 Tesla GPU) 😅

    I trained it with 128px images and ran it for > 3 days, costing > €125.

    Results were nice! but tiny.

    Generating outputs

    Since PaperSpace is expensive (useful but expensive), I moved to Google Colab [which has 12 hours of K80 GPU per run for free] to generate the outputs using this StyleGAN notebook.

    Results were interesting and mesmerising, but 128px beetles are too small, so the project rested inside the fat IdeasForLater folder in my laptop for some months.

    ConfusingColeopterists_packed_post.png

    In parallel, I've been playing with Runway, a fantastic tool for creative experimentation with machine learning. And in late November 2019 the training feature was ready for beta-testing.

    Training in HD

    I loaded the beetle dataset and trained it at full 1024px, [on top of the FlickrHD model] and after 3000 steps the results were very nice.

    From Runway, I saved the 1024px model and moved it to Google Colab to generated some HD outputs.

    In December 2019 StyleGAN 2 was released, and I was able to load the StyleGAN (1) model into this StyleGAN2 notebook and run some experiments like "Projecting images onto the generatable manifold", which finds the closest generatable image based on any input image, and explored the Beetles vs Beatles:

    model released

    make your own fake beetles with RunwayML -> start


    confusing coleopterists - postcards - 01002.png

    1/1 edition postcards

    AI Bugs a crypto-collectibles

    Since there is an infinite number of potential beetles that the GAN can generate, I tested adding a coat of artificial scarcity and post some outputs at Makersplace as digital collectibles on the blockchain... 🙄🤦‍♂️


    AI bugs as a prints

    I then extracted the frames from the interpolation videos and created nicely organised grids that look fantastic as framed prints and posters, and made them available via Society6.

    Framed prints available in six sizes, in a white or black frame color.

    • Every product is made just for you

    • Natural white, matte, 100% cotton rag, acid and lignin-free archival paper

    • Gesso coating for rich color and smooth finish

    • Premium shatterproof acrylic cover

    • Frame dimensions: 1.06" (W) x 0.625" (D)

    • Wire or sawtooth hanger included depending on size (does not include hanging hardware)

    il_794xN.2182813013_swxd.jpg

    random

    mini print

    each print is unique [framed]

    As custom prints (via etsy)


    glossary & links

    Machine learning (ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence. Machine learning algorithms build a mathematical model based on sample data, known as "training data", in order to make predictions or decisions without being explicitly programmed to perform the task.[1][2]:2 Machine learning algorithms are used in a wide variety of applications, such as email filtering and computer vision, where it is difficult or infeasible to develop a conventional algorithm for effectively performing the task.

    Machine learning is closely related to computational statistics, which focuses on making predictions using computers. The study of mathematical optimization delivers methods, theory and application domains to the field of machine learning. Data mining is a field of study within machine learning, and focuses on exploratory data analysis through unsupervised learning.[3][4] In its application across business problems, machine learning is also referred to as predictive analytics.

    The name machine learning was coined in 1959 by Arthur Samuel.[5] Tom M. Mitchell provided a widely quoted, more formal definition of the algorithms studied in the machine learning field: "A computer program is said to learn from experience E with respect to some class of tasks T and performance measure P if its performance at tasks in T, as measured by P, improves with experience E."[6] This definition of the tasks in which machine learning is concerned offers a fundamentally operational definition rather than defining the field in cognitive terms. This follows Alan Turing's proposal in his paper "Computing Machinery and Intelligence", in which the question "Can machines think?" is replaced with the question "Can machines do what we (as thinking entities) can do?".[7] In Turing's proposal the various characteristics that could be possessed by a thinking machine and the various implications in constructing one are exposed.

    Machine learning ( ML) is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. It is seen as a subset of artificial intelligence.

    A generative adversarial network (GAN) is a class of machine learning systems invented by Ian Goodfellow and his colleagues in 2014.[1] Two neural networks contest with each other in a game (in the sense of game theory, often but not always in the form of a zero-sum game). Given a training set, this technique learns to generate new data with the same statistics as the training set. For example, a GAN trained on photographs can generate new photographs that look at least superficially authentic to human observers, having many realistic characteristics. Though originally proposed as a form of generative model for unsupervised learning, GANs have also proven useful for semi-supervised learning,[2] fully supervised learning,[3] and reinforcement learning.[4] In a 2016 seminar, Yann LeCun described GANs as "the coolest idea in machine learning in the last twenty years".[5]

    The generative network generates candidates while the discriminative network evaluates them.[1] The contest operates in terms of data distributions. Typically, the generative network learns to map from a latent space to a data distribution of interest, while the discriminative network distinguishes candidates produced by the generator from the true data distribution. The generative network's training objective is to increase the error rate of the discriminative network (i.e., "fool" the discriminator network by producing novel candidates that the discriminator thinks are not synthesized (are part of the true data distribution)).[1][6]

    A known dataset serves as the initial training data for the discriminator. Training it involves presenting it with samples from the training dataset, until it achieves acceptable accuracy. The generator trains based on whether it succeeds in fooling the discriminator. Typically the generator is seeded with randomized input that is sampled from a predefined latent space (e.g. a multivariate normal distribution). Thereafter, candidates synthesized by the generator are evaluated by the discriminator. Backpropagation is applied in both networks so that the generator produces better images, while the discriminator becomes more skilled at flagging synthetic images.[7] The generator is typically a deconvolutional neural network, and the discriminator is a convolutional neural network.

    A generative adversarial network ( GAN) is a class of machine learning systems invented by Ian Goodfellow and his colleagues in 2014. Two neural networks contest with each other in a game (in the sense of game theory, often but not always in the form of a zero-sum game).

    StyleGAN is a novel generative adversarial network (GAN) introduced by Nvidia researchers in December 2018,[1] and open sourced in February 2019.[2][3]

    StyleGAN depends on Nvidia's CUDA software, GPUs and on TensorFlow.[4]

    The website This Person Does Not Exist showcases fully automated human image synthesis by endlessly generating images that look like facial portraits of human faces.[5] The website was published in February 2019 by Phillip Wang. The technology has drawn comparison with deep fakes[6] and the tells of poker, and its potential usage for sinister purposes has been bruited.[7]

    The website This Person Does Not Exist showcases fully automated human image synthesis by endlessly generating images that look like facial portraits of human faces. The website was published in February 2019 by Phillip Wang. The technology has drawn comparison with deep fakes and the tells of poker, and its potential usage for sinister purposes has been bruited.

    DeepDream is a computer vision program created by Google engineer Alexander Mordvintsev which uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like hallucinogenic appearance in the deliberately over-processed images.[1][2][3]

    Google's program popularized the term (deep) "dreaming" to refer to the generation of images that produce desired activations in a trained deep network, and the term now refers to a collection of related approaches.

    The DeepDream software originated in a deep convolutional network codenamed "Inception" after the film of the same name,[1][2][3] was developed for the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC) in 2014[3] and released in July 2015.

    DeepDream is a computer vision program created by Google engineer Alexander Mordvintsev which uses a convolutional neural network to find and enhance patterns in images via algorithmic pareidolia, thus creating a dream-like hallucinogenic appearance in the deliberately over-processed images.


    Read More
    Bernat Cuni Bernat Cuni

    HEADFOAM

    Design and test of shock-absorbing structures produced by AM to be applied to helmets used as PPE (Personal Protective Equipment) in sports as climbing, biking, skating, etc.

    Shock Absorbing Structures For Sports Helmet

    Head Protective Equipment Fabricated With Additive Manufacturing

    Project: Design and test of shock-absorbing structures produced by AM to be applied to helmets used as PPE (Personal Protective Equipment) in sports as climbing, biking, skating, etc.

    Innovation: Use of an impact-absorbing structure designed specifically for AM production.

    Challenge: We believe that an inner structure of a safety helmet designed with a shock-absorbing lattice structure and produced by AM would make a better helmet.

    Application: The solution PPE helmet can be used in outdoors sports as climbing or biking or skating..

    Activities:

    Design study on several techniques to generate lattice structures.

    01 lattice design.jpg

    Fabrication of test probes produced in MJF PA11 and SLS TPU

    Dynamic Impact Tests

    Test consisted in dropping 5 kg round metal object onto the test subject from a height of 1m. Test was run at the User Partner lab, equipped with testing machinery, using the 1011e 1012_MAU 1002_2W ALU_SF equipment by CKL engineering (figure 3 left). Machine measures the total impact force which is absorbed by the sample, measured in Newtons.

    Test results evaluation

    Helmet inner part designed with shock absorbing lattice

    prototypes & tests

    LaaS / Lattice as a Service


    H2020_logo_500px.jpg

    This project has received funding from the European Union’s H2020 Framework Programme for research, technological development and demonstration under grant agreement no 768775.

    Project supported by AMable and I4MS, done In collaboration with: ProductosClimax S.A.

    AMable [AdditiveManufacturABLE ] provides support to SMEs and mid-caps for their individual uptake of additive manufacturing. Across all technologies from plastics through polymers to metals, AMable offers services that target challenges for newcomers, enthusiasts and experts alike.

    I4MS -> The EU initiative to digitalise the manufacturing industry

    Read More
    Bernat Cuni Bernat Cuni

    SheDavid

    3DScan Mashup / Venus de Milo + Michelangelo’s David

    3DScan Mashup

    Venus de Milo + Michelangelo’s David

    Sculpture / 3D Printed in StainlessSteel

    Sculpture / 3D Printed in StainlessSteel




    Read More
    Bernat Cuni Bernat Cuni

    mootioon.com

    Curated stream of inspiring  3D & MotionGraphics.

    mootioon -> 3D / MotionGraphics / Render / Loop / C4D -> stream

    An experiment to create a custom TV channel with an endless stream of inspiring motion graphics, 3D, vfx and CGI clips.

    Inspired by the weird and mesmerising short-clips that inundated MTv during the 90’s, I always wanted a TV channel full of that, so I built one.

    • A script continuously finds clips from several sources

    • OBS to streams a folder containing the clips, randomly

    • Restream splits the feed to Twitch / Youtube

    • The site links to most of the identified artists.

    • The service built for fun, and it doesn’t make money.

    -> youtube / twitch / instagram

    Read More
    Bernat Cuni Bernat Cuni

    Permutation

    Generative collection, designed with code, 3D printed in Stoneware. 

    cunicode_permutation_stoneware_17.jpg

    Generative collection, designed with code, 3D printed in Stoneware.

    Pieces composed by the random combination of nine basic units, placed around a cylinder.

    Each piece is unique and exists within an immense landscape of millions of possible different combinations.

    • Material: Stoneware

    • Printed by: bcn3Dceramics

    • Software: Rhino3D / Grasshopper

    • Machine: PotterBot 3D printer

    • Design by: Bernat Cuni

    • Exhibited at: Argillà Argentona - International ceramics fair

    P114.3

    148.791.629.670.981.130.805.037.453.479.575.340 possible combinations.

    vaserenders.238.png

    One hundred and forty eight decillion, seven hundred and ninety one nonillion, six hundred and twenty nine octillion, six hundred and seventy septillion, nine hundred and eighty one sextillion, one hundred and thirty quintillion, eight hundred and five quadrillion, and thirty seven trillion, four hundred and fifty three billion, four hundred and seventy nine million, five hundred and seventy five thousand, three hundred and forty.

    P16.4

    66.305.137.490.523.096 possible combinations.

    vaserenders.236.png

    sixty-six quadrillion, three hundred and five trillion, one hundred and thirty-seven billion, four hundred and ninety million, five hundred and twenty-three thousand, one hundred and four

    Read More
    Bernat Cuni Bernat Cuni

    art.faces - AI sculpts 3D faces from famous paintings

    Exploration on face detection and 3D reconstruction using AI

    Exploration on face detection and 3D reconstruction using the "Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression" code by Aaron S. JacksonAdrian Bulat, Vasileios Argyriou and Georgios Tzimiropoulos.

    link: https://www.cunicode.com/works/artfaces Using the demo from the paper "Large Pose 3D Face Reconstruction from a Single Image via Direct Volumetric CNN Regression", we let the AI sculpt 3D faces from famous artworks.

    We selected 8 Famous Paintings and let the Convolutional Neural Network (CNN) perform a direct regression of a volumetric representation of the 3D facial geometry from a single 2D image.

    01 Mona_Lisa,_by_Leonardo_da_Vinci,_from_C2RMF_retouched.jpg

    Mona Lisa

    Leonardo da Vinci
    c. 1503–06, perhaps continuing until c. 1517
    Oil on poplar panel
    Subject: Lisa Gherardini
    77 cm × 53 cm
    Musée du Louvre, Paris

    03 Meisje_met_de_parel.jpg

    Girl with a Pearl Earring

    Meisje met de parel
    Johannes Vermeer
    c. 1665
    Oil on canvas
    44.5 cm × 39 cm
    Mauritshuis, The Hague, Netherlands

    05 Sandro_Botticelli_-_La_nascita_di_Venere_-_Google_Art_Project_-_edited.jpg

    The Birth of Venus

    Nascita di Venere
    Sandro Botticelli,
    The Birth of Venus (c. 1484-86).
    Tempera on canvas.
    172.5 cm × 278.9 cm.
    Uffizi, Florence

    Las Meninas

    Diego Velázquez
    1656
    Oil on canvas
    318 cm × 276 cm
    Museo del Prado, Madrid

    02 Vincent_van_Gogh_-_Self-Portrait_-_Google_Art_Project.jpg

    Self-Portrait

    Vincent van Gogh
    September 1889
    Oil on canvas
    65 × 54 cm
    Musée d'Orsay, Paris.
    This may have been Van Gogh's last self-portrait.

    04 Cavalier_soldier_Hals-1624x.jpg

    Laughing Cavalier

    Frans Hals
    1624
    oil on canvas
    83 cm × 67.3 cm
    Wallace Collection, London

    06 Grant_Wood_-_American_Gothic_-_Google_Art_Project.jpg

    American Gothic

    Grant Wood
    1930
    Oil on beaverboard
    78 cm × 65.3 cm
    Art Institute of Chicago

    08 The_Nightwatch_by_Rembrandt.jpg

    The Night Watch

    Rembrandt van Rijn
    1642
    Oil on canvas
    363 cm × 437 cm
    Rijksmuseum, Amsterdam


    The technology behind: 

    Code available on GitHub
    Test your images with this demo,

    This tech is fantastic and amazing, but same as with DeepFakes, it should be handled with care… Once we can generate accurate 3D representations of anyone with minimal input data, there’s the possibility to use that 3D output as digital identity theft.

    As well questioned in this VentureBeat article : 

    So what happens when the technology further improves (which it will) and becomes accessible to marketers and brands (which it always does)? Imagine a casting call where a dozen actors are digitally and convincingly superimposed on a stand-in model prior to engaging the actors in real life.

    We can imagine a scenario where videos are created using someone’s face (and body?) to make them do whatever they want, without consent.

    On the other hand, AI is getting so smart, that in some cases, it doesn’t need a real input to go wild and imagine possible realities.

    In this paper by NVIDIA, a generative adversarial network is able to picture imaginary celebrities.

    Particularly in the 3D space, we see a huge potential for this kind of technology.

     

     


    Read More
    Bernat Cuni Bernat Cuni

    Digital Ceramics

    Ceramic 3D Printing products and services for the industry & the arts

    Ceramic Additive Manufacturing products and services for the Industry & the arts

    Powder Binding / SLA / Paste Extrusion

    Read More