AI
Human Patterns
The fact that AI systems work so well is proof that we live in a measurable world. The world is filled with structures: nature, cultures, languages, human interactions - all form intricate patterns. Computer systems are increasingly capable in their ability copy these patterns into computer models - known as machine learning. As of 2023, 97 zettabytes (and growing) of data was created in the world per year (Soundarya Jayaraman, 2023). Big data is a basic requirement for training AIs, enabling learning from the structures of the world with increasing accuracy. Large data-sets such as the LAION-5B of 5.85 billion image-text pairs, were foundational for training AI to recognize images (Romain Beaumont, 2022; Schuhmann et al., 2022). Just 3 years later, generating images with GenAI models is now fast enought to create images in real-time while the user is typing (Dwarkesh Patel, 2024). Similarly huge data-sets exist about other types of media - and the open Internet itself, albeit less structured, is a data-source frequently scraped by AI-model builders. Representations of the real world in digital models enable humans to ask questions about the real-world structures and to manipulate them to create synthetic experiments that may match the real world (if the model is accurate enough). This can be used for generating human-sounding language and realistic images, finding mechanisms for novel medicines as well as understanding the fundamental functioning of life on its deep physical and chemical level (No Priors: AI, Machine Learning, Tech, & Startups, 2023). Venture capitalists backing OpenAI describe AI as a foundational technology, which will unlock human potential across all fields of human activity (Greylock, 2022).
In essence, human patterns enable AIs. Already 90 years ago (McCulloch & Pitts, 1943) proposed the first mathematical model of a neural network inspired by the human brain. Alan Turing’s Test for Machine Intelligence followed in 1950. Turing’s initial idea was to design a game of imitation to test human-computer interaction using text messages between a human and 2 other participants, one of which was a human, and the other - a computer. The question was, if the human was simultaneously speaking to another human and a machine, could the messages from the machine be clearly distinguished or would they resemble a human being so much, that the person asking questions would be deceived, unable to realize which one is the human and which one is the machine? (Turing, 1950).
Alan Turing: “I believe that in about fifty years’ time it will be possible to program computers, with a storage capacity of about 109, to make them play the imitation game so well that an average interrogator will not have more than 70 percent chance of making the right identification after five minutes of questioning. … I believe that at the end of the century the use of words and general educated opinion will have altered so much that one will be able to speak of machines thinking without expecting to be contradicted.” - from (Stanford Encyclopedia of Philosophy, 2021)
By the 2010s AI models became capable enough to beat humans in games of Go and Chess, yet they did not yet pass the Turing test. AI use was limited to specific tasks. While over the years, the field of AI had seen a long process of incremental improvements, developing increasingly advanced models of decision-making, it took an increase in computing power and an approach called deep learning, a variation of machine learning (1980s), largely modeled after the neural networks of the biological (human) brain, returning to the idea of biomimicry, inspired by nature, building a machine to resemble the connections between neurons, but digitally, on layers much deeper than attempted before. Like quantum computing, AI more of a discovery, thank an invention; we have no idea, what are the limits of intelligence (CatGPT, 2025).
Founder of NVIDIA, Jensen Huang, whose computer chips power much of this revolution, calls it the “Intelligence Infrastructure”, produced by intelligence factories, and integrated into everything, just like electricity was (NVIDIA, 2025). In order to produce this intelligence, huge AI factories are being built around the world, measured in the energy requirements. (Calma, 2025) predicts AI will surpass Bitcoin’s energy use by the end of 2025 (Calma, 2025). The 500B USD Stargate project, is currently building 1.2 gigawatts of AI capacity in the Texas, and expanding to other areas around the U.S., and data center in Abu Dhabi, U.A.E., which requires 5GW of energy, and is physically bigger than the country of Monaco (Loizos, 2025; Moss, 2025). In comparison, the 500MW xAI AI factory, built by Elon Musk’s company, powered by natural gas generators, is moderate in size (B. Wang, 2025). While OpenAIs Sam Altman is repeatedly quoted as saying the productivity gains created by AI will far offset any of its environmental footprint or other words to that effect (Altman, 2024; Di Pizio, 2023), critics like (iGenius, 2020) argue that AI cannot enable a sustainable future if it is not sustainable by design; training and delivery of AI products must include sustainability considerations tied into data intelligence and business analytics.
Human Feedback
Combining deep learning and reinforcement learning with human feedback (RLHF) enabled to achieve levels of intelligence high enough to beat the Turing test (Christiano et al., 2017; Christiano, 2021; Kara Manke, 2022). John Schulman, a co-founder of OpenAI describes RLHF simply: “the models are just trained to produce a single message that gets high approval from a human reader” (Kara Manke, 2022). Bigger models aren’t necessarily better; rather models need human feedback to improve the quality of responses (Ouyang et al., 2022).
The nature-inspired approach was successful. Innovations such as back-propagation for reducing errors through updating model weights and transformers for tracking relationships in sequential data (for example in sentences), enabled AI models to became increasingly capable (Merritt, 2022; Vaswani et al., 2017). Generative Adversarial Networks trained models through pitting them against each other (Goodfellow et al., 2014). Large Language Models, enabled increasingly generalized models, capable of more complex tasks, such as language generation (Radford et al., 2018).
One of the leading scientists in this field of research, Geoffrey Hinton, had attempted back-propagation already in the 1980s and reminiscents how:
“the only reason neural networks didn’t work in the 1980s was because we didn’t have have enough data and we didn’t have enough computing power” (CBS Mornings, 2023).
(Epoch AI, 2024) reports the growth in computing power and the evolution of more than 800 AI models since the 1950s. Very simply, more data and more computing power means more intelligent models.
The above chart shows an illustration of how transformers work by (Alammar, 2018).
By the 2020s, AI-based models became a mainstay in medical research, drug development, patient care (Holzinger et al., 2023; Leite et al., 2021), quickly finding potential vaccine candidates during the COVID19 pandemic (Zafar & Ahamed, 2022), self-driving vehicles, including cars, delivery robots, drones in the sea and air, as well as AI-based assistants. The existence of AI models has wide implications for all human activities from personal to professional. The founder of the largest chimp-maker NVIDIA calls upon all countries do develop their own AI-models which would encode their local knowledge, culture, and language to make sure these are accurately captured (World Governments Summit, 2024).
OpenAI has researched a wide range of approaches towards artificial general intelligence (AGI), work which has led to advances in large language models(AI Frontiers, 2018; Ilya Sutskever, 2018). In 2020 OpenAI released a LLM called GPT-3 trained on 570 GB of text (Alex Tamkin & Deep Ganguli, 2021) which was adept in text-generation. (Singer et al., 2022) describes how collecting billions of images with descriptive data (for example the descriptive alt text which accompanies images on websites) enabled researchers to train AI models such as stable diffusion for image-generation based on human-language. These training make use of Deep Learning, a layered approach to AI training, where increasing depth of the computer model captures minute details of the world. Much is still to be understood about how deep learning works; even for specialists, the fractal structure of deep learning can only be called mysterious (Sohl-Dickstein, 2024).
AI responses are probabilistic and need some function for ranking response quality. Achieving higher percentage or correct responses requires oversight which can come in the form of human feedback or by using other AIs systems which are deemed to be already well-aligned (termed Constitutional AI by Anthropic) (Bai et al., 2022; Bailey, 2023). One approach to reduce non-alignmnet issues with AI is to introduce some function for human feedback and oversight to automated systems. Human involvement can take the form of interventions from the AI-developer themselves as well as from the end-users of the AI system. Such feedback is not only provided by humans, computer can give feedback to computers too. Less powerful AIs are taught by more poweful and aligned AIs, which understand the world better, to follow human values: for example META used LLAMA 2 for aligning LLAMA 3.
There are many examples of combination of AI and human, also known as “human-in-the-loop”, used for fields as diverse as training computer vision algorithms for self-driving cars and detection of disinformation in social media posts (Bonet-Jover et al., 2023; Wu et al., 2023). Also known as Human-based computation or Human-aided Artificial Intelligence (Mühlhoff, 2019; Shahaf & Amir, 2007). (Ge Wang, 2019) from the Stanford Institute for Human-Centered Artificial Intelligence, describes core design principles for building interactive AI systems that augment rather than replace people: (1) value human agency, (2) offer granularity of control, and (3) provide transparency interfaces.
App | Category | Use Case |
---|---|---|
Welltory | Health | Health data analysis |
Wellue | Health | Heart arrhythmia detection |
QALY | Health | Heart arrhythmia detection |
Starship Robots | Delivery | The robot may ask for human help in a confusing situation, such as when crossing a difficult road |
In order to provide human feedback, systems need to be able to distinguish humans from AIs. To that end, several “Proof of Humanity” toolsets are in the process of being built. (Gitcoin Passport — Sybil Defense. Made Simple. [@gitcoinpassport], 2023) discusses how to build Gitcoin Passport’s Unique Humanity Score, an antifragile passport, inspired by Nassim Taleb’s popular book (Taleb, 2012). Taleb defines “antifragility” as “systems that benefit from volatility and stressors”, summarizing it in a letter to Nature thus:
“a convex response to a stressor or source of harm (for some range of variation), leading to a positive sensitivity to increase in volatility” - antifragility.
Gitcoin’s Passport pulls together proofs of identity from web2 platforms - but adds a unique twist: “Cost of Forgery” as a protection against fake users (aka Sybil attacks, where a malicious person fakes identities so it looks like many independent users), it becomes more expensive for them to do so, turning attack pressure into a self-reinforcing defense; however, while this approach works, it does set a very high bar for users to comply, and requires a cryptocurrency to set the price for the attacks (Gitcoin Passport — Sybil Defense. Made Simple. [@gitcoinpassport], 2023). In contrast, another popular proof-of-personhood protocol called World, verifies humanness via physical scans of human iris’, captured by its Orb device; and again using cryptography, to compare a proof (ZK-SNARK) against a centralized database (Gent, 2023). From the user experience perspective, this approach is much simpler (while needing physical presence for the iris scan). Given that World was co-founded by the OpenAI co-founder Sam Altman, this may be one way he plans to counter the possible societal disruptions accelerated by OpenAIs products.
AI as the Idiot Savant
Hinton likes to call AI an idiot savant: someone with exceptional aptitude yet serious mental disorder (CBS Mornings, 2023). Large AI models don’t understand the world like humans do. Their responses are predictions based on their training data and complex statistics. Indeed, the comparison is apt, as the AI field now offers jobs for AI psychologists, whose role is to figure out what exactly is happening inside the ‘AI brain’ (Waddell, 2018). Understanding the insides of AI models trained of massive amounts of data is important because they are foundational, enabling a holistic approach to learning, combining many disciplines using languages, instead of the reductionist way we as human think because of our limitations (CapInstitute, 2023). Hinton received a Nobel prize for modeling how the brain works and coming up with the idea of predicting the next word in a sequence, already in 1986, which later became the basis for large language models (CBS Mornings, 2025).
Foundation models enable Generative AIs, a class of models which are able to generate many types of *tokens**, such as text, speech, audio (Kreuk et al., 2022; San Roman et al., 2023), music (Copet et al., 2023; Meta AI, 2023), video, and even complex structures such 3D models and DNA structures, in any language it’s trained on. The advent of generative AIs was a revolution in human-computer interaction as AI models became increasingly capable of producing human-like content which is hard to distinguish from actual human creations. This power comes with increased need for responsibility, drawing growing interest in fields like AI ethics and AI explainability. Generative has a potential for misuse, as humans are increasingly confused by what is computer-generated and what is human-created, unable to separate one from the other with certainty.
(Bommasani et al., 2021) define foundation models as large scale pretrained models adaptable to diverse downstream tasks, thouroughly accounting opportunities, such as capabilities across language, vision, robotics and reasoning - and risks: bias, environmental cost, economic shifts, governance, highlighting the need for interdisciplinary research - to understand deeply how these models work, and when and how do they fail. Understaning failure is crucial, as there is the question of who bares the responsibility for the actions taken by the AI (especially, in its most agentic forms, with access to the internet and tools outside the model itself). Research in organizational behavior indicates that when individuals exert influence through intermediaries - known as indirect agency, - their ethical judgment can become distorted: humans may believe they are behaving ethically while, in reality, they exhibit reduced concern for those affected by their decisions, resulting in less accountability for moral failures, and expecting fewer consequences for unethical conduct (Gratch & Fast, 2022).
The technological leap is disprutive enough for people to start calling it the start of a new era.(Noble et al., 2022) proposes AI has reached a stage of development marking beginning of the 5th industrial revolution, a time of collaboration between humans and AI. Widespread Internet of Things (IoT) sensor networks that gather data analyzed by AI algorithms, integrates computing even deeper into the fabric of daily human existence. Several terms of different origin but considerable overlap describe this phenomenon, including Pervasive Computing (PC) (Y. Rogers, 2022) and Ubiquitous Computing. Similar concepts are Ambient Computing, which focuses more on the invisibility of technology, fading into the background, without us, humans, even noticing it, and Calm Technology, which highlights how technology respects humans and our limited attention spans, and doesn’t call attention to itself. In all cases, AI is integral part of our everyday life, inside everything and everywhere. Today AI is not an academic concept but a mainstream reality, affecting our daily lives everywhere, even when we don’t notice it.
Algorithmic Experience and Transparency: Before AIs
Before AIs, as a user of social media, one may be accustomed to interacting with the feed algorithms that provide a personalized algorithmic experience. Social media user feed algorithms are more deterministic than AI, meaning they would produce more predictable output in comparison AI models. Nonetheless, there are many reports about effects these algorithms have on human psychology, including loneliness, anxiety, fear of missing out, social comparison, and even depression (De et al., 2025; Qiu, 2021).
Design is increasingly relevant to algorithms, - algorithm design - and more specifically to algorithms that affect user experience and user interfaces. When the design is concerned with the ethical, environmental, socioeconomic, resource-saving, and participatory aspects of human-machine interactions and aims to affect technology in a more human direction, it can hope to create an experience designed for sustainability.
(Lorenzo et al., 2015) underlines the role of design beyond designing as a tool for envisioning; in her words, “design can set agendas and not necessarily be in service, but be used to find ways to explore our world and how we want it to be”. Practitioners of Participatory Design (PD) have for decades advocated for designers to become more activist through action research. This means to influencing outcomes, not only being a passive observer of phenomena as a researcher, or only focusing on usability as a designer, without taking into account the wider context.
(Shenoi, 2018) argues inviting domain expertise into the discussion while having a sustainable design process enables designers to design for experiences where they are not a domain expert; this applies to highly technical fields, such as medicine, education, governance, and in our case here - finance and sustainability -, while building respectful dialogue through participatory design. After many years of political outcry (Crain & Nadler, 2019), social media platforms such Meta Facebook and Twitter (later renamed to X) have began to shed more light on how these algorithms work, in some cases releasing the source code (Nick Clegg, 2023; Twitter, 2023).
The content on the platform can be more important than the interface. Applications with a similar UI depend on the community as well as the content and how the content is shown to the user.
Transitioning to Complexity: Non-Deterministic Systems
AIs are non-deterministic, which requires a new set of consideration when designing AI. AI systems may make use of several algorithms within one larger model. It follows that AI Explainability requires Algorithmic Transparency.
Being Responsible, Explainable, and Safe: Legislation Adapts and Sets Boundaries for AI
On March 13 2024, the European Parliament (with 523 votes for and 46 against) the EU AI Law, taking a risk-based approach to a regulatory framework, which aims to support innovation, while safeguarding democracy and environmental sustainability (Lomas, 2024). Specifically, the EU Artificial Intelligence Act (Regulation EU 2024/1689) establishes the first comprehensive legal framework for AI in the world, aiming to harmonize rules to ensure that AI systems are safe, human-centric, and rights-respecting; the act defines a tiered system that bans unacceptable risks and regulates high-risk uses, imposing transparency duties on developers of AI systems, for example near-realtime (hourly) CO2eq emissions reports from the AI models (European Union, 2024). As AI-based solutions permeate every aspect of human life, legislation is starting to catch up. In order to help international jurisdictions tailor which incidents and hazards they track and enable interoperability, the Organization for Economic Cooperation and Development (OECD) later also defined 2 types of AI risk, “AI incident” - AI system causes real harm; “AI hazard” - potential‐harm scenario, both which can be raised to “serious” variants (OECD, 2024).
“As humans we tend to fear what we don’t understand” is a common sentiment which has been confirmed psychology (Allport, 1979). Current AI-models are opaque ’black boxes’, where it’s difficult to pin-point exactly why a certain decision was made or how a certain expression was reached, not unlike inside the human brain. This line of thought leads me to the idea of AI Psychologists, who might figure out the Thought Patterns inside the model. Research in AI-explainability (XAI in literature) is on the lookout for ways to create more Transparency and Credibility in AI systems, which could lead to building trust in AI systems and would form the foundations for AI Acceptance.
The problems of opaqueness creates the field of Explainable AI. (Bowman, 2023) says steering Large Language Models is unreliable; even experts don’t fully understand the inner workings of the models. Work towards improving both AI steerability and AI Alignment (doing what humans expect) is ongoing. (Holbrook, 2018) argues that in order to reduce errors which only humans can detect, and provide a way to stop automation from going in the wrong direction, it’s important to focus on making users feel in control of the technology. There’s an increasing number of tools for LLM evaluation. “Evaluate and Track LLM Applications, Explainability for Neural Networks” (Leino et al., 2018; TruEra, 2023). (Liang et al., 2022) believes there’s early evidence it’s possible to assess the quality of LLM output transparently. (Cabitza et al., 2023) proposes a framework for explainability of AI-expressions to guide XAI research, focusing on the quality of formal soundness and cognitive clarity. (Khosravi et al., 2022) proposes a framework for AI explainability, focused squarely on education, which brings in communication with stakeholders and human-centered interface design (Holzinger et al., 2021) highlights possible approaches to implementing transparency and explainability in AI models, introducing the concept of multimodal causability, where an AI system uses pictures, text, and charts all at once, which could help the human user see cause and effect across different kinds of data.
The chart below displays the AI Credibility Heuristics: A Systematic Model, which explains how (similarly to Daniel Kahneman’s book “Thinking, Fast and Slow”), AI…
A movement called Responsible AI seeks to mitigate generative AIs’ known issues. Given the widespread use of AI and its increasing power of foundational models, it’s important these systems are created in a safe and responsible manner. While there have been calls to pause the development of large AI experiments (Future of Life Institute, 2023) so the world could catch up, this is unlikely to happen. There are several problems with the current generation of LLMs from OpenAI, Microsoft, Google, Nvidia, and others.
(Christiano, 2023) believes there are plenty of ways for bad outcomes (existential risk) even without extinction risk. In order to mitigate these risks (and perhaps to appease the public), all the major AI labs have taken steps to be more safe. Anthropic, which was founded by former OpenAI employees, after leaving the OpenAI over this very issue, led the movement by announcing responsible scaling policy (Anthropic’s Responsible Scaling Policy, 2023). OpenAI itself announced a dedicated “Superalignment” team, co-led by Ilya Sutskever and Jan Leike; they made a specific promise to commit 20% of its compute budget to build an AI system in the next 4 years, that can itself research and refine alignment methods, effectively solving the alignment problem for superintelligent AI (which is considered the highest risk) (Jan Leike & Ilya Sutskever, 2023). OpenAI has previously admitted, it does not yet fully understand how the internals of an neural network work; they are developing tools to represent neural network concepts for humans (Gao et al., 2024; OpenAI, 2024a). Outside of the major labs, several intependent AI safety organizations have also been launched, for example METR, the Model Evaluation & Threat Research incubated in the Alignment Research Center (METR, 2023).
A popular approach to AI safety is red-teaming, which means pushing the limits of LLMs, trying to get them to produce outputs that are racist, false, or otherwise unhelpful. Mapping the emerging abilities of new models is a job in itself.
Problem | Description |
Monolithicity | LLMs are massive monolithic models requiring large amounts of computing power for training to offer multi-modal capabilities across diverse domains of knowledge, making training such models possible for very few companies. @liuPrismerVisionLanguageModel2023 proposes future AI models may instead consist of a number networked domain-specific models to increase efficiency and thus become more scalable. |
Opaqueness | LLMs are opaque, making it difficult to explain why a certain prediction was made by the AI model. One visible expression of this problem are hallucinations, the language models are able to generate text that is confident and eloquent yet entirely wrong. Jack Krawczyk, the product lead for Google’s Bard (now renamed to Gemini): “Bard and ChatGPT are large language models, not knowledge models. They are great at generating human-sounding text, they are not good at ensuring their text is fact-based. Why do we think the big first application should be Search, which at its heart is about finding true information?” |
Biases and Prejudices | AI bias is well-documented and a hard problem to solve [@liangGPTDetectorsAre2023]. Humans don’t necessarily correct mistakes made by computers and may instead become “partners in crime” [@krugelAlgorithmsPartnersCrime2023]. People are prone to bias and prejudice. It’s a part of the human psyche. Human brains are limited and actively avoid learning to save energy. These same biases are likely to appear in LLM outputs as they are trained on human-produced content. Unless there is active work to try to counter and eliminate these biases from LLM output, they will appear frequently. |
Missing Data | LLMs have been pre-trained on massive amounts of public data, which gives them the ability for for reasoning and generating in a human-like way, yet they are missing specific private data, which needs to be ingested to augment LLMs ability to respond to questions on niche topics [@Liu_LlamaIndex_2022]. |
Data Contamination | Concerns with the math ability of LLMs. “performance actually reflects dataset contamination, where data closely resembling benchmark questions leaks into the training data, instead of true reasoning ability” @zhangCarefulExaminationLarge2024 |
Lack of Legislation | @anderljungFrontierAIRegulation2023 OpenAI proposes we need to proactively work on common standards and legislation to ensure AI safety. It’s difficult to come up with clear legislation; the U.K. government organized the first AI safety summit in 2023 @browneBritainHostWorld2023. |
In 2024, OpenAI released its “Model Spec” to define clearly their approach to AI safety with the stated intention to provide clear guidelines for the RLHF approach (OpenAI, 2024c).
Evolution of Models and Emerging Abilities
The debate between open source vs closed-source AI is ongoing. Historically, open-source has been useful for finding bugs in code as more pairs of eyes are looking at the code and someone may see a problem the programmers have not noticed. Proponents of closed-source development however worry about the dangers or releasing such powerful technology openly and the possibility of bad actors such as terrorists, hackers, violent governments using LLMs for malice. The question whether closed-sourced or open-sourced development will be lead to more AI safety is one of the large debates in the AI industry.
Personal AI assistants to date have been created by large tech companies, mostly using closed-source AI. However, open-source AI-models have opened up the avenue for smaller companies and even individuals for creating new AI-assistants - perhaps using the same underlying foundation model as the base, but adding new data, abilities, tools, or just innovating on the UI/UX stack. An explosion of personal AI assistants powered by foundation models can be found across use-cases. The following table only lists a tiny sample of such products.
App | Features |
---|---|
socratic.org | Study buddy |
youper.ai | Mental health helper |
fireflies.ai | Video call transcription |
murf.ai | Voice generator |
In any case, open or closed-sourced, real-world usage of LLMs may demonstrate the limitations and edge-cases of AI. Hackathons such as (Pete, 2023) help come up with new use-cases and disprove some potential ideas. The strongest proponent of Open Source AI, META, open-sourced the largest language model (70 billion parameters) which with performance rivaling several of the proprietary models; because META’s core business is not AI, rather it would benefit from having access to cheaper, better AI across the board, open-sourcing may be their best strategy (Dwarkesh Patel, 2024).
AI Model | Released | Company | License | Country |
---|---|---|---|---|
GPT-1 | 2018 | OpenAI | Open Source | U.S. |
GTP-2 | 2019 | OpenAI | Open Source | U.S. |
Turing-NLG | 2020 | Microsoft | Proprietary | U.S. |
GPT-3 | 2020 | OpenAI | Open Source | U.S. |
GPT-3.5 | 2022 | OpenAI | Proprietary | U.S. |
GPT-4 | 2023 | OpenAI | Proprietary | U.S. |
AlexaTM | 2022 | Amazon | Proprietary | U.S. |
NeMo | 2022 | NVIDIA | Open Source | U.S. |
PaLM | 2022 | Proprietary | U.S. | |
LaMDA | 2022 | Proprietary | U.S. | |
GLaM | 2022 | Proprietary | U.S. | |
BLOOM | 2022 | Hugging Face | Open Source | U.S. |
Falcon | 2023 | Technology Innovation Institute | Open Source | U.A.E. |
Tongyi | 2023 | Alibaba | Proprietary | China |
Vicuna | 2023 | Sapling | Open Source | U.S. |
Wu Dao 3 | 2023 | BAAI | Open Source | China |
LLAMA 2 | 2023 | META | Open Source | U.S. |
PaLM-2 | 2023 | Proprietary | U.S. | |
Claude 3 | 2024 | Anthropic | Proprietary | U.S. |
Mistral Large | 2024 | Mistral | Proprietary | France |
Gemini 1.5 | 2024 | Proprietary | U.S. | |
LLAMA 3 | 2024 | META | Open Source | U.S. |
AFM | 2024 | Apple | Proprietary | U.S. |
Viking 7B | 2024 | Silo | Open Source | Finland |
GPT-4.5 | 2025 | OpenAI | Proprietary | U.S. |
DeepSeek-R1 | 2025 | Hangzhou DeepSeek Artificial Intelligence Basic Technology Research Co., Ltd 杭州深度求索人工智慧基礎技術研究有限公司 | Open Source | China |
GPT-5 | 202? | OpenAI | Unknown; trademark registered | U.S. |
A foundational paper on the scaling laws of LLMs by (Kaplan et al., 2020) provided a quantitative road-map linking model, data, and compute to predict performance; helpful to guide large-scale invesment into LLMs. The proliferation of different models enables comparisons of performance based on several metrics from accuracy of responses to standardized tests such as GMAT usually taken my humans to reasoning about less well defined problem spaces. (Chiang et al., 2024; lmsys.org, 2024) open-source AI-leaderboard project has collected over 500 thousand human-ranking of outputs from 82 large-language models, evaluating reasoning capabilities, which as of 2024 rate GPT-4 and Claude 3 Opus as the top-performers. Model performance is not one-dimensional; (OpenAI, 2024b) show how GPT 4o combines different abilities into the same model, preserving more information, which in previous models was lost in data conversion (for example for images). Another metric is metacognition, defined as knowing about knowing (Metcalfe & Shimamura, 1994) or “keeping track of your own learning” as defined by educators in sustainability (an example of how the same term is useful across academic fields) (Zero Waste Europe et al., 2022). Anthropic’s Claude 3 was the first model capable of metacognition, promoting it as a feature, calling out a mistake made by itself (Shibu, 2024).
With the proliferation of AI models, AI benchmarking has developed into its own industry, with many ways to measure a model’s performance. In the early days (Hendrycks et al., 2020) revealed models’ uneven knowledge and lack in calibration, with the introduction of MMLU (Measuring Massive Multitask Language Understanding), a 57-task benchmark covering domains from elementary math to law, showing GPT-3 43.9% accuracy vs 89.8% human experts (19 points above random chance but far below human-expert level). Later models have reached or surpassed humans in this particular benchmark, necessitating the creation of newer, more difficult tests for AI. Anoter foundational AI paper, (Zellers et al., 2019)’s HellaSwag, is also accompanied by a leaderboard website (still being updated after publication) listing AI model performance most recent entry April 16, 2024.
Moreover, benchmarking is not only about the abilities, knowledge and aligment of the model itself. Interactions with other systems are equally important to measure, such as Retrieval Augmented Generation (RAG) performance. Generative AI applications retrieve data from unstructured external sources in order to augment LLMs existing knowledge with current information (Leng et al., Mon, 08/12/2024 - 19:46). (Ragas, 2023) suggests evaluating one’s RAG pipelines enables Metrics-Driven Development. Likewise, LangSmith, the developer platform for LLM-powered apps (which makes extensive use of RAG), dissects the LLM app lifecycle into a pipeline: debug, collaborate, test, and monitor (LangChain, 2024). As using unstructured inputs to generated structured data, is one of the core use cases of LLMs, conforming the outputs strictly to standards such as JSON is crucial (otherwise the production app might even break) - which is why OpenAI’s Structured Outputs, which guaranteed 100% reliability, was an important jump in AI adoption to mainstream app development (Pokrass, 2024).
Meta’s head AI researcher Yann LeCun predicts LLMs may have reached their limitations, for innovation AIs need to understand the physical world and do reasoning in abstract space, which does not require a language, i.e. something a cat could do when figuring out where to jump; in comparison, languages are simple because they are discrete, with very little noise (NVIDIA Developer, 2025).
Price of Tokens vs Price of Human Labor
At the end of the day, the adoption of AI to everyday life, even in the smallest of contexts, will come down to the price. Long-time AI-engineer (Ng, 2024) predicts, having seen the roadmaps for the microchip industries, as well as incoming hardware and software innovations, the price of tokens will be very low, and much lower than a comparative human worker.
Human Acceptance of Artificial Companions
Human Expectations Take Time to Change
AI acceptance is incumbent on traits that are increasingly human-like and would make a human be acceptable: credibility, trustworthiness, reliability, dependability, integrity, character, etc. (Zhang et al., 2023) found humans are more likely to trust an AI teammate if they are not deceived by it’s identity. It’s better for collaboration to make it clear, one is talking to a machine. One step towards trust is the explainability of AI-systems. AIs should disclose they are AIs.
(Zerilli et al., 2022) focuses on human factors and ergonomics and argues that transparency should be task-specific: while transparency is key to trust and system monitoring, it should extend beyond explainability; after AI makes an error, different forms of AI transparency: (1) explanations, (2) confidence metrics, (3) human control over task allocation - affect human confidence in the system and have diverse levels of ability to repair human trust in the AI. To expand on the third point discussed by this author, in adaptable allocation, the user always decides when to keep a task and when to hand it to the AI algorithm - and in adaptive allocation, the system decides itself (by monitoring its own uncertainty) when to give difficult or risky cases back to the human.
Humans still need some time to adjust their expectations of what’s possible using conversational AI interfaces. (Bailey, 2023) believes people are used to search engines and it will take a little bit time to get familiar with talking to a computer in natural language to accomplish their tasks. For example, new users of v0, an AI assistant for building user interfaces through conversation, would tell humans (the company make this app) about the issues they encounter, instead of telling the AI assistant directly, even though the AI in many cases would be able to fix the problem instantly; human users don’t yet necessarily expect computers to behave like another human, there’s inertia in the mental model of what computers are capable of, requiring the user interfaces to provide context and teaching humans how to interact with their AI coworkers(Rauch, 2024). Indeed, ChatGPT is already using buttons to explain context (Feifei Liu 刘菲菲, n.d.).
Speaking in the mother language of the users is a way to gain trust. English is still over-represented in current models so some local models focus on better understanding local context, such as the Finnish (“Silo AI’s New Release Viking 7B, Bridges the Gap for Low-Resource Languages,” 2024) focuses on Nordic languages. However, as time progresses, large, general-purpose LLMs may catch up and integrate all this knowledge - or even potentially being taught by the local models.
Affective Computing: Towards Friendly Machines
Rosalind Picard founded the field of affective computing, aiming to make computers more human-friendly, pioneering early approaches to recognizing human emotions with sensors and providing users experiences that take human emotion into account (Picard, 1997).
It’s not an overstatement to say that data from all the processes around us will define the future of computing (HIITTV, 2021). In the early examples, electrodermal activity of the skin and heart-rate variance data were used to detect the emotional state and stress level of the user (Velmovitsky et al., 2022; Zangróniz et al., 2017). This technology has since become mainstream in products such as Fitbit and the Apple Watch, among many others.
Personal experience:
Apple Watch features Fall Detection, which I’ve experienced personally. Riding my bicycle to the NCKU library after rain, I slipped and landed on my stomach on the pavement. Apple Watch immediately asked me: “It looks like you’ve taken a hard fall” and offered an option to call the ambulance. Fortunately I was OK but if I did need assistance, this AI algorithm delivered contextual help, which could save my health!
Affective computing doesn’t only rely on sensors. It make use of human language using the tools of computational linguistics. such as natural language processing (NLP). Since the first mainframe computers with rudimentary ability to use pattern matching to respond with simple text messages, humans have been drawn to discussions of our private lives with machines. If programmed to be helpful, computers don’t necessarily judge us like a human could. A famous anecdote is about the MIT ELIZA project (an early examples of a language model) developed by Joseph Weizenbaum in the mid-1960s, where the lab assistant Pamela McCorduck would dedicate extended time to talking to the machine in private. The machine was called ELIZA and would assume the persona of called DOCTOR to emulate a Rogerian psychotherapist, a style of person-centered therapy developed by Carl Rogers, from the core idea that positive psychological functioning is a inherently human motivation (Bassett, 2019; McCorduck, 2004; C. R. Rogers, 1995).
An excerpt from the classic, first-of-a-kind conversation between a human and a machine (called DOCTOR) in 1965, as documented by (Tarnoff, 2023; Weizenbaum, 1966)
User: Well, my boyfriend made me come here.
DOCTOR: YOUR BOYFRIEND MADE YOU COME HERE?
User: He says I’m depressed much of the time.
DOCTOR: I AM SORRY TO HEAR YOU ARE DEPRESSED.
User: It’s true. I am unhappy.
DOCTOR: DO YOU THINK COMING HERE WILL HELP YOU NOT TO BE UNHAPPY?
User: I need some help, that much seems certain.
DOCTOR: WHAT WOULD IT MEAN TO YOU IF YOU GOT SOME HELP?
User: Perhaps I could learn to get along with my mother.
DOCTOR: TELL ME MORE ABOUT YOUR FAMILY.
User: My mother takes care of me.
DOCTOR: WHO ELSE IN YOUR FAMILY TAKES CARE OF YOU?
User: My father.
DOCTOR: YOUR FATHER?
User: You are like my father in some ways.
DOCTOR: WHAT RESEMBLANCE DO YOU SEE?
Weizenbaum later expressed concerns about how easily humans might be misled by AIs, by projecting fantasies onto computer systems, cautioning technologists not to neglect human responsibility from societal problems; AI is not a universal solution (Z.M.L, 2023)
Artificial Empathy Also Builds Trust
Today’s machines are much more capable so it’s not a surprise humans would like to talk to them. One example is a conversational chatbot - or AI Friend -, called Replika, a computer model trained to be your companion in daily life. Replika was launched in 2017 and in 2024 was used by 30 million people; the focus is on empathetic dialogue to support mental well being, sort of like a friend, a digital companion, (or even a romantic partner, in paid versions of the app), and includes an animated avatar interface (Eugenia Kuyda, 2023). Replika can ask probing questions, tell jokes, and learning about your personality and preferences to generate more natural-sounding conversations.(Bardhan, 2022; Tristan Greene, 2022) report on anecdotal evidence from Reddit boards which shows how some users of the Replika AI companion app feel so much empathy towards the robot, they confuse it with a sentient being, while others are using verbal abuse and gendered slurs to fight with their AI companions. When the quality of AI responses becomes good enough, people begin to get confused. (Jiang et al., 2022) describes how Replika users in China using in 5 main ways, all of which rely on empathy. The company’s CEO insists it’s not trying to replace human relationship but to create an entirely new relationship category with the AI companion; there’s value for the users in more realistic avatars, integrating the experience further into users’ daily lives through various activities and interactions (Patel, 2024).
How humans express empathy towards the Replika AI companion |
---|
Companion buddy |
Responsive diary |
Emotion-handling program |
Electronic pet |
Tool for venting |
Suprisingly, humans can have emotionally deep conversations with robots. Jakob Nielsen notes two recent studies suggesting human deem AI-generated responses more empathetic than human responses, at times by a significant margin; however telling users the response is AI-generated reduces the perceived empathy (Ayers et al., 2023; Nielsen, 2024c; Yin et al., 2024). LLMs combined with voice, such as the Pi iOS app, provide an user experience, which (Ethan Mollick [@emollick], 2023) calls unnerving. The company provides emotional intelligence as a service and has developed its own proprietary LLM, called Inflection AI, which has raised over 1B USD in funding (A. Mittal, 2024). While startups are moving fast, traditional AI companies, with decades of AI experience, such as Google, are also developing an AI assistants for giving life advice (Goswami, 2023). The conversations can be topic-specific. For instance, (Unleash, 2017) used BJ Fogg’s tiny habits model to develop a sustainability-focused AI assistant at the Danish hackathon series Unleash, to encourage behavioral changes towards maintaining an aspirational lifestyle, nudged by a chatbot buddy.
On the output side, (Lv et al., 2022) studies the effect of cuteness of AI apps on users and found high perceived cuteness correlated with higher willingness to use the apps, especially for emotional tasks. Part of this is learning how to uses emojis in the right amount and at the right time; increasingly, emojis are a part of natural human language (Tay, 2023).
Already more than two decades ago, (Reeves & Nass, 1998) argued that humans expect computers to be like social actors, (not unlike humans or places), with very minimal cues from a machine (like a voice or screen avatar) triggering social behaviors.
Conversation: A Magical Starting Point of a Relationship
High quality conversations are somewhat magical in that they can establish trust and build rapport which humans. (Celino & Re Calegari, 2020) found in testing chatbots for survey interfaces that “[c]onversational survey lead to an improved response data quality.”
There are noticeable differences in the quality of the LLM output, which increases with model size. (Levesque et al., 2012) developed the Winograd Schema Challenge, looking to improve on the Turing test, by requiring the AI to display an understanding of language and context. The test consists of a story and a question, which has a different meaning as the context changes: “The trophy would not fit in the brown suitcase because it was too big” - what does the it refer to? Humans are able to understand this from context while a computer models would fail. Even GPT-3 still failed the test, but later LLMs have been able to solve this test correctly (90% accuracy) Kocijan et al. (2022). This is to say AI is in constant development and improving it’s ability to make sense of language.
ChatGPT is the first user interface (UI) built on top of GPT-4 by OpenAI and is able to communicate in a human-like way - using first-person, making coherent sentences that sound plausible, and even - confident and convincing. M. C. Wang Sarah (2023) ChatGPT reached 1 million users in 5 days and 6 months after launch has 230 million monthly active users. While it was the first, competing offers from Google (Gemini), Anthrophic (Claude), Meta (Llama) and others quickly followed starting a race for best performance across specific tasks including standardized tests from math to science to general knowledge and reasoning abilities.
OpenAI provides AI-as-a-service through its application programming interfaces (APIs), allowing 3rd party developers to build custom UIs to serve the specific needs of their customer. For example Snapchat has created a virtual friend called “My AI” who lives inside the chat section of the Snapchat app and helps people write faster with predictive text completion and answering questions. The APIs make state-of-the-art AI models easy to use without needing much technical knowledge. Teams at AI-hackathons have produced interfaces for problems as diverse as humanitarian crises communication, briefing generation, code-completion, and many others. While models are powerful, they still need access to other services and tools to be able to achieve the tasks, which humand do online on a daily basis; for this to be possible, the Model Context Protocol (MCP) standard provides the structure to link models to APIs in other services, especially useful in agentic workflows, where the model uses chain-of-thought reasoning and may call various other tools and services in the process (Heidel & Handa, 2025; Hungerford, 2025; Pandey & Freiberg, 2025).
ChatGPT makes it possible to evaluate AI models just by talking, i.e. having conversations with the machine and judging the output with some sort of structured content analysis tools. Cahan & Treutlein (2023) have conversations about science with AI. Brent A. Anders (Fall 2022 - Winter 2023) report on AI in education. Just as humans, AIs are continuously learning. (Ramchurn et al., 2021) discusses positive feedback loops in continually learning AI systems which adapt to human needs. (Kecht et al., 2023) suggests AI is even capable of learning business processes.
Multi-Modality: Natural Interactions with AI Systems, Agents and the Intention Economy
While AI outperforms humans on many tasks, humans are experts in multi-modal thinking, bridging diverse fields. Humans are multi-modal creatures by birth. To varied ability, we speak, see, listen using our biological bodies. AIs are becoming multi-modal by design to be able to match all the human modes of communication - increasing their humanity.
Multimodal model development is ongoing. Previously, providing multi-model features meant combining several AI models within the same interface. For example, on the input side, one model is used for human speech or image recognition which are transcribed into tokens that can be ingested into an LLM. On the output side, the LLM can generate instructions which are fed into an image / audio generation model or even computer code which can be ran on a virtual machine and then the output displayed inside the conversation. However, this is changing, with a single model able to handle several tasks internally (thus losing less data and context). By early 2024, widely available LLMs front-ends such as Gemini, Claude and ChatGPT have all released basic features for multi-modal communication. In the case of Google’s Gemini 1.5 Pro, one model is able to handle several types of prompts from text to images. Multimodal prompting however requires larger context windows, as of writing, limited to 1 million tokens in a private version allows combining text and images in the question directed to the AI, used to reason in examples such as a 44-minute Buster Keaton silent film or Apollo 11 launch transcript (404 pages) (Google, 2024).
(Fu et al., 2022) provides an overview of conversational AI, from a survey of over 100 peer-reviewed articles published 2018-2021 (a long time ago in terms of AI development), categorizing systems into (1) rule-based, (2) retrieval-based, and (3) generative types; generative transformer models have led the AI field, yet continue to face challenges with coherence over extended interactions and ensuring factual accuracy (hallucinations), retrieval-augmented tooling improves information accuracy, and reinforcement learning and fine-tuning approaches are effective in adjusting conversational style and safety; the authors also highlight that human evaluation for reinforcement learning is still required, as commonly used automated evaluation metrics for AI models, such as BLEU, ROUGE, and BERTScore have limited correlation with human judgments.
Paper Focus Area | Key Insight | Strengths | Limitations |
---|---|---|---|
Generative transformer models (GenAI) | Recent advancement in AI models | High language fluency, adaptability | Poor long-term coherence, struggles with facts |
Retrieval-augmented hybrids (RAG) | Retrieval methods enhance truthfulness | Improved factual grounding | Difficulty in integrating retrieved content |
Reinforcement-learning | Fine-tuning can steer conversational style and safety | Flexible style and safety alignment | High resource usage, sensitive to reward design |
Literature also delves into human-AI interactions on almost human-like level discussing what kind of roles can the AIs take. (Seeber et al., 2020) proposes a future research agenda for regarding AI assistants as teammates rather than just tools and the implications of such mindset shift. From assistant -> teammate -> companion -> friend The best help for anxiety is a friend. AIs are able to assume different roles based on user requirements and usage context. This makes AI-generated content flexible and malleable. The path from **Assistance* to Collaboration requires another level of trust. It’s not only what role the AI takes but how that affects the human. As humans have ample experience relating to other humans and as such the approach towards an assistants vs a teammate will vary. While (Lenharo, 2023) experimental study reports AI productivity gains, with DALL-E and ChatGPT being qualitatively better than former automation systems, we might still be 1-3 years away from systems that qualify as team-mates. Once AI reaches that level, would it change how do humans treat it? Not because the AI might be hurt, but because how it affects the psyche of the user: this is an area which needs much more attention. One researcher in this field Karpus et al. (2021) is concerned with humans treating AI badly and coins the term algorithm exploitation.
Context of Use, Where is the AI used? (Schoonderwoerd et al., 2021) focuses on human-centered design of AI-apps and multi-modal information display. It’s important to understand the domain where the AI is deployed in order to develop explanations. However, in the real world, how feasible is it to have control over the domain? Calisto et al. (2021) discusses multi-modal AI-assistant for breast cancer classification.
If we see the AI as being in human service. (David Johnston, 2023) proposes Smart Agents, “general purpose AI that acts according to the goals of an individual human”. AI agents can enable Intention Economy where one simply describes one’s needs and a complex orchestration of services ensues, managed by the the AI, in order to fulfill human needs Searls (2012). AI assistants provide help at scale with little to no human intervention in a variety of fields from finance to healthcare to logistics to customer support. OpenAI’s “A practical guide to building agents” defines and AI agents as “Agents are systems that independently accomplish tasks on your behalf.” and details step-by-step how to build one (OpenAI, 2025).
AI agents enable workflow automation, with reasoning capability, and taking actions across different tools, achieving the user’s original intent; what’s left for the user to do is to say what they want to achieve. As models get smarter, there’s less and less need to build workflows (chains of thought) manually, as they end up restricting the model instead of improving the output; the one use case would be to use a cheaper model with less intelligence and more guardrails set in code (Latent Space, 2025; Sengottuvelu, 2025). In software development, AI can already debug problems automatically. Apple uses data from bug reports to train AI models for improving their software (Saini, 2025). And it’s increasingly possible to generate entire apps from a prompt, using tools such as Bolt.new (Fanelli, 2024). The quality of LLM output depends on the quality of the provided prompt. (Zhou et al., 2022) reports creating an “Automatic Prompt Engineer” which automatically generates instructions that outperform the baseline output quality by using another model in the AI pipeline in front of the LLM to enhance the human input with language that is known to produce better quality. This approach however is a moving target as foundational models keep changing rapidly and the baseline might differ from today to 6 months later.
Mediated Experiences Set User Expectations
How AIs are represented in popular media shapes the way we think about AI companions. Some stories have AIs both in positive and negative roles, such as Star Trek and Knight Rider. In some cases like Her and Ex Machina, the characters may be complex and ambivalent rather than fitting into a simple positive or negative box. In Isaac Asimov’s books, the AIs (mostly in robot form) struggle with the 3 laws of robotics, raising thought-provoking questions.
AI Assistants in Media Portrayals mostly have some level of anthropomorphism through voice or image to be able to film; indeed, a purely text-based representation may be too boring an un-cinematic.
There have been dozens of AI-characters in the movies, TV-series, games, and (comic) books. In most cases, they have a physical presence or a voice, so they could be visible for the viewers. Some include KITT (Knight Industries Two Thousand).
Movie / Series / Game / Book | Character | Positive | Ambivalent | Negative |
---|---|---|---|---|
2001: A Space Odyssey | HAL 9000 | X | ||
Her | Samantha | X | ||
Alien | MU/TH/UR 6000 (Mother) | X | ||
Terminator | Skynet | X | ||
Summer Wars | Love Machine | X | ||
Marvel Cinematic Universe | Jarvis, Friday | X | ||
Knight Rider | KITT | X | ||
Knight Rider | CARR | X | ||
Star Trek | Data | X | ||
Star Trek | Lore | X | ||
Ex Machina | Kyoko | X | ||
Ex Machina | Ava | X | ||
Tron | Tron | X | ||
Neuromancer | Wintermute | X | ||
The Caves of Steel / Naked Sun | R. Daneel Olivaw | X | ||
The Robots of Dawn | R. Giskard Reventlov | X | ||
Portal | GLaDOS | X |
Roleplay Fits Computers Into Social Contexts: AI Friends and Anthropomorphism
Affective Design emerged from affective computing, with a focus on understanding user emotions to design UI/UX which elicits specific emotional responses (Reynolds, 2001). Calling a machine a friend is a proposal bound to turn heads. But if we take a step back and think about how children have been playing with toys since before we have records of history. It’s very common for children to imagine stories and characters in play - it’s a way to develop one’s imagination learn through roleplay. A child might have toys with human names and an imaginary friend and it all seems very normal. Indeed, if a child doesn’t like to play with toys, we might think something is wrong. Likewise, inanimate objects with human form have had a role to play for adults too. Anthropomorphic paddle dolls have been found from Egyptian tombs dated 2000 years B.C. (“Paddle Doll Middle Kingdom,” 2023): we don’t know if these dolls were for religious purposes, for play, or for something else, yet their burial with the body underlines their importance.
Is anthropomorphism, being human-like necessary? (Savings literature in the Money section says it is). Research on anthropomorphism in AI literature suggests that giving an AI assistant stronger human-like cues (high-anthropomorphism) rather than weaker ones (low-anthropomorphism) leads users to view it more favorably, and this effect operates through a shorter perceived psychological distance;yet, even though many studies confirm the benefits of anthropomorphism, the precise psychological pathway behind those benefits has rarely been dissected in depth (X. Li & Sung, 2021). Nonetheless, people are less likely to attribute humanness to an AI companion if they understand how the system works, thus higher algorithmic transparency may inhibit anthropomorphism (Liu & Wei, 2021).
Coming back closer to our own time, Barbie dolls are popular since their release in 1959 till today. Throughout the years, the doll would follow changing social norms, but retain in human figure. In the 1990s, a Tamagotchi is perhaps not a human-like friend but an animal-like friend, who can interact in limited ways.
How are conversational AIs different from dolls? They can respond coherently and perhaps that’s the issue - they are too much like humans in their communication. We have crossed the Uncanny Valley (where the computer-generated is nearly human and thus unsettling) to a place where is really hard to tell a difference. And if that’s the case, are we still playing?
Should the AI play a human, animal, or robot? Anthropomorphism can have its drawbacks; humans have certain biases and preconceptions that can affect human-computer interactions. For example, somewhat curiously, (Pilacinski et al., 2023) reports humans were less likely to collaborate with red-eyed robots.
The AI startups like Inworld and Character.AI have raised large rounds of funding to create characters, which can be plugged in into online worlds, and more importantly, remember key facts about the player, such as their likes and dislikes, to generate more natural-sounding dialogues (Wiggers, 2023).
(Morana et al., 2020) conducted a lab-based experiment (n = 183) showing a more anthropomorphic chatbot design boosts perceived social presence of the virtual advisor; social presence in turn influences recommendation adherence indirectly via trust; trust mediates the likelihood to follow its recommendations. As AIs became more expressive - socially present - and able to to roleplay, we can begin discussing some human-centric concepts and how people relate to other people. AI companions, AI partners, AI assistants, AI trainers - there are many roles for the automated systems that help humans in many activities, powered by AI models and algorithms.
(Erik Brynjolfsson, 2022) contrasts AI which emulates human intelligence with AI that augments human abilities, arguing that although the former can offer productivity gains, it risks concentrating wealth and reducing economic power of workers, coining the term Turing Trap. Plenty of research - both before and after AI-induces job losses - has documented the negative effects of unemployment on mental health (Anton Korinek, 2023; Dew et al., 1991; Susskind, 2017).
Non-Anthropomorphic, machine-like AIs have been with us for a while. The Oxford Internet Institute defines AI simply as “computer programming that learns and adapts” (Google & The Oxford Internet Institute, 2022). Google started using AI in 2001, when a simple machine learning model improved spelling mistakes while searching; now in 2023 most of Google’s products are are based on AI (Google, 2022). Throughout Google’s services, AI is hidden and calls no attention itself. It’s simply the complex system working behind the scenes to delivery a result in a barebones interface.
The rising availability of AI assistants may displace Google search with a more conversational user experience. Google itself is working on tools that could cannibalize their search product. The examples include Google Assistant, Google Gemini (previously known as Bard) and massive investments into new LLMs.
The number of AI-powered assistants is too large to list here. I’ve chosen a few select examples in the table below.
Product | Link | Description |
---|---|---|
Github CoPilot | personal.ai | AI helper for coding |
Google Translate | translate.google.com | |
Google Search | google.com | |
Google Interview Warmup | grow.google/certificates/interview-warmup | AI training tool |
Perplexity | [@hinesPerplexityAnnouncesAI2023] | perplexity.ai chat-based search |
Everything that existed before OpenAI’s GPT 4 has been blown out of the water. ChatGPT passes many exams meant for humans and is able to solve difficult tasks in scientific areas such as chemistry with just simple natural-language instructions (Bubeck et al., 2023; White, 2023). As late as in 2017, scientists were trying to create a program with enough natural-language understanding to extract basic facts from scientific papers (Stockton, 2017). This is a task which is trivial for modern LLMs.
Pre-2023 literature is somewhat limited when it comes to AI companions as the advantage of LLMs has significantly raised the bar for AI-advisor abilities as well as user expectations. Before AI, chatbots struggled with evolving human language, understanding the complexity of context, irregular grammar, slang, etc (Lower, 2017). Some evergreen advice most relates to human psychology, which has remained the same. (Haugeland et al., 2022) discusses hedonic user experience in chatbots and (Steph Hay, 2017) explains the relationship between emotions and financial AI. (Isabella Ghassemi Smith, 2019) early performance metrics of AI-driven features across financial markets show that AI outperforms traditional quant strategies, which will lead to wider adoption of autonomously generated investment signals.
Interfaces for Human-Computer Interaction
Speech Makes Computers Feel Real
There’s evidence across disciplines about the usefulness of AI assistants while concerns exist about the possibility of implementing privacy. One attempt at privacy is by Apple’s Foundation Language Models (AFM), which is split into a smaller on-device model and a server-side model, enabling processing of the most senstive data directly on the user’s device (Dang, 2024). Providing voice for the AI raises new ethical issues, as most voice assistants need to continuously record human speech and process it in data centers in the cloud.
Siri, Cortana, Google Assistant, Alexa, Tencent Dingdang, Baidu Xiaodu, Alibaba’s AliGenie - all rely on voice as their main interface. Voice has a visceral effect on the human psyche; since birth we recognize the voice of our mother. The voice of a loved one has a special effect. Voice is a integral part of the human experience. Machines that can use voice in an effective way are closer to representing and affecting human emotions. Voice assistants such as Apple’s Siri and Amazon’s Alexa are well-known yet Amazon’s Rohit Prasad thinks it can do so much more:
“Alexa is not just an AI assistant - it’s a trusted advisor and a companion” (Prasad, 2022).
(Şerban & Todericiu, 2020) suggests using the Alexa AI assistant in education during the pandemic, supported students and teachers human-like presence. The Alpha generation (born since 2010) and Beta (since 2025) are be first true native AI users. (Su & Yang, 2022) and (Su et al., 2023) reviewed papers on AI literacy in early childhood education and found a lack of guidelines and teacher expertise. (Szczuka et al., 2022) provides guidelines for voice AI and kids based on a longitudinal field study, which delved into children’s knowledge regarding the storage and data processing performed by AI voice assistants; published in the International Journal of Child-Computer Interaction, the study tracked children (n = 20, age M = 8.65 years) across 3 home visits over 5 weeks (each visit lasted 45–90 min), including interviews and hands-on interactions designed to probe children’s mental models, with the following key findings: (1) children made significantly more accurate statements about data processing than storage, (2) parental discussion predicted storage knowledge, and (3) better storage knowledge negatively correlated with willingness to share secrets. In order to cover these knowledge gaps in the earliest age, educational materials on AI have been available for children in kindergarten to primary school; for instance the (ReadyAI, 2020) book introduces the 5 big ideas of Human-AI interaction for children aged 2-8: perception (the use of sensors), representation and reasoning (data structures, algorithms, predictions), learning (recognizing patterns in data), natural interaction (emotion, language, expression recognition, even cultural knowledge), and finally, societal impact (biases, ethics, guidelines to avoid unfair outcomes). Finally, (Yang, 2022) proposes a curriculum for in-context teaching of AI for early childhood, explaining why AI literacy is essential: how life is affected by the core concepts of data-driven pattern recognition, prediction and the many algorithmic limitations - all, which should be taught in a culturally responsive, easy for young children to grasp manner, using inquiry(question)-based pedagogy to engage the learners meaningfully.
Design guidelines for optimal design performance can be extremely specific. (Casper Kessels, 2022) details 18 concrete do’s and don’ts, drawing on prior distraction research, to support driving safety and integrate seamlessly with the other interfaces in the vehicle, for instance:
“Auditory information should come from the same location as visual information” to minimize spatial attention shifts “Be aware of visual distraction. [S]ome drivers tend to direct their gaze towards the ‘source’ of the voice assistant when speaking. Make sure an interaction sequence does not cause unnecessary visual distraction” - example guidelines for voice assistants from (Casper Kessels, 2022).
Some research suggests that voice UI accompanied by a physical embodied system is the preferred by users in comparison with voice-only UI (Celino & Re Calegari, 2020).
Generative UIs Enable Flexibility of Use
The “grandfather” of user experience design, (Nielsen, 2024a) recounts how 30 years of work towards usability has largely failed - computers are still not accessible enough; however, he has hope Generative UI could offer a chance to provide levels of accessibility humans could not.
Computers are “difficult, slow, and unpleasant” (Nielsen, 2024a)
Data-driven design combined with GenAIs enables Generative User Interfaces (GenUI), with new UI interactions. The promise of GenUI is to dynamically provide an interface appropriate for the particular user and context. The advances in the capabilities of LLMs makes it possible to achieve user experience (UX) which previously was science fiction. AI is able to predict what kind of UI would the user need right now, based on the data and context. Generative UIs are largely invented in practice, based on user data analysis and experimentation, rather than being built in theory. Kelly Dern, a Senior Product Designer at Google lead a workshop in early 2024 on GenUI for product inclusion aiming to create “more accessible and inclusive [UIs for] users of all backgrounds”.(Matteo Sciortino, 2024) coins the phrase RTAG UIs “real-time automatically-generated UI interfaces” mainly drawing from the example of how his Netflix interface looks different from that of his sister’s because of their distinct usage patterns.
Nonetheless, (“On Nielsen’s Ideas about Generative UI for Resolving Accessibility,” 2024) is critical of GenUI because for the following reasons:
Problem | Description |
---|---|
Low Predictability | Does personalization mean the UI keeps changing? |
High Carbon Cost | AI-based personalization is computation-intensive |
Surveillance | Personalization needs large-scale data capture |
(Nielsen, 2024b) defines information scent as users’ ability to predict destination content from cues, such as link labels and context; clear descriptive labels emit a strong scent, guiding users, reducing bounce rates (users who leave quickly), and enhancing discoverability of content; in contrast, misleading labels break trust and drive users away. The idea of information scent is originally from Information Foraging theory from (Pirolli & Card, 1999), who adapt optimal foraging theory to human information seeking: users follow links as scent cues to maximise their rate of information gain.
However, with AI-chat and voice based interfaces, links lose some of their relevance, as users can receive more info from the AI, without having to navigate to a new page. With less focus on links, current AI UX is more about storytelling, psychology, and seamless design, with more focus on human-centered communication patterns, such as conversations. (Kate Moran & Sarah Gibbons, 2024) calls for “highly personalized, tailor-made interfaces that suit the needs of each individual”, which she terms Outcome-Oriented Design. We can generate better UIs (UI orchestration, crafting “systems of intent”, as (Nielsen, 2025) calls it) that are based on user data and would be truly personalized. (Crompton, 2021) highlights AI as decision-support for humans while differentiating between intended and unintended influence on human decisions. In all this literature and more, the keyword is intent, expressing what the human wants - and having the machines deliver that.
Human-computer interaction (HCI) has a long storied history since the early days of computing when getting a copy machine to work required specialized skill. Xerox Sparc lab focused on early human factors work and inspired a the field of HCI to make computers more human-friendly. Likewise, the history of attempts at making intelligent interfaces is extensive. (“Generative UI Design,” 2023; Kobetz, 2023) give an overview of the history of generative AI design tools, going back in time as far as 2012 when (Troiano & Birtolo, 2014) proposed genetic algorithms for UI design. As the old science fiction adage goes, when machines become more capable, they will eventually be capable of producing machines themselves. Before that happens, at least the software part of the machine can increasingly be generated by AI systems (i.e. machines making machines). Already a decade ago in 2014, the eminent journal Information Sciences decided to dedicate a special section to AI-generated software to call attention to this tectonic shift in software development (Reformat, 2014). Replit, a startup known for allowing user build apps in the web browser, released Openv0, a framework of AI-generated UI components. “Components are the foundation upon which user interfaces (UI) are built, and generative AI is unlocking component creation for front-end developers, transforming a once arduous process, and aiding them in swiftly transitioning from idea to working components” (Replit, 2023). Vercel introduced an open-source prototype UI-generator called V0 which used large language models (LLMs) to create code for web pages based on text prompts (Vercel, 2023). Other similar tools quickly following including Galileo AI, Uizard AutoDesigner and Visily (Who Benefits the Most from Generative UI, 2024). NVIDIA founder Jensen Huang makes the idea exceedingly clear, saying “Everyone is a programmer. Now, you just have to say something to the computer” (Leswing, 2023).
The usefulness of AI systems increases profoundly as they are integrated into existing products as services, which become akin to tools the AI can use when appropriate. (Joyce, 2024) highlights how Notion AI enables collaborating across teams, where AI becomes akin to one of the co-workers; AI influences UI design patterns and boost productivity by providing new features such as memory, recalling important discussions from past meetings, surfacing key insights, and generating reports in a variety of formats, personalized to the intended receiver.
A wide range of literature describes human-AI interactions, spread out over varied scientific disciplines. While the fields of application for AI are diverse, some key lessons can be transferred horizontally across fields of knowledge.
Field | Usage |
---|---|
Shipping | [@veitchSystematicReviewHumanAI2022] highlights the active role of humans in Human-AI interaction is autonomous self-navigating ship systems. |
Data Summarizaton | AI is great at summarizing and analyzing data [@petersGoogleChromeWill2023; @tuWhatShouldData2023] |
Childcare | Generate personalized bedtime stories |
Design Tools | [@DavidHoangHow2024] |
Usability Is the Bare Minimum of Good User Experience
Many researchers have discussed the user experience (UX) principles of designing AI products. The UX of AI (terms such as AI UX, IXD, and XAI have been used) is the subject of several usability guidelines for AI, which provide actionable advice for improving AI usability and UX - some of which I will list here.
(Combi et al., 2022) proposes a conceptual framework for XAI, analysis AI based on (1) Interpretability, (2) Understandability, (3) Usability, and (4) Usefulness. (Costa & Silva, 2022) highlights key UI/UX patterns for interaction design in AI systems and stragies to make AI behaviors transparent and controllable: including (1) interactive explanations, (2) human-in-the-loop controls, (3) logging of contextual decisions - all seamlessly integrated into user workflows. (“Why UX Should Guide AI,” 2021) argues that in order to avoid context blindness, (where the AI lacks awareness of the broader human intent) and foster trust and safe use, UX should (1) clarify limitations, (2) build clear feedback, (3) embed user override mechanisms, and (4) in general ensure users retain meaningful control over specialized AI algorithms. (Lexow, 2021) synthesizes expert interviews into five foundational AI-UX principles: (1) deeply understand the user and task context, (2) clearly communicate AI limitations, (3) balance automation with user control, (4) build fast, iterative feedback paths into the interface, and (5) ensure AI behaviour aligns ethically - and with your brand voice.
(Lennart Ziburski, 2018) emphasizes human-centered design for AI, including five key tenets: (1) starting from existing user workflows which can be augmented by AI, (2) under-promising/over-delivering on AI capabilities, (3) transparently explaining how the system works (data sources, trade-offs), (4) involving users in the learning loop, and (5) designing AI as an empowering tool rather than a black box. (Dávid Pásztor, 2018) offers seven principles for AI-powered products: (1) visually distinguish GenAI content, (2) explain underlying processes and data privacy, (3) set realistic user expectations, (4) test edge cases proactively, (5) ensure AI engineers have access to high quality training data, (6) deploy rigorous user-testing (7) use immediate feedback channels for continuous improvement. (Lew & Schumacher, 2020) likewise focuses on (1) high data quality, (2) context-sensitive feedback, and (3) transparent controls. (Soleimani, 2018) provides the longest list of human-friendly UI/UX patterns for AI, with very specific suggestions including like/dislike toggles, confidence indicators and criteria sliders, “why” insights, risk alerts, and opt-in controls: all to foster transparency, user control, and trust in algorithmic decisions. (Harvard Advanced Leadership Initiative, 2021) focuses on principle for effective human–AI interaction in adaptive interfaces, illustrating a case of Semantic Scholar, where researchers’ intelligence is augmented via recommendation, summarization, and question-answering, while emphasizing user control and verification mechanisms.
Many large corporations have released guidelines for Human-AI interaction as well. The AI UX team from Ericsson’s Experience Design Lab released one of the early reports, exploring the role of trust in AI services, suggesting to treat AIs an agents rather than tools; for the design to be successful, trust must embedded into the interface front and center, best measured on 4 categories, inspired by human relationships: (1) Competence, (2) Benevolence, (3) Integrity, and (4) Charisma (Mikael Eriksson Björling & Ahmed H. Ali, 2020). (Cheng et al., 2022) describes AI-based support systems for collaboration and team-work, underlining how higher trust leads to willingness to reuse the AI in the future, collaboration satisfaction, and perceived task quality. Google’s AI Principles project provides Google’s UX for AI library (Google, n.d.; Josh Lovejoy, n.d.). In (Design Portland, 2018), Lovejoy, lead UX designer at Google’s people-centric AI systems department (PAIR), reminds us that while AI offers need tools, user experience design needs to remain human-centered. While AI can find patterns and offer suggestions, humans should always have the final say.
Microsoft provides guidelines for Human-AI interaction, which provides useful heuristics categorized by context and time (Amershi et al., 2019; T. Li et al., 2022).
Context | Content |
---|---|
Initially | Clarify what it does; what are the limitations. |
During interaction | Offer timely help, show only what matters, while respecting norms and avoiding bias |
When wrong | Let users retry fast and make corrections; empower users to dismiss easily; explain why the system acted; be precise and in-scope |
Over time | Track changes and adapt from use; announce changes and update with care (so not to break the user’s work); invite feedback; show outcome of actions clearly; provide global settings |
The previous design wave before UX for AI was corporations understanding how crucial design is to their business. In the 2010s business consultancies began to recognize the importance of design and advising their clients on putting design in the center of their strategy, bringing user experience design to the core of their business operations. (McKeough, 2018). There’s a number of user interface design patterns that have proven successful across a range of social media apps. Such user interface (UX/UI) patterns have been copied from one app to another, to the extent that the largest apps share a similar look and feature set and the users are used to the same user experience. Common UX/UI parts include features such as the Feed, Stories, and Avatars, among many others. This phenomenon (or trend) has led some designers such as (Fletcher, 2023) and (Joe Blair, 2024) to be worried about UIs becoming average: more and more similar to the lowest common denominator. Yet, by using common UI parts from social media, users may have an easier time to accept the innovative parts, as they just look like new features inside the old interface. As new generations become increasingly used to talking to computers in natural language, the older interface patterns may gradually fade away.
Feature | Examples | Notes |
---|---|---|
Feed | Facebook, Instagram, Twitter, TikTok, etc | The original algorithmic discovery hub; increasingly ran by ever-more-powerful AI to surface personalized content - yet younger generations may prefer the privacy of stories. |
Post | Facebook, Instagram, Twitter, TikTok, etc, even Apple’s App Store | Persistent content mainly for long-term sharing; the original content type |
Stories | IG, FB, WhatsApp, SnapChat, TikTok, etc | Ephemeral content driven by FOMO(fear-of-missing-out) for casual behind-the-scenes sharing |
Comment | YouTube, Threads, Reddit, Medium, etc | Threaded conversationsfuel community engagementand discussion |
Reactions | Facebook, Instagram, Slack, Threads, but even LinkedIn and Github. | The feature has involved from a simple like button to more expressive emotions. |
There are also more philosophical approaches to Interface Studies. (David Hoang, 2022), the head of product design at Webflow, an AI-enabled website development platform, suggests taking cues from art studies to isolate the core problem: “An art study is any action done with the intention of learning about the subject you want to draw”. As a former art student, Hoang looks at an interface as “a piece of design is an artwork with function”. Indeed, art can be a way to see new paths forward, practicing “fictioning” to deal with problematic legacies (“Review of the 2023 Helsinki Biennial,” 2023). (Jarovsky, 2022) lists the numerous ways how AIs can mislead people, which she calls the AI UX dark patterns, and the U.S. FTC Act and the EU AI Act are attempting to manage.
Usability sets the baseline - but AI-interfaces are capable of much more. The user experience (UX) of AI is a topic under active development by all the largest online platforms. AI is usually a computer model that spits out a number between 0 and 1, a probability score or a prediction. UX is what we do with this number. Design starts with understanding human psychology. (Donghee Shin, 2020) looks at user experience through the lens of usability of algorithms; focusing on users’ cognitive processes allows one to appreciate how product features are received by the brain and transformed into experiences by interacting with the algorithm. The general public is familiar with the most famous AI helpers, ChatGPT, Apple’s Siri, Amazon’s Alexa, Microsoft’s Cortana, Google’s Assistant, Alibaba’s Genie, Xiaomi’s Xiao Ai, and many others. For general, everyday tasks, such as asking factual questions, controlling home devices, playing media, making orders, and navigating the smart city. Yet, as AI permeates all types of devices, (Bailey, 2023) believes people will increasingly use AI capabilities through UIs that are specific to a task rather than generalist interfaces like ChatGPT. Nonetheless, a generalist AI interface may still control those services, if asked to do so, so it may an ‘and’ rather than an ‘either/or’, when it comes to AI usage.
The application of user experience (UX) tenets to AI.
UX |
---|
Useful |
Valuable |
Usable |
Acessible |
Findable |
Desirable |
Credible |
1 | 2 | 3 |
---|---|---|
Reduce the time to task | Make the task easier | Personalize the experience for an individual |
Microsoft Co-Founder predicted in 1982 “personal agents that help us get a variety of tasks” (Bill Gates, 1982) and it was Microsoft that introduced the first widely available personal assistant in 1996, called Clippy, inside the Microsoft Word software. Clippy was among the first assistants to reach mainstream adoption, helping users not yet accustomed to working on a computer, to get their bearings (Tash Keuneman, 2022). Nonetheless, it was in many ways useless and intrusive, suggesting there was still little knowledge about UX and human-centered design. Gates never wavered though and is quoted in 2004 saying “If you invent a breakthrough in artificial intelligence, so machines can learn, that is worth 10 Microsofts” Lohr (2004). Gates updated his ideas in 2023 focuses on the idea of AI Agents (Gates, 2023).
With the advent of ChatGPT, the story of Clippy has new relevance as part of the history of AI Assistants. (Benjamin Cassidy, 2022) and (Abigail Cain, 2017) illustrate beautifully the story of Clippy and (Tash Keuneman, 2022) asks poignantly: “We love to hate Clippy — but what if Clippy was right?”. That is to say, might we try again? And Microsoft has been trying again, being one of the leading investors in the AI models that eventually make a better UX possible. Just one example is a project from Microsoft Research, which generates life-like speaking faces from a single image and voice clip, which could empower true-to-life avatars (S. Xu et al., 2024). However, purely on the economic side, processing human voice and images is several times more expensive than processing text messages (V. Mittal, 2025). More required processing power also means, these new interfaces are likely less sustainable.
AI Performance Under High-Stakes Situations
Today AI-based systems are already being used in high-stakes situations (medical, self-driving cars). Attempts to implement AI in medicine, where stakes are perhaps the highest, raising the requirements for ethical considerations, have been made since the early days of computing, as the potential to improve health outcomes is so high. Since CADUCEUS in the 1970s (in Kanza et al., 2021), the first automated medical decision making system, medical AI now provides diagnostic systems for symptoms and AI-assistants in medical imaging. Complicated radiology reports can be explained to patients using AI chatbots (Jeblick et al., 2022). The explanations are not only useful for patients but for doctors (and other medical professionals) as well. (Calisto et al., 2022) focuses on AI-human interactions in medical workflows and underscores the importance of output explainability; medical professionals who were given AI results with an explanation trusted the results more. (Lee et al., 2023) imagines an AI revolution in medicine using GPT models, providing improved tools for decreasing the time and money spent on administrative paperwork while providing a support system for analyzing medical data. For administrative tasks such as responding to patients’ questions, medical AI has already reached - or even exceeded - expert-level question-answering ability (Singhal et al., 2023). In an online text-based setting, patients rated answers from the AI better, and more empathetic, than answers from human doctors (Ayers et al., 2023). If anything, the adoption of AI in medicine has been too cautious. (Daisy Wolf & Pande Vijay, 2023) criticizes US healthcare’s slow adoption of technology and predicts AI will help healthcare leapfrog into a new era of productivity by acting more like a human assistant.
Communication with the patient is perhaps a low-hanging fruit, as there are numerous examples of AI-driven symptom checkers and AI-based FAQ-answering chatbots already commercially available, such as (“Health. Powered by Ada.” n.d.) and (Buoy Health, n.d.), which offer AI-based platforms to survey, track and understand one’s symptoms over time, while providing doctors patient data, which can be used for generating preliminary possible diagnosis, freeing up clinical resources. The Lark digital health coaching platform delivers support for diabetes, hypertension, and weight management, by integrating smart watches and smart scales, to provide evidence-based behavior change (Home - Lark Health, n.d.). The VP of user experience at Senseley discusses the Molly AI assistant, to chat, answer questions, and measure blood pressure; the main challenge is the healthcare system, where a small pilot project might work well, bureaucracy keeps the technology from being widely adopted (Women in AI, 2018). While discussion of these kind of tools and proposals of AI-based health monitoring systems have existed for a over a decade, recent advances in AI reliability have made it feasible to deploy them at scale. While ChatGPT is not built to be a medical tool, the interface is so easily available, its very common for patients to decode lab results using ChatGPT or ask for diagnosis when doctor time is scarce.(Eliza Strickland, 2023).
Example of ChatGPT explaining medical terminology in a blood report.
Today’s AI is already a technology which can augment human skills or replace skills that were lost due to an accident. For instance, (Dot Go, 2023) makes the camera the interaction device for people with vision impairment. (Nathan Benaich & Ian Hogarth, 2022) report notes the increasing AI deployment in critical infrastructure and biology, intensifying geopolitics in AI, growth of the safety research community.
Human-Computer Interactions Without a “Computer”
AI deeply affects Human-Computer Interactions even if the computer is invisible. The field of Human Factors and Ergonomics (HFE) emphasizes designing user experiences (UX) that cater to human needs (The International Ergonomics Association, 2019). Designers think through every interaction of the user with a system and consider a set of metrics at each point of interaction including the user’s context of use and emotional needs.
Software designers, unlike industrial designers, can’t physically alter the ergonomics of a device, which should be optimized for human well-being to begin with and form a cohesive experience together with the software. However, software designers can significantly reduce mental strain by crafting easy-to-use software and user-friendly user journeys. Software interaction design goes beyond the form-factor and accounts for human needs by using responsive design on the screen, aural feedback cues in sound design, and even more crucially, by showing the relevant content at the right time, making a profound difference to the experience, keeping the user engaged and returning for more. In the words of (Babich, 2019), “[T]he moment of interaction is just a part of the journey that a user goes through when they interact with a product. User experience design accounts for all user-facing aspects of a product or system”.
Drawing a parallel from narrative studies terminology, we can view user interaction as a heroic journey of the user to achieve their goals, by navigating through the interface until a success state - or facing failure. Storytelling has its part in interface design however designing for transparency is just as important, when we’re dealing with the user’s finances and sustainability data, which need to be communicated clearly and accurately, to build long-term trust in the service. For a sustainable investment service, getting to a state of success - or failure - may take years, and even longer. Given such long timeframes, how can the app provide support to the user’s emotional and practical needs throughout the journey?
(Tubik Studio, 2018) argues affordance measures the clarity of the interface to take action in user experience design, rooted in human visual perception, however, affected by knowledge of the world around us. A famous example is the door handle - by way of acculturation, most of us would immediately know how to use it - however, would that be the case for someone who saw a door handle for the first time? A similar situation is happening to the people born today. Think of all the technologies they have not seen before - what will be the interface they feel the most comfortable with?
For the vast majority of this study’s target audience (college students), social media can be assumed as the primary interface through which they experience daily life. The widespread availability of mobile devices, cheap internet access, and AI-based optimizations for user retention, implemented by social media companies, means this is the baseline for young adult users’ expectations (as of writing in 2020).
(Don Shin et al., 2020) proposes the model (fig. 10) of Algorithmic Experience (AX) “investigating the nature and processes through which users perceive and actualize the potential for algorithmic affordance” highlighting how interaction design is increasingly becoming dependent on AI. The user interface might remain the same in terms of architecture, but the content is improved, based on personalization and understanding the user at a deeper level.
In 2020 (when I proposed this thesis topic), Google had recently launched an improved natural language engine to better understand search queries (“Understanding Searches Better Than Ever Before,” 2019), which was considered the next step towards understanding human language semantics. The trend was clear, and different types of algorithms were already involved in many types of interaction design, however, we were in the early stages of this technology (and still are early in 2024). Today’s ChatGPT, Claude and Gemini have no problem understanding human semantics - yet are they intelligent?
Intelligence may be besides the point as long as AI becomes very good at reasoning. AI is a reasoning engine (Bubeck et al., 2023; Shipper, 2023; see Bailey, 2023 for a summary). That general observation applies to voice recognition, voice generation, natural language parsing, among others. Large consumer companies like McDonald’s are in the process of replacing human staff with AI assistants in the drive-through, which can do a better job in providing a personal service than human clerks, for whom it would be impossible to remember the information of thousands of clients. In (Barrett, 2019), in the words of Easterbrook, a previous CEO of McDonald’s “How do you transition from mass marketing to mass personalization?”
Do AI-Agents Need Anthropomorphism
(Yuan et al., 2022) surveyed mainland Chinese consumers (n = 210, no age range given), finding that users with high social anxiety lean on hedonic and emotional cues, especially a friendly anthropomorphic interface and a sense of affinity (when those cues are strong, their intention to adopt the AI assistant is as high, and sometimes higher, than that of users with low social anxiety) - in contrast, users with low social anxiety are influenced mainly by utilitarian cues such as accuracy and speed; these functional advantages carry less weight for the high social anxiety group. Perhaps a crude conclusion, but useful for design, would be, people with high social anxiety like cute things.
(X. Xu & Sar, 2018) survey (n = 522) examined how people perceive the minds of machines versus humans along agency (ability to act) and experience (ability to feel), finding among machines those with human-like appearance were seen as having the greatest agency and experience; being more familiar how technology works, correlated with rating machines to have higher agency but lower experience.
What are the next features that could improve the next-generation UX/UI of AI-based assistants? Should AIs look anthropomorphic or fade in the background? It’s an open question (depending on the use case and psychology of the user); perhaps we can expect a mix of both, depending on the context of use and goals of the particular AI. (Stone Skipper, 2022) sketches a vision of “[AI] blend into our lives in a form of apps and services” deeply ingrained into daily human activity. (Aschenbrenner, 2024) predicts “drop-in virtual coworkers”, AI-agents who are able to use computer systems like a human seamlessly replacing human employees.
Anthropomorphic AI User Interfaces | Non-Anthropomorphic AI User Interfaces |
---|---|
AI wife [@MyWifeDead2023] | Generative AI has enabled developers to create AI tools for several industries, including AI-driven website builders [@constandseHowAIdrivenWebsite2018] |
[@sarahperezCharacterAIA16zbacked2023] character AI | AI tools for web designers [@patrizia-slongoAIpoweredToolsWeb2020] |
Mourning for the ‘dead’ AI [@phoebearslanagic-wakefieldReplikaUsersMourn] | Microsoft Designer allows generating UIs just based on a text prompt [@microsoftMicrosoftDesignerStunning2023] |
AI for therapy [@broderickPeopleAreUsing2023] | personalized bed-time stories for kids generated by AI [@bedtimestory.aiAIPoweredStory2023] |
Mental health uses: AI for bullying [@sungParentsWorryTeens2023] |
Roleplay for Financial Robo-Advisors
Using AI and computerised models for financial prediction is not new. (Malliaris & Salchenberger, 1996) applied neural networks to financial forecasting nearly three decades ago, using training data on past volatilities and factors of the options market to predict future (next-day) implied volatility (i.e. volatility not observed directly in the market but back-calculated from option prices) of the S&P 100 index (tracks the largest companies) in the U.S., demonstrating early potential of AI in financial prediction. Such tools were intially of academic interest or only accessible to financial professional. Later on fintech (financial technology) startups began bringing computerized predictive power into user interfaces available to retail investors.
Robo-advisory is a fintech term that was in fashion largely before the arrival of AI assistants and has been thus superseded by newer technologies. Ideally, robo-advisors can be more dynamic than humans and respond to changes to quickly and cheaply, while human financial advisors are expensive and not affordable to most consumers. (Capponi et al., 2019) argues dynamism in understanding the client’s financial situation - which AI excels at - is a key component to providing the best advice.
“The client has a risk profile that varies with time and to which the robo-advisor’s investment performance criterion dynamically adapts”. The key improvement of personalized financial advice is understanding the user’s dynamic risk profile. - (Capponi et al., 2019)
In the early days of consumer-direct robo-advisory, Germany and the United Kingdom led the way with the most robo-advisory usage in Europe (Cowan, 2018). While Germany had 30+ robot-advisors on the market in 2019, with a total of 3.9 billion EUR under robotic management, it was far less than individual apps like Betterment managed in the US (Bankinghub, 2019). Already in 2017, several of the early robo-advisors apps shut down in the UK; ETFmatic gained the largest number of downloads by 2017, focusing exclusively on exchange-traded funds (ETFs), tracking stock-market indexes automatically, with much less sophistication, than their US counterparts - the app was bought by a bank in 2021 and closed down in 2023 (AltFi, 2017, 2021; “ETFmatic - Account Funding of EURO Accounts Ceases,” 2023; Silva, 2023).
Newer literature notes robo-advisor related research is scattered across disciplines (Zhu et al., 2024). (Brown, 2021) outlines how modern financial chatbots have evolved beyond simple Q&A to offer conversational, 24/7 support across banking, investment, insurance, and more, which reduces support costs while improving responsiveness, while freeing human agents for higher-value tasks. In India, research has been conducted on how AI advisors could assist with investors’ erratic behavior in stock market volatility situations, albeit without much success; India is a large financial market with more than 2000 fintechs (financial technology startups) since 2015 (Bhatia et al., 2020; Migozzi et al., 2023). (Barbara Friedberg, 2021) and (Slack, 2021) compare robo-advisors and share show before GenAI, financial chatbots were developed manually using a painstaking process that was slow and error-prone. Older financial robo-advisors, built by fintech companies aiming to provide personalized suggestions for making investments such as Betterment and Wealthfront were forced to upgrade their technology to keep up. Robo-advisors compete with community-investing such as hedge funds, mutual funds, copy-trading, and DAOs with treasuries - or can act as entry-points for these aforementioned modes of investment. However, robo-advisors typically do not have the type of social proof that community-based investment vehicle have, where the user may see the actions taken by other investors.
There’s research of anthropomorphism or the human-like attributes of robo-advisors, such as the aforementioned conversational chatbots, and whether anthropomorphism can affect adoption and risk preferences among customers. Several show that anthropomorphic robo-advisors, with stronger visual human-likeness, increase customer trust and reduce algorithm aversion (Deng & Chau, 2021; Ganbold et al., 2022; Hildebrand & Bergner, 2021; Plotkina et al., 2024). However it’s not clear, if this explanation is tied to the avatar. The question - does the user trust a robot or a human, or is there a possible combination - has been researched in other literature, which does not rely on images. (David et al., 2021) looks at the whether explainable AI could help adoption of financial AI assistants in an experimental study with players (n = 210) of an online investment game had to choose between: (a) human advice, (b) AI advice without explanation, or (c) AI advice paired with an explanation; the results showed no evidence of algorithm aversion (players did not prefer human advice over AI advice).
The most comprehensive meta-review of research on how AI chatbots could mimic humans, comes from (Feine et al., 2019), providing an entire taxonomy of social cues for conversational agents, including verbal, visual, auditory cues, as well as other indicators humans pay attention to, such as age, yawning, laughing, posture, clothing, etc. Because this is such a useful resource, I’ve adapted the findings in the table below.
Category | Sub-Category | Cue | Explanation |
---|---|---|---|
Verbal | Content | Apology | Agent expresses regret for an error |
Asking for permission | Requests user approval before acting | ||
Greeting and farewell | Opens or ends the conversation politely | ||
Joke | Humorous remark to entertain | ||
Name | Addresses the user by name | ||
Opinion conformity | Shows agreement with the user’s view | ||
Praise | Compliments the user | ||
Referring to past | Mentions shared history or earlier turns | ||
Self-disclosure | Reveals personal info about the agent | ||
Small talk | Casual, topic-light chatter | ||
Thanking | Expresses gratitude | ||
Verbal | Style | Abbreviations | Uses shortened words (e.g. “BTW”) |
Dialect | Adopts regional or cultural language variety | ||
Formality | Chooses formal vs casual register | ||
Lexical alignment | Mirrors the user’s word choices | ||
Lexical diversity | Varies vocabulary richness | ||
Politeness | Adds courteous markers (“please”, “could you”) | ||
Sentence complexity | Varies length and structure of sentences | ||
Strength of language | Uses mild vs intense wording | ||
Visual | Kinesics | Arm and hand gesture | Animated limb movements |
Eye movement | Gaze shifts or blinking | ||
Facial expression | Smiles, frowns, eyebrow raises, etc. | ||
Head movement | Nods, shakes, tilts | ||
Posture shift | Whole-body stance changes | ||
Visual | Proxemics | Background | Visual environment behind the agent |
Conversational distance | Apparent closeness to the user | ||
Visual | Appearance | 2D / 3D agent visualization | Flat icon vs full three-dimensional model |
Age | Apparent age of the avatar | ||
Attractiveness | Overall aesthetic appeal | ||
Clothing | Outfit style and details | ||
Color of agent | Dominant color palette | ||
Degree of human likeness | Cartoon-like to photo-real scale | ||
Facial feature | Eye shape, mouth style, etc. | ||
Gender | Male, female, neutral presentation | ||
Name tag | On-screen label with agent’s name | ||
Photorealism | Realistic rendering quality | ||
Visual | Text Styling | Emoticons | 😊 😂 👍 style graphics |
Typefaces | Font choice and typography tweaks | ||
Auditory | Voice Qualities | Gender of voice | Male, female, neutral timbre |
Pitch range | High- vs low-pitched speech | ||
Voice tempo | Speaking speed | ||
Volume | Loudness level | ||
Auditory | Vocalizations | Grunts and moans | Non-word hesitation sounds |
Laughing | Laughter audio | ||
Vocal segregates | “uh-huh”, “mm-hm”, etc | ||
Yawn | Audible yawning | ||
Invisible | Chronemics | First turn | Which party speaks first |
Response time | Delay before replying | ||
Invisible | Haptics | Tactile touch | Device vibration or touch feedback |
Temperature | Warmth or coolness cues |
Literature on fintech UX does share some basic tenets with AI UX on building user confidence. (Why Design Is Key to Building Trust in FinTech Star, 2021) lists essential tactics for building trust in fintech: (1) consistency in UI patterns, (2) transparent feedback, (3) clear error handling, and (4) educating users about data usage. (Sean McGowan, 2018) offers four guidelines for fintech apps: (1) understand domain complexities, (2) friction is necessary for safety - embrace it, (3) provide continuous and clear feedback, and (4) simplify complex financial information - this can build user confidence and reduce errors. (Cordeiro & Weevers, 2016) emphasizes designing for the “unhappy path” - negative experiences can shape users’ perception deeply, as bad memories carve strongly in their user experience - products which handle failures and edge cases gracefully, however, stand out and maintain satisfaction. (ROBIN DHANWANI, 2021) approaches UX problems from an organizational perspective, noting that in large organizations, UX issues can stem for lack of alignment between teams; the authors propose Design Jams as a potential solution to improve cross-team collaboration - design jams are cross-functional workshops, which can help teams align on user needs, generate rapid prototypes, and iteratively refine interfaces - which, in theory, could improve the adherence to the guidelines above noted.
References
Abigail Cain. (2017). The Life and Death of Microsoft Clippy, the Paper Clip the World Loved to Hate. In Artsy. https://www.artsy.net/article/artsy-editorial-life-death-microsoft-clippy-paper-clip-loved-hate.
AI Frontiers. (2018). Ilya Sutskever at AI Frontiers 2018: Recent Advances in Deep Learning and AI from OpenAI.
Alammar, J. (2018). The Illustrated Transformer. https://jalammar.github.io/illustrated-transformer/.
Alex Tamkin & Deep Ganguli. (2021). How Large Language Models Will Transform Science, Society, and AI. https://hai.stanford.edu/news/how-large-language-models-will-transform-science-society-and-ai.
Allport, G. W. (1979). The nature of prejudice (Unabridged, 25th anniversary ed). Addison-Wesley Pub. Co.
AltFi. (2017). ETFmatic app downloaded 100,000 times. In AltFi. https://www.altfi.com/article/3433_etfmatic_app_downloaded_100000_times.
AltFi. (2021). Belgium’s Aion Bank has acquired London robo-advisor ETFmatic. In AltFi. https://www.altfi.com/article/7686_belgiums-aion-bank-has-acquired-london-robo-advisor-etfmatic.
Altman, S. (2024). The Intelligence Age. https://ia.samaltman.com/.
Amershi, S., Weld, D., Vorvoreanu, M., Fourney, A., Nushi, B., Collisson, P., Suh, J., Iqbal, S., Bennett, P., Inkpen, K., Teevan, J., Kikin-Gil, R. & Horvitz, E. (2019, May). Guidelines for human-AI interaction. CHI 2019.
Anthropic’s Responsible Scaling Policy. (2023). https://www.anthropic.com/news/anthropics-responsible-scaling-policy.
Anton Korinek. (2023). Scenario Planning for an AGI Future. In IMF. https://www.imf.org/en/Publications/fandd/issues/2023/12/Scenario-Planning-for-an-AGI-future-Anton-korinek.
Aschenbrenner, L. (2024). SITUATIONAL AWARENESS: The Decade Ahead.
Ayers, J. W., Poliak, A., Dredze, M., Leas, E. C., Zhu, Z., Kelley, J. B., Faix, D. J., Goodman, A. M., Longhurst, C. A., Hogarth, M. & Smith, D. M. (2023). Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA Internal Medicine, 183(6), 589. https://doi.org/10.1001/jamainternmed.2023.1838
Babich, N. (2019). Interaction Design vs UX: What’s the Difference? In Adobe XD Ideas.
Bai, Y., Kadavath, S., Kundu, S., Askell, A., Kernion, J., Jones, A., Chen, A., Goldie, A., Mirhoseini, A., McKinnon, C., Chen, C., Olsson, C., Olah, C., Hernandez, D., Drain, D., Ganguli, D., Li, D., Tran-Johnson, E., Perez, E., … Kaplan, J. (2022). Constitutional AI: Harmlessness from AI Feedback. https://doi.org/10.48550/ARXIV.2212.08073
Bailey, J. (2023). AI in Education. In Education Next.
Bankinghub. (2019). Robo advisor – new standards in asset management. In BankingHub.
Barbara Friedberg. (2021). M1 Finance vs Betterment Robo Advisor Comparison-by Investment Expert.
Bardhan, A. (2022). Men Are Creating AI Girlfriends and Then Verbally Abusing Them. In Futurism.
Barrett, B. (2019). McDonald’s Acquires Machine-Learning Startup Dynamic Yield for $300 Million. Wired.
Bassett, C. (2019). The computational therapeutic: Exploring Weizenbaum’s ELIZA as a history of the present. AI & SOCIETY, 34(4), 803–812. https://doi.org/10.1007/s00146-018-0825-9
Benjamin Cassidy. (2022). The Twisted Life of Clippy. Seattle Met.
Bhatia, A., Chandani, A. & Chhateja, J. (2020). Robo advisory and its potential in addressing the behavioral biases of investors — A qualitative study in Indian context. Journal of Behavioral and Experimental Finance, 25, 100281. https://doi.org/10.1016/j.jbef.2020.100281
Bill Gates. (1982). Bill Gates on the Next 40 Years in Technology.
Bommasani, R., Hudson, D. A., Adeli, E., Altman, R., Arora, S., von Arx, S., Bernstein, M. S., Bohg, J., Bosselut, A., Brunskill, E., Brynjolfsson, E., Buch, S., Card, D., Castellon, R., Chatterji, N., Chen, A., Creel, K., Davis, J. Q., Demszky, D., … Liang, P. (2021). On the Opportunities and Risks of Foundation Models. https://doi.org/10.48550/ARXIV.2108.07258
Bonet-Jover, A., Sepúlveda-Torres, R., Saquete, E. & Martínez-Barco, P. (2023). A semi-automatic annotation methodology that combines Summarization and Human-In-The-Loop to create disinformation detection resources. Knowledge-Based Systems, 275, 110723. https://doi.org/10.1016/j.knosys.2023.110723
Bowman, S. R. (2023). Eight Things to Know about Large Language Models. https://doi.org/10.48550/ARXIV.2304.00612
Brent A. Anders. (Fall 2022 - Winter 2023). Why ChatGPT is such a big deal for education. C2C Digital Magazine, Vol. 1(18).
Brown, A. (2021). How Financial Chatbots Can Benefit Your Business. In Medium.
Bubeck, S., Chandrasekaran, V., Eldan, R., Gehrke, J., Horvitz, E., Kamar, E., Lee, P., Lee, Y. T., Li, Y., Lundberg, S., Nori, H., Palangi, H., Ribeiro, M. T. & Zhang, Y. (2023). Sparks of Artificial General Intelligence: Early experiments with GPT-4. https://doi.org/10.48550/ARXIV.2303.12712
Buoy Health: Check Symptoms & Find the Right Care. (n.d.). https://www.buoyhealth.com.
Cabitza, F., Campagner, A., Malgieri, G., Natali, C., Schneeberger, D., Stoeger, K. & Holzinger, A. (2023). Quod erat demonstrandum? - Towards a typology of the concept of explanation for the design of explainable AI. Expert Systems with Applications, 213, 118888. https://doi.org/10.1016/j.eswa.2022.118888
Cahan, P. & Treutlein, B. (2023). A conversation with ChatGPT on the role of computational systems biology in stem cell research. Stem Cell Reports, 18(1), 1–2. https://doi.org/10.1016/j.stemcr.2022.12.009
Calisto, F. M., Santiago, C., Nunes, N. & Nascimento, J. C. (2021). Introduction of human-centric AI assistant to aid radiologists for multimodal breast image classification. International Journal of Human-Computer Studies, 150, 102607. https://doi.org/10.1016/j.ijhcs.2021.102607
Calisto, F. M., Santiago, C., Nunes, N. & Nascimento, J. C. (2022). BreastScreening-AI: Evaluating medical intelligent agents for human-AI interactions. Artificial Intelligence in Medicine, 127, 102285. https://doi.org/10.1016/j.artmed.2022.102285
Calma, J. (2025). AI could consume more power than Bitcoin by the end of 2025. In The Verge. https://www.theverge.com/climate-change/676528/ai-data-center-energy-forecast-bitcoin-mining.
CapInstitute. (2023). Getting Real about Artificial Intelligence - Episode 4.
Capponi, A., Ólafsson, S. & Zariphopoulou, T. (2019). Personalized Robo-Advising : An Interactive Investment Process.
Casper Kessels. (2022). Guidelines for Designing an In-Car Voice Assistant. In The Turn Signal - a Blog About automotive UX Design. https://theturnsignalblog.com.
CatGPT. (2025). Why AI is more important than the Internet (Interview with Google Co-Founder, Sergey Brin).
CBS Mornings. (2023). Full interview: "Godfather of artificial intelligence" talks impact and potential of AI.
CBS Mornings. (2025). AI pioneer Geoffrey Hinton says world is not prepared for what’s coming.
Celino, I. & Re Calegari, G. (2020). Submitting surveys via a conversational interface: An evaluation of user acceptance and approach effectiveness. International Journal of Human-Computer Studies, 139, 102410. https://doi.org/10.1016/j.ijhcs.2020.102410
Cheng, X., Zhang, X., Yang, B. & Fu, Y. (2022). An investigation on trust in AI-enabled collaboration: Application of AI-Driven chatbot in accommodation-based sharing economy. Electronic Commerce Research and Applications, 54, 101164. https://doi.org/10.1016/j.elerap.2022.101164
Chiang, W.-L., Zheng, L., Sheng, Y., Angelopoulos, A. N., Li, T., Li, D., Zhang, H., Zhu, B., Jordan, M., Gonzalez, J. E. & Stoica, I. (2024). Chatbot arena: An open platform for evaluating LLMs by human preference. https://arxiv.org/abs/2403.04132
Christiano, P. (2021). My research methodology. In Medium. https://ai-alignment.com/my-research-methodology-b94f2751cb2c.
Christiano, P. (2023). My views on “doom.” In Medium. https://ai-alignment.com/my-views-on-doom-4788b1cd0c72.
Christiano, P., Leike, J., Brown, T. B., Martic, M., Legg, S. & Amodei, D. (2017). Deep reinforcement learning from human preferences. https://doi.org/10.48550/ARXIV.1706.03741
Combi, C., Amico, B., Bellazzi, R., Holzinger, A., Moore, J. H., Zitnik, M. & Holmes, J. H. (2022). A manifesto on explainability for artificial intelligence in medicine. Artificial Intelligence in Medicine, 133, 102423. https://doi.org/10.1016/j.artmed.2022.102423
Copet, J., Kreuk, F., Gat, I., Remez, T., Kant, D., Synnaeve, G., Adi, Y. & Défossez, A. (2023). Simple and Controllable Music Generation. https://doi.org/10.48550/ARXIV.2306.05284
Cordeiro, T. & Weevers, I. (2016). Design is No Longer an Option - User Experience (UX) in FinTech. In S. Chishti & J. Barberis (Eds.), The FinTech Book (pp. 34–37). John Wiley & Sons, Ltd. https://doi.org/10.1002/9781119218906.ch9
Costa, A. & Silva, F. (2022). Interaction Design for AI Systems: An oriented state-of-the-art. 2022 International Congress on Human-Computer Interaction, Optimization and Robotic Applications (HORA), 1–7. https://doi.org/10.1109/HORA55278.2022.9800084
Cowan, G. (2018). Robo Advisers Start to Take Hold in Europe. Wall Street Journal.
Crain, M. & Nadler, A. (2019). Political Manipulation and Internet Advertising Infrastructure. Journal of Information Policy, 9, 370–410. https://doi.org/10.5325/jinfopoli.9.2019.0370
Crompton, L. (2021). The decision-point-dilemma: Yet another problem of responsibility in human-AI interaction. Journal of Responsible Technology, 7–8, 100013. https://doi.org/10.1016/j.jrt.2021.100013
Daisy Wolf & Pande Vijay. (2023). Where Will AI Have the Biggest Impact? Healthcare. In Andreessen Horowitz. https://a16z.com/2023/08/02/where-will-ai-have-the-biggest-impact-healthcare/.
Dang, V. T. (2024). Inside Apple’s AI: Understanding the Architecture and Innovations of AFM Models. In Medium.
David, D. B., Resheff, Y. S. & Tron, T. (2021). Explainable AI and Adoption of Financial Algorithmic Advisors: An Experimental Study (No. arXiv:2101.02555). arXiv. https://arxiv.org/abs/2101.02555
David Hoang. (2022). Creating interface studies. https://www.proofofconcept.pub/p/creating-interface-studies.
David Johnston. (2023). Smart Agent Protocol - Community Paper Version 0.2. In Google Docs. https://docs.google.com/document/d/1cutU1SerC3V7B8epopRtZUrmy34bf38W–w4oOyRs2A/edit?usp=sharing.
Dávid Pásztor. (2018). AI UX: 7 Principles of Designing Good AI Products. https://uxstudioteam.com/ux-blog/ai-ux/.
De, D., El Jamal, M., Aydemir, E. & Khera, A. (2025). Social Media Algorithms and Teen Addiction: Neurophysiological Impact and Ethical Considerations. Cureus. https://doi.org/10.7759/cureus.77145
Deng, B. & Chau, M. (2021). Anthropomorphized financial robo-advisors and investment advice-taking behavior. Proceedings of the 27th Americas Conference on Information Systems (AMCIS 2021).
Design Portland. (2018). Humans Have the Final Say — Stories. In Design Portland. https://designportland.org/.
Dew, M. A., Penkower, L. & Bromet, E. J. (1991). Effects of Unemployment on Mental Health in the Contemporary Family. Behavior Modification, 15(4), 501–544. https://doi.org/10.1177/01454455910154004
Di Pizio, A. (2023). Sam Altman Says AI Will Make Businesses 30 Times More Productive: 2 Stocks Investors Will Want to Buy. In NASDAQ The Motley Fool. https://www.fool.com/investing/2023/06/23/sam-altman-ai-30-times-productive-2-stocks-buy/.
Dot Go. (2023). Dot Go. https://dot-go.app/.
Dwarkesh Patel. (2024). Mark Zuckerberg - Llama 3, $10B Models, Caesar Augustus, & 1 GW Datacenters.
Eliza Strickland. (2023). Dr. ChatGPT Will Interface With You Now. In IEEE Spectrum.
Epoch AI. (2024). Data on Notable AI Models.
Erik Brynjolfsson. (2022). The Turing Trap: The Promise & Peril of Human-Like Artificial Intelligence. In Stanford Digital Economy Lab. https://digitaleconomy.stanford.edu/news/the-turing-trap-the-promise-peril-of-human-like-artificial-intelligence/.
ETFmatic - Account funding of EURO accounts ceases. (2023). In r/eupersonalfinance.
Ethan Mollick [@emollick]. (2023). I think most interesting/unnerving fast demo of the future of AI chatbots is to use the Pi iOS app, which lets you have a phone call with a Large Language Model optimized for chat It isn’t the AI from “Her” yet, but you can start to see the path towards AI companions. https://t.co/agJU14ukBB. In Twitter.
Eugenia Kuyda. (2023). Replika. In replika.com. https://replika.com.
European Union. (2024). Regulation (EU) 2024/1689 on artificial intelligence (AI act).
Fanelli, A. (2024). Bolt.new, Flow Engineering for Code Agents, and $>$$8m ARR in 2 months as a Claude Wrapper. https://www.latent.space/p/bolt.
Feifei Liu 刘菲菲. (n.d.). Prompt Controls in GenAI Chatbots: 4 Main Uses and Best Practices. In Nielsen Norman Group. https://www.nngroup.com/articles/prompt-controls-genai/.
Feine, J., Gnewuch, U., Morana, S. & Maedche, A. (2019). A Taxonomy of Social Cues for Conversational Agents. International Journal of Human-Computer Studies, 132, 138–161. https://doi.org/10.1016/j.ijhcs.2019.07.009
Fletcher, J. (2023). Generative UI and the Downfall of Digital Experiences — The Swift Path to Average. In Medium.
Fu, T., Gao, S., Zhao, X., Wen, J. & Yan, R. (2022). Learning towards conversational AI: A survey. AI Open, 3, 14–28. https://doi.org/10.1016/j.aiopen.2022.02.001
Future of Life Institute. (2023). Pause Giant AI Experiments: An Open Letter.
Ganbold, O., Rose, A. M., Rose, J. M. & Rotaru, K. (2022). Increasing Reliance on Financial Advice with Avatars: The Effects of Competence and Complexity on Algorithm Aversion. Journal of Information Systems, 36(1), 7–17. https://doi.org/10.2308/ISYS-2021-002
Gao, L., la Tour, T. D., Tillman, H., Goh, G., Troll, R., Radford, A., Sutskever, I., Leike, J. & Wu, J. (2024). Scaling and evaluating sparse autoencoders. arXiv. https://doi.org/10.48550/ARXIV.2406.04093
Gates, B. (2023). AI is about to completely change how you use computers. In gatesnotes.com. https://www.gatesnotes.com/AI-agents.
Ge Wang. (2019). Humans in the Loop: The Design of Interactive AI Systems. In Stanford HAI. https://hai.stanford.edu/news/humans-loop-design-interactive-ai-systems.
Generative UI Design: Einstein, Galileo, and the AI Design Process. (2023). In Prototypr. https://prototypr.io/post/generative-ai-design.
Gent, E. (2023). A Cryptocurrency for the Masses or a Universal ID?: Worldcoin Aims to Scan all the World’s Eyeballs. IEEE Spectrum, 60(1), 42–57. https://doi.org/10.1109/MSPEC.2023.10006664
Gitcoin Passport — Sybil Defense. Made Simple. [@gitcoinpassport]. (2023). Why did Gitcoin choose to build @GitcoinPassport as an "aggregator" of anti-Sybil solutions? 🤔 Gitcoin Passport Workstream Co-Lead @kevinrolsen explains: https://t.co/QYgqp85QBm. In Twitter.
Goodfellow, I. J., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A. & Bengio, Y. (2014). Generative Adversarial Networks. arXiv. https://doi.org/10.48550/ARXIV.1406.2661
Google. (n.d.). Our Principles – Google AI. https://ai.google/principles.
Google. (2022). Google Presents: AI@ ‘22.
Google. (2024). Multimodal prompting with a 44-minute movie Gemini 1.5 Pro Demo.
Google & The Oxford Internet Institute. (2022). The A-Z of AI. https://atozofai.withgoogle.com/.
Goswami, R. (2023). Google reportedly building A.I. That offers life advice. In CNBC. https://www.cnbc.com/2023/08/16/google-reportedly-building-ai-that-offers-life-advice.html.
Gratch, J. & Fast, N. J. (2022). The power to harm: AI assistants pave the way to unethical behavior. Current Opinion in Psychology, 47, 101382. https://doi.org/10.1016/j.copsyc.2022.101382
Greylock. (2022). OpenAI CEO Sam Altman AI for the Next Era.
Harvard Advanced Leadership Initiative. (2021). Human-AI Interaction: From Artificial Intelligence to Human Intelligence Augmentation.
Haugeland, I. K. F., Følstad, A., Taylor, C. & Bjørkli, C. A. (2022). Understanding the user experience of customer service chatbots: An experimental study of chatbot interaction design. International Journal of Human-Computer Studies, 161, 102788. https://doi.org/10.1016/j.ijhcs.2022.102788
Health. Powered by Ada. (n.d.). In Ada. https://ada.com/.
Heidel, S. & Handa, N. (2025). MCP, reasoning, and multiple Responses API tools can work together [Tweet]. In Twitter.
Hendrycks, D., Burns, C., Basart, S., Zou, A., Mazeika, M., Song, D. & Steinhardt, J. (2020). Measuring Massive Multitask Language Understanding. https://doi.org/10.48550/ARXIV.2009.03300
HIITTV. (2021). Wojciech Szpankowski: Emerging Frontiers of Science of Information.
Hildebrand, C. & Bergner, A. (2021). Conversational robo advisors as surrogates of trust: Onboarding experience, firm perception, and consumer financial decision making. Journal of the Academy of Marketing Science, 49(4), 659–676. https://doi.org/10.1007/s11747-020-00753-z
Holbrook, J. (2018). Human-Centered Machine Learning. In Medium. https://medium.com/google-design/human-centered-machine-learning-a770d10562cd.
Holzinger, A., Keiblinger, K., Holub, P., Zatloukal, K. & Müller, H. (2023). AI for life: Trends in artificial intelligence for biotechnology. New Biotechnology, 74, 16–24. https://doi.org/10.1016/j.nbt.2023.02.001
Holzinger, A., Malle, B., Saranti, A. & Pfeifer, B. (2021). Towards multi-modal causability with Graph Neural Networks enabling information fusion for explainable AI. Information Fusion, 71, 28–37. https://doi.org/10.1016/j.inffus.2021.01.008
Home - Lark Health. (n.d.). https://www.lark.com/.
Hungerford, O. (2025). Modelcontextprotocol/servers: Model Context Protocol Servers. https://github.com/modelcontextprotocol/servers.
iGenius. (2020). Let’s talk about sustainable AI. In Ideas @ iGenius.
Ilya Sutskever. (2018). Ilya Sutskever at AI Frontiers : Progress towards the OpenAI mission.
Isabella Ghassemi Smith. (2019). Interview: Daniel Baeriswyl, CEO of Magic Carpet SeedLegals. https://seedlegals.com/resources/magic-carpet-the-ai-investor-technology-transforming-hedge-fund-strategy/.
Jan Leike & Ilya Sutskever. (2023). Introducing Superalignment. https://openai.com/index/introducing-superalignment/.
Jarovsky, L. (2022). Dark Patterns in AI: Privacy Implications. https://www.theprivacywhisperer.com/p/dark-patterns-in-ai-privacy-implications.
Jeblick, K., Schachtner, B., Dexl, J., Mittermeier, A., Stüber, A. T., Topalis, J., Weber, T., Wesp, P., Sabel, B., Ricke, J. & Ingrisch, M. (2022). ChatGPT Makes Medicine Easy to Swallow: An Exploratory Case Study on Simplified Radiology Reports. https://doi.org/10.48550/ARXIV.2212.14882
Jiang, Q., Zhang, Y. & Pian, W. (2022). Chatbot as an emergency exist: Mediated empathy for resilience via human-AI interaction during the COVID-19 pandemic. Information Processing & Management, 59(6), 103074. https://doi.org/10.1016/j.ipm.2022.103074
Joe Blair. (2024). Generative UI: The new front end of the internet? — Joe Blair. https://www.joe-blair.com/blog/the-new-front-end.
Josh Lovejoy. (n.d.). The UX of AI. In Google Design. https://design.google/library/ux-ai.
Joyce, C. (2024). The rise of Generative AI-driven design patterns. In Medium. https://uxdesign.cc/the-rise-of-generative-ai-driven-design-patterns-177cb1380b23.
Kanza, S., Bird, C. L., Niranjan, M., McNeill, W. & Frey, J. G. (2021). The AI for Scientific Discovery Network+. Patterns, 2(1), 100162. https://doi.org/10.1016/j.patter.2020.100162
Kaplan, J., McCandlish, S., Henighan, T., Brown, T. B., Chess, B., Child, R., Gray, S., Radford, A., Wu, J. & Amodei, D. (2020). Scaling Laws for Neural Language Models. arXiv. https://doi.org/10.48550/ARXIV.2001.08361
Kara Manke. (2022). ChatGPT architect, Berkeley alum John Schulman on his journey with AI. In Berkeley. https://news.berkeley.edu/2023/04/20/chatgpt-architect-berkeley-alum-john-schulman-on-his-journey-with-ai.
Karpus, J., Krüger, A., Verba, J. T., Bahrami, B. & Deroy, O. (2021). Algorithm exploitation: Humans are keen to exploit benevolent AI. iScience, 24(6), 102679. https://doi.org/10.1016/j.isci.2021.102679
Kate Moran & Sarah Gibbons. (2024). Generative UI and Outcome-Oriented Design. In Nielsen Norman Group. https://www.nngroup.com/articles/generative-ui/.
Kecht, C., Egger, A., Kratsch, W. & Röglinger, M. (2023). Quantifying chatbots’ ability to learn business processes. Information Systems, 102176. https://doi.org/10.1016/j.is.2023.102176
Khosravi, H., Shum, S. B., Chen, G., Conati, C., Tsai, Y.-S., Kay, J., Knight, S., Martinez-Maldonado, R., Sadiq, S. & Gašević, D. (2022). Explainable Artificial Intelligence in education. Computers and Education: Artificial Intelligence, 3, 100074. https://doi.org/10.1016/j.caeai.2022.100074
Kobetz, R. (2023). Decoding the future: The evolution of intelligent interfaces. In Medium. https://uxdesign.cc/decoding-the-future-the-evolution-of-intelligent-interfaces-ec696ccc62cc.
Kocijan, V., Davis, E., Lukasiewicz, T., Marcus, G. & Morgenstern, L. (2022). The Defeat of the Winograd Schema Challenge. https://doi.org/10.48550/ARXIV.2201.02387
Kreuk, F., Synnaeve, G., Polyak, A., Singer, U., Défossez, A., Copet, J., Parikh, D., Taigman, Y. & Adi, Y. (2022). AudioGen: Textually Guided Audio Generation. https://doi.org/10.48550/ARXIV.2209.15352
LangChain. (2024). Dynamic few-shot examples with LangSmith datasets. In LangChain Blog. https://blog.langchain.dev/dynamic-few-shot-examples-langsmith-datasets/.
Latent Space. (2025). Building Manus AI (first ever Manus Meetup).
Lee, P., Goldberg, C. & Kohane, I. (2023). The AI revolution in medicine: GPT-4 and beyond (1st ed.). Pearson.
Leino, K., Sen, S., Datta, A., Fredrikson, M. & Li, L. (2018). Influence-Directed Explanations for Deep Convolutional Networks. https://doi.org/10.48550/ARXIV.1802.03788
Leite, M. L., de Loiola Costa, L. S., Cunha, V. A., Kreniski, V., de Oliveira Braga Filho, M., da Cunha, N. B. & Costa, F. F. (2021). Artificial intelligence and the future of life sciences. Drug Discovery Today, 26(11), 2515–2526. https://doi.org/10.1016/j.drudis.2021.07.002
Leng, Q., Portes, J., Havens, S., Zaharia, M. & Carbin, M. (Mon, 08/12/2024 - 19:46). Long Context RAG Performance of LLMs. In Databricks. https://www.databricks.com/blog/long-context-rag-performance-llms.
Lenharo, M. (2023). ChatGPT gives an extra productivity boost to weaker writers. Nature, d41586-023-02270-9. https://doi.org/10.1038/d41586-023-02270-9
Lennart Ziburski. (2018). The UX of AI. https://uxofai.com/.
Leswing, K. (2023). Nvidia reveals new A.I. Chip, says costs of running LLMs will ’drop significantly’. In CNBC. https://www.cnbc.com/2023/08/08/nvidia-reveals-new-ai-chip-says-cost-of-running-large-language-models-will-drop-significantly-.html.
Levesque, H. J., Davis, E. & Morgenstern, L. (2012). The winograd schema challenge. Proceedings of the Thirteenth International Conference on Principles of Knowledge Representation and Reasoning, 552–561.
Lew, G. & Schumacher, R. M. J. (2020). AI and UX: Why artificial intelligence needs user experience. Apress.
Lexow, M. (2021). Designing for AI — a UX approach. In Medium. https://uxdesign.cc/artificial-intelligence-in-ux-design-54ad4aa28762.
Li, T., Vorvoreanu, M., DeBellis, D. & Amershi, S. (2022). Assessing human-AI interaction early through factorial surveys: A study on the guidelines for human-AI interaction. ACM Transactions on Computer-Human Interaction.
Li, X. & Sung, Y. (2021). Anthropomorphism brings us closer: The mediating role of psychological distance in User–AI assistant interactions. Computers in Human Behavior, 118, 106680. https://doi.org/10.1016/j.chb.2021.106680
Liang, P., Bommasani, R., Lee, T., Tsipras, D., Soylu, D., Yasunaga, M., Zhang, Y., Narayanan, D., Wu, Y., Kumar, A., Newman, B., Yuan, B., Yan, B., Zhang, C., Cosgrove, C., Manning, C. D., Ré, C., Acosta-Navas, D., Hudson, D. A., … Koreeda, Y. (2022). Holistic Evaluation of Language Models (No. arXiv:2211.09110). arXiv. https://arxiv.org/abs/2211.09110
Liu, B. & Wei, L. (2021). Machine gaze in online behavioral targeting: The effects of algorithmic human likeness on social presence and social influence. Computers in Human Behavior, 124, 106926. https://doi.org/10.1016/j.chb.2021.106926
lmsys.org. (2024). GPT-4-Turbo has just reclaimed the No. 1 spot on the Arena leaderboard again! Woah! We collect over 8K user votes from diverse domains and observe its strong coding & reasoning capability over others. In Twitter.
Lohr, S. (2004). Microsoft, Amid Dwindling Interest, Talks Up Computing as a Career. The New York Times.
Loizos, C. (2025). OpenAI’s planned data center in Abu Dhabi would be bigger than Monaco. In TechCrunch.
Lomas, N. (2024). Deal on EU AI Act gets thumbs up from European Parliament. In TechCrunch.
Lorenzo, D., Lorenzo, D. & Lorenzo, D. (2015). Daisy Ginsberg Imagines A Friendlier Biological Future. In Fast Company. https://www.fastcompany.com/3051140/daisy-ginsberg-is-natures-most-deadly-synthetic-designer.
Lower, C. (2017). Chatbots: Too Good to Be True? (They Are, Here’s Why). In Clinc.
Lv, X., Luo, J., Liang, Y., Liu, Y. & Li, C. (2022). Is cuteness irresistible? The impact of cuteness on customers’ intentions to use AI applications. Tourism Management, 90, 104472. https://doi.org/10.1016/j.tourman.2021.104472
Malliaris, M. & Salchenberger, L. (1996). Using neural networks to forecast the S&P 100 implied volatility. Neurocomputing, 10(2), 183–195. https://doi.org/10.1016/0925-2312(95)00019-4
Matteo Sciortino. (2024). Generative UI: How AI is automating the creation of digital interfaces. https://www.linkedin.com/pulse/generative-ui-how-ai-automating-creation-digital-matteo-sciortino-qa3yf/.
McCorduck, P. (2004). Machines who think: A personal inquiry into the history and prospects of artificial intelligence (25th anniversary update). A.K. Peters.
McCulloch, W. S. & Pitts, W. (1943). A logical calculus of the ideas immanent in nervous activity. The Bulletin of Mathematical Biophysics, 5(4), 115–133. https://doi.org/10.1007/BF02478259
McKeough, T. (2018). McKinsey Design Launches, Confirming the Importance of Design to Business. In Architectural Digest. https://www.architecturaldigest.com/story/mckinsey-design-consulting-group-confirms-the-importance-of-design-to-business.
Merritt, R. (2022). What Is a Transformer Model? In NVIDIA Blog. https://blogs.nvidia.com/blog/2022/03/25/what-is-a-transformer-model/.
Meta AI. (2023). AudioCraft: A simple one-stop shop for audio modeling. In Meta AI.
Metcalfe, J. & Shimamura, A. P. (Eds.). (1994). Metacognition: Knowing about Knowing. The MIT Press. https://doi.org/10.7551/mitpress/4561.001.0001
METR. (2023). https://metr.org/.
Migozzi, J., Urban, M. & Wójcik, D. (2023). “You should do what India does”: FinTech ecosystems in India reshaping the geography of finance. Geoforum, 103720. https://doi.org/10.1016/j.geoforum.2023.103720
Mikael Eriksson Björling & Ahmed H. Ali. (2020). UX design in AI, A trustworthy face for the AI brain. In Ericsson.
Mittal, A. (2024). Inflection-2.5: The Powerhouse LLM Rivaling GPT-4 and Gemini. In Unite.AI.
Mittal, V. (2025). Little by Little, a Little Becomes a Lot. https://inflection.ai/blog/little-by-little-a-little-becomes-a-lot.
Morana, S., Gnewuch, U., Jung, D. & Granig, C. (2020, June). The effect of anthropomorphism on investment decision-making with robo-advisor chatbots.
Moss, S. (2025). OpenAI CFO: Stargate targeting multiple locations in Texas, considering AI data centers in Pennsylvania, Oregon, and Wisconsin - DCD.
Mühlhoff, R. (2019). Human-aided artificial intelligence: Or, how to run large computations in human brains? Toward a media sociology of machine learning. https://doi.org/10.14279/DEPOSITONCE-11329
Nathan Benaich & Ian Hogarth. (2022). State of AI Report 2022. https://www.stateof.ai/.
Ng, A. (2024). AI Restores ALS Patient’s Voice, AI Lobby Grows, and more. In AI Restores ALS Patient’s Voice, AI Lobby Grows, and more. https://www.deeplearning.ai/the-batch/issue-264/?
Nick Clegg. (2023). How AI Influences What You See on Facebook and Instagram. In Meta.
Nielsen, J. (2024a). Accessibility Has Failed: Try Generative UI = Individualized UX. In Jakob Nielsen on UX.
Nielsen, J. (2024b). Information Scent: How Users Decide Where to Click. In Jakob Nielsen on UX.
Nielsen, J. (2024c). UX Roundup: AI Empathy Submit Buttons European Job Changes Runway AI Video Writing Questions for User Research Leonardo Sold Midjourney New Release. In Jakob Nielsen on UX.
Nielsen, J. (2025). No More User Interface? [Substack Newsletter]. In Jakob Nielsen on UX.
No Priors: AI, Machine Learning, Tech, & Startups. (2023). With Inceptive CEO Jakob Uszkoreit: Vols. Ep. 29.
Noble, S. M., Mende, M., Grewal, D. & Parasuraman, A. (2022). The Fifth Industrial Revolution: How Harmonious Human–Machine Collaboration is Triggering a Retail and Service [R]evolution. Journal of Retailing, 98(2), 199–208. https://doi.org/10.1016/j.jretai.2022.04.003
NVIDIA. (2025). NVIDIA CEO Jensen Huang Keynote at COMPUTEX 2025.
NVIDIA Developer. (2025). Frontiers of AI and Computing: A Conversation With Yann LeCun and Bill Dally NVIDIA GTC 2025.
O’Connor, S. & ChatGPT. (2023). Open artificial intelligence platforms in nursing education: Tools for academic progress or abuse? Nurse Education in Practice, 66, 103537. https://doi.org/10.1016/j.nepr.2022.103537
OECD. (2024). Defining AI incidents and related terms (No. 16).
On Nielsen’s ideas about generative UI for resolving accessibility. (2024). In Axbom $\bullet$ Digital Compassion. https://axbom.com/nielsen-generative-ui-failure/.
OpenAI. (2024a). Extracting Concepts from GPT-4. https://openai.com/index/extracting-concepts-from-gpt-4/.
OpenAI. (2024b). Hello GPT-4o. https://openai.com/index/hello-gpt-4o/.
OpenAI. (2024c). Introducing the Model Spec. https://openai.com/index/introducing-the-model-spec/.
OpenAI. (2025). A practical guide to building agents.
Ouyang, L., Wu, J., Jiang, X., Almeida, D., Wainwright, C. L., Mishkin, P., Zhang, C., Agarwal, S., Slama, K., Ray, A., Schulman, J., Hilton, J., Kelton, F., Miller, L., Simens, M., Askell, A., Welinder, P., Christiano, P., Leike, J. & Lowe, R. (2022). Training language models to follow instructions with human feedback. https://doi.org/10.48550/ARXIV.2203.02155
Paddle Doll Middle Kingdom. (2023). In The Metropolitan Museum of Art. https://www.metmuseum.org/art/collection/search/544216.
Pandey, S. & Freiberg, B. (2025). Introducing AWS Serverless MCP Server: AI-powered development for modern applications AWS Compute Blog. https://aws.amazon.com/blogs/compute/introducing-aws-serverless-mcp-server-ai-powered-development-for-modern-applications/.
Patel, N. (2024). Replika CEO Eugenia Kuyda says the future of AI might mean friendship and marriage with chatbots. In The Verge. https://www.theverge.com/24216748/replika-ceo-eugenia-kuyda-ai-companion-chatbots-dating-friendship-decoder-podcast-interview.
Pavlik, J. V. (2023). Collaborating With ChatGPT: Considering the Implications of Generative Artificial Intelligence for Journalism and Media Education. Journalism & Mass Communication Educator, 78(1), 84–93. https://doi.org/10.1177/10776958221149577
Pete. (2023). We hosted #emergencychatgpthackathon this past Sunday for the new ChatGPT and Whisper APIs. It all came together in just 4 days, but we had 250+ people and 70+ teams demo! Here’s a recap of our winning demos: https://t.co/6o1PvR9gRJ. In Twitter.
Picard, R. W. (1997). Affective computing. MIT Press.
Pilacinski, A., Pinto, A., Oliveira, S., Araújo, E., Carvalho, C., Silva, P. A., Matias, R., Menezes, P. & Sousa, S. (2023). The robot eyes don’t have it. The presence of eyes on collaborative robots yields marginally higher user trust but lower performance. Heliyon, 9(8), e18164. https://doi.org/10.1016/j.heliyon.2023.e18164
Pirolli, P. & Card, S. (1999). Information foraging. Psychological Review, 106(4), 643–675. https://doi.org/10.1037/0033-295X.106.4.643
Plotkina, D., Orkut, H. & Karageyim, M. A. (2024). Give me a human! How anthropomorphism and robot gender affect trust in financial robo-advisory services. Asia Pacific Journal of Marketing and Logistics, 36(10), 2689–2705. https://doi.org/10.1108/APJML-09-2023-0939
Pokrass, M. (2024). Introducing Structured Outputs in the API. In OpenAI. https://openai.com/index/introducing-structured-outputs-in-the-api/.
Prasad, R. (2022). How will Alexa, Amazon’s AI voice assistant, advance by talking to us less? In Web Summit. https://websummit.com/blog/tech/alexa-amazon-ai-voice-assistant-podcast/.
Qiu, T. (2021). A Psychiatrist’s Perspective on Social Media Algorithms and Mental Health Stanford HAI. https://hai.stanford.edu/news/psychiatrists-perspective-social-media-algorithms-and-mental-health.
Radford, A., Narasimhan, K., Salimans, T. & Sutskever, I. (2018). Improving language understanding by generative pre-training. OpenAI.
Ragas. (2023). Metrics-Driven Development. https://docs.ragas.io/en/stable/concepts/metrics_driven.html.
Ramchurn, S. D., Stein, S. & Jennings, N. R. (2021). Trustworthy human-AI partnerships. iScience, 24(8), 102891. https://doi.org/10.1016/j.isci.2021.102891
Rauch, G. (2024). A fascinating finding from @v0 has been that when something fails, newcomers’ instincts are to tell *us*, @vercel, about it, but if they had told the AI, in most cases it would fix the issue immediately and flawlessly. I think the inertia comes from the fact that it’s so. In Twitter.
ReadyAI. (2020). Human-AI Interaction: How We Work with Artificial Intelligence.
Reeves, B. & Nass, C. I. (1998). The media equation: How people treat computers, television, and new media like real people and places (1. paperback ed). CSLI Publications.
Reformat, M. (2014). Special section: Applications of computational intelligence and machine learning to software engineering. Information Sciences, 259, 393–395. https://doi.org/10.1016/j.ins.2013.11.019
Replit. (2023). Replit — Openv0: The Open-Source, AI-Driven Generative UI Component Framework. In Replit Blog. https://blog.replit.com/openv0-spotlight.
Review of the 2023 Helsinki Biennial. (2023). In Berlin Art Link. https://www.berlinartlink.com/2023/07/21/review-2023-helsinki-biennial-wilderness/.
Reynolds, C. (2001). Designing for affective interactions.
ROBIN DHANWANI. (2021). Fintech UI/UX Design: Driving Growth by Creating a Better User Experience Parallel - Blog. https://www.parallelhq.com/blog/fintech-ui-ux-design.
Rogers, C. R. (1995). A way of being. Houghton Mifflin Co.
Rogers, Y. (2022). The Four Phases of Pervasive Computing: From Vision-Inspired to Societal-Challenged. IEEE Pervasive Computing, 21(3), 9–16. https://doi.org/10.1109/MPRV.2022.3179145
Romain Beaumont. (2022). LAION-5B: A NEW ERA OF OPEN LARGE-SCALE MULTI-MODAL DATASETS. https://laion.ai/blog/laion-5b.
Saini, R. (2025). Apple Uses Bug Report Data for AI Training in iOS 18.5 Beta. In The Mac Observer.
San Roman, R., Adi, Y., Deleforge, A., Serizel, R., Synnaeve, G. & Défossez, A. (2023). From discrete tokens to high-fidelity audio using multi-band diffusion. arXiv Preprint arXiv:
Schoonderwoerd, T. A. J., Jorritsma, W., Neerincx, M. A. & van den Bosch, K. (2021). Human-centered XAI: Developing design patterns for explanations of clinical decision support systems. International Journal of Human-Computer Studies, 154, 102684. https://doi.org/10.1016/j.ijhcs.2021.102684
Schuhmann, C., Beaumont, R., Vencu, R., Gordon, C., Wightman, R., Cherti, M., Coombes, T., Katta, A., Mullis, C., Wortsman, M., Schramowski, P., Kundurthy, S., Crowson, K., Schmidt, L., Kaczmarczyk, R. & Jitsev, J. (2022). LAION-5B: An open large-scale dataset for training next generation image-text models. https://doi.org/10.48550/ARXIV.2210.08402
Sean McGowan. (2018). UX Design For FinTech: 4 Things To Remember. In Usability Geek. https://usabilitygeek.com/ux-design-fintech-things-to-remember/.
Searls, D. (2012). The intention economy: When customers take charge. Harvard Business Review Press.
Seeber, I., Bittner, E., Briggs, R. O., de Vreede, T., de Vreede, G.-J., Elkins, A., Maier, R., Merz, A. B., Oeste-Reiß, S., Randrup, N., Schwabe, G. & Söllner, M. (2020). Machines as teammates: A research agenda on AI in team collaboration. Information & Management, 57(2), 103174. https://doi.org/10.1016/j.im.2019.103174
Sengottuvelu, R. (2025). Rethinking how we Scaffold AI Agents - Rahul Sengottuvelu, Ramp. YouTube.
Şerban, C. & Todericiu, I.-A. (2020). Alexa, what classes do I have today? The use of artificial intelligence via smart speakers in education. Procedia Computer Science, 176, 2849–2857. https://doi.org/10.1016/j.procs.2020.09.269
Shahaf, D. & Amir, E. (2007). Towards a theory of AI completeness. AAAI Spring Symposium: Logical Formalizations of Commonsense Reasoning.
Shenoi, S. (2018). Participatory design and the future of interaction design. In Medium. https://uxdesign.cc/participatory-design-and-the-future-of-interaction-design-81a11713bbf.
Shibu, S. (2024). Model From OpenAI Rival Anthropic Shows ’Metacognition’: Report. In Entrepreneur. https://www.entrepreneur.com/business-news/model-from-openai-rival-anthropic-shows-metacognition/470823.
Shin, Donghee. (2020). How do users interact with algorithm recommender systems? The interaction of users, algorithms, and performance. Computers in Human Behavior, 109, 106344. https://doi.org/10.1016/j.chb.2020.106344
Shin, Don, Zhong, B. & Biocca, F. (2020). Beyond user experience: What constitutes algorithmic experiences? International Journal of Information Management, 52, 102061. https://doi.org/10.1016/j.ijinfomgt.2019.102061
Shipper, D. (2023). GPT-4 Is a Reasoning Engine. https://every.to/chain-of-thought/gpt-4-is-a-reasoning-engine.
Silo AI’s new release Viking 7B, bridges the gap for low-resource languages. (2024). In Tech.eu. https://tech.eu/2024/05/15/silo-ai-s-new-release-viking-7b-bridges-the-gap-for-low-resource-languages/.
Silva, F. C. da. (2023). ETFmatic Review.
Singer, U., Polyak, A., Hayes, T., Yin, X., An, J., Zhang, S., Hu, Q., Yang, H., Ashual, O., Gafni, O., Parikh, D., Gupta, S. & Taigman, Y. (2022). Make-A-video: Text-to-video generation without text-video data. ArXiv, abs/2209.14792.
Singhal, K., Tu, T., Gottweis, J., Sayres, R., Wulczyn, E., Hou, L., Clark, K., Pfohl, S., Cole-Lewis, H., Neal, D., Schaekermann, M., Wang, A., Amin, M., Lachgar, S., Mansfield, P., Prakash, S., Green, B., Dominowska, E., Arcas, B. A. y, … Natarajan, V. (2023). Towards Expert-Level Medical Question Answering with Large Language Models (No. arXiv:2305.09617). arXiv. https://arxiv.org/abs/2305.09617
Slack, J. (2021). The Atura Process. In Atura website. https://atura.ai/docs/02-process/.
Sohl-Dickstein, J. (2024). The boundary of neural network trainability is fractal (No. arXiv:2402.06184). arXiv. https://arxiv.org/abs/2402.06184
Soleimani, L. (2018). 10 UI Patterns For a Human Friendly AI. In Medium. https://blog.orium.com/10-ui-patterns-for-a-human-friendly-ai-e86baa2a4471.
Soundarya Jayaraman. (2023). How Big Is Big? 85+ Big Data Statistics You Should Know in 2023. In G2.
Stanford Encyclopedia of Philosophy. (2021). The Turing Test. https://plato.stanford.edu/entries/turing-test/.
Steph Hay. (2017). Eno - Financial AI Understands Emotions. In Capital One. https://www.capitalone.com/tech/machine-learning/designing-a-financial-ai-that-recognizes-and-responds-to-emotion/.
Stockton, N. (2017). If AI Can Fix Peer Review in Science, AI Can Do Anything. Wired.
Stone Skipper. (2022). How AI is changing “interactions.” In Medium. https://uxplanet.org/how-ai-is-changing-interactions-179cc279e545.
Su, J., Ng, D. T. K. & Chu, S. K. W. (2023). Artificial Intelligence (AI) Literacy in Early Childhood Education: The Challenges and Opportunities. Computers and Education: Artificial Intelligence, 4, 100124. https://doi.org/10.1016/j.caeai.2023.100124
Su, J. & Yang, W. (2022). Artificial intelligence in early childhood education: A scoping review. Computers and Education: Artificial Intelligence, 3, 100049. https://doi.org/10.1016/j.caeai.2022.100049
Susskind, D. (2017). A model of technological unemployment.
Szczuka, J. M., Strathmann, C., Szymczyk, N., Mavrina, L. & Krämer, N. C. (2022). How do children acquire knowledge about voice assistants? A longitudinal field study on children’s knowledge about how voice assistants store and process data. International Journal of Child-Computer Interaction, 33, 100460. https://doi.org/10.1016/j.ijcci.2022.100460
Taleb, N. N. (2012). Antifragile: Things that gain from disorder (1st ed). Random House.
Tarnoff, B. (2023). Weizenbaum’s nightmares: How the inventor of the first chatbot turned against AI. The Guardian.
Tash Keuneman. (2022). We love to hate Clippy — but what if Clippy was right? In UX Collective. https://uxdesign.cc/we-love-to-hate-clippy-but-what-if-clippy-was-right-472883c55f2e.
Tay, A. (2023). Why science needs a protein emoji. Nature. https://doi.org/10.1038/d41586-023-00674-1
The International Ergonomics Association. (2019). Human Factors/Ergonomics (HF/E). https://iea.cc/what-is-ergonomics/.
Tristan Greene. (2022). Confused Replika AI users are trying to bang the algorithm. In TNW. https://thenextweb.com/news/confused-replika-ai-users-are-standing-up-for-bots-trying-bang-the-algorithm.
Troiano, L. & Birtolo, C. (2014). Genetic algorithms supporting generative design of user interfaces: Examples. Information Sciences, 259, 433–451. https://doi.org/10.1016/j.ins.2012.01.006
TruEra. (2023). TruLens. https://www.trulens.org.
Tubik Studio. (2018). UX Design Glossary: How to Use Affordances in User Interfaces. In Medium. https://uxplanet.org/ux-design-glossary-how-to-use-affordances-in-user-interfaces-393c8e9686e4.
Turing, A. M. (1950). I.—COMPUTING MACHINERY AND INTELLIGENCE. Mind, LIX(236), 433–460. https://doi.org/10.1093/mind/LIX.236.433
Twitter. (2023). Twitter’s Recommendation Algorithm. Twitter.
Understanding searches better than ever before. (2019). In Google. https://blog.google/products/search/search-language-understanding-bert/.
Unleash. (2017). Sebastian.ai. In UNLEASH.
Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, L. & Polosukhin, I. (2017). Attention Is All You Need. https://doi.org/10.48550/ARXIV.1706.03762
Velmovitsky, P. E., Alencar, P., Leatherdale, S. T., Cowan, D. & Morita, P. P. (2022). Using apple watch ECG data for heart rate variability monitoring and stress prediction: A pilot study. Frontiers in Digital Health, 4, 1058826. https://doi.org/10.3389/fdgth.2022.1058826
Vercel. (2023). Introducing v0: Generative UI.
Waddell, K. (2018). AI might need a therapist, too. In Axios. https://www.axios.com/2018/06/27/ai-might-need-a-psychologist-1529700757.
Wang, B. (2025). OpenAI Stargate Phase 1 Construction of 200 Megawatts and 980,000 Square Feet. In NextBigFuture.
Wang, M. C., Sarah. (2023). The Economic Case for Generative AI and Foundation Models. In Andreessen Horowitz. https://a16z.com/2023/08/03/the-economic-case-for-generative-ai-and-foundation-models/.
Weizenbaum, J. (1966). ELIZA—a computer program for the study of natural language communication between man and machine. Communications of the ACM, 9(1), 36–45. https://doi.org/10.1145/365153.365168
White, A. D. (2023). The future of chemistry is language. Nature Reviews Chemistry, 7(7), 457–458. https://doi.org/10.1038/s41570-023-00502-0
Who Benefits the most from Generative UI. (2024). https://www.monterey.ai/newsroom/who-benefits-the-most-from-generative-ui.
Why design is key to building trust in FinTech Star. (2021). https://star.global/posts/fintech-product-design-podcast/.
Why UX should guide AI. (2021). In VentureBeat.
Wiggers, K. (2023). Inworld, a generative AI platform for creating NPCs, lands fresh investment. In TechCrunch.
Women in AI. (2018). How can AI assistants help patients monitor their health? In Spotify. https://open.spotify.com/episode/3dL4m7ciCY0tnirZT2emzs.
World Governments Summit. (2024). A Conversation with the Founder of NVIDIA: Who Will Shape the Future of AI? https://www.youtube.com/watch?v=8Pm2xEViNIo.
Wu, J., Huang, Z., Hu, Z. & Lv, C. (2023). Toward Human-in-the-Loop AI: Enhancing Deep Reinforcement Learning via Real-Time Human Guidance for Autonomous Driving. Engineering, 21, 75–91. https://doi.org/10.1016/j.eng.2022.05.017
Xu, S., Chen, G., Guo, Y.-X., Yang, J., Li, C., Zang, Z., Zhang, Y., Tong, X. & Guo, B. (2024). VASA-1: Lifelike Audio-Driven Talking Faces Generated in Real Time. https://doi.org/10.48550/ARXIV.2404.10667
Xu, X. & Sar, S. (2018). Do We See Machines The Same Way As We See Humans? A Survey On Mind Perception Of Machines And Human Beings. 2018 27th IEEE International Symposium on Robot and Human Interactive Communication (RO-MAN), 472–475. https://doi.org/10.1109/ROMAN.2018.8525586
Yang, W. (2022). Artificial Intelligence education for young children: Why, what, and how in curriculum design and implementation. Computers and Education: Artificial Intelligence, 3, 100061. https://doi.org/10.1016/j.caeai.2022.100061
Yin, Y., Jia, N. & Wakslak, C. J. (2024). AI can help people feel heard, but an AI label diminishes this impact. Proceedings of the National Academy of Sciences, 121(14), e2319112121. https://doi.org/10.1073/pnas.2319112121
Yuan, C., Zhang, C. & Wang, S. (2022). Social anxiety as a moderator in consumer willingness to accept AI assistants based on utilitarian and hedonic values. Journal of Retailing and Consumer Services, 65, 102878. https://doi.org/10.1016/j.jretconser.2021.102878
Zafar, N. & Ahamed, J. (2022). Emerging technologies for the management of COVID19: A review. Sustainable Operations and Computers, 3, 249–257. https://doi.org/10.1016/j.susoc.2022.05.002
Zangróniz, R., Martínez-Rodrigo, A., Pastor, J., López, M. & Fernández-Caballero, A. (2017). Electrodermal Activity Sensor for Classification of Calm/Distress Condition. Sensors, 17(10), 2324. https://doi.org/10.3390/s17102324
Zellers, R., Holtzman, A., Bisk, Y., Farhadi, A. & Choi, Y. (2019). HellaSwag: Can a Machine Really Finish Your Sentence? https://doi.org/10.48550/ARXIV.1905.07830
Zerilli, J., Bhatt, U. & Weller, A. (2022). How transparency modulates trust in artificial intelligence. Patterns, 3(4), 100455. https://doi.org/10.1016/j.patter.2022.100455
Zero Waste Europe, Ekologi brez meja, Estonian University of Life Sciences, Tallinn University & Let’s Do It Foundation. (2022). The zero waste handbook. In Zero Waste Cities. https://zerowastecities.eu/tools/the-zero-waste-training-handbook/.
Zhang, G., Chong, L., Kotovsky, K. & Cagan, J. (2023). Trust in an AI versus a Human teammate: The effects of teammate identity and performance on Human-AI cooperation. Computers in Human Behavior, 139, 107536. https://doi.org/10.1016/j.chb.2022.107536
Zhou, Y., Muresanu, A. I., Han, Z., Paster, K., Pitis, S., Chan, H. & Ba, J. (2022). Large Language Models Are Human-Level Prompt Engineers. https://doi.org/10.48550/ARXIV.2211.01910
Zhu, H., Vigren, O. & Söderberg, I.-L. (2024). Implementing artificial intelligence empowered financial advisory services: A literature review and critical research agenda. Journal of Business Research, 174, 114494. https://doi.org/10.1016/j.jbusres.2023.114494
Z.M.L. (2023). “Computers enable fantasies” – On the continued relevance of Weizenbaum’s warnings. In LibrarianShipwreck.