Skip to content

Generative AI at The Edge

Share:

As featured on The Cutting Edge column at Acceleration Economy

The hype that is generative AI continues to rise and with it what is and will be and inevitable realization of the risks and the limitations of the technology, its rapidly growing range of applications, and investments in capitalizing on the ever-expanding landscape of theories of its revolutionary impact on society, industry and civilization as we know it. We are also starting to see calls for responsible and confidential Gen AI. The “edge” may have some answers that we seek for safe and valuable use of generative AI technologies.

A Little About Generative AI

Generative AI is an emerging technology and area of research that has created what can only be described as a craze. It’s the craze that launched a thousand VC investments in new “AI” ventures and hundreds of thousands of GPUs.

It has been popularized recently with the release of ChatGPT, which is basically a SaaS-based chatbot providing the public access to the GPT-3.5 and now the GPT-4 Large Language Models or LLMs. A simple way to look at it, ChatGPT is a Generative AI application built on top of a massive, GPU-powered AI engine.

There are other applications that have come into vogue recently such as OpenAI projects such as Dalle-E, Stable Fusion, Midjourney. The advent of these services have spurred hundreds of Layer 2 applications to flood the market in a matter of months. A large number of them are merely front ending a public Gen AI service as a NLU (Natural Language Understanding) interface to provide a conversational facade on top of their application.  

While these generative AI applications can perform some interesting feats and demonstrate an air of promise, they are also quite limited in their capabilities simply because the large language models have inherent flaws that need to be considered especially by organizations looking to use Generative AI or GenAI tools.

Large language models famously suffer from hallucination, which is, kindly put, confidently presenting fabricated outcomes. Depending on the applications this “flaw” can be leveraged as a beneficial feature such as the creation of fiction novelette or poem. For more deterministic and factual needs, LLMs have yet prove themselves and are arguably poorly suited for these variety of tasks. 

The new public generative AI applications beg the question of consumer privacy and enterprise confidentiality especially as companies such as Microsoft, Google, Saleforce, and a growing number others are integrating GenAI or CoPilot features into their products and services. This should raise concerns among CISOs and CIOs as it did at the recent RSA Conference.

What Can Edge Computing Do for Generative AI?

Today, large foundation models such as the Large Language Models (LLM) exemplified by GPT-3.5 and GPT-4 are hosted and run in large data centers. ChatGPT reportedly runs on over 10,000 Nvidia GPUs. ChatGPT and other generative AI systems consume a tremendous amount of computing power and energy raising concerns about the cost to sustainability of this variety of AI computing, or sustainable AI.

Going forward as consumers and enterprises develop concern about privacy and confidentiality with public Gen AI applications, the evolving edge native technologies will inevitably foster “privacy first” and confidential Gen AI systems and architectures. Edge AI technologies such as TinyML and quantization frameworks and tools are making it possible to compress foundation models to run on devices as small as smartphones.

Generation of chips outside of the data center are already well suited if not optimized for AI workloads and applications. This is because companies such as Apple, Qualcomm, MediaTek and others continue to incorporate efficient neural engines into their SOCs with tightly integrated memory which is a critical resource for LLMs. Ironically, data center systems such as those being designed and built by Nvidia borrow for the heterogenous, increasingly AI-oriented system architecture of today’s mobile processor.

Qualcomm recently ran a Stable Diffusion model on a smartphones running their Snapdragon 8 Gen 2 processor which is loaded with the Hexagon AI accelerator demonstrating that robust transformer models can be deployed outside of large data centers to support domain specific Gen AI applications and eventually in distributed and federated architectures across the edge continuum.   

Edge Gen AI, or what Qualcomm calls Hybrid AI will be a great direction of innovation that helps enterprises and consumer software companies leverage the power of this emerging technology in a secure and confidential way. This will help mitigate and minimize privacy and security exposure to public generative AI tools that pose a largely unknown risk to enterprises and consumers. Edge Gen AI can also force the transparency and control that organizations and individuals will want to safely use Gen AI-based applications and services and support the data sovereignty requirements of local jurisdictions around the globe that are increasingly concerned about public Gen AI platforms.

Our thesis is that the valuable Gen AI applications will largely be private implementations based on foundation models managed and curated in a purposeful way to yield sustained business value and productivity enhancement. Most importantly, these applications will be secured and preserve organizational and personal confidentiality.

What does this mean for the C-Suite?

While there is a tremendous deal of excitement about Generative AI, in particular ChatGPT, there is much that the C-Suite needs to be concerned about to check their wonder. Public generative AI applications are essentially black boxes and a potential threat to enterprise confidentiality and security.

It’s important for CISOs and CIOs to start asking really hard questions about how prompts and ingested data are used by generative AI applications and their handlers. It’s also time for legal teams to develop postures, policies, and protocols to deal with the bevy of legal concerns that most organizations have barely started to contemplate. 

Maybe there is a need for a chief privacy officer who considers and evaluates the legal ramifications and recourse, and fosters responsible AI inside and outside the organization.

Proceed with caution. The future of your organization and business with thank you for it.

Contact neXt Curve if you would like advice on how to best approach generative AI for your organization by requesting an analyst inquiry.

This material may not be copied, reproduced, or modified in whole or in part for any purpose except with express written permission or license from an authorized representative of neXt Curve. In addition to such written permission or license to copy, reproduce, or modify this document in whole or part, an acknowledgement of the authors of the document and all applicable portions of the copyright notice must be clearly referenced.

If you would like to engage with a neXt Curve analyst on this topic, please:

If you would like to be notified of our latest research by email, please:

Related Content

Subscribe to neXt Curve!

By subscribing to the neXt Curve site you will registered with our reThink research blog and have an opportunity to engage with one of the most vibrant and independent discussions on our digital future. As a subscriber, you will receive newly published research articles and content as well as invitations to exclusive events by mail.

By subscribing you acknowledge and accept the terms of neXt Curves privacy policy.

Request an Inquiry

Send us an email

Request a Briefing