Data

Is AWS a diminishing AI laggard – or is it right about on time?

December 12, 2023 By Jack Vaughan

Harvard Stadium

AWS is lagging and racing to catch up in Generative AI and Large Language Models (LLMs). Or so an industry meme holds. When a smattering of new COVID isolations end and the dust settles in the weeks after Amazon’s re:Invent 2023 conference in Las Vegas, that notion may be due for a revision.

Like all its competitors, AWS is working to put Generative AI technology in place – that means latching it on to other application offerings and adapting new tools and schemes for developers.

Among challenges that now face teams creating Generative AI applications are vector embeddings. These processes are an important step in handling data for consumption by Large Language Models (LLMs) that betoken a new era of chatbots. Perhaps as importantly, vector embeddings are also useful in slightly less futuristic applications, such as search, recommendation engines and personalization engines.

When Wall Street wags ask whether AWS is a diminishing AI laggard or peaking at just the right time, they probably don’t devote too much thought to the types of vectors machine learning engines are now churning. But building such “infrastructure” is important on the path to working AI.

AWS put vector techniques front and center in AI and data announcements at re:Invent 2023. A centerpiece of this is Amazon Titan Multimodal Embeddings, just out. The software works to convert images and short text into numerical representations that generative learning models can use. These are used to unpack the semantic meanings of data, and to uncover important relations between data points.

Putting new-gen AI chatbots aside for the moment, it’s worth mentioning that recommendation and personalization tasks are likely beneficiaries of vector and AI progress. Once the province of Magnificent 7 Class vendors, these application types have become part of more and more organizations’ systems portfolios.

As you may imagine, they add considerable complexity to a developer’s typical day. Here, Amazon AWS has set a course to simplify such work for customers.

Before some words on that, a few words about these kinds of embeddings: Vector embeddings are numerical representations created by LLMs from words, phrases or blocks of text. The vectors are more useful for new styles of machine learning, which seek to find meaning in data points.

This is useful, but development managers need to find skilled-up programmers and architects to make this leap forward. That is some of the feedback AWS says it’s getting from customers. Enter Swami Sivasubramanian.

Sivasubramanian is vice president of data and AI, at AWS. At re:Invent he told attendees: “Our customers use vectors for GenAI applications. They told us they want to use them in their existing databases so that they can eliminate the learning curve in terms of picking up a new programing paradigm, new tools, APIs and SDKs. Importantly, when your vectors and business data are stored in the same place, your applications will run faster and there is not data sync or data movement to worry about.”

Do you want to bring in a vector database to handle this work – adding to your relational databases, document databases, graph databases, and so on? AWS, which has used re:Invent after re:Invent to spotlight such new database offerings is shifting here to promote “run you vectors in your existing database” rather than bring in another new-fangled database.

So, central to AWS’s take is a push to provide vector data handling within existing Amazon databases, rather than standalone vector databases, although Amazon supports 3rd-party vector database integration as well.

Among many Amazon initiatives Sivasubramanian discussed at re:Invent 2023 were vector support for DocumentDB, DynamoDB, Aurora PostgreSQL, Amazon RDS for PostgreSQL, MemoryDB for Redis, Neptune Analytics, Amazon OpenSearch Serverless, and Amazon Bedrock.

The moment sets up a classic soup-to-nuts vendor vs. best-of-breed vendor paradigm. Among the best-of-breed database upstarts are Milvus, Pinecone, Zilliz and others.

Meanwhile, vector support has sounded as a drumroll for database makers of all ilk of late. Here is a small sampling. In September, IBM said it planned to integrate vector database capability into watsonx.data for use in retrieval augmented generation (RAG) use cases. Also in September, Oracle disclosed plans to add semantic search capabilities using AI vectors to Oracle Database 23c. On the heels of re:Invent, NoSQL stalwart MongoDB announced GA for MongoDB Atlas Vector Search. And, prior to re:Invent, Microsoft went public with vector database add-ons for Azure container apps.

Is AWS a diminishing AI laggard – or is it right about on time? No surprise here. The answer is somewhere in between the two extremes, just as it is somewhere between the poles on the soup-to-nuts-to-best-of-breed continuum. It will be interesting to see how the vector database market evolves. – Jack Vaughan

Use cases ultimately pave Generative AI’s path: Face it!

October 22, 2023 By Jack Vaughan

Hadoop Exits Andrew Ng’s online Stanford University machine learning classes serve as a gateway to understanding for many of today’s data scientists, and a discussion he led this summer at Stanford’s Graduate School of Business is noted here as extraordinary. It provides for a clear view of possible futures he offers tomorrow’s AI practitioners. They must, like most of us, wonder where all this is going.

Said Ng: “It feels like a bunch of us have been talking about AI for 15 years or something. But if you look at where the value of AI is today, a lot of it is still very concentrated in the consumer software internet. Once you get outside tech or consumer software internet, there’s some AI adoption – but it all feels very early.”

Andrew Ng stands out among the ranks of machine learning scientists, notable for research, entrepreneurship, and teaching. He helped form and lead the Google Brain Team, has helped redefine the world of machine vision, and done a stint building the neural net and machine learning efforts at Baidu.

THIS IS PART 2 OF 2. FOR PART 1, GO TO: Old Big Data Today – Or the clarion of shiny new thingness

In 2014, he and his team at Google Brain published an influential paper on convolutional neural networks capable of supervised learning. Such supervised learning paved the way for today’s Generative AI.

“About 10, 15 years ago, my friends and I figured out a recipe for how to hire, say, 100 engineers to write one piece of software to serve more relevant ads, and apply that one piece of software to a billion users, and generate massive financial value,” he said, “But once you go outside consumer software internet, hardly anyone has 100 million or a billion users that you can … apply one piece of software to.”

A multibillion dollar blockbusting winner project a Google or Amazon could muster and accomplish is one thing, all else is another. That was a major context in the days of Big Data (2014-2019). It bears note: The similarity of the ‘Large’ in Large Language Model and the ‘Big’ in Big Data.

Ng’s groundbreaking Google work was followed by leading roles at AI venture funds and startups including AI Fund, Landing AI and DeepLearning.AI. The work there now entails a search for cost-effective use cases for the latest AI breakthroughs.

There are interesting projects to pursue but, he suggests, they don’t usually avail a type of return commensurate with the needed developer effort. From use case to use case, there’s work to do, and caveats to consider.

Ng has worked with consumer packaged food makers to better systematize cheese patterns on pizza. As big as that app may be, is not a “recipe for hiring a hundred or dozens of engineers.” The project value may be, for example, $5 million. He cited another perhaps typical AI brainstorm: To get wheat to grow straighter. Again, the return on the investment was not so favorable.

Then there is the cautionary moment. From Ng’s point of view, the work on use cases that can benefit from the tools of generative AI will also be marked by short-term fads along the way. Such a fad was Prisma Labs’ Lensa AI photo app, which turned selfies into professional looking digital art. That petered out like the ‘50s hula hoop. You can cite more of such. With generative AI, we’ve seen more than a bit of that already.

He does suggest the time and coding power needed to create the early-move AI apps, is shrinking — that Generative AI’s potential to streamline programming is crucial there. – Jack Vaughan

Worth noting: As one of the fathers of supervised learning, Ng naturally avows that this earlier discovery still has legs – that the bounty of supervised learning is still being mined for commercial effect. That may well have been missed in Wall Street’s mark-up of AI futures, and should not be ignored as Gen AI hype begins to damper lightly.

There’s a lot more to learn, in what ever modes we choose. I recommend Andrew Ng’s lecture on AI Opportunities. [On YouTube.]

Orthogonal Sideshow: Investor and philosopher Nasim Taleb had recent comment of interest on LLM prompting process and entropy. Clicker beware: You are entering the realm of X.

Old Big Data Today – Or the clarion of shiny new thingness

October 21, 2023 By Jack Vaughan

Hadoop Exits LLMs and Generative AI are the next steps forward for machine learning, the not-so-little engine that saved AI from the horror of technology irrelevance. Here, I note some similarities with today’s AI and yesterday’s Big Data – followed in a subsequent post by some observations from Andrew Ng, a machine learning pioneer, looking ahead to cost-effective use cases for the new tooling.

Similarity breeds comparison

A jaundiced view might hold that Genartive AI has taken over where Big Data left off. A rage a few years ago, Big Data fled from the scene. Is it anywhere now on a Gartner representation of the life of hype?

Big Data leaders, after they redefined recommendation engines and social media personalization, were often asked what Big Data was supposed to do next. The answer turned out to be “machine learning.” Flash forward to the present, and this has morphed into Large Language Models (LLMs) and prompt engineering.

There are plenty of differences between then and now. Let’s dwell on some similarities:

*As in the Big Data/Hadoop days of yesteryear, getting great gobs of custom data into the LLM is time consuming, labor intensive and error prone.

*The shiny new thingness may lure developers to chase the technology (which makes the resume sparkle) while short-changing the use case; that is, pursuing indefensible applications with a short and less-than-stellar commercial life spans.

*And, as with Big Data – and just about every innovation that has ever come about — what works as a prototype may fail to scale in production. As well, what worked for a small army of Google sysadmins may not work for you, or prove saleable either.

*The first tooling is raw, and development can become a trudge of semi-blind trial and error.

*There is a megaton bomb of hyperbole that explodes, followed by hemming-hawing, nitpicking and numb lethargy. See ‘Faded Love and Hadoop’.

These problems are familiar to innovators, but LLMs bring new classes of problems too. What some developers will find persistently annoying is a flakiness in interaction with the LLM. You can regularly prompt it with the same input, while getting different output. I asked Google Bard about this and the answer was: “Overall, whether or not prompt engineering is fun is up to you.”

Of course, a great effort is underway, and development teams will soon benefit from both the successes achieved and failures endured. Among the questions that should direct their efforts: Does the technology solve a widely-found problem of significant weight? In our next post, let’s find out what Andrew Ng says! – Jack Vaughan

THIS IS PART 1 OF 2. FOR PART 2, GO TO Use cases ultimately pave Generative AI’s path: Face it!

Nvidia and Cloudflare Deal; NYT on Human Feedback Loop

September 29, 2023 By Jack Vaughan

Cloudflare Powers Hyper-local AI inference with Nvidia – The triumph that is Nvidia these days can’t be overstated – although the wolves on Wall St. have sometimes tried. Still, Nvidia is a hardware company. Ok, let’s say Nvidia is still arguably a hardware company. Its chops are considerable, but essentially it’s all about the GPU.

Nvidia is ready to take on the very top ranks in computing. But, to do so, it needs more feet on the street. So, it is on the trail of such, as seen in a steady stream of alliances, partners and showcase customers.

That’s a backdrop to this week’s announcement that Cloudflare Inc will deploy Nvidia GPUs along with Nvidia Ethernet switch ASICs at the Internet’s edge. The purpose is to enable AI inferencing, which is the runtime task that follows AI model training.

“AI inference on a network is going to be the sweet spot for many businesses,” Matthew Prince, CEO and co-founder, Cloudflare, said in a company release concerning the Cloudflare/Nvidia deal. Cloudflare said NVIDIA GPUs will be available for inference tasks in over 100 cities (or hyper-localities) by the end of 2023, and “nearly everywhere Cloudflare’s network extends by the end of 2024.”

CloudFlare has found expansive use in Web application acceleration, and could help Nvidia in its efforts to capitalize on GPU technology’s use in the amber fields of next-wave generative AI applications.

With such alliances, all Nvidia has to do is keep punching out those GPUs – and development tools for model building.

*** *** *** ***

NYT on Human Feedback – The dirty little secret in the rise of machine learning was labeling. Labeling can be human-labor intensive, time-consuming and expensive. It harkens back to the days when ‘computer’ was a human job title, and filing index cards was a way of life.

Amazon’s Mechanical Turk – a crowdsourcing marketplace amusedly named after the 18^th Century chess-playing machine “automaton” that was actually powered by a chess master hidden inside the apparatus — is still a very common way to label machine learning data.

Labeling doesn’t go away as Generative AI happens. As the world delves into what Generative AI is, it turns out that human labelers are a pretty significant part.

That was borne out by some of the research I did in the summer for “LLMs, generative AI loom large for MLOps practices” for SDxCentral.com. Sources for the story also discussed how “reinforcement learning through human feedback” was needed for the Large Language Models underpinning Generative AI.

The cost of reinforcement learning, which makes sure things are working, is more than a small part of the sticker shock C-suite execs are experiencing with Generative AI.

Like everything, improvement may come to the process. Sources suggest retrieval augmented generation (RAG) is generally less labor intensive than data labeling. RAG retrieves info from an external database and provides it to the model “as-is.”

RAG is meant to address one of ChatGPT’s and Generative AI’s most disturbing traits: It can make a false claim with amazingly smug confidence. Humans have to keep a check on it.

But the build out of RAG requires some super smarts. As we have come to see, many of today’s headline AI acquisitions are as much about gaining personnel with advanced know-how as they are about gaining some software code or tool. This type of smarts comes at a high price, just as the world’s most powerful GPUs do.

This train of thought is impelled by a recent piece by Cade Metz for the New York Times. “The Secret Ingredient of ChatGPT Is Human Advice” considers reinforcement learning from human feedback, which is said to drive much of the development of artificial intelligence across the industry. “More than any other advance, it has transformed chatbots from a curiosity into mainstream technology,” Metz writes of human feedback aligned with Generative AI.

Metz’s capable piece discusses the role that expert humans are playing in making Generative AI work, with some implication that we should get non-experts involved, too. In response to the story, one Twitter wag suggested that Expert Systems are making a comeback. If so, guess we will have to make-do with more expert humans until “The Real AI” comes along! – Jack Vaughan

Large models cooling

July 16, 2023 By Jack Vaughan

Molecular Sampler – The week just passed brought news of a combined MIT/IBM team suggesting a less compute-intensive route to AI-driven materials science. The group said it used a subset of a larger data pool to predict molecular properties. The use case has gained attention in both ML and quantum computing circles – where a drive to speed material development and drug discovery could lead to cost savings, better health outcomes and yet-to-be-imagined innovations.

Like most AI advances of late, the work gains inspiration from NLP techniques. The methods used to predict molecular properties tap into “grammar rule production,” which by now has a long lineage. There are 1 followed by 100 zeros of ways to combine atoms, which is to say grammar rule production for materials is a big job, and that style of computation is daunting and may not be immediately exploited.

Because the grammar rule production process is too difficult even for large-scale modern computing, the research team put its efforts into preparatory paring of data, a short-cut technique that goes back to the beginning of time. Some notes from the MIT information office:

“In language theory, one generates words, sentences, or paragraphs based on a set of grammar rules. You can think of a molecular grammar the same way. It is a set of production rules that dictate how to generate molecules or polymers by combining atoms and substructures.

“The MIT team created a machine-learning system that automatically learns the “language” of molecules — what is known as a molecular grammar — using only a small, domain-specific dataset. It uses this grammar to construct viable molecules and predict their properties.

As I read it, the MIT-IBM team have come up with a simulation sampler approach. The ‘smaller corpus’ approach is much explored these days as implementers try to take some of the ‘Large’ out of Large Language Models. One may always wonder if such synthesis ultimately can gain true results. I trust an army better qualified will dig into the details of the sampling technique used here over the weekend.

*** *** *** ***

ChatGPT damper – The signs continue to point to a welcome damper on ChatGPT (AI) boosterism – now that each deadline journalist in the world has asked the bot to write up a global heatwave story or Met red-carpet opening story in the style of Hemingway or Mailer or another.

Among the signals of cooling:

*There’s investor Adam Coons. The Chief Portfolio Manager at Winthrop Capital Management said AI on Wall Street will continue but then fade as a hot button.

For a stock market that has endorsed Mega cap growth stocks for their ChatGPT chops, it has become a FOMO trade. “In the near term that trade will continue to work. There’s enough investors still willing to chase that narrative,” he told Reuters. On the other hand, Coons and Winthrop Capital are cautious on it, as the hyperbole has obscured the true potential. He said:

“We are moving away from the AI narrative. We think that there’s still too much to be shown. Particularly [with] Nvidia, we think the growth figures that are being priced into that stock just don’t make sense. And there’s just really not enough proof statements from a monetization standpoint behind what AI can really do within the tech sector.”

*There’s Pincecone COO Bob Wiederhold speaking at VB Transform – Pinecone is in the forefront of up-surging Vector Databases that appear to have a special place in formative LLM applications. Still, Wiederhold sees need for a realistic approach to commercializing the phenomenon.

His comments as described by Matt Marshall on VentureBeat:

Wiederhold acknowledged that the generative AI market is going through a hype cycle and that it will soon hit a “trough of reality” as developers move on from prototyping applications that have no ability to go into production. He said this is a good thing for the industry as it will separate the real production-ready, impactful applications from the “fluff” of prototyped applications that currently make up the majority of experimentation.

*There’s Rob Hirschfeld commentary “Are LLMs Leading DevOps Into a Tech Debt Trap?” on DevOps.com – Hirschfeld is concerned with the technical debt generative AI LLMs could heap onto today’s DevOps crews, which are already awash in quickly built, inefficiently engineered Patch Hell. Code generation is often the second-cited LLM use case (after direct-mail and press releases).

Figuring out an original developer’s intent has always been the cursed task of those who maintain our innovations – but LLM’s has the potential to bring on a new mass of mute code fragments contrived from LLM web whacks. Things could go from worse to worser, all the rosy pictures of no-code LLM case studies notwithstanding. Hirschfeld, who is CEO at infrastructure consultancy RackN, writes:

Since they are unbounded, they will cheerfully use the knowledge to churn out terabytes of functionally correct but bespoke code…It’s easy to imagine a future where LLMs crank out DevOps scripts 10x faster. We will be supercharging our ability to produce complex, untested automation at a pace never seen before! On the surface, this seems like a huge productivity boost because we (mistakenly) see our job as focused on producing scripts instead of working systems…But we already have an overabundance of duplicated and difficult-to-support automation. This ever-expanding surface of technical debt is one of the major reasons that ITOps teams are mired in complexity and are forever underwater.

News is about sudden change. Generative AI, ChatGPT and LLMs brought that in spades. It is all a breathless rush right now, and analysis can wait. But, the limelight on generative AI is slightly dimmed. That is good because what is real will be easier to see. Importantly, reporters and others are now asking those probing follow-up questions like: “How much how soon?”

It’s almost enough to draw an old-time skeptical examiner into the fray. – Jack Vaughan

Adage

“Future users of large data banks must be protected from having to know how the data is organized in the machine….” E.F. Codd in A Relational Model of Data for Large Shared Data Banks