LLM in Practice by Accumulation Point

LLM blog posts and term-definitions: a practical collection

LLM ecosystem tools

Posted on 12th March 2024.

LLM tools

If you've spent any time looking into integrating LLMs within your business, you've surely noticed the wide variety of frameworks, packages, and tools available. Beyond the models (LLMs), you'll find vector stores, embedding models, UI tools, general language frameworks, guardrail frameworks, semantic routers, agent frameworks, and many more types of tools.

These include Langchain, LlamaIndex, Ollama, ChromaDB, AutoGen, DSPy, Nomic embeddings, NeMo-Guardrails, Semantic Router, Gradio, QLoRA, and many others which are all here to support your journey in integrating LLMs in your business. These tools sometimes compete with each other and serve different purposes, at other times they can be integrated together and complement each other. One thing is certain though, which is that all the tools in this space are evolving very quickly. With this, how can you find your bearings?

Even if you momentarily ignore such tools, things may already seem complicated enough when you need to choose an LLM. Options include LLMs from OpenAI, Google, Anthropic, Meta, Mistral, Stability AI, Qwen, Falcon, and others. Each comes with its pros and cons, ranging from pricing, performance, licensing, and computational demands. The tools we discuss in this blog post are not these LLMs but rather constitute the peripherals around the one or more LLMs that you would use. Such tools are designed to support your LLM based business applications.

As a rough analogy you can think of an LLM (say GPT-4 or Mixtral) as a CPU, and you can consider all these other tools as peripherals that enable the CPU to deliver effective applications. Let us now categorize these tools in broad categories, explain what each category is, and nominate the few most popular alternatives in each case. We hope that by the end of this post, you will have a clearer view of the quickly evolving space of LLM tools.

Two example applications

In describing these tools it will be useful to have some examples in mind. To this end, we will describe two example LLM based applications that you may want developed for your business. Let us call one of the applications a private business LLM which is typically implemented via the RAG paradigm. Here, proprietary documents and data of your business are securely exposed to an internally available LLM chatbot that your staff can use to securely ask questions informed by your private data.

Let us call the other application a workflow automation LLM where workflows of your team integrate with LLM work. Here multiple LLMs, sometimes called agents, interplay to drive an outcome. This can be used for market research report generation, preparation of tender applications, creation of complex customer reports, and other such tasks.

Our focus in this blog post is not to outline complete details of private business LLMs or workflow automation LLMs, but rather to briefly describe and categorize many of the tools that can be used for such purposes. We shall just keep these two applications in mind as we outline the tools.

Vendor extensions, out of the box solutions, and programming based frameworks

Some tools are vendor extensions in the sense that a main LLM vendor supplies the tools. For example OpenAI's GPTs fall into the category of vendor extensions since you can try and create LLM based application within this GPTs framework, directly supplied by the vendor. However, to date, vendor extensions are quite limited, and they do not suffice for really integrating applications such as our suggested private business LLM or workflow automation LLM within your business.

Other tools are out of the box solutions which you can download and run on your infrastructure or on secure third party infrastructure. For example, AnythingLLM is a customizable document chatbot built for anyone who wants to "intelligently chat with", or build knowledge bases using their existing documents. Perhaps down the road, applications such as our example private business LLM and workflow automation LLM could be directly deployed by configuring tools such as AnythingLLM or competitors. However it is still early days in the LLM world. So to get the performance, safety, accuracy, and tailor-made behaviors that most businesses need, they need to create their own custom solutions.

This brings us to the third category of tools: those that are programming based. This includes libraries and packages that are mostly interfaced via Python, but also sometimes in Javascript/TypeScript. In fact, at the moment there are several dozen such tools and frameworks. The upside of programming based frameworks is that they they typically swiftly interconnect into workable applications such as our private business LLM or the workflow automation LLM examples. The hardship is that they do require development expertise.

With this dichotomy let us now go through categories of tools, one after the other. As we do so, we will try to highlight the main tools of each category and discuss which of them are vendor extensions, which are out of the box solutions, which are programming based, and which are of some other nature.

Software Tools

Ecosystem tools

Now that let's look at some tools. We've broken up the categories into general frameworks, agent frameworks, embeddings, vector databases, model runners, open (free) LLMs, guardrails/moderation, semantic routers, UI/IDE development tools, front-end frameworks, and fine tuning and quantization.

Below we briefly describe each of these and in each category we link to the few most popular tools (last updated March 12, 2024).

General frameworks

If an LLM is metaphorically a CPU then, a framework like Langchain, LlamaIndex, or Haystack are metaphorically like a motherboard. These frameworks wrap LLM execution, provide the LLM with filtered documents, create prompts, post-process LLM output, and much more. For an application, one would typically use one of these three and not all three in conjunction.

Another related framework is DSPy where the focus is on algorithmically optimizing LLM prompts and weights, especially when LLMs are used one or more times within a pipeline.

Agent frameworks

An agent framework orchestrates multiple LLMs operating in parallel or in sequence. Popular frameworks are AutoGen, CrewAI, TaskWeaver, and Qwen-Agent. Agent frameworks can be useful for the workflow automation LLM application.

Embeddings

The notion of an embedding is a basic central notion of AI systems and LLMs in particular. In a nutshell, a document or part of it can be summarized into a vector of real numbers called the embedding, or embedding vector. This then enables comparing multiple embeddings and seeing which documents or parts of documents convey the same semantic meaning and which do not. This is a critical part of the private business LLM and many other applications. OpenAI's embeddings are one popular non-free choice, and another recent popular non-free choice is Nomic. One can also find dozens of free embedding models on Hugging Face.

Vector databases

Once we have an embedding associated with chunks of text, we sometimes want to store them in a database. For this there are dedicated platforms called vector databases that get the job done. A conventional database can also be used to store embedding vectors, however, these LLM-specialized databases additionally allow one to efficiently find the nearest embedding vectors to any given query. This allows one to for example find documents related to a query. Some of the basic ideas of such databases are based on Faiss, and other databases include, ChromaDB, Pinecone, Weaviate, Milvus, and qdrant.

Model runners

Running an LLM locally on your laptop, or on managed local infrastructure is very viable. The Ollama application presents one alternative for this and another is LM Studio. The models we use with such model runners will almost always be open (free and open sourced) LLMs, as non-free models cannot be downloaded and run locally.

Open (free and open sourced) LLMs

There are many open LLMs, many of which have fine tuned versions uploaded to HuggingFace. Some of the most popular open LLMs are Llama2 by Meta, Mistral models such as Mistral 7B, and the strong Mixtral 8x7B, Google's Gemma, Stability AI's Stable LM Zephyr 3B, Qwen1.5, and the massive Falcon 180B.

See also the open sourced multi-modal model LLaVA.

Guardrails/moderation

Without taking precautions LLMs can return harmful or offensive results. For this there are auxillary models and techniques generally known as guardrails or moderation models. OpenAI provides some moderation models, and for open alternatives there is NeMo-Guardrails and Guardrails AI.

Semantic routers

A semantic router is a smaller LLM that decides how to dispatch or route messages between multiple LLMs. One popular package is aptly called Semantic Router.

UI/IDE development tools

LLM applications using tools such as those mentioned above are becoming popular at a phenomenal rate. With this there are now even graphical platforms that aim to alleviate some of the programming. See for example, FlowiseAI and VectorShift.

Front-end frameworks

And speaking of graphical interfaces, when we want to present our LLM based applications with a front end, then some popular frameworks include,Gradio, Streamlit, and Chainlit.

Fine tuning and quantization

For many LLM applications we never need to actually train or fine-tune an LLM. However, in certain cases we may want to do so. Two popular libraries specific to LLM fine-tuning are, QLoRA, and gptq.

Software Tools

Wrapping up

As of March 2024 there are already hundreds of frameworks in the LLM ecosystem and as time progresses these tools will continue to progress and it will become more evident which are the most suitable tools for any given task. As you plan your LLM business application, let it be a private business LLM, a workflow automation LLM, or anything else, it is important to be aware of the various tools available to decide on a right combination for your task at hand. Stay tuned for more posts where we describe in greater detail how a private business LLM can be implemented (and survey a few of the tools), and same for a workflow automation LLM.