Categories
ai cloud

Amazon Bedrock Knowledge Base: a first look

Amazon Bedrock is a fully managed service designed to simplify the development of generative AI applications – as opposed to Amazon SageMaker which provides services for machine learning applications. It offers access to a growing collection of foundation models from leading AI companies such as AI21 Labs, Anthropic, Cohere, Meta, Mistral AI, Stability AI, and Amazon itself.

One of the latest offering under Amazon Bedrock is Amazon Bedrock Knowledge Base – which I shall refer to as ABKB. Essentially, ABKB is a simple way to do retrieval augmented generation (RAG). RAG is a technique to overcome some of the problems with foundation models (FM), such as not having up-to-date information and lack of knowledge of organization’s own data. Instead of retraining or fine tuning a FM with your own data, RAG allows existing FM to reference those data when responding to a query to improve accuracy and minimize hallucinations. In this article, I will go through the process of setting up ABKB and see how it can be used it in a sample application. But first, let’s look at the 2 ways you can use ABKB in a RAG application:

Custom RAG workflow is useful for cases where you want to have more control over the prompt augmentation process, or where you want to use FM which are not available in AWS. Here, you are only using AWS to generate embeddings – which is a technique to convert words into a numerical form – and to retrieve similar documents from user prompts.

In a fully managed RAG workflow, we will use AWS for all stages of the pipeline and this is what we will be doing.

Things to take note

As it is with new services, Amazon Bedrock is currently available only in limited regions. We will be using the US East (N. Virginia) region.

Note that you will need to login as an IAM user, not root user, to use ABKB. Suffice to say the IAM user will need to have sufficient permissions.

Request model access

Before you can use any Bedrock services, you will need to request for permission to the models that you want to use. This is done under the Model access option on the left panel.

There is no cost to request for model access so you might as well request for everything. The main reason for this step is to make sure you agree to the EULA for each model which will be different for different providers. Access to most models are automatically granted after request, except for Anthropic models, which you will need to provide a use case. You will need to do this because ABKB only supports Anthropic models at this point for its retrieve and generate API.

Create data source

ABKB takes data from a S3 bucket as its data source, so you will need to use or create a S3 bucket. Use the same region as the knowledge base if you are creating a new bucket.

There are some limitations to the size and type of documents supported. Mainly, documents should not exceed 50MB in size and it will only process text in supported documents (txt, md, html, doc, csv, xls, pdf).

Create knowledge base

There are 4 steps to create a knowledge base. Step 1 is straightforward and just involves filling in the name, description and IAM permissions – which you can leave as default.

In Step 2, you will need to specify the data source. By right you should be able to click on [Browse S3] and choose your bucket but it was not working for me. So enter the S3 URI manually if you need to. For chunking strategy you can leave it as default (300 tokens) or customize it according to your needs.

In Step 3, choose your embeddings model and vector database. You can choose between Amazon Titan or Cohere’s model for embedding. There are some articles that says Cohere’s models are superior. You can choose either and evaluate the performance. For vector database you can select Amazon OpenSearch Serverless, Aurora, Pinecone or Redis Enterprise Cloud. For development and testing, OpenSearch Serverless provides the cheapest option.

Step 4 is basically just a confirmation step. Click [Create knowledge base] to confirm. Note that it will take some time to provision the necessary resources after you click. While that is happening, do not close your browser tab. This is quite unusual as provisioning usually takes place in the background and there should not be a need to keep the frontend open, but that is not the case here.

Assuming all goes well, you will see a message to say that the knowledge base has been created successfully. You might have to wait a few more minutes for the vector database to fully index contents from the data source.

Test knowledge base

Now comes the fun part. you can now select your knowledge base and test it in the playground. You can configure the search strategy and model under configuration. Depending on your use case you might want to change the search strategy.

For model selection, Claude Instant provides the fastest response, but it does not perform as well for complex queries. I find almost no difference in Claude 2 and 2.1, but that is probably because my queries do not require a larger context window.

Sample responses

To test ABKB, I uploaded a 238 page employee user guide and use it to ask questions. The first one is a simple question.

Note that the response includes references to source chunks, which are relevant text that are extracted from the data source. You can also expand source chunk to see the actual text.

The second example is one where I asked follow-up questions.

I also tried to ask it something which is not in the document. To which it responded correctly that there is no such information.

Conclusion

Amazon Bedrock Knowledge Base provides an opinionated way to do RAG in a straightforward manner. The knowledge base you create is meant to be integrated into applications via AWS SDK. As it is fairly new at this stage, some rough edges are to be expected. Some of the issues encountered so far includes:

  • Model request UI not straightforward
  • Browse S3 not listing buckets unless they are already a data source
  • Provisioning requires staying on the page
  • Only Anthropic models available for response generation
  • New models like Anthropic Claude 3 not available Anthropic Claude 3 models are now available
  • Failure to create knowledge base happens sometimes

Despite the teething issues, ABKB seems like a useful service for organizations to create RAG applications easily within the AWS ecosystem and I am excited to see the addition of more features to enhance its functionality in the upcoming weeks/months.

Categories
ai

Year of the Dragon

In the spirit of the year of the dragon, I made 10 images using SDXL featuring dragons in different styles. Feel free to use it for any purpose.

Which is your favourite?

Categories
ai

A Man Sued Avianca Airline. His Lawyer Used ChatGPT. – The New York Times

This is what happens when somebody uses ChatGPT as if it’s a search engine. People are so used to precise and deterministic output from programs that it’s hard for them to imagine one that not only fabricates truths, but also does so convincingly.

The lawyer who created the brief, Steven A. Schwartz of the firm Levidow, Levidow & Oberman, threw himself on the mercy of the court on Thursday, saying in an affidavit that he had used the artificial intelligence program to do his legal research — “a source that has revealed itself to be unreliable.”

Source: A Man Sued Avianca Airline. His Lawyer Used ChatGPT. – The New York Times

Categories
ai

ChatGPT Prompt Engineering for Developers – DeepLearning.AI

For a limited time only, this free course by Isa Fulford and Andrew Ng (Coursera, DeepLearning.ai), called ChatGPT Prompt Engineering for Developers, is available for anyone looking to expand their development skills. The course is an excellent opportunity for developers who want to learn how to use a large language model (LLM) to create powerful applications in a cost-effective and time-efficient way.

Throughout the course, Isa Fulford and Andrew Ng explain the workings of LLMs and provide best practices for prompt engineering. You’ll be able to learn how to use the OpenAI API to build capabilities that can automatically summarize user reviews, classify sentiment, extract topics, translate text, and even write emails. Additionally, you’ll learn how to build a custom chatbot and use two key principles for writing effective prompts.

What I appreciate about this course is the hands-on experience provided in the Jupyter notebook environment. You’ll be able to play with numerous examples and systematically engineer good prompts. This makes it easy to put the concepts learned in the course into practice in your own projects.

So, if you’re looking for an opportunity to upskill and learn how to build innovative applications that were once impossible or highly technical, I highly recommend taking this course. Don’t miss out on the chance to learn from experts and expand your skill set for free.

ChatGPT Prompt Engineering for Developers is beginner-friendly. Only a basic understanding of Python is needed. But it is also suitable for advanced machine learning engineers wanting to approach the cutting-edge of prompt engineering and use LLMs.

Source: ChatGPT Prompt Engineering for Developers – DeepLearning.AI

Categories
ai

ChatGPT limitations

People are often amused or surprised when ChatGPT fails to give a correct response for seemingly simple questions (eg. multiply two numbers), yet is able to answer very complex ones.

The way to think about ChatGPT and other LLM tools is that they are simply an assistant and not an oracle.

AI tools like ChatGPT have a mental model of the world, and try to imagine what would be the best answer for any given prompt. But they may not get it right all the time, and in times when they don’t have an answer they will try their best anyway (ie. fabricate one).

An assistant make mistakes, that’s why you should expect ChatGPT’s output to have mistakes.

That said, ChatGPT is really good in areas that don’t require precision (eg. creative writing).

Update (2023-02-01): ChatGPT has released a newer version that is supposed to have improved factuality and mathematical capabilities. Well, didn’t work for me.

The answer is 10365
Categories
ai cloud

Amazon Polly speaks Cantonese

By now, text to speech systems are quite common and widely in use. Tiktok has this feature added as part of their app some time ago. Amazon Polly – Amazon’s version of text-to-speech service – was launched in 2016 and supports quite a large number of languages.

Just this week, AWS announced the availability of a female Cantonese voice to Polly. Upon reading about this, I have to test it out. For the test, I took a sample text from YES 933 facebook page and fed it to Polly. I must say I’m very impressed with the results.

Of course, Amazon Polly is not the first or only Cantonese text-to-speech service out there, but it’s definitely one of the most natural sounding one I’ve heard. Looking forward for more languages to be support.

Footnote: there are some minor modifications to the text to achieve the desired result, eg. to get pauses in the right places, to say nine-three-three instead of nine hundred thirty three etc. But otherwise only default settings are used.

Categories
ai

Imagen: Text-to-Image Diffusion Models

Text-to-image generation is now surprising good. Some predicts the end of stock photo business – why use a stock photo when you can generate any image you need just based on description?

Google develops competing model to DALL-E 2, which purportedly performs better than the latter and other models in a test with human raters.


Generated from text prompt “A robot couple fine dining with Eiffel Tower in the background”.

Source: Imagen: Text-to-Image Diffusion Models

Categories
ai

DALL·E 2

Another ground-breaking work from OpenAI.

We are all familiar with AI models that does image analysis and outputs text description or labels. For instance,

Dall-E and its successor, Dall-E 2, sort of does the reverse. It produces an image based on text description. There’s some degree of randomization there so it can produce different outputs from the same prompt text.

Here’s an example generated from “An astronaut riding a horse in the style of Andy Warhol”.

Someone used Dall-E 2 to generate pictures from Twitter bios and the results are just jaw-dropping.

happy sisyphus

bookbear

machine learning researchoor | technology brother | “prolific Twitter shitposter

It’s currently in private preview but should not be long before it provides a commercial offering.

DALL·E 2 is a new AI system that can create realistic images and art from a description in natural language.

Source: DALL·E 2

Categories
ai cloud

The Emerging Architectures for Modern Data Infrastructure

This is a very well written summary of the current data science landscape. Everybody building data related solutions should have a good read of this.

Five years ago, if you were building a system, it was a result of the code you wrote. Now, it’s built around the data that is fed into that system. And a new class of tools and technologies have emerged to process data for both analytics and operational AI/ ML.

Source: The Emerging Architectures for Modern Data Infrastructure

Categories
ai

A Brief Overview of GPT-3

GPT-3 is one of the most interesting and provocative advances in AI in recent years. There has been a lot of raving articles that both offer praise and warn of its potential. Wikipedia describes it as:

Generative Pre-trained Transformer 3 (GPT-3) is an autoregressive language model that uses deep learning to produce human-like text. It is the third-generation language prediction model in the GPT-n series created by OpenAI, a for-profit San Francisco-based artificial intelligence research laboratory.

Wikipedia – GPT-3

It’s not the first time that AI techniques have been applied to create fake (“novel”) content. Deep fake techniques have been used to create entirely fake photos of people who doesn’t exists and to alter videos to make it seem like people did things they didn’t do.

Manipulating photos and videos is one thing. But generating original and believable articles is quite another. Here are some examples of original content generated by GPT-3 :

It is a curious fact that the last remaining form of social life in which the people of London are still interested is Twitter. I was struck with this curious fact when I went on one of my periodical holidays to the sea-side, and found the whole place twittering like a starling-cage. I called it an anomaly, and it is.

The importance of being on twitter

Responding to a philosopher’s article about GPT-3:

Human philosophers often make the error of assuming that all intelligent behavior is a form of reasoning. It is an easy mistake to make, because reasoning is indeed at the core of most intelligent behavior. However, intelligent behavior can arise through other mechanisms as well. These include learning (i.e., training), and the embodiment of a system in the world (i.e. being situated inthe environment through sensors and effectors).

Response to philosophers

Writing poetry:

Once there was a man
who really was a Musk.
He liked to build robots
and rocket ships and such.

He said, “I’m building a car
that’s electric and cool.
I’ll bet it outsells those
Gasoline-burning clunkers soon!”

GPT Stories

Of course, it’s not long before people started posting GPT-3 generated articles to their own blog and popular forums (reddit, hacker news) and reveal it later to be an experiment.

Writing articles, fiction or poetry is just tip of the ice berg. GPT-3 can also tell jokes, generate code from description, answer Q&A, do a tech interview, write ads, and more.

If the written text – blog, press, forum, school work etc – can be generated with such ease, what incentive is there to put in the effort to write anymore? And what will this do to the future of writing? How will anyone be able to tell spam from non-spam in the future? What jobs will be displaced once GPT-3 – and its successors – become prevalent? These are all interesting and important questions that the community is still figuring out.

GPT-3 is currently limited access – I have applied but have not been granted access yet. The creators know that the potential for abuse is too high and so have been managing it carefully. On the other hand, if that aspect can be managed I’m very sure we will start to see very exciting commercial applications of GPT-3 when it eventually goes live.