Categories
ai

How to run DeepSeek-R1 locally

This is about running the full model that has OpenAI o1-level performance, not those distilled models running on Raspberry Pi. A distilled model is like a student learning from a master. It may be able to do some things that the master do, but it is not at the same level.

Someone wrote a post on X/Twitter on how to achieve this in ~USD6000. This is the original post, but if you don’t have an X/Twitter account, you can also use this link to view the whole thread. There are also others who wrote about the same setup here and here so I won’t repeat it here.

This is a video capture of the model’s output in realtime:

This setup is impressive for a few reasons:

  • CPU-Only Processing – No GPUs are involved
  • Decent Token Generation Speed – 6-8 tokens per second
  • Energy Efficient – Operates on <400W of power
  • Cost-Effective – Cost ~USD6,000, a fraction of the estimated $100,000+ required for a GPU-based setup

The setup is not exactly cheap, but it is within research or hobbyist-level budget. It will be very interesting to see how much more optimization can be done to make it even more affordable without compromising on quality.

Categories
cloud

The AWS Management Console now supports simultaneous sign-in for multiple AWS accounts – AWS

Finally, AWS has a new feature to support multiple AWS accounts in the same browser easily. The problem faced by people using multiple AWS accounts is that you can only log in to one AWS account console at any one time – at least using the same browser. Workarounds include use of different browsers (eg. one console in Chrome, one in Safari etc), use of incognito mode, or – my favourite – use of browser containers in Firefox, but they all come with limitations. Now you don’t have to do that anymore.

Today, AWS announces multi-session support, which enables AWS customers to access multiple AWS accounts simultaneously in the AWS Console. AWS Customers can sign-in to up to 5 sessions in a single browser, and this can be any combination of root, IAM, or federated roles in different accounts or in the same account.

 

Categories
Uncategorized

Singapore Car Prefix Registration

LTA has an interesting PDF showing the list of license plate prefix and registration date of the prefix for Singapore cars. If you want to know how new/old a car is, it is a good reference#.

Based on this PDF, it can be observed that some prefixes which are proper English words are reserved or skipped, eg.

  • SKY
  • SLY

Some reserved prefixes are acronyms of well-known companies:

  • SBS (Singapore Bus Services)
  • SDC (Sentosa Development Corp)

However, other skipped or reserved prefixes don’t seem to have a pattern:

  • SCB
  • SCC
  • SMB

I guess they may be acronyms of some entities that are not well known.

Some letters are also omitted, eg. O and I, presumably to avoid confusion with 0 and 1.

Based on the PDF, I asked Claude to do an analysis, to see the rate of car registrations based on prefixes. This is what it generated.

Interestingly, I gave ChatGPT (the non-plus version) the same prompt and it erred out on its first attempt, and simply generated Pandas code for me the second time round. The code doesn’t work as it attempted to extract data from the PDF using regex.

Based on the graph, it seems that some years have extremely high registrations, eg. 2003-2007. That corresponds to a period where prices are relatively low (prices from 2002 till current from sgcharts.com).

#There are exceptions of course, when people register a new license plate for an old car.

Categories
ai cloud

How Many ‘Copilots’ Do We Need?

Dear companies, please stop using the term “Copilot” for everything. It is honestly confusing and doesn’t help anyone. Imagine a conversation like this:

Person A: Hey, have you tried Copilot? It's honestly amazing!
Person B: Yeah! I was using it to finish writing my Python assignment in half the time!
Person C: I didn't know you can do that? I was using Copilot to setup my AWS ECS services
Person D: Wait what? Isn't Copilot for summarizing emails and drafting Word documents?

¯\_(ツ)_/¯
How many “Copilots” are out there really? Let’s see.

Github Copilot

This should be familiar to developers.

GitHub Copilot is an AI coding assistant that helps you write code faster and with less effort, allowing you to focus more energy on problem solving and collaboration.

From the website

In short: Coding assistant

AWS Copilot

This one is less well-known, unless you are a AWS power user.

AWS Copilot is an open source command line interface that makes it easy for developers to build, release, and operate production ready containerized applications on AWS App Runner, Amazon ECS, and AWS Fargate.

From the website

To be fair, AWS Copilot was launched in 2020, before everyone started using the term to mean some form of AI assistant.

In short: Another AWS command line tool

It gets more confusing from here.

Microsoft Copilot (formerly known as Bing Chat, or Bing AI)

I can’t actually find an official definition for Microsoft Copilot. Lol.
As of Nov 2024, if you do a search and end up in the above page, all you see is a chat interface and page titled “Microsoft Copilot – Your AI companion”.

In short: ChatGPT variant

Note that if you sign in using your work Microsoft account, you will be unable to use Microsoft Copilot. You must either:

  1. Sign in with a personal account, or
  2. Sign out completely, or
  3. Use Microsoft 365 Copilot (see next section)

In addition, the features you can use in (1) and (2) are slightly different. Sounds confusing? Let’s go on.

Microsoft 365 Copilot

The official page for Microsoft 365 Copilot is full of marketing-speak. It doesn’t help that different features are available depending on the region you are in. For the purpose of this article, we’ll go with the US version.

Based on info from the page, Microsoft 365 Copilot is an AI-powered virtual assistant that is integrated into Microsoft 365 apps like Word, Excel, PowerPoint, and Outlook. The gist of what is included is covered in the pricing section – a per-user license is required. When you meet others at work talking about “Copilot”, they are mostly likely referring to this.

In short: Clippy on steroids

Microsoft Dynamics 365 Copilot

There’s not a lot of content on this. Just a blog post from Microsoft. From what I can tell, Microsoft Dynamics 365 Copilot is an add-on to Microsoft Dynamics 365 range of products that deals with CRM and ERP.

In short: A chat interface for Dynamics 365

Microsoft Copilot in Azure

This newly launched feature is in preview, and the official website says: Simplify operations and management from cloud to edge with an AI companion.

In short: FAQ for Azure

Actually there are more, like this and that, and also not forgetting this. And I’m sure I’m missing quite a few others as well. Hopefully we will quickly move past this fad and avoid situations like the hypothetical conversion at the beginning of this article.

Or if you are still confused, you can always ask “Copilot”. 🙂

Categories
ai cloud

AWS AI Certifications

AWS is launching not one, but two new AI certifications, as demand for AI skills skyrockets.

For those who are unaware, AWS certification comes in 4 levels, roughly in terms of difficulty/professional experience required:

  • Foundational
  • Associate
  • Professional
  • Specialty

Previously the only AWS AI certification available is their specialty one: AWS Certified Machine Learning – Specialty, which is aimed at data scientists and ML engineers. To fill the intermediate gaps, AWS will be launching the following certifications:

(Foundational) AWS Certificated AI Practitioner
(Associate) AWS Certified Machine Learning Engineer – Associate

The former is aimed at generalists (business analyst, product or project manager, sales professional) whereas the latter is targeting developers/engineers doing ML work but whom may not be full-time ML specialists.

These new certifications are marked as beta, so expect syllabus and content to change. They are currently priced at a discounted price of
USD75. The usual foundational certifications are USD100 and associate ones are USD150, so this is quite a good offer.

Registration for the new certification opens on 13 Aug 2024 and you can use Skill Builder resources listed in the certification page to prepare for it.

Categories
Uncategorized

Windows 10 Desktop

Color me impressed. Did you know the Windows 10 desktop wallpaper is an actual physical window? I didn’t, but now I do.

Our approach involved a live-action shoot using different variables and customizations. Our core concept wanted to position the logo as a portal into the world behind it.

Source: Windows 10 Desktop — GMUNK
Categories
ai cloud

Build RAG applications with MongoDB Atlas, now available in Knowledge Bases for Amazon Bedrock | AWS News Blog

More vector database choices for Amazon Bedrock Knowledge Base. This might make sense if you are already using MongoDB Atlas.

MongoDB Atlas vector store in Knowledge Bases for Amazon Bedrock is available in the US East (N. Virginia) and US West (Oregon) Regions. Be sure to check the full Region list for future updates.

Categories
3D programming

Interactive Fluffy Ball

This is a fun little interactive app that showcases the power of shaders in modern browsers. This looks like one of those Three.js demos, but it is actually written in a cross-platform language called Haxe that is able to compile to different platforms including JavaScript, C++, C#, Java, JVM, Python etc.

Source: Marimo

Categories
ai cloud

Amazon Bedrock Knowledge Base – Part 3 (SDK)

Since the last writeup, AWS has added support for Anthropic Claude 3 model to AWS Bedrock Knowledge Base (ABKB). It has also added the ability to add your own metadata to your source files in order to perform filtering when doing query. For example, you may want to add a metadata to a certain set of files to indicate that they are from year 2023. Then during your query, you can include a filter to indicate you only want to use data from year 2023. This provides another set of tools for developers to create more relevant and targeted query. Note that filtering is only supported for FAISS vector engine.

If you’re looking to integrate ABKB into your code, there are two primary methods: using one of the AWS SDK or interacting through HTTP API. In this article, we will be using Boto3, the AWS SDK for Python. Here is a simple example to do a retrieve and generate query using Boto3. This example uses the new Claude 3 Sonnet model.

import boto3
import json

AWS_ACCESS_KEY="_your_access_key_"
AWS_SECRET_KEY="_your_secret_key_"
REGION_NAME="_your_region_"

client = boto3.client('bedrock-agent-runtime',
                      aws_access_key_id=AWS_ACCESS_KEY,
                      aws_secret_access_key=AWS_SECRET_KEY,
                      region_name=REGION_NAME
)

# retrieval and generate
response = client.retrieve_and_generate(
    input={
        'text': 'how to apply for leave'
    },
    retrieveAndGenerateConfiguration={
        'knowledgeBaseConfiguration': {
            'knowledgeBaseId': 'LEBQPJQ9BY',
            'modelArn': 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-3-sonnet-20240229-v1:0',
            'retrievalConfiguration': {
                'vectorSearchConfiguration': {
                    'overrideSearchType': 'HYBRID'
                }
            }
        },
        'type': 'KNOWLEDGE_BASE'
    }
)

print(json.dumps(response))

Running the code produces the following output in JSON:

{
  "ResponseMetadata": {
		...trimmed...
  },
  "citations": [
    {
      "generatedResponsePart": {
        "textResponsePart": {
          "span": {
            "end": 705,
            "start": 364
          },
          "text": "...trimmed..."
        }
      },
      "retrievedReferences": [
        {
          "content": {
            "text": "...trimmed..."
          },
          "location": {
            "s3Location": {
              "uri": "s3://...trimmed..."
            },
            "type": "S3"
          }
        }
      ]
    }
  ],
  "output": {
    "text": "To apply for leave as an employee on the Workday mobile app:\n\n1. Navigate to your Workday Mobile Homepage and select 'View Applications' under 'Frequently Used'\n2. Select 'Time Off'\n3. Select the date(s) you want to apply for leave\n4. Select 'Next' and choose the leave type\n5. Select any required reasons or upload attachments if applicable\n6. Submit the request To apply for leave as an employee on the Workday desktop:\n\n1. Go to the Workday Homepage and select the 'Absence' worklet\n2. Under 'Request', select 'Request Absence'\n3. Select the date(s) for the leave and click 'Request Absence'\n4. Choose the leave type\n5. Select 'Next' and provide any required reasons or attachments\n6. Submit the request"
  },
  "sessionId": "c8332417-df3c-41e5-8516-ad38cc09de15"
}

For this simple task there is not much difference in output from the various Claude models. I expect the differences will be more pronounced for complex tasks or those involving much larger context window.

With this, we conclude the three-part series on Amazon Bedrock Knowledge Base. I have covered everything from creating the knowledge base, testing it in the playground, to executing queries via CLI and SDK. Hopefully this gives a good overview of the processes involved and capabilities of this new service.

Categories
ai cloud

Amazon Bedrock Knowledge Base: Part 2 (CLI)

In Part 1, I showed how you can set up an Amazon Bedrock Knowledge Base (ABKB for short) using the AWS console. I also showed how you can perform queries against the knowledge base via the playground in AWS console. In this article, I will show how you can do the same thing via AWS CLI.

First, make sure you are using the latest version of the CLI. Otherwise some commands might not be available. To see if your CLI supports the commands, run

aws bedrock-agent-runtime help

It should return something like this:

BEDROCK-AGENT-RUNTIME()                                BEDROCK-AGENT-RUNTIME()



NAME
       bedrock-agent-runtime -

DESCRIPTION
       Amazon Bedrock Agent

AVAILABLE COMMANDS
       o help

       o retrieve

       o retrieve-and-generate

Next, make sure you have the access and secret keys configured in AWS CLI. You can do it via the usual aws configure but I usually do it in a profile since I have many AWS accounts/IAM users, eg. aws configure --profile demo. For convenience, I will use alias to use the new profile like this: alias aws='aws --profile=demo --region=us-east-1'

We can now test the retrieve command in CLI. To run the command, you will need the knowledge base ID. Strangely, there is no way to get this via the CLI 🤷.For now just copy the value from AWS console. Once that is done, you are ready to run the CLI command. Omitting optional/default parameters, this is an example of the simplest version of the command:

aws bedrock-agent-runtime retrieve \
--knowledge-base-id LEBQPJQ9BY \
--retrieval-query '{ "text": "how to apply for leave" }'

Retrieve performs a vector search using the query text and returns a list of matches with a score. As mentioned in Part 1, if you are implementing a custom RAG workflow, you can use the output of retrieve as the context for further prompting. Score ranges from 0-1, 1 being most relevant.

{
    "retrievalResults": [
        {
            "content": {
                "text": "<trimmed>"
            },
            "location": {
                "type": "S3",
                "s3Location": {
                    "uri": "s3://<trimmed>"
                }
            },
            "score": 0.75545114
        },
        {
            "content": {
                "text": "<trimmed>"
            },
            "location": {
                "type": "S3",
                "s3Location": {
                    "uri": "s3://<trimmed>"
                }
            },
            "score": 0.7345349
        },

(Note: output trimmed for brevity and sanitization)

Next we will test retrieve-and-generate command, which implements the fully managed RAG workflow.

Unlike some other CLI commands which uses model id, you will need the model ARN for querying. There is currently no way to get the model ARN from AWS console, so you will need to get it via another CLI command:

aws bedrock list-foundation-models

Not all models can be used in ABKB – at least for now. Stick to Claude Instant V1, V2, V2.1 and only use ON_DEMAND models. I made the mistake of choosing a PROVISIONED model and all I get is a cryptic error message. Yikes.

An error occurred (ValidationException) when calling the RetrieveAndGenerate operation: 1 validation error detected: Value 'arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-instant-v1:2:100k' at 'retrieveAndGenerateConfiguration.knowledgeBaseConfiguration.modelArn' failed to satisfy constraint: Member must satisfy regular expression pattern: (arn:aws(-[^:]+)?:bedrock:[a-z0-9-]{1,20}:(([0-9]{12}:custom-model/[a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}/[a-z0-9]{12})|(:foundation-model/[a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([.:]?[a-z0-9-]{1,63}))|([0-9]{12}:provisioned-model/[a-z0-9]{12})))|([a-z0-9-]{1,63}[.]{1}[a-z0-9-]{1,63}([.:]?[a-z0-9-]{1,63}))|(([0-9a-zA-Z][_-]?)+)

With the right model ARN in hand, you are ready to execute the retrieve-and-generate command. Here is an example of the command you can execute:

aws bedrock-agent-runtime retrieve-and-generate \
--input '{ "text": "how to apply for leave" }' \
--retrieve-and-generate-configuration '
{
  "knowledgeBaseConfiguration": {
    "knowledgeBaseId": "LEBQPJQ9BY",
    "modelArn": "arn:aws:bedrock:us-east-1::foundation-model/anthropic.claude-instant-v1",
    "retrievalConfiguration": {
      "vectorSearchConfiguration": {
        "overrideSearchType": "HYBRID"
      }
    }
  },
  "type": "KNOWLEDGE_BASE"
}
'

If all goes well, you will get an output like this:

{
    "sessionId": "0ff48086-f26f-4ebd-bb68-7c7bcd1e414a",
    "output": {
        "text": "To apply for leave, navigate to the Workday homepage and select the Absence worklet. Then select \"Request Absence\" and choose the date range and type of leave you want to apply for. You may need to provide additional details or attachments depending on the leave type. Finally, select \"Submit\" to complete the request."
    },
    "citations": [
        {
            "generatedResponsePart": {
                "textResponsePart": {
                    "text": "<trimmed>",
                    "span": {
                        "start": 0,
                        "end": 317
                    }
                }
            },
            "retrievedReferences": [
                {
                    "content": {
                        "text": "<trimmed>"
                    },
                    "location": {
                        "type": "S3",
                        "s3Location": {
                            "uri": "s3://<trimmed>"
                        }
                    }
                },

In an earlier attempt, I included numberOfResults in vectorSearchConfiguration and got an error message. Note that numberOfResults is currently unsupported.

Closing Thoughts

While writing this article, I noted some general observations in the terms of CLI/console usage:

  • Use of model id vs model ARN: some CLI commands use model id while others use model ARN
  • Some information can only be found in AWS console (eg. knowledge base id), others only via AWS CLI (eg. model ARN)
  • Inconsistent naming in CLI (eg. –retrieve-query vs –input) and error message (error message refers to numResults while actual field is numberOfResults)

Since ABKB is so new there are bound to be some rough edges here and there. None of these are showstoppers and I expect them to clear up over time as the service becomes more mature. For now do be aware as the service undergoes rapid development and updates.