keep moving – a blog on current news and trends in software, hardware and technology

3D ai

HoloPart: Generative 3D Part Amodal Segmentation

Post author By tongwing
Post date May 10, 2025

This research is somewhat related to my early-day work on 3D model segmentation. The difference in approach is that while I am developing an algorithmic method to achieve segmentation, they are tapping on the new generative AI techniques such as attention mechanism and diffusion methods. The results are impressive, as shown in the examples.

generative 3D part amodal segmentation–decomposing a 3D shape into complete, semantically meaningful parts.

Source: HoloPart: Generative 3D Part Amodal Segmentation

ai programming

Amazon introduces SWE-PolyBench, a multilingual benchmark for AI Coding Agents | AWS DevOps & Developer Productivity Blog

Post author By tongwing
Post date April 25, 2025

It is always good to have diversity in benchmarks, to avoid over-reliance and overfitting on one set of benchmarks. AWS just released SWE-PolyBench, their benchmark to evaluate AI coding agents’ ability to navigate and understand complex codebases.

Unlike SWE-Bench, which only works for Python code, SWE-PolyBench is designed to work for additional languages like Java, JavaScript and TypeScript.

Today, Amazon introduces SWE-PolyBench, the first industry benchmark to evaluate AI coding agents’ ability to navigate and understand complex codebases, introducing rich metrics to advance AI performance in real-world scenarios. SWE-PolyBench contains over 2,000 curated issues in four languages. In addition, it contains a stratified subset of 500 issues (SWE-PolyBench500) for the purpose of rapid experimentation. SWE-PolyBench evaluates the performance of AI coding agents through a comprehensive set of metrics: pass rates across different programming languages and task complexity levels, along with precision and recall measurements for code/file context identification. These evaluation metrics can help the community address challenges in understanding how well AI coding agents can navigate through and comprehend complex codebases

Perhaps unsurprisingly, Amazon Q Developer Agent is currently leading this benchmark in the leaderboard. It remains to be seen how well-adopted this new benchmark will be.

cloud

Free data transfer out to internet when moving out of AWS

Post author By tongwing
Post date March 30, 2025

One of the common complaint about using AWS is the egress fees, which can be quite significant if a company wants to transfer lots of data out of the cloud. I happen to chance upon this announcement which I missed last year, that talks about a way for customers to perform data transfer out (DTO) for free.

The caveat is that this offer is only for customers who are migrating away from AWS, and you will need to contact support to make a request – although it says AWS doesn’t require customers to close their accounts. Also, it appears that the “free exit” window is time-limited. Meaning once you get an approval, you should quickly migrate your data before the free DTO window is over.

We believe this choice must include the one to migrate your data to another cloud provider or on-premises. That’s why, starting today, we’re waiving data transfer out to the internet (DTO) charges when you want to move outside of AWS.

– Source

ai cloud

DeepSeek-R1 now available as a fully managed serverless model in Amazon Bedrock

Post author By tongwing
Post date March 11, 2025

In my previous writeup, I wrote that you have to spent a lot (using GPU), or put up with very slow performance (using CPU) if you wanted to use Deepseek R1 on AWS. Not anymore. AWS now offers Deepseek R1 as a base model starting from 10 Mar 2025 (in selected regions). Check out AWS blog on the demo walkthrough.

Just take note that you may have to increase the maximum output length in order to complete your request – this applies to most reasoning models. In my test, output was abruptly stopped halfway, as the default output token length is only 4096. Extending the output length solves the problem.

Cheapest way to run DeepSeek R1 in AWS

Post author By tongwing
Post date February 7, 2025

I wrote about how to run DeepSeek R1 model locally – that is, using your own hardware. But if you do not want to commit USD6K on that, there are other options. Certainly there are many providers now supporting DeepSeek R1 via an API, but those are still running on someone else’s stack – you have to send your data to them and trust that they do the right thing like proper data hygiene. Another option if you don’t have the hardware but want complete control is to run it in the cloud. Here we explore using AWS for this.

GPU Option

DeepSeek is currently not one of the model providers for AWS Bedrock. That does not mean you cannot run it. The official article from AWS suggests 3 ways of running:

The DeepSeek-R1 model in Amazon Bedrock Marketplace
The DeepSeek-R1 model in Amazon SageMaker JumpStart
DeepSeek-R1-Distill models using Amazon Bedrock Custom Model Import

Option 3 is out for this exercise as we are only interested in the full 671B model.

Following the steps in the official article, you can choose DeepSeek-R1 from the model catalog (surprisingly, us-east-1 only has the distilled models. I have to choose us-east-2 for this):

The recommended specs to run is ml.p5e.48xlarge which cost a whopping USD 124.375 per hour for us-east-2 on-demand usage. Needless to say I didn’t proceed with this option. Option 2 has similar cost as it also recommends the same instance type.

CPU Option

Instead of using GPU, I went with the CPU only option. The cheapest instance type with minimum 768GB RAM – recommended to run the full model – is r5a.24xlarge:

Instance name	r5a.24xlarge
On-demand hourly rate	$5.424 (USD)
vCPU	96
Memory	768 GiB
Storage	EBS Only
Network performance	20 Gigabit

Running ollama, the full model (404GB) takes around 35 minutes to download. It appears there was some throttling after 5 minutes of downloading. Loading the model into memory took another 8 minutes before it can be used. After that I asked it the classic strawberry question. So how did it perform (the following video is captured in realtime)?

Not great. Token output is about 0.5-1 token/s.

It runs but the performance is hardly usable for any real world purpose. Then again, do you really need a 671B parameter model for your problem? Maybe you do if you are doing research or tackling problems that require deep understanding. For the common use case out there the 32B or smaller distilled model will probably be fine. And those require much less resources to run.

Conclusion

Running the full DeepSeek R1 model in the cloud is certainly possible, but practicality is another matter. The GPU option, while powerful, comes at an eye-watering cost. The cheapest CPU option, though, suffers from performance issues that make it nearly unusable for real-world applications.

So is it worth running the full 671B model yourself? Unless you have a specific need for such a massive model, it’s likely overkill. For most practical applications, the distilled 32B or smaller versions offer a much more reasonable balance between cost and performance.

Ultimately, while self-hosting DeepSeek R1 gives you full control, the trade-offs in cost and speed mean that for most users, cloud-based API access or a smaller model may be the better choice.

How to run DeepSeek-R1 locally

Post author By tongwing
Post date February 4, 2025

This is about running the full model that has OpenAI o1-level performance, not those distilled models running on Raspberry Pi. A distilled model is like a student learning from a master. It may be able to do some things that the master do, but it is not at the same level.

Someone wrote a post on X/Twitter on how to achieve this in ~USD6000. This is the original post, but if you don’t have an X/Twitter account, you can also use this link to view the whole thread. There are also others who wrote about the same setup here and here so I won’t repeat it here.

This is a video capture of the model’s output in realtime:

This setup is impressive for a few reasons:

CPU-Only Processing – No GPUs are involved
Decent Token Generation Speed – 6-8 tokens per second
Energy Efficient – Operates on <400W of power
Cost-Effective – Cost ~USD6,000, a fraction of the estimated $100,000+ required for a GPU-based setup

The setup is not exactly cheap, but it is within research or hobbyist-level budget. It will be very interesting to see how much more optimization can be done to make it even more affordable without compromising on quality.

cloud

The AWS Management Console now supports simultaneous sign-in for multiple AWS accounts – AWS

Post author By tongwing
Post date January 22, 2025

Finally, AWS has a new feature to support multiple AWS accounts in the same browser easily. The problem faced by people using multiple AWS accounts is that you can only log in to one AWS account console at any one time – at least using the same browser. Workarounds include use of different browsers (eg. one console in Chrome, one in Safari etc), use of incognito mode, or – my favourite – use of browser containers in Firefox, but they all come with limitations. Now you don’t have to do that anymore.

Today, AWS announces multi-session support, which enables AWS customers to access multiple AWS accounts simultaneously in the AWS Console. AWS Customers can sign-in to up to 5 sessions in a single browser, and this can be any combination of root, IAM, or federated roles in different accounts or in the same account.

Uncategorized

Singapore Car Prefix Registration

Post author By tongwing
Post date January 13, 2025

LTA has an interesting PDF showing the list of license plate prefix and registration date of the prefix for Singapore cars. If you want to know how new/old a car is, it is a good reference#.

Based on this PDF, it can be observed that some prefixes which are proper English words are reserved or skipped, eg.

Some reserved prefixes are acronyms of well-known companies:

SBS (Singapore Bus Services)
SDC (Sentosa Development Corp)

However, other skipped or reserved prefixes don’t seem to have a pattern:

I guess they may be acronyms of some entities that are not well known.

Some letters are also omitted, eg. O and I, presumably to avoid confusion with 0 and 1.

Based on the PDF, I asked Claude to do an analysis, to see the rate of car registrations based on prefixes. This is what it generated.

Interestingly, I gave ChatGPT (the non-plus version) the same prompt and it erred out on its first attempt, and simply generated Pandas code for me the second time round. The code doesn’t work as it attempted to extract data from the PDF using regex.

Based on the graph, it seems that some years have extremely high registrations, eg. 2003-2007. That corresponds to a period where prices are relatively low (prices from 2002 till current from sgcharts.com).

#There are exceptions of course, when people register a new license plate for an old car.

ai cloud

How Many ‘Copilots’ Do We Need?

Post author By tongwing
Post date November 15, 2024

Dear companies, please stop using the term “Copilot” for everything. It is honestly confusing and doesn’t help anyone. Imagine a conversation like this:

Person A: Hey, have you tried Copilot? It's honestly amazing!
Person B: Yeah! I was using it to finish writing my Python assignment in half the time!
Person C: I didn't know you can do that? I was using Copilot to setup my AWS ECS services
Person D: Wait what? Isn't Copilot for summarizing emails and drafting Word documents?

¯\_(ツ)_/¯
How many “Copilots” are out there really? Let’s see.

Github Copilot

This should be familiar to developers.

GitHub Copilot is an AI coding assistant that helps you write code faster and with less effort, allowing you to focus more energy on problem solving and collaboration.
From the website

In short: Coding assistant

AWS Copilot

This one is less well-known, unless you are a AWS power user.

AWS Copilot is an open source command line interface that makes it easy for developers to build, release, and operate production ready containerized applications on AWS App Runner, Amazon ECS, and AWS Fargate.
From the website

To be fair, AWS Copilot was launched in 2020, before everyone started using the term to mean some form of AI assistant.

In short: Another AWS command line tool

It gets more confusing from here.

Microsoft Copilot (formerly known as Bing Chat, or Bing AI)

I can’t actually find an official definition for Microsoft Copilot. Lol.
As of Nov 2024, if you do a search and end up in the above page, all you see is a chat interface and page titled “Microsoft Copilot – Your AI companion”.

In short: ChatGPT variant

Note that if you sign in using your work Microsoft account, you will be unable to use Microsoft Copilot. You must either:

Sign in with a personal account, or
Sign out completely, or
Use Microsoft 365 Copilot (see next section)

In addition, the features you can use in (1) and (2) are slightly different. Sounds confusing? Let’s go on.

Microsoft 365 Copilot

The official page for Microsoft 365 Copilot is full of marketing-speak. It doesn’t help that different features are available depending on the region you are in. For the purpose of this article, we’ll go with the US version.

Based on info from the page, Microsoft 365 Copilot is an AI-powered virtual assistant that is integrated into Microsoft 365 apps like Word, Excel, PowerPoint, and Outlook. The gist of what is included is covered in the pricing section – a per-user license is required. When you meet others at work talking about “Copilot”, they are mostly likely referring to this.

In short: Clippy on steroids

Microsoft Dynamics 365 Copilot

There’s not a lot of content on this. Just a blog post from Microsoft. From what I can tell, Microsoft Dynamics 365 Copilot is an add-on to Microsoft Dynamics 365 range of products that deals with CRM and ERP.

In short: A chat interface for Dynamics 365

Microsoft Copilot in Azure

This newly launched feature is in preview, and the official website says: Simplify operations and management from cloud to edge with an AI companion.

In short: FAQ for Azure

Actually there are more, like this and that, and also not forgetting this. And I’m sure I’m missing quite a few others as well. Hopefully we will quickly move past this fad and avoid situations like the hypothetical conversion at the beginning of this article.

Or if you are still confused, you can always ask “Copilot”. 🙂

ai cloud

AWS AI Certifications

Post author By tongwing
Post date July 18, 2024

AWS is launching not one, but two new AI certifications, as demand for AI skills skyrockets.

For those who are unaware, AWS certification comes in 4 levels, roughly in terms of difficulty/professional experience required:

Foundational
Associate
Professional
Specialty

Previously the only AWS AI certification available is their specialty one: AWS Certified Machine Learning – Specialty, which is aimed at data scientists and ML engineers. To fill the intermediate gaps, AWS will be launching the following certifications:

The former is aimed at generalists (business analyst, product or project manager, sales professional) whereas the latter is targeting developers/engineers doing ML work but whom may not be full-time ML specialists.

These new certifications are marked as beta, so expect syllabus and content to change. They are currently priced at a discounted price of
USD75. The usual foundational certifications are USD100 and associate ones are USD150, so this is quite a good offer.

Registration for the new certification opens on 13 Aug 2024 and you can use Skill Builder resources listed in the certification page to prepare for it.