Why choosing the right AI model is important

Why choosing the right AI model is important
Photo by Igor Omilaev / Unsplash

The SaaS I'm currently building integrates with the OpenAI API to perform various tasks. I'd consider most of these tasks to be quite straight forward. We're not passing massive amounts of data to OpenAI and aren't getting much data back from OpenAI so therefore our token usage is quite low. They do, however, perform an important part of the platform so it's vital they are performant.

I've recently been monitoring our OpenAI usage and in particular, cost. I had an idea on how to improve it so we could spend less, but get the same functionality. I wanted to tackle this early on before it becomes a problem with the more customers than onboard onto the platform though.

All of our jobs that interact with OpenAI in the application were using GPT-4.1 mini as the model. I did some testing with each one and gathered the output. This model pricing is $0.40 / 1M tokens for input and $1.60 / 1M tokens for output. Whilst those prices are fine, I wanted to see if we could actually use GPT-4.1 nano instead which is much cheaper at $0.100 / 1M tokens for input and $0.400 / 1M tokens for output.

The application currently uses the OpenAI API in 6 different places. I went through each place and swapped the call to use the GPT-4.1 nano model and compared the output against the GPT-4.1 mini model. Doing this allowed me to actually swap 5 of the 6 places to use GPT-4.1 nano as I saw no noticeable difference. The one remaining place that uses GPT-4.1 mini makes sense and using GPT-4.1 nano for this actually made it worse so I restored this part of the code to it's original state.

This means that the majority of the application is now using a much more cost effective model and also means we're not hindering on what the platform offers because the output was the same.

This was a good lesson to learn. Just because you can use a modal, you shouldn't blindly use it. If there's absolutely no benefit between the model you're using and a cheaper one, roll with the cheaper one!