1. What models should I use?

Check out our model guide here: Link

2. I just subscribed to the Mini plan, but it’s giving me an error.

The Mini plan is a “Bring your own key” plan. It’s meant for people who’d like to control their own AI spend. This means, you get 0 chat messages per month, and would need to get an API key from either Anthropic, Gemini, or OpenAI, to continue using the chat models.

The Mini plan’s benefit is that it has Unlimited Code Apply, letting your models apply changes to your files near-instantly (+ things like Unlimited Autocomplete.)

To get your API keys, follow these articles: https://alexcodes.app/docs/keys

3. Can I use Alex completely free?

Yes! You can use our chat functionalities with your own API keys. You just won’t get access to the Code Apply feature.

4. Can I use codebase embeddings for free?

Yes! Our codebase embeddings model is provided for free to all users. You don’t need to bring an API key for it.

5. Can I use Alex with local models?

Yes! You can add your local models (e.g. Ollama or LM Studio) in Settings > Models & API Keys.

Make sure you add the v1, for example: http://127.0.0.1:11434/v1

You don’t need to put an API key, unless you’ve enabled it yourself.

6. Alex is not letting me login to a new device because I’ve reached my 2 device limit. What can I do?

Login to our portal: https://alexcodes.app/admin

Find the “Devices” section, and remove the serial number associated with the older device you no longer want on your account.

Just remember that there is always limit of 2 devices per account.

7. Our company has an AI proxy that isn’t available publicly. How can I use it?

Our team plans allow you to override all our model endpoints.

To get started:

  1. Go to your portal: https://alexcodes.app/admin
  2. Click on “Create New Team” and give it a name
  3. Go to the “Models” tab
  4. Add any chat models you’d like to use.

Note: The chat model needs to follow the OpenAI scheme.

You can also override the Autocomplete, Embeddings, Voice, Thinking, Web Search, and Code Apply models.

By doing this, you completely bypass our server.

8. How do I set my VAT ID?

  1. Go to your billing portal (https://alexcodes.app/admin > Manage Subscription)
  2. Scroll down to “Billing Information”
  3. Click “Update Information”
  4. Scroll down to “Tax ID”, and set your VAT
  5. Click Save

9. Can I disable Telemetry (Analytics + Crash Logs)?

Yes, but only on team accounts.

Once you create your team account on the portal (see above), click on the “Advanced” tab, and disable telemetry.

Your team users will all stop collecting telemetry. Make sure to login through Alex + restart the app to apply the changes.

10. Can I disable auto-compiling?

Yes. You can control what tools are provided to Alex by going to Settings > Tools and disabling the tools you don’t want.

You can then manually click the “Build and Fix Errors” button in the chat view whenever you like.

11. How do I stop Alex from automatically changing my files?

Next to the model selector, there’s a toggle that either shows “Manual” or “Auto Apply”.

Make sure it shows “Manual”, and Alex won’t automatically apply changes.

12. I’m getting a “Rate limit Exceeded” error, or a “Maximum Tokens exceeded” error

If you have your own API key inserted, make sure that you have increased your rate-limits with Anthropic/OpenAI/Gemini.

Sometimes the default rate limits are really low (e.g. 20k tokens / minute). This makes any chat unlikely to send.

You should check how to increase your rate limits with these services. Here are some helpful links:

You usually need to add a certain number of $ credits to your account to pass their rate limits.

Note: This has nothing to do with Alex. If you don’t want to worry about rate limits, subscribe to our Pro or Premium plans to use our “Chat Messages” credit system.

13. I would like to cancel my subscription. How can I do that?

Go to https://alexcodes.app/admin and click “Manage / Cancel Subscription”.

14. I clicked “Start Free Trial” and it immediately upgraded me.

This is a known issue when you have previously had a trial of that product. We’ll update it soon to prevent this.

15. Do you use our data for training?

We don’t collect your chat requests for training unless you’ve opted into training mode during onboarding. If you’d like to disable it, go to Settings > Privacy.

We collect Crash Logs & Analytics (via Sentry and PostHog) which you cannot disable — unless you are on a team plan (see #9.)

16. Why is my simulator opening after Alex compiles my app?

Alex tries to compile your app and run it if successfull. Then, it may attempt to click around the UI to confirm its changes were correct.

If you’d like to disable these actions, go to Settings > Tools and uncheck the Simulator actions, as well as the “Run App” and “Compile” actions.

17. What happened to Gemini 2.5 Pro Exp (Free Gemini)

The Gemini team has discontinued their free model. It’s no longer supported.

18. Alex is stuck on “Waiting for model…” but nothing is happening

Please update to version 3.1.11 to fix this issue. (Same for if git commit generation is stuck.)

19. How does image pricing work?

Generating any image will cost 2 chat messages. Each additional image (on top of the first) will cost an additional chat message (e.g. if alex generates 2 images, that would be 3 chat messages).

This is because OpenAI’s image generation API costs $0.04 per image.

20. Can I use image generation with my own API Key?

Yes! It requires an OpenAI key. Here’s a guide: https://alexcodes.app/docs/keys/adding-openai-api-key

21. Why does Alex tell me I have like 300 messages remaining, but the chat says I’ve used 140k of 150k tokens used?

TLDR; The context bar is only useful for knowing when to start a new chat. It’s not used for our billing.

Long answer:

The Context Bar (tokens) is entirely different than the message system. It shows how much of the context limit you’ve used up in the chat.

AI systems work based on “Context”. Every time we send a message, we have to construct the whole chat into one large request to send to the AI.

Naturally, this becomes very expensive. e.g. if you used Claude Sonnet 4 with 200k tokens (or approximately 1 million characters of text), you would need to spend $0.60 every time you send a message. This includes any time the agent takes an action.

This is why we limit the amount of context is sent to the chat model. And when we limit it, that means only a certain length of conversation can be passed in.

What determines how much context is used?

The total text inside the chat. This includes:

  • The system prompt we provide
  • Each file you attach (this can take up a lot of context!)
  • Each message you’ve sent in the chat (including images)
  • Each response the AI has given to you, including any of the code it has written

This can add up quickly, so it’s important to keep tabs on your context bar.

Also: Generally, as you use up more tokens, AI systems get more “confused”. It’s like if you passed a whole book to some and asked them to find a single word! This is why limiting context, keeping chats short, and passing only required files, helps actually improve your results.

Now, back to the point of confusion: The amount of tokens you use has no effect on the # of Alex messages you’ve consumed from a billing perspective. We simplify things by counting each request to the AI model as 1 “message”. Some messages could be cheaper, some more expensive, but we average it so that you spend less time counting tokens and prices.