Building a Native Desktop Chat App with Factory AI |

Translate: 🇫🇷 French 🇸🇦 Arabic 🇨🇳 Chinese 🇪🇸 Spanish

This article introduces a new AI tool that has been gaining a lot of traction and becoming really popular recently, mainly because its development pattern mirrors how real software is built in the real world. As you’ll see in this article, it actually built a native desktop chat GPT application, which is genuinely impressive. It doesn’t solve one big problem all at once; instead, everything is broken down into smaller parts with different responsibilities distributed across multiple roles. In this case, those roles are filled by different agents, and these agents are called droids.

The droids work together to diagnose issues, apply fixes, and eventually ship your software. It also features native integrations with tools like Notion, Jira, and Linear, which makes it feel like a real developer environment that thinks and codes the way actual teams do. In this article, we’re going to explore the platform in depth, and I’ll show you whether it lives up to the hype and if its team-like, modular approach to building real software actually works.

What are Droids?

So what are these droids? They are essentially small agents that each specialize in just one specific task. They work independently, transfer data between one another, and carry out real tasks for you. There are currently five droids available:

Tutorial Droid: When you first start with the demo, this droid guides you through the entire process and shows you exactly what to do.
Code Droid: This droid is responsible for writing code.
Product Droid: This agent doesn’t just help with coding but also focuses on delivering a solid product by managing things from a product perspective.
Reliability Droid: This one works on improving the overall stability and security of your application.
Knowledge Droid: This droid documents your codebase, offers technical explanations, and handles any external documentation you might need while working on your project.

The Building Process: A Walkthrough

Right now, I’m in the factory tutorial, and the tutorial droid is guiding me through the steps. They’ve asked us to connect to something in order to begin coding. At this point, I can choose to connect to a remote machine, a remote workspace, or even my own local machine. For that, I’m provided with a code to copy.

Factory includes an app that allows it to run commands locally on your system. The app is called Factory Bridge. This bridge application is where you paste the code you just copied. Once you enter the code, the app begins the connection process. As soon as the connection is successful, everything is ready to go, so we can close the app and proceed.

They’ve also shown us a checklist that displays all the commands factory can execute on our local system. This checklist allows you to select which commands should be auto-executed, giving you precise control over what actions are permitted. However, since everything is now fully set up, we can go ahead and officially begin.

After we confirm that the setup is complete, it tells us we’re going to build an app together using Tori, which is a framework that uses Rust and TypeScript. It also mentions that completing the tutorial will reward us with standard tokens. It then checks whether Rust is installed on the system. Since it detects Rust, it proceeds to clone a tutorial repository from its own source, which appears to contain a preconfigured template for building the application. After that, it begins installing dependencies and setting up the environment.

Another detail worth pointing out is how it handles command execution. Before running any command, it displays the potential risks involved, giving us the option to either accept or reject them. This is where the auto-accepts come into play. If we choose to allow all commands to run automatically, it stops prompting us for confirmation and simply executes the tasks. In this case, we’ll go ahead and let it proceed.

It ran the app, and now we can see that everything is up and running and functioning properly. At the moment, it’s just using a starter template, so there isn’t much visible yet. What stands out, though, is that this is not a web app; it is actually building a native desktop application. I’m not sure what it is going to ask us to do next; it might prompt us to customize something, but we’ll find out soon.

Up to this point, the capabilities are genuinely impressive. It read the TypeScript file located in our local directory and identified that the main application structure was inside the app.tsx file. After fetching the file, it opened a new window where code execution began, and this part is likely being handled by the coding agent. At the moment, it is implementing a simple counter feature to make the application more interactive. While the feature itself is basic, the overall workflow appears to be smooth and well-structured.

If we head back into the app, we can see that a counter demo has been implemented. This was added to demonstrate direct state management, and it’s clearly functioning as intended. We can increment the counter as needed, and it behaves exactly as expected.

Building a Real Application

At this point, it’s time to start customizing the application based on our own preferences, so let’s move into that next. They automatically provided us with options for what we’d like to build next. The message explains that we can now begin creating something genuinely useful—something more substantial, as they describe it. The available options include:

An LLM chat application
A code review tool
A meeting notes summarizer

What stands out about these options is that they aren’t just basic apps like to-do lists or manual schedulers, which are common with most AI or SAS builders. These are actual applications that involve external API integrations. For instance, the LLM chat application would require an OpenAI API key, and the other tools also depend on third-party APIs. This is where the Droid ecosystem really starts to shine. Most likely, the knowledge droid will play a role in gathering information about the APIs, helping us integrate them, and ultimately assisting in building something genuinely valuable. In this case, we’re looking at creating a real desktop application, which is quite impressive.

To proceed, we’ll go ahead and build the LLM chat application. I’m going to copy the option and paste it into the prompt box, and we’ll see how it goes from here.

Testing the Chat App

It built the application, and now it is instructing us to go ahead and get our API key from OpenAI. After obtaining the key, we simply need to click the settings button inside the app, paste the API key, and begin chatting.

Before continuing with the app, I want to mention the files that were modified. The preview appeared right beside them, and I could see the files updating in real time. This is the app that has now been built, and as you can see, we just need to go into the settings and paste our OpenAI API key. Along with that, the settings panel allows us to:

Select different models
Adjust the temperature
Set the maximum tokens for the model
Update the system prompt

Since the app operates through API access to OpenAI, it is great to see that it takes full advantage of these configurable options. I will now paste in my API key. After entering it, the system ran a test and confirmed that the key is valid. We can save the settings and try it out.

I am going to send a message, but it looks like there was no reply. I will try once more. At this point, it seems like the response has gone completely off topic. There might be an issue with the system prompt, so I will attempt one more time. It is still responding with something about a song by John Legend, and I have no idea why. I will check the system prompt, and it appears that nothing has changed there. There might be an issue with the application itself, so I will try to fix it using factory.

Debugging and Final Thoughts

It finally fixed the issue—or more accurately, it debugged it. I switched to Gemini 2.5 Pro and began the debugging process, and it turns out the problem was not with the app itself. The actual issue, although I am not entirely certain, seemed to be related to the model I had selected earlier, which was GPT-4. As soon as I switched to a different model, and after trying a few others as well, the issue completely disappeared. It only appeared when I was using that specific model.

Let me show you again: it is working properly now, and you can see that the chat application is responding as expected. It might have been a temporary issue with OpenAI’s API, but it was returning completely random answers regardless of the prompt. Now that everything is set up correctly, the chat app is functioning the way it should.

This was just a small demo, but I genuinely enjoyed using this tool. The way it breaks everything down and follows a structured workflow makes it feel like something that could genuinely transform how software is going to be built in the future.