Week 2 Summary: First version of the agent working
Days 8-15 / 60
Second week completed. 25% of the journey done.
The difficult week
I'll be very honest: this week was not productive at all. End of year holidays, ~2,000 km of travel, and little opportunity to sit down and work. It frustrates me a bit because time doesn't stop and there are 45 days left. I need to compensate with more intensity in the coming weeks.
The agent is breathing
Despite the little time, I achieved something important: the first version of the agent is working and starting to behave as expected.
Last week I mentioned it felt "rough". This week there's a notable change, and it wasn't simply from having a better prompt but rather I changed the model.
The jump from Gemini 2 to Gemini 3
I started using Gemini 2 Flash because I wanted fast responses. The problem: it constantly failed with tools. Sometimes it wouldn't call any when it clearly should. Other times it hallucinated made-up data without even attempting to run a simulation.
Since I'm participating in the Gemini 3 Hackathon, I decided to try gemini-3-flash-preview—the fastest version of Google's most powerful model. The difference is brutal: reasoning is more consistent and tool calls work as they should. The agent stopped hallucinating and started calling tools properly.

DN running a power flow in PowerFactory
The acid test: compared faults
I asked the agent to run a single-phase fault at 50% of line 1 and then a three-phase fault to compare results. It worked perfectly.

Fault comparison: single-phase vs three-phase
The interesting thing is not just that it executed the simulations, but that it understood the sequence without me having to explain each step. It used 3 of the 10 available iterations in the agentic loop:
- Understand the loaded project
- Execute single-phase fault at 50% of the line
- Execute three-phase fault at 50% of the line and compare
With Gemini 2 Flash, this same prompt had ended with the agent making up short-circuit currents without simulating anything.
About the 10 iteration limit
For now I configured a maximum of 10 tool calls per query. It's an arbitrary number that allows me to experiment without the agent entering infinite loops. For simple tasks like fault comparison, 3 iterations are enough. For a complete ECAP, I'll probably need more, but that's future me's problem.
The web app
I also have a web application working. For now it's for internal testing, but the idea is to make it available soon so others can interact with the agent.
What's next
Next steps:
- Machine benchmark: My test systems are so small that simulations take less than 5 seconds. I need to test with real cases to understand the limits.
- More tools: The agent only knows how to simulate. I still need to connect the Coordinator's PGP and Infotecnica so it can learn from real studies.