Agent Yolo Mode & Python PyTest

Source: Notion | Last edited: 2025-01-09 | ID: 1752d2dc-3ef...

🛠

I try not to redundantly share what’s been mostly discussed by the YouTubers, but Yifan’s YouTube channel provides valuable insights that are directly applicable:
- Yifan - Beyond the Hype. It’s hands down the most down-to-earth, non-hype channel that focuses on practical uses of Cursor.

2025-01-07 After spending the past two to three weeks diving into Cursor IDE’s Composer interface with Anthropic Cloud 3.5 Sonnet enabled Agent Mode, iterating 100+ Yolo-enabled dialogues, I’ve come across a workflow that’s been incredibly beneficial for me. I wanted to share my experience and some tips here with you guys!

One of the standout tools I’ve been using is Python PyTest alongside the agent mode. By clearly defining my goals with PyTest, the agent can iterate until we reach the desired outcome. This has made the whole process so much smoother. PyTest integrates seamlessly with the agent mode, allowing the agent to understand the purpose behind each test. This means it can help us define and refine our test cases step by step. Before, writing unit tests was pretty tedious and sometimes made me hesitant to dive deeper into coding. But now, with PyTest and the agent’s support, I can focus more on developing the algorithms and creating new features instead of getting bogged down with extensive testing. The result? A solid set of unit tests that align perfectly with what I want and keep regressions from happening whenever new changes are made, no matter how big the changes in the code are. The test cases keep the agents in line!

A lot of people emphasize a results-oriented approach by prioritizing unit tests first. While unit tests are crucial for ensuring precision and avoiding errors, they can be quite labor-intensive and drain your mental energy. The precision needed doesn’t always add direct value during the creative process of developing new features. Balancing this used to take a lot out of me. However, integrating PyTest with the agent mode has significantly reduced this burden. The agent takes care of generating and managing the unit tests, freeing up my mental resources for more creative and impactful aspects of feature development.

Whether you’re working on finding ideas or building unit tests, I recommend using the “Explore and Rank” prompt. By including the “Explore and Rank” phrase in your prompt, the LLM agent generates a list of potential solutions and ranks them based on factors like feasibility and effectiveness. These rankings often come with rationales such as “strongly recommended” or “mildly recommended,” and sometimes include numerical ratings like 5 stars or 4.5 stars. For ideas I haven’t thought of or heard about, the agent quickly creates demos to help me implement those solutions. This allows me to assess their suitability without investing too much time upfront. If a solution doesn’t fit, I can skip it and rearrange the priorities accordingly. This prompting keyword phrase, along with automatic demo building, provides a non-mentally draining way to evaluate and prioritize options.

For example, charting has always been a great way to visualize features, but handling high-frequency data was challenging due to the sheer volume. With agent mode, I can apply performance enhancement methods alongside PyTest to ensure everything runs smoothly without discrepancies. After a few trials, it becomes obvious to me that handle high-frequency data visualization without overloading the browser is by chunking charts into multiple browser tabs — a simple but elegant solution proposed by the agent that I haven’t thought of!

Another example: tools like Numba sometimes introduce minor inconsistencies with the original implementation, but PyTest helps identify the best refactoring of slow code to achieve 3-10x performance with Numba-enabled code while maintaining consistency. What’s truly impressive is that this process happens iteratively on its own—it’s almost like magic. The agent already knows the results, and with PyTest being particularly good at pinpointing where the problem lies, the LLM agent can iteratively fix issues until everything is satisfactory.

However, having said all that, human intervention is still necessary, and there are a two main areas where it comes into play:

1. Managing Iterations

We don’t want the iterative process to drag on too long. The longer the context, the slower the generation of new responses becomes. To keep things efficient, we need to limit the number of iterations and stay aware of the ongoing context. Unfortunately, Cursor IDE doesn’t show the token count or context window size we’re using, but we can manage by keeping our interactions as concise as possible. Here are two strategies to handle this:

1.1. Open New Composers:

Start a new composer session as soon as a task is completed.

1.2. Monitor and Adjust:

If the agent starts veering off track after a few iterations, revert to an earlier prompt where the context was clear and guide it back on the right path.

2. Rearranging Priorities

2.1. Re-aligning the Process:

Sometimes, after several iterations without good results, it’s clear the agent is heading in the wrong direction. In these cases, I go back to a specific prompt, discard the ineffective iterations, and re-enter the correct prompt to realign the process. This ensures our efforts remain productive and focused. For processes that aren’t starting correctly, I maintain a strong initial context by sticking to the same prompt in the third iteration to keep things aligned. If things go too far—say, after four or five iterations without satisfactory results—I revert to an earlier, more effective prompt and continue from there. This flexibility is incredibly handy and keeps the workflow efficient and effective.

2.2. Constructing Safeguard and Boundary Files:

Another useful aspect of my workflow is the construction of safeguard and boundary files. These files are crucial for maintaining constraints and ensuring certain undesirable scenarios don’t happen again. Depending on the requirement, the agent may use various formats like YAML, INI, or type-safe documents managed by Pydantic. Although Pydantic hasn’t met my expectations yet—perhaps because it’s relatively new and the LLM isn’t fully familiar with it—it remains a promising tool for the future. Safeguard and constraint files act as guardrails, preventing issues like inconsistent time denominations in Python Pandas (e.g., switching from “T” to “min” for minutes). By maintaining these safeguarding files, we ensure that the agent adheres to predefined rules, minimizing the likelihood of recurring errors.

Takeways

Iterative Process Management: To keep the iterative process efficient and not too lengthy, start a new composer session after completing certain tasks or intervene early if the agent begins to deviate from the desired path.
Prioritizing Solutions: Use the “explore and rank” prompt to have the agent evaluate and prioritize different solutions based on its rationale. This helps in making informed decisions without getting overwhelmed by too many options.
Demo Implementations: When you encounter unfamiliar or new solutions, ask the agent to create quick demos. This allows you to rapidly assess their suitability before fully committing.
Reverting Iterations: If multiple iterations don’t yield satisfactory results, revert to a previous prompt that worked well and guide the agent from there to maintain alignment with your goals.

Tips & Tricks

Safeguard and Constraint Files: Implement safeguard and boundary files (safeguard.qmd, constraints.qmd, architectural_decisions.qmd) to maintain consistency and prevent recurring issues. Ensure the agent prioritizes these files to adhere to established rules and constraints.
- I’ve been working with the Quarto’s QMD file format because it offers a great balance between human & LLM readability, making it both stylish and functional.
- https://marketplace.visualstudio.com/items?itemName=quarto.quarto — remember to install its extension on Cursor.
Automated Commits: People often recommend committing frequently and regularly, but committing can be mentally taxing due to the need for precise commit messages, correct commands, and proper formatting. With our workflow, the agent automatically detects the context from our dialogues, including the motivations behind our actions, and identifies actual changes made. This eliminates the reliance on memory or manual tracking. By keeping the context window short, the agent can handle committing automatically, reducing mental energy expenditure. The only manual step required is reviewing the generated commit message to ensure it aligns with our intentions. **Human contribution involved refining the commit message format through trial and error to align the QMD workflow file with our preferred standards. **
- Ref: Cursor Agent: Git commit and push workflow 2025-01-09 updated with y/n confirmation by the git users.

Video file

Selective Contextual Input: To save time and increase precision, manually select the folders or files that are most contextually relevant to your current task. Right click or double tap them. Use the “Copy Relative Path” in the context menu. Paste these paths into the Composer’s prompt for the agent to consume as context. This allows the agent to focus on the most important contexts before proceeding further, ensuring more accurate and efficient outcomes.

Video file