Sunday, September 7, 2025

The Design Doc Dilemma: Finding the "Just Enough" in Your Two-Week Sprint

How a little upfront design can prevent your "go fast" agile team from actually 
going slow.

If you’ve worked on an agile team, you know the rhythm: backlog grooming, sprint planning, a two-week burst of coding, and then a review. The mantra is often "working software over comprehensive documentation." But if you’ve been in the trenches, you’ve also likely seen this scenario: 

A complex story gets pulled into a sprint. The team huddles for a quick 15-minute discussion, then everyone jumps straight to code. Two days later, PR comments reveal a fundamental misunderstanding. A week in, two engineers realize their implementations are incompatible. By the end of the sprint, the feature is "done," but it’s a Rube Goldberg machine of code—brittle, overly complex, and difficult to test.
The subsequent sprints are then plagued with bug fixes, refactoring, and rework directly caused by that initial rushed implementation.

Going fast ended up making us go agonizingly slow.

This isn't an indictment of agile; it's a misapplication of it. The key to avoiding this trap isn't to revert to weeks of Big Design Upfront (BDUF), but to intelligently apply "Just Enough" Design Upfront (JEDUF). And the most effective tool I've found for this is the humble design document.

Why Jumping Straight to Code Fails for Complex Problems

For simple CRUD tasks or well-trodden paths, a story description and a quick conversation are perfectly sufficient. The problem space is understood, and the solution is obvious. But for complex, novel, or architecturally significant work, code is a terrible medium for exploring ideas.

Why?

  • Code is Final: Writing code is an act of commitment. Changing a core architecture decision after hundreds  of lines have been written is expensive.
  • It Lacks Context: Code shows how something is done, but rarely explains the why-the considered alternatives, the trade-offs, and the rejected ideas.
  • It's Isolating: Without a shared artifact, engineers can head down divergent paths, only discovering 
    their misalignment during a painful merge conflict.

The Antidote: The Targeted Design Doc

The solution isn't to document everything, but to recognize the stories that carry risk and complexity and treat them differently. For these, I advocate for a simple process:
  1. Identify the Candidate: During sprint planning or grooming, flag a story as "complex." This is usually obvious—it involves new system integrations, significant performance requirements, novel algorithms, or has a high degree of ambiguity.
  2. Time-Box the Design: The assigned engineer spends a few hours (not days!) drafting a concise design doc. This isn't a 50-page specification. It's a brief document that outlines:
    • The Problem: What are we actually solving?
    • The Proposed Solution: A high-level overview of the approach.
    • Considered Alternatives: What other paths did you consider? Why were they rejected?
    • Key Trade-offs: (e.g., "We chose faster performance over code simplicity here because of 
      requirement X.")
    • Open Questions: What are you still unsure about?
  3. Review & Socialize: Share the doc with other senior engineers—often async, but sometimes in a quick 30-minute meeting. The goal isn't to achieve consensus, but to 
    stress-test the idea. Does this make sense? Are there hidden pitfalls? Is there a simpler, more elegant solution we're all missing?
  4. Iterate or Implement: Based on the feedback, the design is improved, simplified, or sometimes rejected altogether in favor of a better approach. Now the team codes, with a clear, 
    vetted blueprint.

The Science: Knowing When to Use a Design Doc

The science,  is in the discernment. You don't do this for every story. That would be bureaucratic and slow. You do it for the ones where the cost of being wrong is high.

Use a design doc when the story involves:

  • Cross-team or cross-service dependencies.
  • New technology or patterns the team isn't familiar with.
  • Significant performance or scaling concerns.
  • High-risk areas of the codebase.
  • Fundamental changes to the application's architecture.

For the vast majority of stories, the "design" is a whiteboard sketch or a conversation. But for the 10-20% that are truly complex, the design doc process is an accelerator, not a hindrance.

The Result: Speed, Quality, and Alignment

This approach transforms your process:

  • Fewer Revisions: Catching design flaws in a doc is orders of magnitude cheaper than catching them in a PR.
  • Collective Ownership: The entire team understands the why behind the solution, leading to better maintenance and fewer regressions.
  • Knowledge Sharing: The document becomes a lasting artifact for future engineers wondering,
     "Why did we build it this way?"
  • True Agility: You're not just moving fast; you're moving fast in the right direction
    You build quality in from the start, instead of trying to test or refactor it in later.

So, the next time your team faces a gnarly story, resist the urge to dive headfirst into the IDE. 
Take a breath, write a page, and get a second opinion. You’ll find that a small investment in thinking saves a huge amount of time in coding.

How does your team handle complex design? Do you have a process for "just enough" 
documentation? Share your thoughts in the comments below!


Thursday, August 14, 2025

Bug Driven Development (BDD): When "Done" Really Means "Debugging Hell"

 

Introduction

In the world of Agile and Scrum, the term "Done" is sacred. A story is supposed to be complete-tested, reviewed, and ready for production. But in some teams, "Done" is just the beginning of a never-ending cycle of bug fixes. This anti-pattern has a name: Bug Driven Development (BDD)!

What is Bug Driven Development (BDD)?

BDD is a dysfunctional workflow where:

  1. A developer claims a story is "Done" in Sprint Review.
  2. QA (or worse, users) finds a flood of bugs that should never have existed.
  3. The next sprint is spent fixing what was supposedly "finished."
  4. The cycle repeats, creating technical debt, frustration, and burnout.

Unlike Behavior-Driven Development (the good BDD), where tests define requirements, Bug-Driven Development means bugs define the real scope of work.


The Face of BDD

Bug Driven Development (The Unintentional Anti-Pattern)

  • Symptoms:
    • "Works on my machine" mentality.
    • Zero (or flaky) unit/integration tests.
    • QA backlog grows faster than the dev sprint velocity.
  • Root Causes:
    • Poor Estimating and Planning leading to
    • Rushing to meet sprint deadlines without proper validation.
    • No code reviews (or rubber-stamp approvals).
    • Learned helplessness—engineers think bugs are inevitable
    • Cultural - favor speed over quality and expecting bugs is natural.
    • Lack of accountability—no consequences for shipping broken code.
    • Low-skilled engineers who don’t understand defensive programming.

Why BDD is a Silent Killer

  • Wastes Time: Fixing preventable bugs drains 50%+ of dev capacity (Microsoft Research found devs spend 50-75% of time debugging).
  • Kills Morale: Engineers hate working in bug factories.
  • Destroys Trust: Stakeholders stop believing in "Done."
  • Increases Costs: Late-stage bug fixes are 100x costlier (IBM Systems Sciences Institute).

How to Escape BDD (Before It Kills Your Team)

1. Enforce Real "Definition of Done" (DoD)

  • No story is "Done" without:
    • ✅ Unit/Integration tests.
    • ✅ Peer-reviewed code.
    • ✅ Passing QA (not "mostly working").

2. Shift Left on Quality

  • Test-first mindset: Write tests before code (TDD).
  • Automate validation: CI/CD pipelines should block buggy code.

3. Stop Rewarding Speed Over Quality

  • Measure & penalize escape defects (bugs found post-"Done").
  • Celebrate clean code, not just closed tickets.

4. Fire Bad Engineers (If Necessary)

  • Low-skilled engineers can learn, but Bad Engineers won't.
  • If someone refuses to improve, they’re a culture toxin.

Conclusion: From BDD to Brilliance

Bug-Driven Development isn’t Agile—it’s technical debt in motion The fix? Stop accepting "Done" until it’s really done. Otherwise, prepare for a future where your sprints are just bug-fixing marathons.

Question for You:
Does your team Implements BDD? Share your horror stories below!

Saturday, July 12, 2025

Why Feature Teams Beat Siloed Development: Lessons from a Cloud Migration

When I joined CompanyX (to protect it's identity), they were in the midst of a massive modernization effort—replacing legacy monolithic systems with sleek, cloud-native micro services running on Google Cloud. On paper, it was a forward-thinking move. But there was a catch: the engineering teams were strictly divided into backend and frontend squads, each working in isolation.

The backend team built REST APIs. The frontend team consumed them. They coordinated via GoogleChat, ADO tickets, and API contracts—yet, when it came time for User Acceptance Testing (UAT), chaos ensued. Bugs surfaced. Assumptions clashed. Finger-pointing began.

What went wrong?

The Problem with Siloed Teams

The traditional "Backend vs. Frontend" split seems logical at first:

  • Backend engineers focus on APIs, databases, and business logic.
  • Frontend developers build UIs, handling state management and user interactions.

But in practice, this separation creates three major headaches:

  1. Late Integration Surprises

    • Teams work on different timelines, delaying end-to-end testing until late in the cycle.
    • By the time APIs and UIs meet, mismatches in data structures, error handling, or performance become costly to fix.
  2. Communication Overhead

    • Instead of real-time collaboration, teams rely on documentation and meetings—which often lag behind actual development.
    • A backend engineer might design an API that "makes sense" to them but is awkward for frontend consumption.
  3. Lack of Ownership

    • When something breaks, it’s easy to say: "That’s a frontend issue" or "The backend payload is wrong."
    • No single team feels responsible for the entire user experience.

A Better Way: Feature Teams

What if, instead of splitting teams by technical layer, we organized them by features?

A feature team is a small, cross-functional pod that includes:
✔ Backend developers
✔ Frontend developers
✔ (Optional) QA, DevOps, Data Engineers 

Their mission? Deliver a complete, working slice of functionality—not just a backend API or a UI mockup.

Why This Works Better

  1. Early and Continuous Integration

    • Since the team builds vertically, they test integrations daily—not just in UAT.
    • Bugs are caught early, reducing last-minute fire drills.
  2. Tighter Collaboration

    • Backend and frontend devs sit together (or pair remotely), discussing API design in real-time.
    • No more "This isn’t what we agreed on in the spec!" surprises.
  3. End-to-End Ownership

    • The team owns the entire feature, from database to UI.
    • No more blame games—just collective problem-solving.
  4. Faster Delivery

    • Features move smoothly from development to testing to production.
    • Less waiting on external dependencies.

What I Wish We Had Done Differently

Looking back, Company X’s cloud migration could have been smoother and faster with feature teams. Instead of:
"Backend will deliver the API in Sprint 3, frontend will integrate in Sprint 4,"

We could have had:
"Team A ships the 'Checkout Flow' by Sprint 3—fully working, tested, and deployed."

Key Takeaways

  • Silos slow you down. Separation of frontend and backend creates friction.
  • Feature teams align with Agile & DevOps principles—focusing on working software, not just technical outputs.
  • Own the whole feature, not just a layer. This reduces risk and improves quality.

If you're leading a modernization effort (especially in microservices or cloud migrations), break the silos early. Build feature teams, not fragmented departments. Your UAT phase will thank you.


What’s your experience? Have you seen siloed teams cause integration nightmares? Or have you successfully shifted to feature-driven development? Share your thoughts below! 🚀


Tuesday, July 30, 2024

Interview Question: "Describe a challending Project you worked on"

 

Describe a Challenging Project You Worked On


One common interview question is to "Describe a challenging project you worked on." In 2011, AWS had only a few RDS choices, unlike the many options available today.

Throughout my career, I have worked on many interesting projects, but I will focus on my time at Precor, a fitness equipment manufacturer. In 2011, Precor was building on the concept of "Connected Fitness" to allow fitness machines to connect to the internet. This enabled users to download workouts, save workouts, watch instructional videos, read e-books while running on a treadmill, and enjoy many other features.

Precor needed to build a team from the ground up for this project, as they previously only made fitness machines with basic embedded software. I was the second hire after the engineering manager and became Principal Engineer. My mission was to design and build on the vision of "connected fitness." Precor had two main teams: one working on the console (P80 equipment, a dedicated terminal attached to a fitness machine) and my team, working on the backend systems powering all machines in gyms, clubs, and hotels.

Team Composition:


My team consisted of an Engineering Manager, a Principal Engineer (me) as team lead/architect, five engineers, and a Product Owner. We followed a Scrum approach.
 

My Mission:


I was tasked in building two categories of APIs (myself and the team, I played a key role, leading the team, in both defining much of the architecture and code modules):


1. APIs to help club operators understand machine utilization.  
2. APIs to empower exercisers.

Focus on the Exerciser's API:


Every console connected to the internet had a dedicated UI running on an embedded Linux machine, powered by an ARM Cortex CPU with decent video capability, such as rendering YouTube videos. The Fitness Equipment (FE) served as an API client. Another client type was the Mobile App, built by a third party.

The backend systems were built as microservices running on the AWS Cloud. The Exerciser API was a REST API leveraging OAuth2 for user authorization. The use case for exercisers was to create and track their fitness goals using both a mobile app and the fitness machines, regardless of location, as long as they were using Precor connected fitness equipment.

For club owners, the use case was to better serve their customers with modern machines and understand machine utilization, idle time, and receive custom alerts for machine malfunctions. They could also generate custom reports on user exercise frequency to predict membership cancellations.

Exerciser API Features:


The Exerciser API allowed users to log in via RFID, enabling the machine to adjust settings such as angle, inclination, and speed on a treadmill, and start recording exercise data, including calories and duration. Users could check their daily and weekly exercise progress towards their goals on the mobile app, which could be customized for goals like getting fit or losing weight. Users were awarded badges on the mobile app for achieving milestones, such as 1,000 steps, accompanied by a congratulatory message and a cool image.


The Exerciser API:

 

 


On the backend, we built the stack with Java and the Spring framework, using Apache as the HTTP server. The database was RDS (MySQL), and we used DynamoDB, a columnar datastore, for high volume and write throughput. Redis was used to track denial of service attacks.

DynamoDB was utilized for the Fitness Machine APIs to store frequent heartbeat data and log messages. To buffer between the server and storage, we implemented a message queue in front of DynamoDB.



The Goals of this Project, known as Preva was to Increase user retention, attract new members, 

drive secondary revenue generation and help gym members achieve their goals.

Precor conducted several studies to measure each of these goals, and I will link these studies to Precor website:

Increase retention

Attract new Members

Drive Secondary Revenue Generation

Help Gym Members Achieve their Goals


Conclusion:


This project was both fun and challenging due to several factors: tight deadlines, the innovation involved, learning cloud computing, leading a team, and ultimately helping people improve their lives by promoting a healthy lifestyle.



Wednesday, January 7, 2015

Spring Boot Camper and REST Assured Testing Library

I almost never blog, but I will attempt once more. Hopefully I will make it a habit.

I have recently created a sample project to demonstrate how simple and cool it is to write REST APIs using Spring Boot. Also, how to test it with integration tests that runs as fast as unit tests using REST Assured.

This is going to be a series of samples, each focused in showcasing one aspect of the framework or technique.
The first one is  camp_rest_assured,  which demonstrate how to integrate Spring Boot with REST Assured library, with an extra bonus of showing a technique that I created using a custom converter service to do request validation using custom messages collected from representation beans annotated with JSR-303 validations.

Checkout the sampler project here:
https://github.com/phavelar/boot-camper

I will be adding detailed explanations soon. Stay Tuned!

Friday, November 2, 2012

Grails URL Encoding

Bug: Form data being encoded twice in Grails 2.1.1


I recently run into a bug related to character encoding in Grails 2.1.1 application that manifested a priory only when the web app was deployed into production.  
All tests where passing during CI builds as well as local development environment.
It was a tricky issue to figure it out, so I want to share it here, hopping to save you some time if you encounter this problem and are fortunate enough to find this post :)  

Say you're dealing with i18n and have UTF-8 form encoded text data (application/x-www-form-urlencoded), somehow after posting a text in Cyrillic, the contents were being doubly encoded, messing up the original text data..

The issue does not manifest in development mode,  or rather, I discovered that if you use IntelliJ IDE to launch the web application ("exploded mode" )  all is normal, that is the Cyrillic text is properly encoded.  Instead,  if you build the war using "grails war" command and manually deploy it to Tomcat, then this bug happens.

Digging deeper we had the following sloppy code to encode the form data:

URLEncoder.encode(formData)

As you can see from the Java Doc API,  this is a deprecated method, where the resulting string may vary depending on the platform's default.

I'm not sure why the platform default changes when packaging the war via "grails war" command versus running the war from within the IDE, but the fact is that this had cost us a few hours spent on debugging.

Method Summary
static Stringencode(String s)
          Deprecated. The resulting string may vary depending on the platform's default encoding. Instead, use the encode(String,String) method to specify the encoding.
static Stringencode(String s, String enc)
          Translates a string into application/x-www-form-urlencoded format using a specific encoding scheme.

To fix, simply change the code to URLEncoder.encode(formData, "UTF-8")

So, discovering the behavior about packaged versus exploded war phenomena  was half the battle to be able to reproduce this bug.   However, this could have been avoided altogether if the developer had paid attention to compiler deprecated warnings. Or  not done this:

-Dgrails.log.deprecated=false  //to turn off for development mode  

Hope this post can help some fellow developers !

Happy Coding !




 

Thursday, October 18, 2012


Using Guava Ranges to Implement Rules-based Logic


Consider the requirements of implementing an awarding mechanism based on goal completion percentage.
This kind of requirements is commonly found in loyalty programs. For example, if the user achieves between 70% to 79% of  his/her original goal, the user gets  a 10 points reward. Likewise, if the user completes 80% to 89%  the user gets 15 points and so on.  The basic idea is to have a set of percentage ranges associated with points. In addition, the percentage ranges/points combination must changeable to support evolving requirements.
I'm currently working on a Fitness Application, so I'm going to use this domain to illustrate a coding technique for Rules-based logic without using explicit IF statements.  In this fitness application, we wish to award "badges" to an exerciser, based on the percentage of goal completion. 

The first thing that comes to mind, is to implement a series of  "if-else" blocks to determine the percentage range the exerciser is at according his goal completion: 

 int lookupPoints(int goal)   
 {  
   int points = 0;    
   if (goal >= 70 && goal <= 79){  
     points = 10;  
   }   
   else if (goal >=80 && goal <= 89){   
     points = 15;  
   }  
   else if (goal >=90 && goal <= 99){  
     points = 25;  
   }   
   else if (goal >=100 && goal <= 109){  
     points = 50;  
   }  
   else if (goal > 110){  
     points = 60; 
   }  
   return points;  
 }  


This approach may be fine, but it is spaghetti code.  As requirements change you may  have to add or modify the existing range boundaries and point values.

Another approach to consider is leveraging a rules engine. I think using a rules engine, like Drools is overkill for this simple case.

Is there is a way to accomplish the same thing, without using any explicit "if"s while making possible to easily modify the ranges and point values?
Yes, after all you're reading this to find out how !  With this blog technique, there is no need to change the underlying decision making algorithm in case requirements change, all you need to do is to change the data setup, just like a  fixture.

What kind of sorcery is this ?  Guava  libraries to the rescue!

Enter the Range class. (follow the link for it's Java Doc)
A range (or "interval") defines the boundaries around a contiguous span of values of some Comparable type. So we can express all intervals in the code block above this way:


 Ranges.closedOpen(new Integer(70, new Integer(79));  
 Ranges.closedOpen(new Integer(80, new Integer(89));  
 Ranges.closedOpen(new Integer(90, new Integer(99));  
 Ranges.closedOpen(new Integer(100, new Integer(109));
 Ranges.atLeast(new Integer(110));   

What we need next is a way to associate each range with its "points" value.
We can use a Map for that like that:

 Map<Range<Integer>, Integer> pointAwardMap = new HashMap<Range<Integer>, Integer>();  
 pointAwardMap.put(Ranges.closedOpen(new Integer(80, new Integer(89)), 10);  
 pointAwardMap.put(Ranges.closedOpen(new Integer(90, new Integer(99)), 15);   
 // ...  etc, until   
 pointAwardMap.put(Ranges.atLeat(new Integer(110), 60);  

Now that we have the point associated with a range, we need a way to do the look up based on the current goal value.  That can be accomplish using the Guava Collections2  Filter class:

 int lookupAwardPoints(final Integer completedGoal)  
 {      
    Collection<Range<Integer>> percentileRange = filter(pointAwardMap.keySet(), new  Predicate<Range<Integer>>()  
    {  
      @Override  
      public boolean apply(Range<Integer> input)  
      {  
        return input.contains(completedGoal);  
      }  
    });  
    return percentileRange.isEmpty() ? 0 : pointAwardMap.get(percentileRange.iterator().next());  
 }  

So, now that you know the though process, all that is left is to organize the code in a nice clean way, removing duplication and making it easy to add new rules:

1:  import com.google.common.base.Predicate;  
2:  import com.google.common.collect.Range;  
3:  import com.google.common.collect.Ranges;  
4:  import java.util.Collection;  
5:  import java.util.HashMap;  
6:  import java.util.Map;  
7:  import static com.google.common.collect.Collections2.filter;  
8:  public class PointAward  
9:  {  
10:    {  
11:      addRangeAndPointAward(70, 80, 10);  
12:      addRangeAndPointAward(80, 90, 15);  
13:      addRangeAndPointAward(90, 100, 25);  
14:      addRangeAndPointAward(100, 110, 50);  
15:      addRangeAndPointAward(110, 60);  
16:    }  
17:    private Map<Range<Integer>, Integer> pointAwardMap = new HashMap<Range<Integer>, Integer>();  
18:    public int lookupAwardPoints(final Integer percentCompleted)  
19:    {  
20:      Collection<Range<Integer>> percentileRange = filter(pointAwardMap.keySet(), new Predicate<Range<Integer>>()  
21:      {  
22:        @Override  
23:        public boolean apply(Range<Integer> input)  
24:        {  
25:          return input.contains(percentCompleted);  
26:        }  
27:      });  
28:      return percentileRange.isEmpty() ? 0 : pointAwardMap.get(percentileRange.iterator().next());  
29:    }  
30:    private void addRangeAndPointAward(int lowerEnd, int pointAward)  
31:    {  
32:      pointAwardMap.put(Ranges.atLeast(lowerEnd), pointAward);  
33:    }  
34:    private void addRangeAndPointAward(int lowerEnd, int upperEnd, int pointAward)  
35:    {  
36:      pointAwardMap.put(Ranges.closedOpen(lowerEnd, upperEnd), pointAward);  
37:    }  
38:  }  

Note that all you need to do if your rules change is to modify the "fixture" like code on lines 10-15,  no need to ever change the looupAwardPoints()  function.  Also notice that we don't have any ifs, of spaghetti code.
Granted this code is way more sophisticated and complex than the first one, but it give you a lot of flexibility.

I hope, I showed you a useful trick!

Until next time!