“Lean Evaluations” for Measuring the Impact of Digital Farmer Services

In the rapidly evolving digital farmer services (DFS) sector, there’s a growing need for meaningful, rigorous evidence of impact without relying on resource-heavy randomized controlled trials (RCTs). Through a partnership with the Bill and Melinda Gates Foundation and Busara, 60 Decibels piloted a new evaluation method that balances rigor with efficiency.

Introducing Lean Evaluation: Bridging the Gap

60 Decibels’ Lean Data methodology, known for its agility and low cost, leverages short, phone-based interviews to gather feedback from end users, providing directional insights into outcomes like increased production or income. This approach typically concludes in under four months. Our challenge was to develop a method that, while still agile, allows for causal inference—something more robust than Lean Data without the costs and time demands of RCTs and quasi-experimental designs (QEDs). This led us to develop the lean evaluation methodology, piloted with two DFS companies in Sub-Saharan Africa. 

Key Features of Lean Evaluation

To create this approach, we adapted the difference-in-differences (DiD) method with a few practical adjustments:

  • Brief Phone Interviews
    Instead of in-person surveys, we used 20-30 minute phone interviews, gathering only essential data to support cost and time efficiency. This leverages the Lean Data approach, keeping interviews concise and feasible across large, dispersed groups.
  • Retrospective Baseline
    Without pre-registration contact, we established a retrospective baseline, asking farmers to describe outcomes from the previous agricultural season. Conducting these as soon after registration as possible minimizes recall bias.
  • Provider-Selected Comparison Groups
    We asked DFS providers to identify comparison groups among farmers they had previously contacted but who hadn’t enrolled. While this method assumes consistent time-related factors across groups, we recognize potential limitations around self-selection.
  • Reduced Sample Size
    With operational constraints in mind, we aimed for 600-800 respondents per study—smaller than traditional DiD requirements but sufficient to support statistically meaningful analysis given our focus.

Insights from Testing Lean Evaluation

After two pilot studies, we identified several challenges and considerations unique to the DFS context:

Baseline and Comparison Group Challenges

It’s challenging to establish a true baseline since contact information typically becomes available post-registration. We expected this in our design for a “retrospective baseline,” but operationally it was even more challenging because the registrations came in a slow trickle rather than one large batch. This can delay or otherwise affect baseline outcomes, while comparison groups may differ from registered users in ways that can affect the findings–for example, if there is a systematic reason that farmers chose not to register for the service. 

Higher Attrition with DFS

Phone-based lean evaluations can face higher attrition between baseline and endline (a 12 month period) due to factors like signal issues and changes in contact details. For DFS, there’s an additional layer: attrition linked to low sustained use. A high percentage of users who had registered at baseline did not continue using the service, and therefore were no longer classified as “treatment” at endline. 

Focus on Crop-Specific and Intermediate Outcomes

Lean evaluations, constrained by phone interview length, tend to focus on intermediate outcomes over long-term metrics like income. For DFS not tied to a specific crop, measuring impact is even more complex, so assessing behavioral and practice shifts as indicators of potential long-term benefits was more practical than measuring yield and income impacts. 


Early Stage Risks in DFS

Many DFS companies are still building scale, which introduces risks when trying to gauge true impact. Evaluations often involve new services, customers, or regions, adding uncertainty.

Resource Planning is Essential

While lean evaluations are more efficient than RCTs, they require considerable resources—up to $100,000 and around 18 months for completion, depending on various factors.


Conclusion: Finding the Right Fit for DFS Evaluation

Lean evaluations offer a middle ground for evaluating DFS, especially for NGOs or agricultural development programs with flexibility in comparison group selection. However, the method’s effectiveness depends on meeting the outlined conditions. For early-stage or commercial DFS companies, 60 Decibels’ Lean Data or cross-sectional comparisons may be more viable, though they come with trade-offs in causal rigor.

Ultimately, evaluation choice should reflect the context and goals of each DFS, balancing operational constraints with the level of rigor needed to provide actionable insights.

For stakeholders who do want to pursue Lean Evaluations, we’ve identified conditions to enhance the effectiveness of lean evaluations in the DFS landscape:

  • Focus on Observable Medium-Term Outcomes
    Prioritize outcomes that are observable within a year, such as changes in practices or investment behavior.
  • Service Type and Take-Up
    Services with clear, immediate applications and high take-up rates—like input financing or insurance—lend themselves better to lean evaluations.
  • DFS Provider Readiness
    Effective evaluations require a DFS provider equipped to track user registration, service usage, and identify comparable non-users for analysis.
  • Realistic Expectations on Resources and Timeline
    Although lean evaluations are less intensive than RCTs, they still need time, skilled personnel, and funding.
  • Control for Self-Selection Bias in Comparison Groups
    Careful planning and ongoing adjustments are necessary to ensure comparison groups closely resemble the treatment group to mitigate bias.