Methodology

Diagnosis Methodology

Data sources, definitions, and analytical approach

Data Source

Source

Inveragro sales transaction database

Time Period

2016 - 2026 (11 years)

Total Transactions

138,900 sales records

Unique Customers

6,605 identified by NIT

Data Quality
Cleaning and validation steps

The raw data was cleaned and validated through the following process:

  • Removed duplicate transactions (same NIT, date, invoice)
  • Standardized customer identifiers (NIT format normalization)
  • Validated date ranges and corrected obvious errors
  • Excluded returns and credit notes from revenue calculations
  • Geocoded addresses to city/zone level
Key Definitions

Active Customer

A customer with at least one purchase in the last 12 months

Churned Customer

A customer with no purchases in the last 12 months who previously had transaction history

Customer Tenure

Time between first and last purchase for each customer

Revenue Attribution

Total valor_neto (net value) from all transactions for a given customer, year, or segment

RFM Segmentation
Customer scoring methodology

Customers are scored on three dimensions using quintiles (1-5):

DimensionMetricScore 5 (Best)Score 1 (Worst)
RecencyDays since last purchase≤30 days>365 days
FrequencyTotal order countTop 20%Bottom 20%
MonetaryTotal revenueTop 20%Bottom 20%
Segment Assignment Rules
SegmentR ScoreF ScoreM Score
Champions≥4≥4≥4
Loyal≥4≥3≥3
Cannot Lose≤2≥4any
At Risk≤2≥3≥3
Lost/Hibernating≤2≤2any
Limitations
  • Customer identification relies on NIT; multiple NITs for same entity may exist
  • 2021 appears as partial year due to data collection gap
  • No cost data available; margin analysis not possible
  • Competitive intelligence not included in analysis