Methodology
Diagnosis Methodology
Data sources, definitions, and analytical approach
Data Source
Source
Inveragro sales transaction database
Time Period
2016 - 2026 (11 years)
Total Transactions
138,900 sales records
Unique Customers
6,605 identified by NIT
Data Quality
Cleaning and validation steps
The raw data was cleaned and validated through the following process:
- Removed duplicate transactions (same NIT, date, invoice)
- Standardized customer identifiers (NIT format normalization)
- Validated date ranges and corrected obvious errors
- Excluded returns and credit notes from revenue calculations
- Geocoded addresses to city/zone level
Key Definitions
Active Customer
A customer with at least one purchase in the last 12 months
Churned Customer
A customer with no purchases in the last 12 months who previously had transaction history
Customer Tenure
Time between first and last purchase for each customer
Revenue Attribution
Total valor_neto (net value) from all transactions for a given customer, year, or segment
RFM Segmentation
Customer scoring methodology
Customers are scored on three dimensions using quintiles (1-5):
| Dimension | Metric | Score 5 (Best) | Score 1 (Worst) |
|---|---|---|---|
| Recency | Days since last purchase | ≤30 days | >365 days |
| Frequency | Total order count | Top 20% | Bottom 20% |
| Monetary | Total revenue | Top 20% | Bottom 20% |
Segment Assignment Rules
| Segment | R Score | F Score | M Score |
|---|---|---|---|
| Champions | ≥4 | ≥4 | ≥4 |
| Loyal | ≥4 | ≥3 | ≥3 |
| Cannot Lose | ≤2 | ≥4 | any |
| At Risk | ≤2 | ≥3 | ≥3 |
| Lost/Hibernating | ≤2 | ≤2 | any |
Limitations
- Customer identification relies on NIT; multiple NITs for same entity may exist
- 2021 appears as partial year due to data collection gap
- No cost data available; margin analysis not possible
- Competitive intelligence not included in analysis