Dennis Atonya | Software Engineer

Introduction

Most ML tutorials end at model accuracy. Production ML engineering starts there. This post covers the full lifecycle — training, evaluation, serialisation, and serving — based on the Inventory Forecasting System I built using XGBoost.

The Problem

Given 18 months of historical sales data for a small business (product, date, quantity, price, promotions), predict demand for the next 30 days per SKU.

Feature Engineering

Raw timestamps are useless to tree models. Extract temporal features:

def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
    df = df.copy()
    df['date'] = pd.to_datetime(df['date'])
    df['day_of_week'] = df['date'].dt.dayofweek
    df['month'] = df['date'].dt.month
    df['week_of_year'] = df['date'].dt.isocalendar().week.astype(int)
    df['is_month_end'] = df['date'].dt.is_month_end.astype(int)
    df['lag_7'] = df.groupby('sku_id')['quantity'].shift(7)
    df['lag_28'] = df.groupby('sku_id')['quantity'].shift(28)
    df['rolling_mean_7'] = (
        df.groupby('sku_id')['quantity']
        .transform(lambda x: x.shift(1).rolling(7).mean())
    )
    return df.dropna()

Training with Early Stopping

import xgboost as xgb
from sklearn.model_selection import TimeSeriesSplit

tscv = TimeSeriesSplit(n_splits=5)
params = {
    'objective': 'reg:squarederror',
    'max_depth': 6,
    'learning_rate': 0.05,
    'n_estimators': 1000,
    'subsample': 0.8,
    'colsample_bytree': 0.8,
    'early_stopping_rounds': 50,
}

model = xgb.XGBRegressor(**params)
model.fit(
    X_train, y_train,
    eval_set=[(X_val, y_val)],
    verbose=100,
)

Time series splits are critical — never use random train/test split on temporal data, you'll leak future information into training.

Evaluation

from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error

y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
mape = mean_absolute_percentage_error(y_test, y_pred) * 100

print(f"MAE:  {mae:.1f} units")
print(f"MAPE: {mape:.1f}%")
# MAE:  12.3 units
# MAPE: 8.7%

Serialisation

import joblib

# Save model + feature list together — prevents feature mismatch bugs
artifact = {
    'model': model,
    'features': list(X_train.columns),
    'trained_at': datetime.utcnow().isoformat(),
}
joblib.dump(artifact, 'models/demand_forecast_v2.pkl')

Django REST Endpoint

# views.py
import joblib
import pandas as pd
from rest_framework.views import APIView
from rest_framework.response import Response

artifact = joblib.load('models/demand_forecast_v2.pkl')
model = artifact['model']
feature_cols = artifact['features']

class ForecastView(APIView):
    def post(self, request):
        data = request.data.get('data', [])
        df = pd.DataFrame(data)
        df = engineer_features(df)
        X = df[feature_cols]
        predictions = model.predict(X).tolist()
        return Response({'forecast': predictions, 'unit': 'quantity'})

Gotchas in Production

Feature drift — retrain monthly; monitor MAE on a holdout set
Cold start for new SKUs — fall back to category average
Negative predictions — clip to 0: np.clip(predictions, 0, None)
Model versioning — store artifacts in S3 with version tags, never overwrite
Latency — XGBoost predict on 100 rows takes ~5ms; for batch jobs, vectorise, don't loop

Results

The system reduced stock-outs by 34% and excess inventory holding costs by 22% for a 3-location business — purely from better demand signal, no change to ordering processes.