Introduction
Most ML tutorials end at model accuracy. Production ML engineering starts there. This post covers the full lifecycle — training, evaluation, serialisation, and serving — based on the Inventory Forecasting System I built using XGBoost.
The Problem
Given 18 months of historical sales data for a small business (product, date, quantity, price, promotions), predict demand for the next 30 days per SKU.
Feature Engineering
Raw timestamps are useless to tree models. Extract temporal features:
def engineer_features(df: pd.DataFrame) -> pd.DataFrame:
df = df.copy()
df['date'] = pd.to_datetime(df['date'])
df['day_of_week'] = df['date'].dt.dayofweek
df['month'] = df['date'].dt.month
df['week_of_year'] = df['date'].dt.isocalendar().week.astype(int)
df['is_month_end'] = df['date'].dt.is_month_end.astype(int)
df['lag_7'] = df.groupby('sku_id')['quantity'].shift(7)
df['lag_28'] = df.groupby('sku_id')['quantity'].shift(28)
df['rolling_mean_7'] = (
df.groupby('sku_id')['quantity']
.transform(lambda x: x.shift(1).rolling(7).mean())
)
return df.dropna()
Training with Early Stopping
import xgboost as xgb
from sklearn.model_selection import TimeSeriesSplit
tscv = TimeSeriesSplit(n_splits=5)
params = {
'objective': 'reg:squarederror',
'max_depth': 6,
'learning_rate': 0.05,
'n_estimators': 1000,
'subsample': 0.8,
'colsample_bytree': 0.8,
'early_stopping_rounds': 50,
}
model = xgb.XGBRegressor(**params)
model.fit(
X_train, y_train,
eval_set=[(X_val, y_val)],
verbose=100,
)
Time series splits are critical — never use random train/test split on temporal data, you'll leak future information into training.
Evaluation
from sklearn.metrics import mean_absolute_error, mean_absolute_percentage_error
y_pred = model.predict(X_test)
mae = mean_absolute_error(y_test, y_pred)
mape = mean_absolute_percentage_error(y_test, y_pred) * 100
print(f"MAE: {mae:.1f} units")
print(f"MAPE: {mape:.1f}%")
# MAE: 12.3 units
# MAPE: 8.7%
Serialisation
import joblib
# Save model + feature list together — prevents feature mismatch bugs
artifact = {
'model': model,
'features': list(X_train.columns),
'trained_at': datetime.utcnow().isoformat(),
}
joblib.dump(artifact, 'models/demand_forecast_v2.pkl')
Django REST Endpoint
# views.py
import joblib
import pandas as pd
from rest_framework.views import APIView
from rest_framework.response import Response
artifact = joblib.load('models/demand_forecast_v2.pkl')
model = artifact['model']
feature_cols = artifact['features']
class ForecastView(APIView):
def post(self, request):
data = request.data.get('data', [])
df = pd.DataFrame(data)
df = engineer_features(df)
X = df[feature_cols]
predictions = model.predict(X).tolist()
return Response({'forecast': predictions, 'unit': 'quantity'})
Gotchas in Production
- Feature drift — retrain monthly; monitor MAE on a holdout set
- Cold start for new SKUs — fall back to category average
- Negative predictions — clip to 0:
np.clip(predictions, 0, None) - Model versioning — store artifacts in S3 with version tags, never overwrite
- Latency — XGBoost predict on 100 rows takes ~5ms; for batch jobs, vectorise, don't loop
Results
The system reduced stock-outs by 34% and excess inventory holding costs by 22% for a 3-location business — purely from better demand signal, no change to ordering processes.