machine-learning-engineeringSafety --Repository

General Machine Learning Engineering practices. Covers model lifecycle, feature engineering, evaluation metrics, and deployment (MLOps).

0 stars

1.2k downloads

Updated 2/8/2026

Package Files

Loading files...

SKILL.md

Machine Learning Engineering Standards

This skill covers the end-to-end lifecycle of building and deploying ML models, from simple regressions to complex neural networks.

1. Problem Framing

Supervised vs Unsupervised: Do we have labeled data (e.g., patient outcomes)?
Regression vs Classification: Are we predicting a number (blood sugar level) or a category (Risk: High/Low)?
Baseline: Always establish a dump heuristic baseline (e.g., "Predict the average") before training a model. If your model doesn't beat the average, it's useless.

2. Data Engineering (Feature Store)

Garbage In, Garbage Out: 80% of ML is data cleaning.
Normalization: Scale inputs (0-1 or -1 to 1). Neural networks fail with unscaled data.
Categorical Encoding: One-Hot Encoding vs Embeddings.
Splitting: STRICT separation of Train / Validation / Test sets to avoid data leakage.

3. Model Selection Strategy

Tabular Data (Excel, SQL): XGBoost / LightGBM / CatBoost usually beat Deep Learning.
Unstructured Data (Images, Text): Deep Learning (Transformers, CNNs).
Start Simple: Logistic Regression -> Random Forest -> Gradient Boosting -> Neural Net. Don't jump to Deep Learning immediately.

4. MLOps (Deployment)

Model format: ONNX is the universal standard for portability.
Serving:
- Realtime: API (FastAPI) wrapping the model.predict().
- Batch: Nightly jobs processing thousands of rows.
Drift Monitoring: Models rot. Monitor the input distribution. If inputs change (e.g., "Patient age range changed"), retrain.

5. Evaluation Metrics

Accuracy is misleading (especially in imbalanced medical data).
Use Precision (False Positives matter?) vs Recall (False Negatives matter?).
For medical screening, Recall usually wins (better to have a false alarm than miss a diagnosis).

Install

Requires askill CLI v1.0+▶

AI Quality Score

AI review pending.

Metadata

Licenseunknown

Version-

Updated2/8/2026

Publisherbenjamin09111

Tags

apidatabaseobservabilitytesting