Ai With Darija

Python Libraries for Explainable AI (XAI)

Introduction

Explainable AI (XAI) tools help us interpret and understand machine learning and deep learning models. Python provides several open-source libraries for this purpose. Each library offers unique techniques for visualizing, explaining, and validating model decisions.

Moroccan Darija: كاينين بزاف ديال المكتبات فـ Python اللي كيساعدونا نفهمو كيفاش الموديل كيتصرف. كل مكتبة عندها طريقة خاصة باش تشرح وتفسر القرارات ديال الموديل.

Main Python Libraries for Explainable AI

1. LIME (Local Interpretable Model-agnostic Explanations)

LIME explains individual predictions by building a simple linear model around a specific instance. It works with any black-box model.

pip install lime

from lime import lime_tabular
explainer = lime_tabular.LimeTabularExplainer(training_data, feature_names=features)
explanation = explainer.explain_instance(sample, model.predict_proba)
explanation.show_in_notebook()

Darija: LIME كتفسر كل توقع بوحدو، وكاتخدم مع أي نوع ديال الموديل.

2. SHAP (SHapley Additive exPlanations)

SHAP uses game theory to assign importance values (Shapley values) to features. It supports deep learning, tree models, and tabular data.

pip install shap

import shap
explainer = shap.Explainer(model)
shap_values = explainer(X)
shap.summary_plot(shap_values, X)

Darija: SHAP كيعطي لكل ميزة رقم كيبين شحال ساهمات فالناتيج.

3. ELI5 (Explain Like I’m Five)

ELI5 helps visualize feature importance and weights in linear and tree-based models. It also integrates with scikit-learn.

pip install eli5

import eli5
eli5.show_weights(model, feature_names=feature_names)

Darija: ELI5 مكتبة بسيطة كتعرض شنو الميزات المهمة فالموديل بطريقة مفهومة.

4. InterpretML

Developed by Microsoft, InterpretML combines global and local explanation methods. It supports classical ML and deep learning models.

pip install interpret

from interpret.glassbox import ExplainableBoostingClassifier
model = ExplainableBoostingClassifier()
model.fit(X_train, y_train)
ebm_local = model.explain_local(X_test, y_test)
ebm_local.show()

Darija: InterpretML مكتبة من Microsoft كتسمح باش تشرح الموديلات الكلاسيكية والعميقة.

5. Captum

Captum is a PyTorch library for model interpretability. It provides methods like Integrated Gradients, DeepLIFT, and Saliency Maps for neural networks.

pip install captum

from captum.attr import IntegratedGradients
ig = IntegratedGradients(model)
attributions = ig.attribute(inputs, target=0)

Darija: Captum مكتبة ديال PyTorch كتعطي تفسيرات للموديلات العميقة بحال CNN وRNN.

6. Alibi

Alibi supports tabular, image, and text models. It provides counterfactual explanations, anchors, and saliency maps.

pip install alibi

from alibi.explainers import AnchorTabular
explainer = AnchorTabular(predict_fn=model.predict, feature_names=feature_names)
explainer.fit(X_train)
explanation = explainer.explain(X_test[0])

Darija: Alibi كتخدم مع النصوص، الصور، والجداول، وكتعطي شروحات بحال counterfactuals.

7. DALEX

DALEX (Descriptive mAchine Learning EXplanations) is used for model comparison and feature importance analysis. It supports Python and R.

pip install dalex

import dalex as dx
exp = dx.Explainer(model, X, y)
exp.model_parts().plot()

Darija: DALEX كتعاون باش نقارنو بين الموديلات ونشوفو الميزات اللي مؤثرة بزاف.

8. Skater

Skater provides model-agnostic interpretability for black-box models, supporting visualization and local explanations.

pip install skater

from skater.core.explanations import Interpretation
interpreter = Interpretation(X, feature_names=features)
interpreter.feature_importance(model.predict)

Darija: Skater كتخدم مع بزاف ديال أنواع الموديلات وكتعطي تفسيرات محلية وبصرية.

9. TensorFlow Explain

TensorFlow Explain integrates directly with Keras models to provide Grad-CAM, Occlusion Sensitivity, and SmoothGrad visualizations.

pip install tf-explain

from tf_explain.core.grad_cam import GradCAM
explainer = GradCAM()
grid = explainer.explain(validation_data, model, class_index=0)
explainer.save(grid, ".", "gradcam_result.png")

Darija: TensorFlow Explain كتعاون باش نفهمو الموديلات ديال الصور باستعمال Grad-CAM وطرق أخرى.

Comparison Table

Library	Main Use	Supports Deep Learning
LIME	Local explanations	Yes
SHAP	Feature importance (global/local)	Yes
ELI5	Linear/Tree models	No
InterpretML	Hybrid explanations	Yes
Captum	Neural network interpretation	Yes
Alibi	Counterfactual and anchors	Yes
DALEX	Model comparison	Yes
Skater	Black-box interpretation	No
TensorFlow Explain	Visual CNN explanations	Yes

10 Exercises for Practice

Install and test the LIME library with a small scikit-learn model.
Use SHAP to explain predictions of a deep learning model.
Visualize feature importance using ELI5.
Train a simple classifier and explain it using DALEX.
Use Captum with a PyTorch CNN and interpret a specific layer.
Compare explanations from LIME and SHAP for the same dataset.
Generate counterfactual examples using Alibi.
Visualize Grad-CAM on a Keras CNN model using TensorFlow Explain.
Explain a regression model using InterpretML.
Build a dashboard comparing XAI results from multiple libraries.

Internal Linking Suggestions

[internal link: Explainable AI in Deep Learning]

[internal link: Deep Learning Basics]

[internal link: AI Ethics and Transparency]

Conclusion

Python offers a rich ecosystem for Explainable AI. Libraries like LIME, SHAP, and Captum help visualize and understand complex model behavior. Choosing the right tool depends on your model type and data domain.

Darija Summary: فـ Python كاينين بزاف ديال المكتبات اللي كيساعدونا نفسرو الموديلات. بحال LIME وSHAP وCaptum. كل وحدة عندها استعمال خاص حسب نوع البيانات والموديل.

How Transformers Work in Detail

Introduction

Transformers are deep learning models that changed the way machines understand text, images, and even sound. They power systems like ChatGPT, BERT, and DALL·E. In this tutorial, we explain how transformers work step by step using clear logic and simple math.

Moroccan Darija: الموديلات ديال Transformers بدلو الطريقة اللي الماكينات كاتفهم بيها النصوص، الصور وحتى الصوت. فهاد الدرس غادي نشرحو كيفاش كيخدمو بالتفصيل وبطريقة بسيطة.

Core Concepts Explained

Transformers are based on a mechanism called Self-Attention. This allows the model to understand the relationship between all words in a sentence at once.

Example sentence: “The cat sat on the mat.” The model looks at how each word relates to the others. For example, “cat” relates more to “sat” and “mat” than to “the”.

Main Components:

Input Embeddings: Convert words into numerical vectors.
Positional Encoding: Add order information to embeddings.
Self-Attention Layers: Compute relationships between words.
Feed Forward Networks: Process the attention outputs.
Layer Normalization: Stabilize training.
Residual Connections: Help keep information flowing.

Moroccan Darija: المكونات الرئيسية هما:

Embeddings كتحول الكلمات لأرقام.
Positional Encoding كيعطي ترتيب للكلمات.
Self-Attention كيشوف العلاقة بين كل كلمة والتانية.
Feed Forward كيعالج النتائج.
Normalization كيساعد فالتدريب.
Residual Connections كتحافظ على المعلومة.

Syntax and Model Structure

A Transformer model is made of two parts: an Encoder and a Decoder.

Encoder: Reads the input sentence and creates hidden representations.
Decoder: Generates output based on encoder outputs and previous words.

# Simple transformer-like structure in PyTorch
import torch
from torch import nn

class SimpleTransformer(nn.Module):
    def __init__(self, vocab_size, embed_dim, num_heads, hidden_dim):
        super().__init__()
        self.embedding = nn.Embedding(vocab_size, embed_dim)
        self.attention = nn.MultiheadAttention(embed_dim, num_heads)
        self.feed_forward = nn.Sequential(
            nn.Linear(embed_dim, hidden_dim),
            nn.ReLU(),
            nn.Linear(hidden_dim, embed_dim)
        )
        self.norm = nn.LayerNorm(embed_dim)
    
    def forward(self, x):
        x = self.embedding(x)
        attn_output, _ = self.attention(x, x, x)
        x = self.norm(x + attn_output)
        x = self.feed_forward(x)
        return x

Practical Examples

Example 1: Understanding Word Relationships

from transformers import AutoTokenizer, AutoModel
tokenizer = AutoTokenizer.from_pretrained("bert-base-uncased")
model = AutoModel.from_pretrained("bert-base-uncased")

inputs = tokenizer("The cat sat on the mat", return_tensors="pt")
outputs = model(**inputs)
print(outputs.last_hidden_state.shape)

This shows how BERT encodes each token into a vector that contains contextual information.

Example 2: Machine Translation

from transformers import MarianMTModel, MarianTokenizer

model_name = "Helsinki-NLP/opus-mt-en-fr"
tokenizer = MarianTokenizer.from_pretrained(model_name)
model = MarianMTModel.from_pretrained(model_name)

text = "I love learning AI"
inputs = tokenizer(text, return_tensors="pt")
translated = model.generate(**inputs)
print(tokenizer.decode(translated[0], skip_special_tokens=True))

Darija Explanation: هاد المثال كيبين كيفاش Transformer كيدير الترجمة الآلية، بحال من الإنجليزية للفرنسية.

Example 3: Text Generation

from transformers import GPT2LMHeadModel, GPT2Tokenizer

tokenizer = GPT2Tokenizer.from_pretrained("gpt2")
model = GPT2LMHeadModel.from_pretrained("gpt2")

input_ids = tokenizer.encode("Artificial Intelligence is", return_tensors="pt")
output = model.generate(input_ids, max_length=20)
print(tokenizer.decode(output[0], skip_special_tokens=True))

This example shows how GPT models use the decoder part to generate coherent sentences.

Explanation of Each Example

Example 1: Shows attention in action — every word depends on context.
Example 2: Uses encoder-decoder structure for translation.
Example 3: Uses decoder-only architecture to predict next words.

10 Exercises for Practice

Explain what self-attention means in your own words.
Write a Python function to normalize attention scores using softmax.
Modify the SimpleTransformer class to add dropout.
Compare encoder-only (BERT) and decoder-only (GPT) architectures.
Try using a different tokenizer and see how tokenization changes.
Visualize attention scores for a simple sentence using any library.
Train a small transformer on a custom text dataset.
Explain why positional encoding is needed.
Implement a small feed-forward block in PyTorch.
Experiment with different numbers of heads in MultiheadAttention.

Internal Linking Suggestions

[internal link: Machine Learning Basics]

[internal link: Neural Networks Guide]

[internal link: Attention Mechanism Explained]

Conclusion

Transformers changed AI by allowing parallel training and better context understanding. They are now the foundation for models in NLP, vision, and multimodal AI.

Darija Summary: Transformers دارو ثورة فالعالم ديال الذكاء الاصطناعي، وخلاو الموديلات تفهم النصوص والصور بطريقة قوية وسريعة.

Automated Machine Learning (AutoML)

Introduction

Automated Machine Learning (AutoML) is the process of automating the steps involved in building a machine learning model. It helps users automatically preprocess data, select algorithms, tune hyperparameters, and evaluate models without deep manual intervention.

بالعربية المغربية (الدارجة): التعلم الآلي الآلي (AutoML) هو طريقة كتمكن النظام يدير مراحل بناء النموذج بوحدو، بحال تنظيف البيانات، اختيار الخوارزميات، وضبط الإعدادات، بلا تدخل كبير من الإنسان.

Why AutoML is Important

Saves time and effort for data scientists and engineers.
Finds the best model automatically.
Improves model performance through optimized hyperparameters.
Makes machine learning accessible to non-experts.

بالعربية المغربية: AutoML كيوفر الوقت والمجهود، كيعطي نتائج قوية، وكيسهّل استعمال الذكاء الاصطناعي حتى للناس اللي ما عندهمش خبرة كبيرة.

Core Concepts Explained

Data Preprocessing: Cleaning and transforming raw data automatically.
Feature Engineering: Creating new features or selecting the most useful ones.
Model Selection: Choosing the best algorithm (e.g., Random Forest, XGBoost, etc.).
Hyperparameter Optimization: Automatically tuning parameters for better performance.
Ensembling: Combining multiple models to achieve higher accuracy.

Popular AutoML Frameworks

Framework	Main Features
Auto-sklearn	Open-source library for automatic model selection and tuning
TPOT	Uses genetic programming to find the best pipelines
H2O AutoML	Scalable, fast AutoML tool supporting multiple algorithms
Google Cloud AutoML	Cloud-based AutoML for vision, text, and tabular data
PyCaret	Low-code AutoML library for Python

بالعربية المغربية: كاينين بزاف ديال الأدوات بحال Auto-sklearn، TPOT، H2O، وPyCaret اللي كيساعدونا نخدمو AutoML بسهولة وبطرق مختلفة.

Python Example: AutoML with PyCaret


# Install PyCaret
# pip install pycaret

from pycaret.datasets import get_data
from pycaret.classification import *

# Load dataset
data = get_data('iris')

# Initialize AutoML environment
s = setup(data=data, target='species', session_id=123)

# Compare all models automatically
best_model = compare_models()

# Print best model
print(best_model)

Explanation of the Example

The PyCaret library is used for AutoML in Python.
The setup() function initializes preprocessing and data handling automatically.
compare_models() tests multiple algorithms and returns the best-performing one.

بالعربية المغربية: PyCaret كيدير كلشي بوحدو، من تنظيف البيانات حتى مقارنة النماذج. فالنهاية كيعطينا أحسن نموذج يخدم فهاد الحالة.

Example: AutoML with Auto-sklearn


# Install Auto-sklearn
# pip install auto-sklearn

import autosklearn.classification
from sklearn.model_selection import train_test_split
from sklearn.datasets import load_iris
from sklearn.metrics import accuracy_score

# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Run AutoML
automl = autosklearn.classification.AutoSklearnClassifier(time_left_for_this_task=60)
automl.fit(X_train, y_train)

# Evaluate
y_pred = automl.predict(X_test)
print("Accuracy:", accuracy_score(y_test, y_pred))

Explanation of the Auto-sklearn Example

The system automatically tests several models and hyperparameters.
It returns the best-performing combination based on evaluation metrics.
The result shows model accuracy on test data.

بالعربية المغربية: Auto-sklearn كيجرب بزاف ديال النماذج والإعدادات، ومن بعد كيعطينا أفضل نتيجة ممكنة على البيانات.

Benefits of AutoML

Reduces manual experimentation and trial-and-error.
Enables faster deployment of models.
Improves productivity in AI workflows.
Ensures consistent evaluation across models.

Limitations of AutoML

Less control over algorithm internals.
Requires computational resources.
May overfit if not configured properly.
Not always interpretable for complex models.

بالعربية المغربية: رغم الفوائد ديالو، AutoML كيعاني من بعض العيوب بحال قلة التحكم، الاستهلاك الكبير ديال الحواسيب، وصعوبة الفهم فبعض المرات.

Best Practices

Use AutoML for baseline experiments before fine-tuning manually.
Always interpret model results with explainability tools (e.g., SHAP, LIME).
Limit computation time using time budgets.
Validate AutoML results with independent test data.

10 Exercises for Practice

Define AutoML and its main purpose.
List three frameworks used for automated machine learning.
Install and test PyCaret with the Iris dataset.
Use Auto-sklearn to build a model and report its accuracy.
Change the time limit in Auto-sklearn and observe performance differences.
Use PyCaret to compare models for a regression task.
Explain the role of feature engineering in AutoML.
Identify the limitations of AutoML in real-world use cases.
Apply SHAP to explain predictions from an AutoML model.
Discuss how AutoML helps accelerate MLOps pipelines.

Bias and Fairness in Artificial Intelligence

Bias and Fairness in AI

Introduction

Bias and fairness are critical topics in artificial intelligence and machine learning. A biased AI model can lead to unfair, discriminatory, or unethical outcomes. Fairness ensures that AI systems make decisions that are consistent, transparent, and equal for all individuals or groups.

بالعربية المغربية (الدارجة): الانحياز والإنصاف فـالذكاء الاصطناعي مواضيع مهمة بزاف. النموذج المنحاز يقدر يعطي قرارات ظالمة أو تمييزية. الإنصاف كيعني النموذج يخدم بعدالة ويعامل الناس كاملين بنفس الطريقة.

What is Bias in AI?

Bias happens when an AI system systematically favors or disadvantages certain groups. It can come from data, algorithms, or human choices during model design.

Data Bias: When the dataset does not represent all groups fairly.
Label Bias: When labels reflect human prejudice or errors.
Algorithmic Bias: When the model amplifies existing unfair patterns.

بالعربية المغربية: الانحياز كيتوقع ملي البيانات ما كتمثلش الجميع، ولا ملي التصنيفات فيها أخطاء بشرية، ولا ملي الخوارزمية كتزيد فالتمييز اللي كان من قبل.

Fairness in Machine Learning

Fairness means that predictions are not dependent on sensitive attributes like gender, race, or age. There are different definitions of fairness, including:

Demographic Parity: Each group gets positive predictions at similar rates.
Equal Opportunity: True positive rates are equal across groups.
Equalized Odds: Both true positive and false positive rates are equal across groups.

بالعربية المغربية: الإنصاف كيعني القرار ما يكونش مرتبط بالعمر ولا بالجنس ولا بالأصل. كاين بزاف ديال طرق باش نقيسو الإنصاف، بحال المساواة فالنسب أو ففرص النجاح.

Python Example: Detecting Bias in a Model


import pandas as pd
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score

# Sample dataset
data = pd.DataFrame({
    "age": [22, 45, 35, 26, 50, 29],
    "gender": ["male", "female", "female", "male", "female", "male"],
    "income": [30000, 60000, 55000, 35000, 62000, 40000],
    "approved": [0, 1, 1, 0, 1, 0]
})

# Encode gender as numeric
data["gender_num"] = data["gender"].map({"male": 0, "female": 1})

X = data[["age", "gender_num", "income"]]
y = data["approved"]

model = LogisticRegression()
model.fit(X, y)
predictions = model.predict(X)

data["predicted"] = predictions
accuracy = accuracy_score(y, predictions)

print("Model Accuracy:", accuracy)
print("\nPredictions by gender:")
print(data.groupby("gender")[["predicted"]].mean())

Explanation of the Example

The dataset includes features like age, gender, and income.
The model predicts loan approval.
By comparing average predictions by gender, we can check if the model favors one group.

بالعربية المغربية: هاد الكود كيدرب نموذج بسيط باش يتوقع القبول ديال القرض. من بعد كنشوفو واش النتيجة كتميل لجنس معين، باش نعرفو واش النموذج فيه انحياز.

Mitigating Bias in AI Models

Preprocessing: Balance datasets before training (e.g., re-sampling, re-weighting).
In-processing: Modify learning algorithms to reduce bias (e.g., fairness constraints).
Post-processing: Adjust predictions after training to improve fairness.

بالعربية المغربية: باش نصلحو الانحياز نقدر نعدلو البيانات قبل التدريب، ولا نبدلو الطريقة اللي كيتعلم بها النموذج، أو نصححو النتائج من بعد التدريب.

Example: Balancing Data Before Training


from sklearn.utils import resample

# Suppose 'female' group is smaller
female_data = data[data.gender == "female"]
male_data = data[data.gender == "male"]

# Resample females to balance
female_upsampled = resample(female_data, replace=True, n_samples=len(male_data), random_state=42)
balanced_data = pd.concat([male_data, female_upsampled])

print("Data balanced successfully:")
print(balanced_data.gender.value_counts())

Best Practices for Fair AI

Use diverse and representative datasets.
Continuously monitor models after deployment.
Collaborate with ethicists and domain experts.
Report model fairness metrics in documentation.
Test for bias before releasing an AI product.

بالعربية المغربية: باش نضمنو الإنصاف، خاص البيانات تكون متنوعة، ونراقبو النموذج حتى بعد الإطلاق، ونتأكدو دايمًا من التوازن فالنتائج.

Tools for Bias and Fairness Analysis

Tool	Purpose
AI Fairness 360 (IBM)	Detect and mitigate bias in datasets and models.
Fairlearn (Microsoft)	Measure and reduce unfairness in machine learning.
What-If Tool (Google)	Visualize model predictions and fairness metrics.
SHAP / LIME	Explain model decisions and feature influence.

بالعربية المغربية: كاينين بزاف ديال الأدوات اللي كتعاون فالكشف والتصحيح ديال الانحياز، بحال AI Fairness 360 وFairlearn وWhat-If Tool.

10 Exercises for Practice

Define bias and fairness in AI using your own words.
List three types of bias and provide an example for each.
Use the sample Python code to check if your model is biased by gender.
Add a new sensitive attribute (like age) and measure fairness again.
Balance your dataset using re-sampling and compare results.
Install and use AI Fairness 360 to detect bias in your model.
Explain the difference between preprocessing and post-processing techniques.
Apply SHAP or LIME to explain model decisions on sensitive features.
Discuss how bias affects AI systems in healthcare or finance.
Write a short report suggesting how to design fair AI systems.

Experiment Tracking with MLflow

Introduction

Experiment tracking is the process of recording and managing all information about machine learning experiments, such as model parameters, metrics, and results. MLflow is a powerful open-source platform that simplifies this process by providing easy tracking, reproducibility, and deployment.

بالعربية المغربية (الدارجة): تتبع التجارب فالتعلم الآلي هو عملية تسجيل كل المعلومات ديال التجارب بحال المعاملات، النتائج، والمقاييس. MLflow هو منصة مفتوحة كيسهّل هاد الخدمة وكيخلي التجارب منظمة وسهلة الإعادة.

Why Experiment Tracking Matters

Ensures reproducibility of experiments.
Helps compare different models easily.
Keeps track of hyperparameters and results.
Facilitates collaboration in data teams.
Supports model versioning and deployment.

بالعربية المغربية: التتبع ديال التجارب كيساعد باش نرجعو لأي تجربة قديمة، نقارنو النماذج، ونعرفو شنو أحسن إعدادات بدون ما نضيعو الوقت.

Core Concepts of MLflow

Run: A single experiment execution where metrics and parameters are logged.
Experiment: A collection of related runs under one project.
Artifact: Files like models, plots, or logs stored for each run.
Tracking Server: A centralized service to manage experiments and results.

Setting Up MLflow


# Install MLflow
pip install mlflow

# Start MLflow tracking server locally
mlflow ui

This command opens the MLflow user interface at http://localhost:5000 where you can view all your experiments.

بالعربية المغربية: من بعد التثبيت، نقدر نفتحو واجهة MLflow على الرابط http://localhost:5000 باش نشوفو كل التجارب ديالنا بشكل منظم.

Python Example: Logging Experiments with MLflow


import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Define experiment
mlflow.set_experiment("iris-rf-experiment")

# Start run
with mlflow.start_run():
    n_estimators = 100
    max_depth = 5
    model = RandomForestClassifier(n_estimators=n_estimators, max_depth=max_depth, random_state=42)
    model.fit(X_train, y_train)
    accuracy = model.score(X_test, y_test)
    
    # Log parameters and metrics
    mlflow.log_param("n_estimators", n_estimators)
    mlflow.log_param("max_depth", max_depth)
    mlflow.log_metric("accuracy", accuracy)
    
    # Save model
    mlflow.sklearn.log_model(model, "model")

print("Experiment logged successfully.")

Explanation of the Example

The experiment is named iris-rf-experiment.
Model parameters and accuracy are recorded using mlflow.log_param() and mlflow.log_metric().
The model is saved automatically as an artifact.
All results can be viewed on the MLflow UI.

بالعربية المغربية: هاد الكود كيدير تجربة جديدة بالإسم iris-rf-experiment، كيسجّل المعاملات بحال عدد الأشجار والعمق، وكيحسب الدقة، ومن بعد كيسجّل النموذج فالواجهة ديال MLflow.

Comparing Multiple Experiments

MLflow lets you compare different runs visually. You can see parameter values, metrics, and charts side by side to pick the best model.

بالعربية المغربية: فـMLflow نقدر نقارنو بين التجارب باش نعرفو شنو النموذج اللي خدم مزيان، وكل تجربة كتكون فيها النتائج والمقاييس بشكل واضح.

Advanced MLflow Features

MLflow Projects: Package code and environment for reproducibility.
MLflow Models: Manage model deployment formats.
MLflow Registry: Store and version trained models.
MLflow UI: Visualize all experiments in one dashboard.

Best Practices for Experiment Tracking

Always log both parameters and metrics.
Tag each experiment with dataset and model version.
Use consistent experiment naming conventions.
Regularly clean unused runs to save storage.
Automate experiment logging in your training pipeline.

Example: Registering Models in MLflow


# After logging a model, you can register it
result = mlflow.register_model(
    "runs:/<run_id>/model",
    "IrisModelRegistry"
)

print("Model registered with version:", result.version)

بالعربية المغربية: من بعد التجربة نقدر نسجلو النموذج فالـModel Registry باش نستعملوه فالتطبيق أو التحديث القادم.

10 Exercises for Practice

Define what experiment tracking is and why it’s important.
Install MLflow and start the local tracking server.
Run the provided example and view results in the MLflow UI.
Add more hyperparameters (like min_samples_split) to your logged model.
Log confusion matrix as an artifact in MLflow.
Compare two models using different hyperparameters.
Export experiment results as a CSV file from the MLflow UI.
Register your best model using MLflow Model Registry.
Integrate MLflow tracking with your existing ML pipeline.
Explain the difference between MLflow Tracking, Projects, and Models.

Explainable AI (XAI)

Explainable AI (XAI) Tutorial

Introduction

Explainable AI (XAI) refers to techniques and methods that help humans understand and trust the decisions made by artificial intelligence systems. It focuses on making models transparent, interpretable, and accountable. As AI systems become more complex, XAI helps ensure fairness, safety, and compliance.

بالعربية المغربية (الدارجة): الذكاء الاصطناعي القابل للتفسير (XAI) هو مجموعة ديال الطرق اللي كتخلي الناس يفهمو كيفاش النموذج كيدير القرارات ديالو. الهدف هو الشفافية والعدالة باش نقدر نثقو فالنظام الذكي.

Why Explainable AI is Important

Builds trust between humans and AI systems.
Helps detect bias and unfair decisions.
Improves debugging and model performance.
Supports legal and ethical compliance.
Enhances decision-making in sensitive domains (healthcare, finance).

بالعربية المغربية: XAI مهم حيت كيعطينا الثقة فالنموذج، كيساعدنا نكتاشفو الأخطاء، وكيفيد فالمجالات الحساسة بحال الطب والبنوك.

Core Concepts Explained

Interpretability: How easily a human can understand the model’s logic.
Transparency: How clearly the inner workings of the model are visible.
Post-hoc Explanation: Explaining decisions after the model has made predictions.
Feature Importance: Identifying which input variables most affect predictions.

Types of Explainability

Type	Description	Example Methods
Global	Explains the entire model behavior	Feature Importance, Partial Dependence Plots
Local	Explains a single prediction	LIME, SHAP

بالعربية المغربية: كاين تفسير شامل (Global) اللي كيبين كيفاش النموذج كيتصرف فالمجمل، وتفسير محلي (Local) اللي كيركّز على تفسير قرار واحد فقط.

Python Example: Explainable AI with LIME


# Install LIME if not installed
# pip install lime scikit-learn

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from lime.lime_tabular import LimeTabularExplainer

# Load dataset
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Train model
model = RandomForestClassifier(n_estimators=100, random_state=42)
model.fit(X_train, y_train)

# Create LIME explainer
explainer = LimeTabularExplainer(X_train, feature_names=['sepal_length','sepal_width','petal_length','petal_width'], class_names=['setosa','versicolor','virginica'], discretize_continuous=True)

# Explain one prediction
i = 3
exp = explainer.explain_instance(X_test[i], model.predict_proba)
exp.show_in_notebook(show_table=True)

Explanation of the Example

The Random Forest model predicts flower species.
LIME explains why the model made a specific prediction.
It shows which features contributed most to that decision.

بالعربية المغربية: هاد المثال كيدير تدريب على بيانات الزهور. من بعد LIME كيعطينا تفسير مفصل لقرار النموذج، وكيورينا شنو هما الميزات اللي كانو مؤثرين بزاف فالتصنيف.

Another Example: Using SHAP for Feature Importance


# Install SHAP if not installed
# pip install shap

import shap

# Create SHAP explainer
explainer = shap.TreeExplainer(model)
shap_values = explainer.shap_values(X_test)

# Plot feature importance
shap.summary_plot(shap_values, X_test, feature_names=['sepal_length','sepal_width','petal_length','petal_width'])

Explanation of the SHAP Example

SHAP uses game theory to explain model predictions.
It shows how each feature pushes the prediction higher or lower.
The summary plot visualizes global feature importance.

بالعربية المغربية: SHAP كيعتمد على نظرية الألعاب باش يشرح القرارات. كيعطينا فكرة على شنو المميزات اللي رفعو القيمة ولا نقصوها، وكيعطي رسم بياني واضح للميزات المهمة.

Best Practices for Explainable AI

Choose interpretable models for critical systems (e.g., linear models, decision trees).
Use post-hoc explainers like LIME or SHAP for complex models.
Communicate explanations in simple terms to end-users.
Validate explanations with domain experts.
Regularly test for bias and fairness.

10 Exercises for Practice

Define Explainable AI and its importance in real-world systems.
Differentiate between interpretability and transparency.
Implement a simple interpretable model (e.g., Decision Tree) and visualize feature importance.
Install and use LIME to explain one prediction.
Install and use SHAP to analyze feature contributions globally.
Compare explanations from LIME and SHAP for the same model.
Create a visualization that shows bias in model predictions.
Explain one AI decision to a non-technical user in simple language.
Evaluate how explainability affects model trust and fairness.
Document your explainability process for model governance.

Model Versioning in Machine Learning

Introduction

Model versioning is the process of tracking and managing different versions of machine learning models over time. It helps data scientists and engineers keep control of changes, reproduce results, and deploy the right model confidently.

بالعربية المغربية (الدارجة): تتبع نسخ النماذج (Model Versioning) هو الطريقة اللي كتنظم بيها الإصدارات المختلفة ديال النموذج. بهاد الطريقة نقدر نعرف شنو تبدّل، ونعيد التجارب، ونخدم بالإصدار الصحيح بسهولة.

Why Model Versioning Matters

Ensures reproducibility of experiments.
Keeps a record of model improvements.
Prevents confusion during deployment.
Allows rollback to a previous stable version.
Facilitates collaboration between data teams.

بالعربية المغربية: التتبع ديال النسخ كيعطينا إمكانية نرجعو لأي نسخة قديمة، نعرفو شنو تحسّن، ونخدمو كفريق بلا مشاكل.

Core Concepts Explained

Model Metadata: Information about the model (parameters, dataset version, metrics).
Version Identifier: A unique tag or hash assigned to each model version.
Artifact Storage: A repository where model files are saved (like S3, DVC remote, MLflow server).
Version Control Integration: Tools like Git or DVC that track experiments, data, and models.

Example: Using DVC for Model Versioning

DVC (Data Version Control) is a tool that helps track datasets and models like Git tracks code.


# Initialize DVC in your project
dvc init

# Add a trained model file to version control
dvc add model.pkl

# Save the change in Git
git add model.pkl.dvc .gitignore
git commit -m "Add model version 1.0"

# Push model to remote storage
dvc remote add -d myremote s3://my-bucket/models
dvc push

This process saves your model with its metadata and links it to the specific experiment that produced it.

بالعربية المغربية: بـ DVC نقدر نسجّل النموذج ديالنا بحال الكود. كنضيفو، كنحطّو فالـGit، ومن بعد كنرفعو للسيرفر. كل نسخة كتكون مربوطة بتجربة محددة.

Python Example: Tracking Model Versions with MLflow

MLflow is another tool used to manage model versions, experiments, and deployments.


import mlflow
import mlflow.sklearn
from sklearn.ensemble import RandomForestClassifier
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

# Load data
X, y = load_iris(return_X_y=True)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

# Train model
model = RandomForestClassifier(n_estimators=50, random_state=42)
model.fit(X_train, y_train)
accuracy = model.score(X_test, y_test)

# Track model with MLflow
mlflow.set_experiment("iris-classification")
with mlflow.start_run():
    mlflow.log_param("n_estimators", 50)
    mlflow.log_metric("accuracy", accuracy)
    mlflow.sklearn.log_model(model, "model")

print("Model version logged successfully.")

Explanation of the Example

The model and parameters are saved automatically by MLflow.
Each experiment run creates a unique version ID.
All metrics and models can be viewed in the MLflow UI.

بالعربية المغربية: MLflow كيسجّل النموذج، المعاملات، والنتائج فكل تجربة. كل مرة كتدير تجربة جديدة كتولد نسخة جديدة بآيدي خاص بيها.

Best Practices for Model Versioning

Tag models with clear version numbers (v1.0, v1.1).
Always log dataset version and preprocessing steps.
Store evaluation metrics with each model.
Use automation to register models after training.
Link each model to its source code commit.

Common Tools for Model Versioning

Tool	Main Purpose
Git	Code versioning
DVC	Data and model tracking
MLflow	Experiment and model version management
Weights & Biases (W&B)	Visualization and experiment tracking
TensorBoard	Model metrics visualization

بالعربية المغربية: كاينين بزاف ديال الأدوات اللي كتعاون فالتتبع: Git للكود، DVC للبيانات والنماذج، وMLflow للتجارب. كل وحدة كتخدم جزء مهم فالسيرورة ديال التعلم الآلي.

10 Exercises for Practice

Define what model versioning means in machine learning.
List three benefits of using model versioning.
Explain the difference between Git and DVC.
Set up DVC in a local project and add a model file.
Use MLflow to log model metrics and parameters.
Create a versioning system using model version tags (v1, v2).
Track both data and model versions for one experiment.
Connect DVC or MLflow with remote storage (like S3 or Google Drive).
Build a Python script to automatically register new model versions.
Discuss how versioning improves collaboration in AI projects.