Pandas Essentials for AI and Data Beginners
Pandas is the main library for data handling in Python. It gives you tools to load, clean, explore, and prepare datasets. AI and machine learning depend on clean data, and Pandas makes this process simple.
1. Importing Pandas
import pandas as pd
This is the standard import name. Always use pd.
2. Creating a DataFrame
data = {
"name": ["Sara", "Ali", "Yassine"],
"score": [85, 92, 78],
"age": [21, 23, 22]
}
df = pd.DataFrame(data)
print(df)
A DataFrame is like an excel sheet. Rows and columns.
3. Loading Data From Files
- CSV.
- Excel.
- JSON.
df = pd.read_csv("data.csv")
df = pd.read_excel("file.xlsx")
df = pd.read_json("info.json")
4. Inspecting Data
print(df.head()) print(df.tail()) print(df.info()) print(df.describe())
head. first rows.info. types and null values.describe. stats for numeric columns.
5. Selecting Columns
print(df["name"]) print(df[["name", "score"]])
Always use brackets for lists of columns.
6. Selecting Rows
Use iloc for index based selection. Use loc for label based selection.
print(df.iloc[0]) print(df.iloc[0:2]) print(df.loc[0])
7. Filtering Rows With Conditions
high_scores = df[df["score"] > 80] print(high_scores)
Filter two conditions.
f = df[(df["score"] > 80) & (df["age"] < 23)] print(f)
8. Adding and Updating Columns
df["passed"] = df["score"] >= 80 print(df)
You can also update values.
df["score"] = df["score"] + 5
9. Handling Missing Data
Check missing values.
print(df.isnull().sum())
Fill missing values.
df["age"] = df["age"].fillna(df["age"].mean())
Drop rows with missing values.
df = df.dropna()
10. Sorting Data
df_sorted = df.sort_values("score", ascending=False)
print(df_sorted)
11. Grouping Data
Grouping helps with summarization.
grouped = df.groupby("age")["score"].mean()
print(grouped)
You can use many functions.
df.groupby("age").agg({"score": ["mean", "max"]})
12. Merging DataFrames
Pandas supports joins like SQL.
merged = pd.merge(df1, df2, on="id", how="inner")
- inner.
- left.
- right.
- outer.
13. Removing Columns or Rows
df = df.drop("age", axis=1)
df = df.drop(0)
14. Converting Data Types
df["age"] = df["age"].astype(int)
Always check types before model training.
15. Exporting Data
df.to_csv("output.csv", index=False)
df.to_excel("output.xlsx", index=False)
16. Mini Projects With Pandas
Project 1. Sales Analysis
- Load sales CSV.
- Group by product.
- Compute revenue.
- Sort by top sellers.
Project 2. Student Grades Report
- Load student data.
- Fill missing grades.
- Create pass or fail column.
- Export results.
Project 3. Data Cleaning Script
- Load messy dataset.
- Drop duplicates.
- Fix types.
- Handle missing values.
Pandas in Moroccan Darija
Pandas kay3awnk t3alj data b tariqa sahl. Kat load data. Katcleani. Katsort. Katgroupi. W kat7ddarha l machine learning.
df.head()bach tchouf data.df["col"]bach tjib column.- Filtering bach tselecti rows.
groupbybach tdir stats.mergebach tjma3 datasets.
Ila t9der tkhddem b Pandas mzyan, t9der tbni projects dial AI bla t3qid.
Conclusion
Pandas offers strong tools for data loading, cleaning, filtering, and grouping. These essentials prepare your datasets for machine learning and deep learning. Learn them well to build strong AI workflows.