teach.pascalyim.com

Resources

Documents, notebooks, datasets and guides for download.

Datasets

Mini datasets — used in chapters ML 1 to 3

  • titanic_mini.csv· 7 KB

    Titanic survival — features: Sex, Age, FirstClass, Children, Survived. Binary classification for ML-3.

    Download
  • abalone_mini.csv· 103 KB

    Abalone: predict age (Rings) from physical measurements (Length, Diameter, Height, Weight). Regression.

    Download
  • cancer_mini.csv· 15 KB

    Breast Cancer Wisconsin (UCI). Radiologic features + diagnosis (M/B → 0/1). Used in ML-3.

    Download
  • house_mini.csv· 753 KB

    House prices (regression on area, number of rooms, etc.). Used in ML-2.

    Download
  • co2_mini.csv· 28 KB

    Atmospheric CO2 concentration over time. Regression / time series.

    Download
  • cos_mini.csv· 4 KB

    Synthetic cosine data to illustrate non-linear / polynomial regression.

    Download
  • iris_mini.csv· 1 KB

    The classic Iris (sepal/petal length/width → species). Beginner classification.

    Download
  • passengers_mini.csv· 1 KB

    Monthly airline passengers, classic time series for ML-1.

    Download

Classic datasets — chapter ML 4 and exercises

  • penguins.csv· 14 KB

    Palmer Penguins (3 species, bill/wing measurements). Modern alternative to Iris.

    Download
  • mushrooms.csv· 366 KB

    UCI Mushroom (8,124 entries). Binary classification edible / poisonous from morphological descriptors only.

    Download
  • student.csv· 67 KB

    UCI Student Performance (math/Portuguese grades, family context). Regression or multiclass classification.

    Download
  • creditcard.csv· 42 MB

    Credit card fraud detection (heavily imbalanced, ~0.17% fraud). Classic benchmark for minority-class problems.

    Download
  • titanic.csv· 60 KB

    Full Titanic (Kaggle), with more features than titanic_mini: Name, Ticket, Cabin, Fare, Embarked, …

    Download
  • cancer.csv· 123 KB

    Full Wisconsin Breast Cancer (30 features). Wider than cancer_mini.

    Download
  • churn.csv· 274 KB

    Telco churn: predict customer cancellation from profile and usage data.

    Download
  • stars.csv· 8 KB

    Star classification (temperature, luminosity, radius → spectral type). Multiclass, pedagogical.

    Download
  • adult.csv· 4 MB

    UCI Adult / Census Income. Predict whether income > $50K/year (binary classification with bias to analyze).

    Download

Exercise datasets — for ML 5 (large)

  • house_prices.csv· 450 KB

    Kaggle Ames Housing — complex regression on ~80 features, house prices in Ames (Iowa). For ML-5.

    Download
  • mercedes_test.csv· 6.2 MB

    Kaggle Mercedes-Benz Greener Manufacturing: predict test duration on vehicles. Industrial regression.

    Download
  • stroke.csv· 2.6 MB

    Stroke prediction from medical and demographic variables. Imbalanced binary classification.

    Download
  • mnist.csv· 122 MB

    MNIST in CSV (70,000 28×28 handwritten digits, label 0-9). Multiclass classification, prefer local execution.

    Download
  • sign.csv· 101 MB

    Sign Language MNIST: sign language alphabet as 28×28 pixel images.

    Download

Course notebooks

  • DL-1DL1 - neurone lineaire
    dl1-ecpk-pascal-yim.ipynb · 1081 KB · 5/13/2026
    Download
  • DL-2DL2 - classification
    dl-2-ecpk-py (3).ipynb · 555 KB · 5/14/2026
    Download
  • DL-3DL3 - Convolutions 1
    dl3-ecpk-pascal-yim.ipynb · 597 KB · 5/14/2026
    Download
  • DL-4DL4 - Convolution 2
    dl4-ecpk-py.ipynb · 526 KB · 5/15/2026
    Download
  • ML-1ML1 : rappels
    ml1-ecpk-pascal-yim.ipynb · 329 KB · 5/11/2026
    Download
  • ML-2ML2 : regression
    ml2-ecpk-pascal-yim.ipynb · 1040 KB · 5/11/2026
    Download
  • ML-3ML3 : Classification
    ml3-ecpk-pascal-yim.ipynb · 61 KB · 5/11/2026
    Download
  • ML-4ML4 : datas
    ml4-ecpk-pascal-yim.ipynb · 90 KB · 5/11/2026
    Download

Extra datasets

  • Dataset alien vs predator
    archive (3).zip · 14476 KB · 5/15/2026
    Download