A Dataset for Supervised Learning

The Iris dataset from UC-Irvine Machine Learning repository:

  • Columns are called input variables or features or attributes.
  • The outcome we are trying to predict, is called output variables or targets.
  • A row in the table is called training example or instance.
  • The whole table is called (training) dataset.
  • The problem of predicting is called classification.

The variables are:

  • sepal_length: Sepal length, in centimeters, used as input.
  • sepal_width: Sepal width, in centimeters, used as input.
  • petal_length: Petal length, in centimeters, used as input.
  • petal_width: Petal width, in centimeters, used as input.
  • class: Iris Setosa, Versicolor, or Virginica, used as the target.