[Scikit-Learn] Tutorial (0) What is "Scikit-Learn"

Scikit-Learn is a open source machine learning framework in Python. It has six domains:

Classification

Check what type of our target.
Application: Spam detection, Image identification
Algorithm: SVM, Nearest neighbors, Random forest

Regression

Predicts the continuous value attribute of the test data.
Application: Drug response, Stock prices
Algorithm: SVR, Ridge regression, Lasso

Clustering

Automatically classify data into different clusters.
Application: Customer segmentation, Grouping experiment outcomes
Algorithm: K-Means、Spectral clustering、mean-shift

Dimensionality reduction

Retrieve the number of random variables we need to consider.
Application: Visualization, Increased efficiency

Model selection

Compare and verify parameters and models
Application: Improve accuracy by adjusting parameters
Module: Grid search, Cross validation, Metrics

Preprocessing

Feature extraction and normalization
Application: Preprocessing, Feature extraction

Data Set

Scikit-Learn offers a wide range of small toy datasets for users to test their various models:

Boston house-prices
Iris
Diabetes
Digits
Linnerud
Wine
Breast cancer Wisconsin

However, it must be noted that because these materials are too lightweight, them cannot really replace the data in the real world.

Next, I will organize the application of different models and data sets, and will continue to write down the Scikit-Learn introduction tutorial series.

Maybe refer to the algorithm map provided by Scikit-Learn official website:

[Scikit-Learn] Tutorial (0) What is “Scikit-Learn”

Classification

Regression

Clustering

Dimensionality reduction

Model selection

Preprocessing

Data Set

Related

Leave a ReplyCancel reply

[Scikit-Learn] Tutorial (0) What is “Scikit-Learn”

Classification

Regression

Clustering

Dimensionality reduction

Model selection

Preprocessing

Data Set

Share this:

Related

Leave a ReplyCancel reply