Scikit Learn
[NLP] The TF-IDF In Text Mining
TF-IDF (Term Frequency – Inverse Document Frequency) is a famous word weighting technology, it can show the importance of words to texts.
Read More »[NLP] The TF-IDF In Text Mining[Python] Use ShuffleSplit() To Process Cross-Validation Step
Cross-validation is an important concept in data splitting of machine learning. Simply to put, when we want to train a model, we need to split data to training data and testing data.
Read More »[Python] Use ShuffleSplit() To Process Cross-Validation Step[Solved] graphviz.backend.ExecutableNotFound: failed to execute [‘dot’, ‘-Tpdf’, ‘-O’, ‘Digraph.gv’], make sure the Graphviz executables are on your systems’ PATH
Today, I used PyTorch to build a model, I suddenly needed to submit my technical report, so I simply found a tool to visualize the model: torchviz.
Read More »[Solved] graphviz.backend.ExecutableNotFound: failed to execute [‘dot’, ‘-Tpdf’, ‘-O’, ‘Digraph.gv’], make sure the Graphviz executables are on your systems’ PATH[Machine Learning] Introduction the indicators of the three evaluation models of Precision、Recall、F1-score
Precision, Recall, and F1-score are three fairly well-known model evaluation indicators, which are mostly used for binary classification (if it is a multi-classification, it is suitable for macro and micro). The following is a brief description of these different indicators:
Read More »[Machine Learning] Introduction the indicators of the three evaluation models of Precision、Recall、F1-score[Scikit-Learn] Using “train_test_split()” to split your data
Today, if we need to split our data for training our model —— we need to split training data and test data. We use “training data” to train our model and check it never peep our “test data”, it is very important, because our “test data” can assess the quality of our model.
Read More »[Scikit-Learn] Using “train_test_split()” to split your data