Machine Learning Tutorial Python – 8: Logistic Regression (Binary Classification)

///Machine Learning Tutorial Python – 8: Logistic Regression (Binary Classification)

Machine Learning Tutorial Python – 8: Logistic Regression (Binary Classification)

FavoriteLoadingAdd to favorites

Logistic regression is used for classification problems in machine learning. This tutorial will show you how to use sklearn logisticregression class to solve binary classification problem to predict if a customer would buy a life insurance. At the end we have an interesting exercise for you to solve.
Usually there are two types of machine learning problems (1) Linear regression where prediction value is continuous (2) Classification where predicted value is categorical. Logistic regression is used for classification problems mainly.
Code:
Exercise: Open above notebook from github and go to the end.

Topics that are covered in this Video:
0:01 – Theory (Explain difference between logic regression and classification)
1:18 – What is logistic regression?
1:26 – Classification types (Binary vs multiclass classification)
1:53 – Explanation of logistic regression using the example of if person will buy insurance based on his age
5:38 – Sigmoid or Logit function
8:18 – Coding (for coding we are using an example of if a person will buy insurance or not based on his age)
14:36 – sklearn predict_proba() function
15:49 – Exercise (Solve a problem of predicting employee retention based on salary, distance to work, promotion, department etc)

Next Video:
Machine Learning Tutorial Python – 8 Logistic Regression (Multiclass Classification):

Populor Playlist:
Data Science Full Course:

Data Science Project:

Machine learning tutorials:

Pandas:

matplotlib:

Python:

Jupyter Notebook:

To download csv and code for all tutorials: go to click on a green button to clone or download the entire repository and then go to relevant folder to get access to that specific file.

Website:
Facebook:
Twitter:

source

By |2020-01-10T01:08:44+00:00January 10th, 2020|Python Video Tutorials|42 Comments

42 Comments

  1. Sandeep Yadav January 10, 2020 at 1:08 am - Reply

    Finally got the Python version of Andrew Ag's machine learning course. With a better explanation.
    thanks.

  2. Zaid Zeee January 10, 2020 at 1:08 am - Reply

    this video really really very helpful.
    thank you so much for this amazing kwnldge
    please make more video request.

  3. aleisley January 10, 2020 at 1:08 am - Reply

    Can only go up to 78% and that's with tuning the hyperparameters. Thanks btw codebasics!

  4. nissy Pradeep January 10, 2020 at 1:08 am - Reply

    why we will always fit model on train data set not on test,but we will use transform on both…what is the difference between fit and transform?

  5. Arun Sharma January 10, 2020 at 1:08 am - Reply

    is this possible to have 0 and 1 both for the same age and we can compute them further based on probability?

  6. Bandham Manikanta January 10, 2020 at 1:08 am - Reply

    Perfect explanation on logistic regression.

    Loved it. Thanks a lot.

  7. raja ram January 10, 2020 at 1:08 am - Reply

    Design and build a binary classifier over the dataset. Explain your algorithm and its

    configuration. Explain your findings into both numerical and graphical

    representations. Evaluate the performance of the model and verify the accuracy and

    the effectiveness of your model. can u explain

  8. mridul ahmed January 10, 2020 at 1:08 am - Reply

    wow so nice. Thanx to explain in a very nice way.

  9. Amanullah Mahabub January 10, 2020 at 1:08 am - Reply

    You guys are life savers. man love your videos.

  10. Aarushi Gupta January 10, 2020 at 1:08 am - Reply

    It's good but too slow to listen

  11. Rida Mehdawe January 10, 2020 at 1:08 am - Reply

    among several videos, this one is the best. appreciated

  12. salvin dsouza January 10, 2020 at 1:08 am - Reply

    14:40 blooper !!

  13. Praful Maka January 10, 2020 at 1:08 am - Reply

    Nice explanation!

  14. George Trialonis January 10, 2020 at 1:08 am - Reply

    Thank you very much for the videos on ML, AI, Python, etc. They help me learn a lot. Your explanations are clear and well understood. Thanks.

  15. Sunny veer Pratap Singh January 10, 2020 at 1:08 am - Reply

    bro you are best .. tried to swirl thru other online videos and then I end up watching your videos and I understand better .

  16. Hardik Vegad January 10, 2020 at 1:08 am - Reply

    sir why didn't u dropped one dummy column from salary_high,salary_medium,salary_low in the exercise question, will it not create dummy variable trap?

  17. umbul banin January 10, 2020 at 1:08 am - Reply

    great job sir

  18. Raju Prudhvi January 10, 2020 at 1:08 am - Reply

    Tnks sir can give me prediction plot visualization

  19. Islamic Way January 10, 2020 at 1:08 am - Reply

    What is y_train?

  20. fahim shahriar January 10, 2020 at 1:08 am - Reply

    sir, in your given exercise can we drop the independent variable by backward elimination process ??

  21. weerapast ruenrurngdee January 10, 2020 at 1:08 am - Reply

    Thank you so much 🙂

  22. Matt Chase January 10, 2020 at 1:08 am - Reply

    what about the step function?

  23. Maxim Kuznetsov January 10, 2020 at 1:08 am - Reply

    Cool

  24. rudr'a rajput January 10, 2020 at 1:08 am - Reply

    when i predict the following let see
    In[22]: model.predict(56)
    #it show me following error, please give me solution.
    —————————————————————————

    ValueError Traceback (most recent call last)

    <ipython-input-18-f6c77a36af5e> in <module>

    —-> 1 model.predict(56)

    c:usersuserappdatalocalprogramspythonpython37-32libsite-packagessklearnlinear_modelbase.py in predict(self, X)

    287 Predicted class label per sample.

    288 """

    –> 289 scores = self.decision_function(X)

    290 if len(scores.shape) == 1:

    291 indices = (scores > 0).astype(np.int)

    c:usersuserappdatalocalprogramspythonpython37-32libsite-packagessklearnlinear_modelbase.py in decision_function(self, X)

    263 "yet" % {'name': type(self).__name__})

    264

    –> 265 X = check_array(X, accept_sparse='csr')

    266

    267 n_features = self.coef_.shape[1]

    c:usersuserappdatalocalprogramspythonpython37-32libsite-packagessklearnutilsvalidation.py in check_array(array, accept_sparse, accept_large_sparse, dtype, order, copy, force_all_finite, ensure_2d, allow_nd, ensure_min_samples, ensure_min_features, warn_on_dtype, estimator)

    512 "Reshape your data either using array.reshape(-1, 1) if "

    513 "your data has a single feature or array.reshape(1, -1) "

    –> 514 "if it contains a single sample.".format(array))

    515 # If input is 1D raise error

    516 if array.ndim == 1:

    ValueError: Expected 2D array, got scalar array instead:

    array=56.

    Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

  25. Chris LAM January 10, 2020 at 1:08 am - Reply

    shift-tab.

  26. Ankit DS January 10, 2020 at 1:08 am - Reply

    Bro, this command is not working – giving below error – I did a dir on LinearRegression and can see 'predict_proba' in it, but still getting below error

    'LinearRegression' object has no attribute 'predict_proba'

  27. Vishnu Dutt January 10, 2020 at 1:08 am - Reply

    awesome explanation

  28. hardik darji January 10, 2020 at 1:08 am - Reply

    Sir, I tried exercise 'HR_comma_sep.cvs' —
    Cannot getting Accuracy > 0.78… how can I improve the model?
    thanks a lot …

  29. Prajual Pillai January 10, 2020 at 1:08 am - Reply

    you should not force an accent.

  30. Patrik Buess January 10, 2020 at 1:08 am - Reply

    thx Sir!

  31. Hemanth Peddi January 10, 2020 at 1:08 am - Reply

    what is penalty=l2 in the output 5 at 13:56 ? Can you please explain the parameters of the function

  32. shubham jain January 10, 2020 at 1:08 am - Reply

    sir I have got an error "too many values to unpack" at 11:24,please help me to resolve this issue.

  33. Arunav Rath January 10, 2020 at 1:08 am - Reply

    Superb..I have question though, after building the model in the exercise, how do we apply to new employees? I mean I want to check the probability of retaining a new set of employees.How to do that?

  34. Ankit Parashar January 10, 2020 at 1:08 am - Reply

    76% with salary

  35. Umashankar verma January 10, 2020 at 1:08 am - Reply

    love your videos.

  36. Justin Dates January 10, 2020 at 1:08 am - Reply

    how do you import the csv file?

  37. siddhant ranjan January 10, 2020 at 1:08 am - Reply

    accuracy-100%

  38. Md Irshad January 10, 2020 at 1:08 am - Reply

    Hi sir can you make a video for given exercise …so that we can understand how to analyse also ..pls😊 your video is awesome..

  39. Sunday Honesty January 10, 2020 at 1:08 am - Reply

    Thank you Codebasicd for helping me understand Linear regression. I have question for Codebasics and everyone. please do well to answer me. thanks in advance.
    I want to perform a logistic regression. I was asked to use state and political party and vote gotten as my independent variable and make a prediction whether a political party wins or loses. I have 36 states in my country and i want to use 3 dominant parties i want to use as a case study. my problem is how the layout of these data will be; I am unable to resolve party been in a separate column unless I take one political party and take one state and do the prediction explicitly and then move on to another.
    Please i really needs you guys help to resolve these issue. Thanks in advance.

  40. Uttam Dey January 10, 2020 at 1:08 am - Reply

    Sir, in the exercise section how did you decide that the df['left']==1 is the employees who left and df['left']==0 are the one retained. As I initially thought vice versa.Please respond to this query how to decide in such circumstance.

    By the way the tutorials are really helpful and thank you very much for the helpful tutorials.

  41. Riefvan Achmad Masrury January 10, 2020 at 1:08 am - Reply

    Really nice and clear explanation, will be very useful for my students

  42. Naveen kumar M January 10, 2020 at 1:08 am - Reply

    Thanks codebasics for such a clear explanations with examples.

    For the exercise problem, satisfaction level, average monthly hours, promotion last 5 years and salary are independent variables right ??
    If so, why not 'time_spend_company' and 'Work_accident' considered ?
    Can you please explain me actually i didn't get how to conclude the variables as dependent and independent specifically for this exercise problem. .

Leave A Comment

*