Data Science Full Course - Learn Data Science Beginners day




So what you're going to do isyou will build a classifier that predicts by using these hundred and sixty fiveobservations your feed all of these 165 observationsto your classifier and It will predictthe output every time a new patients detail is fedto the classifier right now out of these 165 cases. Let's say thatthe classifier predicted. Yes hundred and ten timesand no 55 times. Alright, so yesbasically stands for yes. The person has a diseaseand no stands for know. The person hasnot have a disease. All right, that'spretty self-explanatory. But yeah, so it predictedthat a hundred and ten times. Patient has a diseaseand 55 times that nor the patientdoesn't have a disease. However in reality onlyhundred and five patients in the samples havethe disease and 60 patients who do not havethe disease, right? So how do you calculatethe accuracy of your model? You basically buildthe confusion Matrix? All right. This is how the Matrix looks like and basically denotesthe total number of observations that you have which is 165 in our caseactual denotes the actual use in the data set and predicted denotesthe predicted values by the classifier. So the actual value is no here and the predictedvalue is no here. So your classifierwas correctly able to classify 50 cases as no. All right, since bothof these are no so 50 it was correctly ableto classify but 10 of these cases itincorrectly classified meaning that your actual value here is no but you classifierpredicted it as yes or a that's why this And over heresimilarly it wrongly predicted that five patientsdo not have diseases whereas they actuallydid have diseases and it correctlypredicted hundred patients, which have the disease. All right. I know this isa little bit confusing. But if you lookat these values no, no 50 meaning that it correctlypredicted 50 values No Yes means that itwrongly predicted. Yes for the values are itwas supposed to predict.



No. All right. Now what exactly is? Is this true positiveto negative and all of that? I'll tell you whatexactly it is. So true positive are the casesin which we predicted a yes and they do not actuallyhave the disease. All right, so it isbasically this value already predicted a yes here, even though theydid not have the disease. So we have 10 true positivesright similarly true- is we predicted know and they don't havethe disease meaning that this is correct. False positive is be predicted. Yes, but they do not actuallyhave the disease. All right. This is also known as type1 error falls- is we predicted. No, but they actuallydo not have the disease. So guys basically false negativeand true negatives are basically correct classifications. All right. So this was confusion Matrix and I hope this conceptis clear again guys. If you have doubts, please comment your doubtin the comment section. So guys that wasdescriptive statistics now, Before we go to probability. I promised all that will run a small demoin our all right, we'll try and understand how mean median modeworks in our okay, so let's do that first. So guys again what we just discussed so farwas descriptive statistics. All right, next we're goingto discuss probability and then we'll moveon to inferential statistics. Okay in financial statistics is basically the secondtype of Statistics. Okay now to make thingsmore clear of you, let me just zoom in. So guys it's always best to perform practicalimplementations in order to understand the conceptsin a better way. Okay, so here will be executinga small demo that will show you how to calculate the mean medianmode variance standard deviation and how to study the variablesby plotting a histogram.




Okay. Don't worry. If you don't knowwhat a histogram is. It's basically a frequency plot. There's no big signs behind it. Alright, this isa very simple demo but it also formsa foundation that everything. Machine learning algorithmis built upon. Okay, you can say that most of the machinelearning algorithms actually all the machine learning algorithms and deeplearning algorithms have this basic concept behind them. Okay, you need to knowhow mean median mode and all of that is calculated. So guys am usingthe our language to perform this and I'm running thison our studio. For those of youwho don't know our language. I will leave a couple of linksin the description box. You can go through those videos. So what we're doing is weare randomly generated. Eating numbers and Miss storing it in a variablecalled data, right? So if you want to seethe generated numbers just to run the line data, right this variable basicallystores all our numbers. All right. Now, what we're goingto do is we're going to calculate the mean now. All you have to do in ouris specify the word mean along with the data that you're calculatingthe mean of and I was assigned this whole thinginto a variable called mean Just hold the meanvalue of this data. So now let's look at the mean for that abuser functioncalled print and mean. All right. So our mean is around 5.99. Okay. Next is calculating the median. It's very simple guys. All you have to do is usethe function median or write and pass the data asa parameter to this function. That's all you have to do. So our provides functionsfor each and everything. All right statistics isvery easy when it comes to R because R is basicallya statistical language. Okay. So all you have to do isjust name the function and that function is Readyin built in your art. Okay, so your medianis around 6.4. Similarly. We will calculate the mode. All right. Let's run this function. I basically createda small function for calculating the mode. So guys, this isour mode meaning that this is the mostrecurrent value right now. We're going to calculate the variance and the standarddeviation for that. Again. We have a function in are calledas we're all right.




All you have to do is passthe data to that function. Okay, similarly will calculatethe standard deviation, which is basicallythe square root of your variance right now will Rentthe standard deviation, right? This is ourstandard deviation value. Now. Finally, we will just plota small histogram histogram is nothing but it'sa frequency plot already in show you how frequentlya data point is occurring. So this is the histogram that we've just createdit's quite simple in our because our has a lot of packages and a lotof inbuilt functions that support statistics. All right. It is a statistical language that is mainly used bydata scientists or by data and analysts andmachine learning Engineers because they don't haveto student code these functions. All they have to do isthey have to mention the name of the function and passthe corresponding parameters. So guys that was the entiredescriptive statistics module and now we will discussabout probability. Okay. So before we understandwhat exactly probability is, let me clear out a verycommon misconception people often tend to askme this question. What is the relationship betweenstatistics and probability? So probability and statisticsare related fields. All right. So probability isa mathematical method used for statistical analysis. Therefore we can say that a probability andstatistics are interconnected branches of mathematics that deal with analyzing therelative frequency of events. So they're veryinterconnected feels and probability makesuse of statistics and statistics makes use of probability or a they'revery interconnected Fields. So that is the relationship between said It issix and probability. Now. Let's understand whatexactly is probability. So probability is the measure of How likely an eventwill occur to be more precise. It is the ratio of desired outcometo the total outcomes. Now, the probability of all outcomes always sum upto 1 the probability will always sum up to 1 probabilitycannot go beyond one. Okay. So either your probabilitycan be 0 or it can be 1 or it can In the formof decimals like 0.5 to or 0.55 or it can bein the form of 0.5 0.7 0.9. But it's valuable always staybetween the range 0 and 1 okay, another famous exampleof probability is rolling a dice example. So when you roll a dice you getsix possible outcomes, right? You get one two, three four and five six phasesof a dies now each possibility only has one outcome. So what is the probabilitythat on rolling a dice? You will get 3 the probabilityis 1 by 6 right because there's only one phase which has the number 3 on itout of six phases. There's only one phasewhich has the number three. So the probability of getting 3 when you roll a diceis 1 by 6 similarly. If you want to findthe probability of getting a number 5 again, the probability isgoing to be 1 by 6.




All right. So all of this will sum up to 1. All right, so guys, this is exactly what Ability isit's a very simple concept. We all learnt it in 8 standardonwards right now. Let's understand thedifferent terminologies that are related to probability. Now that three terminologies that you often come acrosswhen we talk about probability. We have something knownas the random experiment. Okay. It's basically an experimentor a process for which the outcomes cannot bepredicted with certainty. All right. That's why you use probability. You're going to use probabilityin order to predict the outcome with Some sort of certainty sample space is theentire possible set of outcomes of a random experiment and event is one or moreoutcomes of an experiment. So if you consider the exampleof rolling a dice now, let's say that you wantto find out the probability of getting a towhen you roll the dice. Okay. So finding this probabilityis the random experiment the sample space is basicallyyour entire possibility. Okay. So one two, three, four five six Is arethere and out of that you need to find the probabilityof getting a 2 right? So all the possible outcomeswill basically represent your sample space gives a 1 to 6are all your possible outcomes. This represents your samplespace now event is one or more outcome of an experiment. So in this casemy event is to get a tattoo when I roll a dice, right? So my event is the probabilityof getting a to when I roll a dice, so guys, this is basicallywhat random experiment samples. All space and eventreally means alright now, let's discuss the differenttypes of events. There are two types of events that you should knowabout there is disjoint and non disjoint events. Disjoint events are events that do not haveany common outcome. For example, if you draw a single cardfrom a deck of cards, it cannot be a king and a queen correctit can either be king or it can be Queen nowa non disjoint events are events that have common out. For example a studentcan get hundred marks in statistics and hundredmarks in probability.





All right, and also the outcome of a ball deliveredcan be a no ball and it can be a 6 right. So this is what nondisjoint events are or n? These are very simpleto understand right now. Let's move on and lookat the different types of probability distribution. All right, I'll be discussing the three main probabilitydistribution functions. I'll be talkingabout probability density. Aaron normal distributionand Central limit theorem. Okay probability densityfunction also known as PDF is concernedwith the relative likelihood for a continuous random variableto take on a given value. Alright, so the PDF givesthe probability of a variable that lies betweenthe range A and B. So basically what you're tryingto do is you're going to try and find the probabilityof a continuous random variable over a specified range. Okay. Now this graph denotes the PDFof a continuous variable. Now this graph is also knownas the bell curve right? It's famously calledthe bell curve because of its shape andthe three important properties that you need to know abouta probability density function. Now the graph of a PDFwill be continuous over a range this is because you're findingthe probability that a continuous variable liesbetween the ranges A and B, right the second property. Is that the area bounded by Bythe curve of a density function and the x-axis is equalto 1 basically the area below the curve is equalto 1 all right, because it denotes probabilityagain the probability cannot arrange morethan one it has to be between 0 and 1property number three is that the probability that our random variableassumes a value between A and B is equal to the area under the PDF boundedby A and B. Okay. Now what this means, is that the probabilityYou is denoted by the area of the graph. All right, so whatever valuethat you get here, which basically oneis the probability that a random variable will liebetween the range A and B. All right. So I hope all of you have understood theprobability density function. It's basically the probabilityof finding the value of a continuous random variablebetween the range A and B. All right. Now, let's lookat our next distribution, which is normaldistribution now.




Normal distribution, which is also known asthe gaussian distribution is a probability distribution that denotes thesymmetric property of the mean right meaning that the ideabehind this function. Is that the datanear the mean occurs more frequently than the dataaway from the mean. So what it means to say is that the data around the meanrepresents the entire data set. Okay. So if you just takea sample of data around the mean it can representthe entire data set now similar to Probability density functionthe normal distribution appears as a bell curve right now when it comesto normal distribution. There are two important factors. All right, we have the meanof the population and the standard deviation. Okay, so the mean and the graphdetermines the location of the center of the graph, right and the standard deviationdetermines the height of the graph. Okay. So if the standard deviationis large the curve is going to look something like this. All right, it'llbe short and wide. I'd and if the standarddeviation is small the curve is tall and narrow. All right. So this was itabout normal distribution. Now, let's lookat the central limit theorem. Now the centrallimit theorem states that the sampling distribution of the mean of any independentrandom variable will be normal or nearly normal if the sample sizeis large enough now, that's a little confusing. Okay. Let me break it down foryou now in simple terms if we had a large population and be Why did itin too many samples, then the mean of all the samples from the population will bealmost equal to the mean of the entire population right? Meaning that each of the sampleis normally distributed. Right? So if you compare the meanof each of the sample, it will almost be equalto the mean of the population. Right? So this graph basically showsa more clear understanding of the central limit theorem redyou can see each sample here and the mean of each sample. Oil is almost alongthe same line, right? Okay. So this is exactly what the central limit theoremStates now the accuracy or the resemblance tothe normal distribution depends on two main factors, right? So the first is the numberof sample points that you consider. All right, and the second is the shapeof the underlying population. Now the shape obviously dependson the standard deviation and the meanof a sample, correct. So guys the centrallimit theorem basically states that eats Bill will be normallydistributed in such a way that the mean of each samplewill coincide with the mean of the actual population. All right in short terms. That's what centrallimit theorem States.




All right, and this holds trueonly for a large data set mostly for a small data setand there are more deviations when compared to a largedata set is because of the scaling Factor, right? The small is deviation in a small data set will changethe value vary drastically, but in a large dataset a small deviation will not matter at all. Now, let's move. Vaughn and lookat our next topic which is the differenttypes of probability. This is a important topic because most of your problemscan be solved by understanding which type of probability shouldI use to solve this problem? Right? So we have three importanttypes of probability. We have marginal jointand conditional probability. So let's discuss each of these now the probability ofan event occurring unconditioned on any other eventis known as marginal. Or unconditional probability. So let's say that you wantto find the probability that a card drawn is a heart. All right. So if you want tofind the probability that a card drawn is a heartThe Profit will be 13 by 52 since there are52 cards in a deck and there are 13 heartsin a deck of cards. Right and there are52 cards in a total deck. So your marginal probabilitywill be 13 by 52. That's aboutmarginal probability. Now, let's understandwhat is joint probability. And now joint probability is a measure of two eventshappening at the same time. Okay, let's saythat the two events are A and B. So the probability of event A and B occurring isthe intersection of A and B. So for example, if you want tofind the probability that a card is a four and a redthat would be joint probability. All right, becauseyou're finding a card that is 4 and the cardhas to be red in color.




So for the answer to thiswould be to Biceps you do because we have 1/2in heart and we have 1/2 and diamonds, correct. So both of these are redand color therefore. Our probability is to by 52 and if you further downit is 1 by 26, right? So this is whatjoint probability is all about moving on. Let's look at what exactlyconditional probability is. So if the probability of an event or an outcomeis based on the occurrence of a previous eventor an outcome. Then you call it asa conditional probability. Okay. So the conditional probabilityof an event B is the probability that the event will occur given that an event ahas already occurred. Right? So if a and b aredependent events, then the expression for conditional probabilityis given by this. Now this first termon the left hand side, which is p b of a isbasically the probability of event B occurring given that event ahas already occurred. So like I said, if a and b are dependent eventsthan this is the expression but if a and b areindependent events, and the expressionfor conditional probability is like this, right? So guys P of A and B of Bis obviously the probability of a and probabilityof B right now, let's move on now in order to understand conditionalprobability joint probability and marginal probability. Let's look at a small use case. Okay now basicallywe're going to Take a data set which examines the salarypackage and training undergone my candidates. Okay. Now in this there are60 candidates a without training and forty five candidates, which have enrolledfor Adder Acres training right. Now the task here is you haveto assess the training with a salary package. Okay. Let's look at thisin a little more depth. So in total, we have hundred and fivecandidates out of which 60 of them have not enrolledFrederick has training and 45 of them have enrolledfor a deer Acres. Inning. All right. This is the small surveythat was conducted and this is the ratingof the package or the salary that they got right? So if you read through the data, you can understandthere were five candidates without Eddie record training who got a verypoor salary package. Okay. Similarly, there are 30 candidates withEd Eureka training who got a good package, right? So guys, basically you'recomparing the salary package of a person depending on whether or not they've enrolledfor a A core training right? This is our data set. Now. Let's look at our problemstatement find the probability that a candidate has undergone editor Acrestraining quite simple,




which type ofprobability is this. This is marginal probability. Right? So the probability that a candidate has undergoneEdge rakers training is obviously 45 dividedby a hundred and five since 45 is the number of candidates withEddie record raining and hundred and five isthe total number of candidates, so you Valueof approximately 0.4 to or I that's the probabilityof a candidate that has undergone a Judaica straining next questionfind the probability that a candidate has attendededger a constraining and also has good package. Now. This is obviously a jointprobability problem, right? So how do youcalculate this now? Since our table is quiteformatted we can directly find that people who havegotten a good package along with Eddie recordraining or 30, right? So out of hundred andfive people 30 people have education trainingand a good package, right? They specifically asking for people with Ado Rekhatraining remember that right? The question isfind the probability that a candidate has attendededitor Acres training and also has a good package. Alright, so we needto consider two factors that is a candidate who'saddenda deaderick has training and who has a good package. So clearly that number is 30 30 divided bytotal number of candidates, which is 1 0 Five, right. So here you getthe answer clearly. Next we havefind the probability that a candidate hasa good package given that he has not undergone training. Okay. Now this is clearlyconditional probability because here you're defininga condition you're saying that you want to findthe probability of a candidate who has a good package giventhat he's not undergone. Any training, right? The condition is that he'snot undergone any training. All right. So the number of people who have not undergonetraining are 60 and out of that five of them have gota good package, right? So that's why this is Phi by 60and not 5 by hundred and five because here theyhave clearly mentioned has a good package given that hehas not undergone training. You have to only consider people who have notundergone training, right? So only five people who have not undergonetraining have gotten a good package, right? So 5 divided by 60 you geta probability of around 208 which is pretty low, right? Okay. So this was all about the differenttypes of probability. Now, let's move on and look atour last Topic in probability, which is base theorem. Now guys Bayes theorem isa very important concept when it comesto statistics and probability. It is majorly usedin knife bias algorithm. Those of you who aren't aware.





Now I've bias is a supervised learningclassification algorithm and it is mainly Usedin Gmail spam filtering, right a lot of youmight have noticed that if you open up Gmail, you'll see that you havea folder called spam right or that is carried outthrough machine learning and the algorithm usedthere is knife bias, right? So now let's discuss whatexactly the Bayes theorem is and what it denotesthe bias theorem is used to show the relation betweenone conditional probability and it's inverse. All right, basically Nothing, but the probability of an event occurring basedon prior knowledge of conditions that might be relatedto the same event. Okay. So mathematicallythe bell's theorem is represented like this, right like shownin this equation. The left-hand term is referredto as the likelihood ratio, which measures the probabilityof occurrence of event B, given an event a okayon the left hand side is what is known asthe posterior right is referred to as posterior. Are which meansthat the probability of occurrence of a givenan event B, right? The second term is referredto as the likelihood ratio or a this measures theprobability of occurrence of B, given an event a now Pof a is also known as the prior which refers to the actualprobability distribution of A and P of B is again, the probability of B, right. This is the bias theorem in order to betterunderstand the base theorem. Let's look at a small example. Let's say that we Three balls wehave about a bowel be and bouncy okay barleycontains two blue balls and for red balls bowelbe contains eight blue balls and for red balls baozicontains one blue ball and three red balls. Now if we draw one ballfrom each Bowl, what is the probability to drawa blue ball from a bowel a if we know that we drewexactly a total of two blue balls right if youdidn't Understand the question. Please. Read it. I shall pausefor a second or two. Right. So I hope all of youhave understood the question. Okay. Now what I'm going to dois I'm going to draw a blueprint for you and tell you how exactlyto solve the problem. But I want you all to giveme the solution to this problem, right? I'll draw a blueprint. I'll tell youwhat exactly the steps are but I want you to comeup with a solution on your own right the formulais also given to you. Everything is given to you. All you have to do is come upwith the final answer. Right? Let's look at how youcan solve this problem. So first of all, what we will do isLet's consider a all right, let a be the event of picking a blue ballfrom bag in and let X be the event of pickingexactly two blue balls, right because theseare the two events that we need to calculatethe probability of now there are two probabilitiesthat you need to consider here. One is the event of pickinga blue ball from bag a and the other is the event ofpicking exactly two blue balls. Okay. So these two are representedby a and X respectively Lee so what we want isthe probability of occurrence of event a given X, which means that given that we're pickingexactly two blue balls, what is the probability that we are pickinga blue ball from bag? So by the definitionof conditional probability, this is exactly whatour equation will look like. Correct. This is basically a occurrenceof event a given an event X and this isthe probability of a and x and this is the probabilityof X alone, correct? And what we need to dois we need to find these two probabilities which is probability of aand X occurring together and probability of X. Okay. This is the entire solution. So how do you find P probability of X this you can doin three ways. So first is white ballfrom a either white from be or read from see now first isto find the probability of x x basically represents the event of picking exactlytwo blue balls. Right. So these are the three waysin which it is possible. So you'll pick one blue ballfrom bowel a and one from bowel be in the second case. You can pick one from a and another blue ballfrom see in the third case. You can pick a blueball from Bagby and a blue ball from bagsy. Right? These are the three waysin which it is possible. So you need to findthe probability of each of this step two is that you need to findthe probability of a and X occurring together. This is the sumof terms 1 and 2. Okay, this is because in bothof these events, we are picking a ballfrom bag, correct. So there is find outthis probability and let me know your answerin the comment section. All right.






We'll see if you getthe answer right? I gave you the entiresolution to this. All you have to do issubstitute the value right? If you want a second or two, I'm going to pause on the screenso that you can go through this in a more clear away. Right? Remember that you needto calculate two. Tease the first probability that you need to calculate isthe event of picking a blue ball from bag a given that you're pickingexactly two blue balls. Okay, II probabilityyou need to calculate is the event of pickingexactly two blue bonds. All right. These are the two probabilities. You need to calculate soremember that and this is the solution. All right, so guys make sureyou mention your answers in the comment section for now. Let's move on and Lookat our next topic, which is theinferential statistics. So guys, we just completed theprobability module right now. We will discussinferential statistics, which is the secondtype of Statistics. We discussed descriptivestatistics earlier. Alright, so like I mentioned earlier inferentialstatistics also known as statistical inference isa branch of Statistics that deals with forminginferences and predictions about a population basedon a sample of data. Are taken from the population. All right, and the questionyou should ask is how does one form inferencesor predictions on a sample? The answer is youuse Point estimation? Okay. Now you must be wondering what is point estimationone estimation is concerned with the use of the sample datato measure a single value which serves asan approximate value or the best estimate ofan unknown population parameter. That's a little confusing. Let me break it downto you for Camping in order to calculate the meanof a huge population. What we do is we first draw outthe sample of the population and then we find the sample mean right the sample meanis then used to estimate the population mean this isbasically Point estimate, you're estimating the valueof one of the parameters of the population, right? Basically the main you're trying to estimatethe value of the mean. This is what point estimation isthe two main terms in point estimation. There's something known as as the estimatorand the something known as the estimate estimatoris a function of the sample that is used to findout the estimate. Alright in this example. It's basically the samplemean right so a function that calculates the samplemean is known as the estimator and the realized value of the estimator isthe estimate right?





So I hope Pointestimation is clear. Now, how do youfind the estimates? There are four common waysin which you can do this. The first one is methodof Moment you'll what you do isyou form an equation in the sample data set and then you analyzethe similar equation in the population dataset as well like the population meanpopulation variance and so on. So in simple terms, what you're doing is you'retaking down some known facts about the population and you're extendingthose ideas to the sample. Alright, once you do that, you can analyze the sampleand estimate more essential or more complexvalues right next. We have maximum likelihood. But this method basically usesa model to estimate a value. All right. Now a maximum likelihoodis majorly based on probability. So there's a lot of probabilityinvolved in this method next. We have the base estimatorthis works by minimizing the errors or the average risk. Okay, the base estimator has a lot to dowith the Bayes theorem. All right, let'snot get into the depth of these estimation methods. Finally. We have the best unbiasedestimators in this method. There are seven unbiasedestimators that can be used to approximate a parameter. Okay. So Guys these werea couple of methods that are usedto find the estimate but the most well-known methodto find the estimate is known as the interval estimation. Okay. This is one of the most important estimationmethods or at this is where confidence interval alsocomes into the picture right apart from interval estimation. We also have somethingknown as margin of error. So I'll be discussingall of this. In the upcoming slides. So first let's understand. What is interval estimate? Okay, an intervalor range of values, which are used to estimate apopulation parameter is known as an interval estimation, right? That's very understandable. Basically what they're trying tosee is you're going to estimate the value of a parameter. Let's say you're trying to findthe mean of a population. What you're going to do isyou're going to build a range and your value will lie inthat range or in that interval. All right. So this way your outputis going to be more accurate because you've not predicteda point estimation instead. You have estimated an interval within which your valuemight occur, right? Okay. Now this image clearly shows how Point estimate and intervalestimate or different. So where's interval estimateis obviously more accurate because you're not just focusingon a particular value or a particular point in order to predictthe probability instead. You're saying thatthe value might be within this range betweenthe lower confidence limit and the upper confidence limit. All right, this is denotesthe range or the interval.






Okay, if you're still confusedabout interval estimation, let me give you a small example if I stated that I will take30 minutes to reach the theater. This is knownas Point estimation. Okay, but if I stated that I will takebetween 45 minutes to an hour to reach the theater. This is an exampleof Will estimation all right. I hope it's clear. Now now interval estimationgives rise to two important statistical terminologies oneis known as confidence interval and the other is knownas margin of error. All right. So there's it's important that you pay attention to both of these terminologiesconfidence interval is one of the most significant measures that are used to check how essential machinelearning model is. All right. So what is confidence intervalconfidence interval is the measure of your confidence that the intervalestimated contains the population parameteror the population mean or any of those parametersright now statisticians use confidence intervalto describe the amount of uncertainty associated with the sample estimate ofa population parameter now guys, this is a lot of definition. Let me just make youunderstand confidence interval with a small example. Okay. Let's say that you perform a survey and you surveya group of cat owners. The see how many cans of catfood they purchase in one year. Okay, you test your statistics at the 99percent confidence level and you geta confidence interval of hundred comma 200 this means that you think that the cat owners by between hundred to twohundred cans in a year and also since the confidencelevel is 99% shows that you're very confidentthat the results are, correct. Okay. I hope all of youare clear with that. Alright, so your confidenceinterval here will be a hundred and two hundred and your confidence levelwill be 99% Right? That's the differencebetween confidence interval and confidence level Sowithin your confidence interval your value is going to lie andyour confidence level will show how confident you areabout your estimation, right? I hope that was clear. Let's look at margin of error. No margin of error for a given level of confidenceis a greatest possible distance between the Point estimate and the value of the parameter that it is estimatingyou can say that it is a deviation fromthe actual point estimate right. Now. The margin of errorcan be calculated using this formula now zcher denotes the critical value or the confidence interval and this is X standarddeviation divided by root of the sample size. All right, n is basicallythe sample size now, let's understand howyou can estimate the confidence intervals. So guys the level of confidence which is denoted byC is the probability that the interval estimatecontains a population parameter. Let's say that you're tryingto estimate the mean. All right. So the level of confidenceis the probability that the interval estimatecontains a population parameter. So this intervalbetween minus Z and z or the area beneath this curveis nothing but the probability that the interval estimatecontains a population parameter. You don't all right. It should basicallycontain the value that you are predicting right. Now. These are knownas critical values. This is basicallyyour lower limit and your higherlimit confidence level. Also, there's somethingknown as the Z score now. This court can be calculated by using the standardnormal table, right? If you look it up anywhereon Google you'll find the z-score table or the standard normaltable get to understand how this is done. Let's look at a small example. Okay, let's say that the levelof Vince is 90% This means that you are 90% confident that the interval containsthe population mean. Okay, so the remaining 10%which is out of hundred percent. The remaining 10%is equally distributed on these Dale regions. Okay, so you have 0.05 hereand 0.05 over here, right? So on either side of see you will distributethe other leftover percentage now these these scoresare calculated from the table as I mentioned before. All right one. N64 5 is get collatedfrom the standard normal table. Okay. So guys how you estimatethe level of confidence. So to sum it up. Let me tell you the stepsthat are involved in constructing aconfidence interval first. You'll start by identifyinga sample statistic. Okay. This is the statistic that you will use to estimatea population parameter. This can be anythinglike the mean of the sample next youwill select a confidence level now the confidence leveldescribes the uncertainty of a Sampling method right after that you'll findsomething known as the margin of error, right? We discuss marginof error earlier. So you find this basedon the equation that I explainedin the previous slide, then you'll finally specifythe confidence interval. All right. Now, let's lookat a problem statement to better understandthis concept a random sample of 32 textbook prices is takenfrom a local College Bookstore. The mean of the sample is so so and so and the samplestandard deviation is This use a 95% confident level and find the marginof error for the mean price of all text booksin the bookstore. Okay. Now, this is a verystraightforward question. If you want you can readthe question again. All you have to do is you haveto just substitute the values into the equation. All right, so guys, we know the formula for marginof error you take the Z score from the table. After that we have deviationMadrid's 23.4 for right and that's standard deviationand n stands for the number of samples here.





The number of samples is32 basically 32 textbooks. So approximately your marginof error is going to be around 8.1 to this isa pretty simple question. All right. I hope all of youunderstood this now that you know, the idea behindconfidence interval. Let's move ahead to one of the most important topicsin statistical inference, which is hypothesistesting, right? So Sigelei statisticiansuse hypothesis testing to formally check whether the hypothesisis accepted or rejected. Okay, hypothesis. Testing is an inferentialstatistical technique used to determine whether there is enough evidencein a data sample to infer that a certain condition holdstrue for an entire population. So to understand the characteristicsof a general population, we take a random sample, and we analyze the propertiesof the sample right we test. Whether or not the identifiedconclusion represent the population accurately and finally we interprettheir results now whether or not to acceptthe hypothesis depends upon the percentage valuethat we get from the hypothesis. Okay, so tobetter understand this, let's look at a smallexample before that. There are few stepsthat are followed in hypothesis, testing you beginby stating the null and the alternative hypothesis. All right. I'll tell you whatexactly these terms are and then you formulate. Analysis plan right after thatyou analyze the sample data and finally you caninterpret the results right now to understandthe entire hypothesis testing. We look at a good example. Okay now considerfor boys Nick jean-bob and Harry these boyswere caught bunking a class and they were askedto stay back at school and clean the classroomas a punishment, right? So what John did is he decided that four of them would taketurns to clean their classrooms. He came up with a planof writing each of their names on chits and putting themin a bout now every day. They had to pick upa name from the bowel and that person had to playin the clock, right? That sounds pretty fair enoughnow it is been three days and everybody's name has come upexcept John's assuming that this eventis completely random and free of bias. What is a probability of John not treatingright or is the probability that he's not actuallycheating this can Solved by using hypothesis testing. Okay. So we'll Begin by calculatingthe probability of John not being picked for a day. Alright, so we'regoing to assume that the event is free of bias. So we need to findout the probability of John not cheating rightfirst we'll find the probability that John is not pickedfor a day, right? We get 3 out of 4, which is basically 75%75% is fairly high. So if John is not pickedfor three days in a row the Probability will drop downto approximately 42% Okay. So three days in a row meaning that is the probabilitydrops down to 42 percent. Now, let's consider a situation where John is not pickedfor 12 days in a row the probability drops downto Tea Point two percent. Okay, that's the probability of John cheating becomesfairly high, right? So in order for statisticians to cometo a conclusion, they Define what is knownas the threshold value. Right consideringthe above situation if the threshold valueis set to 5 percent. It would indicate that if the probability liesbelow 5% then John is cheating his way out of detention. But if the probability isabout threshold value then John it just lucky and his nameisn't getting picked. So the probability and hypothesis testing give riseto two important components of hypothesis testing, which is null hypothesisand alternative hypothesis. Null. Hypothesis is based. Basically approving the Assumption alternatehypothesis is when your result disapprovesthe Assumption right therefore in our example, if the probabilityof an event occurring is less than 5% which it isthen the event is biased hence. It proves thealternate hypothesis. Undoubtedly machine learning isthe most in-demand technology in today's market. It's applications. From Seth driving cause to predicting deadly diseasessuch as ALS the high demand for machine learning skillsis the motivation behind today's session. So let me discussthe agenda with you first. Now, we're goingto begin the session by understanding the need for machine learning and whyit is important after that. We look at what exactlymachine learning is and then we'll discuss a coupleof machine learning definitions. Once we're done with that. We'll look at themachine learning process and how you can solvea problem by using Using the machine learning processnext we will discuss the types of machine learning which includessupervised unsupervised and reinforcement learning. Once we're done with that. We'll discuss the differenttypes of problems that can be solved byusing machine learning. Finally. We will end this sessionby looking at a demo where we'll see how youcan perform weather forecasting by using machine learning. All right, so guys, let's get startedwith our first topic. So what is the importance or what is the needfor machine learning now? Since the technical Revolution, we've been generatingan immeasurable amount of data as for research with generating around2.5 quintillion bytes of data every single day and it is estimated that by 2020 1.7 MB of datawill be created every second for every person on earth. Now that is a lotof data right now.







This data comesfrom sources such as the cloud iot devices socialmedia and all of that. Since all of usare very interested in the internet right nowwith generating a lot of data. All right, you have no ideahow much data we generate through social mediaall the chatting that we do and all the images that we poston Instagram the videos that we watch all of thisgenerates a lot of data. Now how does machinelearning fit into all of this since we're producingthis much data, we need to find a method that can analyze processand interpret this much data. All right, and weneed to find a method. That can make sense out of data. And that methodis machine learning. Now the lotof talk tire companies and data driven companysuch as Netflix and Amazon which build machine learningmodels by using tons of data in order to identifyany profitable opportunities. And if they want to avoidany unwanted risk it make use of machine learning. Alright, so through machinelearning You can predict risk You can predict profits youcan identify opportunities, which will help yougrow your business. Business so now I'll show you a couple of examples of wherein machine learning is used. All right, so I'm sure all ofyou have been watch on Netflix. Now the most important thing about Netflix isits recommendation engine. All right. Most of Netflix's Revenue comesfrom its recommendation engine. So the recommendation engine basically studies the movieviewing patterns of its users and then recommendsrelevant movies to them. All right, it recommends moviesdepending on users interests. Depending on the type of movies the userwatches and all of that. Alright, so that is how Netflix usesmachine learning. Next. We have Facebook'sAuto tagging feature. Now the logic behind Facebook's Auto tagging featureis machine learning and neural networks. I'm not sure how manyof you know this but Facebook makes use of deepmindface verification system, which is based on machine learningnatural language processing and neural networks. So deep mine basicallystudies the facial features in an image and it tagyour friends and family. Another such example isAmazon's Alexa now Alexa is basically an advancedlevel virtual assistant that is based on natural language processingand machine learning. Now, it can do morethan just play music for you. All right, it can bookyour Uber it can connect with other I/O devices that your house itcan track your health. It can order foodonline and all of that. So data, and machine learningare basically the main factors behind Alex has power another such example isthe Google spam filter. So guys Gmail basically makes use of machine learningto filter out spam messages. If any of you justopen your Gmail inbox, you'll see that thereare separate sections. There's one for primarythis social the spam and the Joe general made nowbasically Gmail makes use of machine learning algorithmsand natural language processing to an Is emails in real time and then classifythem as either spam or non-spam now, this is another famousapplication of machine learning. So to sum this up,let's look at a few reasons. Why machine learningis so important. So the first reasonis obviously increase in data generation. So because of excessiveproduction of data, we need a methodthat can be used to structure and lies and drawuseful insights from data. This is where machine learningcomes as in it uses data to solve problems and find solutions to the most complex tasksfaced by organizations. Another important reason is thatit improves decision-making. So by making use of variousalgorithms machine learning can be used to makeBetter Business decisions. For example machine learningis used to forecast sales. It is used to predict anydownfalls in the stock market. It is used to identifyrisks anomalies and so on now the next reasonIs it uncovers patterns and Trends in data findinghidden patterns and extracting key insights from data isthe most essential part of machine learning. So by building predictive models and using statisticaltechniques machine learning allows you to digbeneath the surface and explore the data at a minut scalenow understanding data and extracting patterns manuallywill take a lot of days. Now, if you do this throughmachine learning algorithms, you can performsuch computations. Nations in less than a second. Another reason is that it's solvedcomplex problems. So from detecting genes that are linkedto deadly ALS disease is to building self-driving cars and building phase detectionsystems machine learning can be used to solvethe most complex problems. So guys now that you know, why machine learningis so important. Let's look at what exactlymachine learning is. The term machine learningwas first coined by Arthur Samuel in the year1959 now looking back that your was probablythe most significant in terms of technological advancements. There is if you browsethrough the net about what is machine learningyou'll get at least a hundred different definitions. Now the first and very formaldefinition was given by Tom and Mitchell now, the definition says that a computer program is setto learn from experience e with respect to some class. Of caste andperformance measure P if its performance at tasks in D as measured by P improveswith experience e all right. Now I know this isa little confusing. So let's break it downinto simple words. Now in simple termsmachine learning is a subset of artificial intelligence which provides machines theability to learn automatically and improve from experiencewithout being explicitly programmed to doso in the sense. It is the practice of gettingmachines to solve problems by gaining the abilityto think but wait now how can a machine thinkor make decisions? Well, if you feel a machinea good amount of data, it will learnhow to interpret process and analyze this data by usingmachine learning algorithm. Okay. Now guys, lookat this figure on top. Now this figure basically showshow a machine learning algorithm or how the machine learningprocess really works. So the machine learning Beginsby feeding the machine lots and lots of data okayby using this data. The machine is trained to detecthidden insights and Trends. Now these insightsare then used to build a machine learning modelby using an algorithm in order to solve a problem. Okay. So basically you'regoing to feed a lot of data to the machine. The machine is going to gettrained by using this data. It's going to use this data and it's going todraw useful insights and patterns from it, and then it's goingto build a model by Using machine learning algorithms. Now this model will helpyou predict the outcome or help you solveany complex problem or any business problem. So that's a simple explanationof how machine learning works. Now, let's move on and look at some of the most commonlyused machine learning terms. So first of all,we have algorithm. Now, this isquite self-explanatory. Basically algorithmis a set of rules or statistical techniques, which are used to learnpatterns from data now an algorithm is The logicbehind a machine learning model. All right, an example of a machine learningalgorithm is linear regression. I'm not sure how many of youhave heard of linear regression. It's the most simple and basicmachine learning algorithm. All right. Next we have model nowmodel is the main component of machine learning. All right. So model will basically mapthe input to your output by using the machine learningalgorithm and by using the data that you're feeding the machine. So basically the model is a representation of the entiremachine learning process. So the model isbasically fed input which has a lot of data and then it will outputa particular result or a particular outcome by usingmachine learning algorithms. Next we have somethingknown as predictor variable. Now predictor variableis a feature of the data that can be usedto predict the output. So for example, let's say that you're trying to predictthe weight of a person depending on the person's heightand their age. All right. So over here the predictorvariables are your height and your age because you're usingheight and age of a person to predict the person's weight. Alright, so the height and the A's arethe predictor variables now, Wait on the other handis the response or the target variable. So response variable isa feature or the output variable that needs to be predicted byusing the predictor variables. All right, after that we have somethingknown as training data. So guys the data that is fed to a machinelearning model is always split into two parts first. We have the training data and then we havethe testing data now training data is basically used to buildthe machine learning model. So usually training datais much larger. Than the testing data because obviouslyif you're trying to train the machine then you're goingto feed it a lot more data. Testing data is just usedto validate and evaluate the efficiency of the model. Alright, so that was trainingdata and testing data.






So Guys, these were a few termsthat I thought you should know before we move any further. Okay. Now, let's move on and discussthe machine learning process. Now, this is goingto get very interesting because I'm goingto give you an example and make you understandhow the machine learning. process works So first of all, let's definethe different stages or the different steps involvedin the machine learning process. So machine learningprocess always begins with defining the objectiveor defining the problem that you're trying to solvenext is is data Gathering or data collection. Now the data that youneed to solve this problem is collected at this stage. This is followedby data preparation or data processing after that. You have dataexploration and Analysis. Isis and the nextstage is building a machine learning model. This is followedby model evaluation. And finally you haveprediction or your output. Now, let's try to understandthis entire process with an example. So our problem statement hereis to predict the possibility of rain by studyingthe weather conditions. So let's say that you're givena problem statement and you're asked to usea machine learning process to solve this problem statement. So let's get started. Alright, so the first stepis to Find the objective of the problem statement. Our objective here isto predict the possibility of rain by studyingthe weather conditions. Now in the first stageof a machine learning process. You must understand what exactly needsto be predicted. Now in our case the objectiveis to predict the possibility of rain by studyingweather conditions, right? So at this stage, it is also essential to takemental notes on what kind of data can be usedto solve this problem or the type of approach that you can follow to get. Get to the solution. All right, a few questions that are worth askingduring this stage is what are we trying to predict? What are the Target features or what arethe predictor variables? What kind of inputdata do we need? And what kindof problem are we facing? Is it a binary classificationproblem or is it a clustering problemnow, don't worry. If you don't knowwhat classification and clustering isI'll be explaining this in the upcoming slides. So guys this was the first stepof a machine learning process, which is Definethe Double the problem. All right. Now, let's move on and lookat step number two. So step number two isbasically data collection or data Gatheringnow at this stage. You must be asking questionssuch as what kind of data is needed to solve the problemis the data available and if it is available, how can I get the data? Okay. So once you know the typeof data that is required, you must understand how you can derivethis data data collection can be done manuallyor by web scraping, but if you're a beginner Norand you're just looking to learn machine learning you don't haveto worry about getting the data. OK there are thousandsof data resources on the web. You can just go ahead and download the datasetsfrom websites such as kaggle. Okay, now comingback to the problem at hand the data needed for weather forecasting includes measures such as humidity leveltemperature pressure locality whether or not you livein a hill station and so on so guyssuch data must be collected and stored for analysis. Now the next stage in machine learningis preparing your data the data you collected is almostnever in the right format. So basically you'll encountera lot of inconsistencies in the data set. Okay, this includes missing values redundantvariables duplicate values and so on removingsuch values is very important because they might leadto wrongful computations and predictions. So that's why at this stage youmust can the entire data set for any inconsistencies. You have to fix themat this stage. Now. The next step isexploratory data analysis. Now data analysis isall about diving deep into data and finding allthe hidden data Mysteries. Okay. This is where youbecome a detective. So edu or exploratory dataanalysis is like a brainstorming of machine learningdata exploration involves understanding the patterns and the trends in your data. So at this stage allthe useful insights are drawn and all the correlations. Turns between thevariables are understood. So you might ask what sort of correlations areyou talking about? For example in the caseof predicting rain fall. We know that there isa strong possibility of rain if the temperaturehas fallen low. Okay. So such correlationshave to be understood and mapped at this stage. Now. This stage is followedby stage number 5, which is buildinga machine learning model. So all the insightsand the patterns that you derive during data exploration are usedto build the machine learning. So this stage always Beginsby splitting the data set into two parts training dataand the testing data. So earlier in the session. I already told you what training and testing data isnow the training data will be used to buildand analyze the model and the logic of the model will be based on the machinelearning algorithm that is being implemented. Okay. Now in the caseof predicting rainfall since the output will bein the form of true or false we can use a classification algorithmlike logistically. Regression now choosingthe right algorithm depends on the type of problem. You're trying to solvethe data set you have and the level of complexityof the problem. So in the upcoming sections willbe discussing different types of problems that can be solvedby using machine learning.






So don't worry. If you don't knowwhat classification algorithm is and what logistic regression in. Okay. So all you need to knowis at this stage, you'll be buildinga machine learning model by using machinelearning algorithm and by using the trainingdata set the next But in on machine learningprocess is model evaluation and optimization. So after building a modelby using the training data set it is finally time to putthe model to a test. Okay. So the testing data setis used to check the efficiency of the model and how accuratelyit can predict the outcome. So once you calculatethe accuracy any improvements in the model haveto be implemented in this stage. Okay, so methods like parametertuning and cross-validation can be used to improvethe The performance of the model this is followedby the last stage, which is predictions. So once the model is evaluated and improved it is finallyused to make predictions. The final output can bea categorical variable or it can be a continuousquantity in our case for predicting the occurrence of rainfall the outputwill be a categorical variable in the sense. Our output will bein the form of true or false. Yes or no. Yes, basically represents that is going to rainand no will represent that. It wondering okayas simple as that, so guys that was the entiremachine learning process.


Thanks For Reading

Post a Comment

If you have any questions ! please let me know

Previous Post Next Post