Data Science Full Course - Learn Data Science Beginners day
So what you're going to do isyou will build a classifier that
predicts by using these hundred and sixty fiveobservations your feed all
of these 165 observationsto your classifier and It will predictthe
output every time a new patients detail is fedto the classifier right
now out of these 165 cases. Let's say thatthe classifier predicted. Yes
hundred and ten timesand no 55 times. Alright, so yesbasically stands
for yes. The person has a diseaseand no stands for know. The person
hasnot have a disease. All right, that'spretty self-explanatory. But
yeah, so it predictedthat a hundred and ten times. Patient has a
diseaseand 55 times that nor the patientdoesn't have a disease. However
in reality onlyhundred and five patients in the samples havethe disease
and 60 patients who do not havethe disease, right? So how do you
calculatethe accuracy of your model? You basically buildthe confusion
Matrix? All right. This is how the Matrix looks like and basically
denotesthe total number of observations that you have which is 165 in
our caseactual denotes the actual use in the data set and predicted
denotesthe predicted values by the classifier. So the actual value is no
here and the predictedvalue is no here. So your classifierwas correctly
able to classify 50 cases as no. All right, since bothof these are no
so 50 it was correctly ableto classify but 10 of these cases
itincorrectly classified meaning that your actual value here is no but
you classifierpredicted it as yes or a that's why this And over
heresimilarly it wrongly predicted that five patientsdo not have
diseases whereas they actuallydid have diseases and it
correctlypredicted hundred patients, which have the disease. All right. I
know this isa little bit confusing. But if you lookat these values no,
no 50 meaning that it correctlypredicted 50 values No Yes means that
itwrongly predicted. Yes for the values are itwas supposed to predict.
No. All right. Now what exactly is? Is this true positiveto negative and
all of that? I'll tell you whatexactly it is. So true positive are the
casesin which we predicted a yes and they do not actuallyhave the
disease. All right, so it isbasically this value already predicted a yes
here, even though theydid not have the disease. So we have 10 true
positivesright similarly true- is we predicted know and they don't
havethe disease meaning that this is correct. False positive is be
predicted. Yes, but they do not actuallyhave the disease. All right.
This is also known as type1 error falls- is we predicted. No, but they
actuallydo not have the disease. So guys basically false negativeand
true negatives are basically correct classifications. All right. So this
was confusion Matrix and I hope this conceptis clear again guys. If you
have doubts, please comment your doubtin the comment section. So guys
that wasdescriptive statistics now, Before we go to probability. I
promised all that will run a small demoin our all right, we'll try and
understand how mean median modeworks in our okay, so let's do that
first. So guys again what we just discussed so farwas descriptive
statistics. All right, next we're goingto discuss probability and then
we'll moveon to inferential statistics. Okay in financial statistics is
basically the secondtype of Statistics. Okay now to make thingsmore
clear of you, let me just zoom in. So guys it's always best to perform
practicalimplementations in order to understand the conceptsin a better
way. Okay, so here will be executinga small demo that will show you how
to calculate the mean medianmode variance standard deviation and how to
study the variablesby plotting a histogram.
Okay. Don't worry. If you
don't knowwhat a histogram is. It's basically a frequency plot. There's
no big signs behind it. Alright, this isa very simple demo but it also
formsa foundation that everything. Machine learning algorithmis built
upon. Okay, you can say that most of the machinelearning algorithms
actually all the machine learning algorithms and deeplearning algorithms
have this basic concept behind them. Okay, you need to knowhow mean
median mode and all of that is calculated. So guys am usingthe our
language to perform this and I'm running thison our studio. For those of
youwho don't know our language. I will leave a couple of linksin the
description box. You can go through those videos. So what we're doing is
weare randomly generated. Eating numbers and Miss storing it in a
variablecalled data, right? So if you want to seethe generated numbers
just to run the line data, right this variable basicallystores all our
numbers. All right. Now, what we're goingto do is we're going to
calculate the mean now. All you have to do in ouris specify the word
mean along with the data that you're calculatingthe mean of and I was
assigned this whole thinginto a variable called mean Just hold the
meanvalue of this data. So now let's look at the mean for that abuser
functioncalled print and mean. All right. So our mean is around 5.99.
Okay. Next is calculating the median. It's very simple guys. All you
have to do is usethe function median or write and pass the data asa
parameter to this function. That's all you have to do. So our provides
functionsfor each and everything. All right statistics isvery easy when
it comes to R because R is basicallya statistical language. Okay. So all
you have to do isjust name the function and that function is Readyin
built in your art. Okay, so your medianis around 6.4. Similarly. We will
calculate the mode. All right. Let's run this function. I basically
createda small function for calculating the mode. So guys, this isour
mode meaning that this is the mostrecurrent value right now. We're going
to calculate the variance and the standarddeviation for that. Again. We
have a function in are calledas we're all right.
All you have to do is
passthe data to that function. Okay, similarly will calculatethe
standard deviation, which is basicallythe square root of your variance
right now will Rentthe standard deviation, right? This is ourstandard
deviation value. Now. Finally, we will just plota small histogram
histogram is nothing but it'sa frequency plot already in show you how
frequentlya data point is occurring. So this is the histogram that we've
just createdit's quite simple in our because our has a lot of packages
and a lotof inbuilt functions that support statistics. All right. It is a
statistical language that is mainly used bydata scientists or by data
and analysts andmachine learning Engineers because they don't haveto
student code these functions. All they have to do isthey have to mention
the name of the function and passthe corresponding parameters. So guys
that was the entiredescriptive statistics module and now we will
discussabout probability. Okay. So before we understandwhat exactly
probability is, let me clear out a verycommon misconception people often
tend to askme this question. What is the relationship betweenstatistics
and probability? So probability and statisticsare related fields. All
right. So probability isa mathematical method used for statistical
analysis. Therefore we can say that a probability andstatistics are
interconnected branches of mathematics that deal with analyzing
therelative frequency of events. So they're veryinterconnected feels and
probability makesuse of statistics and statistics makes use of
probability or a they'revery interconnected Fields. So that is the
relationship between said It issix and probability. Now. Let's
understand whatexactly is probability. So probability is the measure of
How likely an eventwill occur to be more precise. It is the ratio of
desired outcometo the total outcomes. Now, the probability of all
outcomes always sum upto 1 the probability will always sum up to 1
probabilitycannot go beyond one. Okay. So either your probabilitycan be 0
or it can be 1 or it can In the formof decimals like 0.5 to or 0.55 or
it can bein the form of 0.5 0.7 0.9. But it's valuable always
staybetween the range 0 and 1 okay, another famous exampleof probability
is rolling a dice example. So when you roll a dice you getsix possible
outcomes, right? You get one two, three four and five six phasesof a
dies now each possibility only has one outcome. So what is the
probabilitythat on rolling a dice? You will get 3 the probabilityis 1 by
6 right because there's only one phase which has the number 3 on itout
of six phases. There's only one phasewhich has the number three. So the
probability of getting 3 when you roll a diceis 1 by 6 similarly. If you
want to findthe probability of getting a number 5 again, the
probability isgoing to be 1 by 6.
All right. So all of this will sum up
to 1. All right, so guys, this is exactly what Ability isit's a very
simple concept. We all learnt it in 8 standardonwards right now. Let's
understand thedifferent terminologies that are related to probability.
Now that three terminologies that you often come acrosswhen we talk
about probability. We have something knownas the random experiment.
Okay. It's basically an experimentor a process for which the outcomes
cannot bepredicted with certainty. All right. That's why you use
probability. You're going to use probabilityin order to predict the
outcome with Some sort of certainty sample space is theentire possible
set of outcomes of a random experiment and event is one or moreoutcomes
of an experiment. So if you consider the exampleof rolling a dice now,
let's say that you wantto find out the probability of getting a towhen
you roll the dice. Okay. So finding this probabilityis the random
experiment the sample space is basicallyyour entire possibility. Okay.
So one two, three, four five six Is arethere and out of that you need to
find the probabilityof getting a 2 right? So all the possible
outcomeswill basically represent your sample space gives a 1 to 6are all
your possible outcomes. This represents your samplespace now event is
one or more outcome of an experiment. So in this casemy event is to get a
tattoo when I roll a dice, right? So my event is the probabilityof
getting a to when I roll a dice, so guys, this is basicallywhat random
experiment samples. All space and eventreally means alright now, let's
discuss the differenttypes of events. There are two types of events that
you should knowabout there is disjoint and non disjoint events.
Disjoint events are events that do not haveany common outcome. For
example, if you draw a single cardfrom a deck of cards, it cannot be a
king and a queen correctit can either be king or it can be Queen nowa
non disjoint events are events that have common out. For example a
studentcan get hundred marks in statistics and hundredmarks in
probability.
All right, and also the outcome of a ball deliveredcan be a
no ball and it can be a 6 right. So this is what nondisjoint events are
or n? These are very simpleto understand right now. Let's move on and
lookat the different types of probability distribution. All right, I'll
be discussing the three main probabilitydistribution functions. I'll be
talkingabout probability density. Aaron normal distributionand Central
limit theorem. Okay probability densityfunction also known as PDF is
concernedwith the relative likelihood for a continuous random variableto
take on a given value. Alright, so the PDF givesthe probability of a
variable that lies betweenthe range A and B. So basically what you're
tryingto do is you're going to try and find the probabilityof a
continuous random variable over a specified range. Okay. Now this graph
denotes the PDFof a continuous variable. Now this graph is also knownas
the bell curve right? It's famously calledthe bell curve because of its
shape andthe three important properties that you need to know abouta
probability density function. Now the graph of a PDFwill be continuous
over a range this is because you're findingthe probability that a
continuous variable liesbetween the ranges A and B, right the second
property. Is that the area bounded by Bythe curve of a density function
and the x-axis is equalto 1 basically the area below the curve is
equalto 1 all right, because it denotes probabilityagain the probability
cannot arrange morethan one it has to be between 0 and 1property number
three is that the probability that our random variableassumes a value
between A and B is equal to the area under the PDF boundedby A and B.
Okay. Now what this means, is that the probabilityYou is denoted by the
area of the graph. All right, so whatever valuethat you get here, which
basically oneis the probability that a random variable will liebetween
the range A and B. All right. So I hope all of you have understood
theprobability density function. It's basically the probabilityof
finding the value of a continuous random variablebetween the range A and
B. All right. Now, let's lookat our next distribution, which is
normaldistribution now.
Normal distribution, which is also known asthe
gaussian distribution is a probability distribution that denotes
thesymmetric property of the mean right meaning that the ideabehind this
function. Is that the datanear the mean occurs more frequently than the
dataaway from the mean. So what it means to say is that the data around
the meanrepresents the entire data set. Okay. So if you just takea
sample of data around the mean it can representthe entire data set now
similar to Probability density functionthe normal distribution appears
as a bell curve right now when it comesto normal distribution. There are
two important factors. All right, we have the meanof the population and
the standard deviation. Okay, so the mean and the graphdetermines the
location of the center of the graph, right and the standard
deviationdetermines the height of the graph. Okay. So if the standard
deviationis large the curve is going to look something like this. All
right, it'llbe short and wide. I'd and if the standarddeviation is small
the curve is tall and narrow. All right. So this was itabout normal
distribution. Now, let's lookat the central limit theorem. Now the
centrallimit theorem states that the sampling distribution of the mean
of any independentrandom variable will be normal or nearly normal if the
sample sizeis large enough now, that's a little confusing. Okay. Let me
break it down foryou now in simple terms if we had a large population
and be Why did itin too many samples, then the mean of all the samples
from the population will bealmost equal to the mean of the entire
population right? Meaning that each of the sampleis normally
distributed. Right? So if you compare the meanof each of the sample, it
will almost be equalto the mean of the population. Right? So this graph
basically showsa more clear understanding of the central limit theorem
redyou can see each sample here and the mean of each sample. Oil is
almost alongthe same line, right? Okay. So this is exactly what the
central limit theoremStates now the accuracy or the resemblance tothe
normal distribution depends on two main factors, right? So the first is
the numberof sample points that you consider. All right, and the second
is the shapeof the underlying population. Now the shape obviously
dependson the standard deviation and the meanof a sample, correct. So
guys the centrallimit theorem basically states that eats Bill will be
normallydistributed in such a way that the mean of each samplewill
coincide with the mean of the actual population. All right in short
terms. That's what centrallimit theorem States.
All right, and this
holds trueonly for a large data set mostly for a small data setand there
are more deviations when compared to a largedata set is because of the
scaling Factor, right? The small is deviation in a small data set will
changethe value vary drastically, but in a large dataset a small
deviation will not matter at all. Now, let's move. Vaughn and lookat our
next topic which is the differenttypes of probability. This is a
important topic because most of your problemscan be solved by
understanding which type of probability shouldI use to solve this
problem? Right? So we have three importanttypes of probability. We have
marginal jointand conditional probability. So let's discuss each of
these now the probability ofan event occurring unconditioned on any
other eventis known as marginal. Or unconditional probability. So let's
say that you wantto find the probability that a card drawn is a heart.
All right. So if you want tofind the probability that a card drawn is a
heartThe Profit will be 13 by 52 since there are52 cards in a deck and
there are 13 heartsin a deck of cards. Right and there are52 cards in a
total deck. So your marginal probabilitywill be 13 by 52. That's
aboutmarginal probability. Now, let's understandwhat is joint
probability. And now joint probability is a measure of two
eventshappening at the same time. Okay, let's saythat the two events are
A and B. So the probability of event A and B occurring isthe
intersection of A and B. So for example, if you want tofind the
probability that a card is a four and a redthat would be joint
probability. All right, becauseyou're finding a card that is 4 and the
cardhas to be red in color.
So for the answer to thiswould be to Biceps
you do because we have 1/2in heart and we have 1/2 and diamonds,
correct. So both of these are redand color therefore. Our probability is
to by 52 and if you further downit is 1 by 26, right? So this is
whatjoint probability is all about moving on. Let's look at what
exactlyconditional probability is. So if the probability of an event or
an outcomeis based on the occurrence of a previous eventor an outcome.
Then you call it asa conditional probability. Okay. So the conditional
probabilityof an event B is the probability that the event will occur
given that an event ahas already occurred. Right? So if a and b
aredependent events, then the expression for conditional probabilityis
given by this. Now this first termon the left hand side, which is p b of
a isbasically the probability of event B occurring given that event
ahas already occurred. So like I said, if a and b are dependent
eventsthan this is the expression but if a and b areindependent events,
and the expressionfor conditional probability is like this, right? So
guys P of A and B of Bis obviously the probability of a and
probabilityof B right now, let's move on now in order to understand
conditionalprobability joint probability and marginal probability. Let's
look at a small use case. Okay now basicallywe're going to Take a data
set which examines the salarypackage and training undergone my
candidates. Okay. Now in this there are60 candidates a without training
and forty five candidates, which have enrolledfor Adder Acres training
right. Now the task here is you haveto assess the training with a salary
package. Okay. Let's look at thisin a little more depth. So in total,
we have hundred and fivecandidates out of which 60 of them have not
enrolledFrederick has training and 45 of them have enrolledfor a deer
Acres. Inning. All right. This is the small surveythat was conducted and
this is the ratingof the package or the salary that they got right? So
if you read through the data, you can understandthere were five
candidates without Eddie record training who got a verypoor salary
package. Okay. Similarly, there are 30 candidates withEd Eureka training
who got a good package, right? So guys, basically you'recomparing the
salary package of a person depending on whether or not they've
enrolledfor a A core training right? This is our data set. Now. Let's
look at our problemstatement find the probability that a candidate has
undergone editor Acrestraining quite simple,
which type ofprobability is
this. This is marginal probability. Right? So the probability that a
candidate has undergoneEdge rakers training is obviously 45 dividedby a
hundred and five since 45 is the number of candidates withEddie record
raining and hundred and five isthe total number of candidates, so you
Valueof approximately 0.4 to or I that's the probabilityof a candidate
that has undergone a Judaica straining next questionfind the probability
that a candidate has attendededger a constraining and also has good
package. Now. This is obviously a jointprobability problem, right? So
how do youcalculate this now? Since our table is quiteformatted we can
directly find that people who havegotten a good package along with Eddie
recordraining or 30, right? So out of hundred andfive people 30 people
have education trainingand a good package, right? They specifically
asking for people with Ado Rekhatraining remember that right? The
question isfind the probability that a candidate has attendededitor
Acres training and also has a good package. Alright, so we needto
consider two factors that is a candidate who'saddenda deaderick has
training and who has a good package. So clearly that number is 30 30
divided bytotal number of candidates, which is 1 0 Five, right. So here
you getthe answer clearly. Next we havefind the probability that a
candidate hasa good package given that he has not undergone training.
Okay. Now this is clearlyconditional probability because here you're
defininga condition you're saying that you want to findthe probability
of a candidate who has a good package giventhat he's not undergone. Any
training, right? The condition is that he'snot undergone any training.
All right. So the number of people who have not undergonetraining are 60
and out of that five of them have gota good package, right? So that's
why this is Phi by 60and not 5 by hundred and five because here theyhave
clearly mentioned has a good package given that hehas not undergone
training. You have to only consider people who have notundergone
training, right? So only five people who have not undergonetraining have
gotten a good package, right? So 5 divided by 60 you geta probability
of around 208 which is pretty low, right? Okay. So this was all about
the differenttypes of probability. Now, let's move on and look atour
last Topic in probability, which is base theorem. Now guys Bayes theorem
isa very important concept when it comesto statistics and probability.
It is majorly usedin knife bias algorithm. Those of you who aren't
aware.
Now I've bias is a supervised learningclassification algorithm
and it is mainly Usedin Gmail spam filtering, right a lot of youmight
have noticed that if you open up Gmail, you'll see that you havea folder
called spam right or that is carried outthrough machine learning and
the algorithm usedthere is knife bias, right? So now let's discuss
whatexactly the Bayes theorem is and what it denotesthe bias theorem is
used to show the relation betweenone conditional probability and it's
inverse. All right, basically Nothing, but the probability of an event
occurring basedon prior knowledge of conditions that might be relatedto
the same event. Okay. So mathematicallythe bell's theorem is represented
like this, right like shownin this equation. The left-hand term is
referredto as the likelihood ratio, which measures the probabilityof
occurrence of event B, given an event a okayon the left hand side is
what is known asthe posterior right is referred to as posterior. Are
which meansthat the probability of occurrence of a givenan event B,
right? The second term is referredto as the likelihood ratio or a this
measures theprobability of occurrence of B, given an event a now Pof a
is also known as the prior which refers to the actualprobability
distribution of A and P of B is again, the probability of B, right. This
is the bias theorem in order to betterunderstand the base theorem.
Let's look at a small example. Let's say that we Three balls wehave
about a bowel be and bouncy okay barleycontains two blue balls and for
red balls bowelbe contains eight blue balls and for red balls
baozicontains one blue ball and three red balls. Now if we draw one
ballfrom each Bowl, what is the probability to drawa blue ball from a
bowel a if we know that we drewexactly a total of two blue balls right
if youdidn't Understand the question. Please. Read it. I shall pausefor a
second or two. Right. So I hope all of youhave understood the question.
Okay. Now what I'm going to dois I'm going to draw a blueprint for you
and tell you how exactlyto solve the problem. But I want you all to
giveme the solution to this problem, right? I'll draw a blueprint. I'll
tell youwhat exactly the steps are but I want you to comeup with a
solution on your own right the formulais also given to you. Everything
is given to you. All you have to do is come upwith the final answer.
Right? Let's look at how youcan solve this problem. So first of all,
what we will do isLet's consider a all right, let a be the event of
picking a blue ballfrom bag in and let X be the event of pickingexactly
two blue balls, right because theseare the two events that we need to
calculatethe probability of now there are two probabilitiesthat you need
to consider here. One is the event of pickinga blue ball from bag a and
the other is the event ofpicking exactly two blue balls. Okay. So these
two are representedby a and X respectively Lee so what we want isthe
probability of occurrence of event a given X, which means that given
that we're pickingexactly two blue balls, what is the probability that
we are pickinga blue ball from bag? So by the definitionof conditional
probability, this is exactly whatour equation will look like. Correct.
This is basically a occurrenceof event a given an event X and this isthe
probability of a and x and this is the probabilityof X alone, correct?
And what we need to dois we need to find these two probabilities which
is probability of aand X occurring together and probability of X. Okay.
This is the entire solution. So how do you find P probability of X this
you can doin three ways. So first is white ballfrom a either white from
be or read from see now first isto find the probability of x x basically
represents the event of picking exactlytwo blue balls. Right. So these
are the three waysin which it is possible. So you'll pick one blue
ballfrom bowel a and one from bowel be in the second case. You can pick
one from a and another blue ballfrom see in the third case. You can pick
a blueball from Bagby and a blue ball from bagsy. Right? These are the
three waysin which it is possible. So you need to findthe probability of
each of this step two is that you need to findthe probability of a and X
occurring together. This is the sumof terms 1 and 2. Okay, this is
because in bothof these events, we are picking a ballfrom bag, correct.
So there is find outthis probability and let me know your answerin the
comment section. All right.
We'll see if you getthe answer right? I gave
you the entiresolution to this. All you have to do issubstitute the
value right? If you want a second or two, I'm going to pause on the
screenso that you can go through this in a more clear away. Right?
Remember that you needto calculate two. Tease the first probability that
you need to calculate isthe event of picking a blue ball from bag a
given that you're pickingexactly two blue balls. Okay, II probabilityyou
need to calculate is the event of pickingexactly two blue bonds. All
right. These are the two probabilities. You need to calculate soremember
that and this is the solution. All right, so guys make sureyou mention
your answers in the comment section for now. Let's move on and Lookat
our next topic, which is theinferential statistics. So guys, we just
completed theprobability module right now. We will discussinferential
statistics, which is the secondtype of Statistics. We discussed
descriptivestatistics earlier. Alright, so like I mentioned earlier
inferentialstatistics also known as statistical inference isa branch of
Statistics that deals with forminginferences and predictions about a
population basedon a sample of data. Are taken from the population. All
right, and the questionyou should ask is how does one form inferencesor
predictions on a sample? The answer is youuse Point estimation? Okay.
Now you must be wondering what is point estimationone estimation is
concerned with the use of the sample datato measure a single value which
serves asan approximate value or the best estimate ofan unknown
population parameter. That's a little confusing. Let me break it downto
you for Camping in order to calculate the meanof a huge population. What
we do is we first draw outthe sample of the population and then we find
the sample mean right the sample meanis then used to estimate the
population mean this isbasically Point estimate, you're estimating the
valueof one of the parameters of the population, right? Basically the
main you're trying to estimatethe value of the mean. This is what point
estimation isthe two main terms in point estimation. There's something
known as as the estimatorand the something known as the estimate
estimatoris a function of the sample that is used to findout the
estimate. Alright in this example. It's basically the samplemean right
so a function that calculates the samplemean is known as the estimator
and the realized value of the estimator isthe estimate right?
So I hope
Pointestimation is clear. Now, how do youfind the estimates? There are
four common waysin which you can do this. The first one is methodof
Moment you'll what you do isyou form an equation in the sample data set
and then you analyzethe similar equation in the population dataset as
well like the population meanpopulation variance and so on. So in simple
terms, what you're doing is you'retaking down some known facts about
the population and you're extendingthose ideas to the sample. Alright,
once you do that, you can analyze the sampleand estimate more essential
or more complexvalues right next. We have maximum likelihood. But this
method basically usesa model to estimate a value. All right. Now a
maximum likelihoodis majorly based on probability. So there's a lot of
probabilityinvolved in this method next. We have the base estimatorthis
works by minimizing the errors or the average risk. Okay, the base
estimator has a lot to dowith the Bayes theorem. All right, let'snot get
into the depth of these estimation methods. Finally. We have the best
unbiasedestimators in this method. There are seven unbiasedestimators
that can be used to approximate a parameter. Okay. So Guys these werea
couple of methods that are usedto find the estimate but the most
well-known methodto find the estimate is known as the interval
estimation. Okay. This is one of the most important estimationmethods or
at this is where confidence interval alsocomes into the picture right
apart from interval estimation. We also have somethingknown as margin of
error. So I'll be discussingall of this. In the upcoming slides. So
first let's understand. What is interval estimate? Okay, an intervalor
range of values, which are used to estimate apopulation parameter is
known as an interval estimation, right? That's very understandable.
Basically what they're trying tosee is you're going to estimate the
value of a parameter. Let's say you're trying to findthe mean of a
population. What you're going to do isyou're going to build a range and
your value will lie inthat range or in that interval. All right. So this
way your outputis going to be more accurate because you've not
predicteda point estimation instead. You have estimated an interval
within which your valuemight occur, right? Okay. Now this image clearly
shows how Point estimate and intervalestimate or different. So where's
interval estimateis obviously more accurate because you're not just
focusingon a particular value or a particular point in order to
predictthe probability instead. You're saying thatthe value might be
within this range betweenthe lower confidence limit and the upper
confidence limit. All right, this is denotesthe range or the interval.
Okay, if you're still confusedabout interval estimation, let me give you
a small example if I stated that I will take30 minutes to reach the
theater. This is knownas Point estimation. Okay, but if I stated that I
will takebetween 45 minutes to an hour to reach the theater. This is an
exampleof Will estimation all right. I hope it's clear. Now now interval
estimationgives rise to two important statistical terminologies oneis
known as confidence interval and the other is knownas margin of error.
All right. So there's it's important that you pay attention to both of
these terminologiesconfidence interval is one of the most significant
measures that are used to check how essential machinelearning model is.
All right. So what is confidence intervalconfidence interval is the
measure of your confidence that the intervalestimated contains the
population parameteror the population mean or any of those
parametersright now statisticians use confidence intervalto describe the
amount of uncertainty associated with the sample estimate ofa
population parameter now guys, this is a lot of definition. Let me just
make youunderstand confidence interval with a small example. Okay. Let's
say that you perform a survey and you surveya group of cat owners. The
see how many cans of catfood they purchase in one year. Okay, you test
your statistics at the 99percent confidence level and you geta
confidence interval of hundred comma 200 this means that you think that
the cat owners by between hundred to twohundred cans in a year and also
since the confidencelevel is 99% shows that you're very confidentthat
the results are, correct. Okay. I hope all of youare clear with that.
Alright, so your confidenceinterval here will be a hundred and two
hundred and your confidence levelwill be 99% Right? That's the
differencebetween confidence interval and confidence level Sowithin your
confidence interval your value is going to lie andyour confidence level
will show how confident you areabout your estimation, right? I hope
that was clear. Let's look at margin of error. No margin of error for a
given level of confidenceis a greatest possible distance between the
Point estimate and the value of the parameter that it is estimatingyou
can say that it is a deviation fromthe actual point estimate right. Now.
The margin of errorcan be calculated using this formula now zcher
denotes the critical value or the confidence interval and this is X
standarddeviation divided by root of the sample size. All right, n is
basicallythe sample size now, let's understand howyou can estimate the
confidence intervals. So guys the level of confidence which is denoted
byC is the probability that the interval estimatecontains a population
parameter. Let's say that you're tryingto estimate the mean. All right.
So the level of confidenceis the probability that the interval
estimatecontains a population parameter. So this intervalbetween minus Z
and z or the area beneath this curveis nothing but the probability that
the interval estimatecontains a population parameter. You don't all
right. It should basicallycontain the value that you are predicting
right. Now. These are knownas critical values. This is basicallyyour
lower limit and your higherlimit confidence level. Also, there's
somethingknown as the Z score now. This court can be calculated by using
the standardnormal table, right? If you look it up anywhereon Google
you'll find the z-score table or the standard normaltable get to
understand how this is done. Let's look at a small example. Okay, let's
say that the levelof Vince is 90% This means that you are 90% confident
that the interval containsthe population mean. Okay, so the remaining
10%which is out of hundred percent. The remaining 10%is equally
distributed on these Dale regions. Okay, so you have 0.05 hereand 0.05
over here, right? So on either side of see you will distributethe other
leftover percentage now these these scoresare calculated from the table
as I mentioned before. All right one. N64 5 is get collatedfrom the
standard normal table. Okay. So guys how you estimatethe level of
confidence. So to sum it up. Let me tell you the stepsthat are involved
in constructing aconfidence interval first. You'll start by identifyinga
sample statistic. Okay. This is the statistic that you will use to
estimatea population parameter. This can be anythinglike the mean of the
sample next youwill select a confidence level now the confidence
leveldescribes the uncertainty of a Sampling method right after that
you'll findsomething known as the margin of error, right? We discuss
marginof error earlier. So you find this basedon the equation that I
explainedin the previous slide, then you'll finally specifythe
confidence interval. All right. Now, let's lookat a problem statement to
better understandthis concept a random sample of 32 textbook prices is
takenfrom a local College Bookstore. The mean of the sample is so so and
so and the samplestandard deviation is This use a 95% confident level
and find the marginof error for the mean price of all text booksin the
bookstore. Okay. Now, this is a verystraightforward question. If you
want you can readthe question again. All you have to do is you haveto
just substitute the values into the equation. All right, so guys, we
know the formula for marginof error you take the Z score from the table.
After that we have deviationMadrid's 23.4 for right and that's standard
deviationand n stands for the number of samples here.
The number of
samples is32 basically 32 textbooks. So approximately your marginof
error is going to be around 8.1 to this isa pretty simple question. All
right. I hope all of youunderstood this now that you know, the idea
behindconfidence interval. Let's move ahead to one of the most important
topicsin statistical inference, which is hypothesistesting, right? So
Sigelei statisticiansuse hypothesis testing to formally check whether
the hypothesisis accepted or rejected. Okay, hypothesis. Testing is an
inferentialstatistical technique used to determine whether there is
enough evidencein a data sample to infer that a certain condition
holdstrue for an entire population. So to understand the
characteristicsof a general population, we take a random sample, and we
analyze the propertiesof the sample right we test. Whether or not the
identifiedconclusion represent the population accurately and finally we
interprettheir results now whether or not to acceptthe hypothesis
depends upon the percentage valuethat we get from the hypothesis. Okay,
so tobetter understand this, let's look at a smallexample before that.
There are few stepsthat are followed in hypothesis, testing you beginby
stating the null and the alternative hypothesis. All right. I'll tell
you whatexactly these terms are and then you formulate. Analysis plan
right after thatyou analyze the sample data and finally you caninterpret
the results right now to understandthe entire hypothesis testing. We
look at a good example. Okay now considerfor boys Nick jean-bob and
Harry these boyswere caught bunking a class and they were askedto stay
back at school and clean the classroomas a punishment, right? So what
John did is he decided that four of them would taketurns to clean their
classrooms. He came up with a planof writing each of their names on
chits and putting themin a bout now every day. They had to pick upa name
from the bowel and that person had to playin the clock, right? That
sounds pretty fair enoughnow it is been three days and everybody's name
has come upexcept John's assuming that this eventis completely random
and free of bias. What is a probability of John not treatingright or is
the probability that he's not actuallycheating this can Solved by using
hypothesis testing. Okay. So we'll Begin by calculatingthe probability
of John not being picked for a day. Alright, so we'regoing to assume
that the event is free of bias. So we need to findout the probability of
John not cheating rightfirst we'll find the probability that John is
not pickedfor a day, right? We get 3 out of 4, which is basically 75%75%
is fairly high. So if John is not pickedfor three days in a row the
Probability will drop downto approximately 42% Okay. So three days in a
row meaning that is the probabilitydrops down to 42 percent. Now, let's
consider a situation where John is not pickedfor 12 days in a row the
probability drops downto Tea Point two percent. Okay, that's the
probability of John cheating becomesfairly high, right? So in order for
statisticians to cometo a conclusion, they Define what is knownas the
threshold value. Right consideringthe above situation if the threshold
valueis set to 5 percent. It would indicate that if the probability
liesbelow 5% then John is cheating his way out of detention. But if the
probability isabout threshold value then John it just lucky and his
nameisn't getting picked. So the probability and hypothesis testing give
riseto two important components of hypothesis testing, which is null
hypothesisand alternative hypothesis. Null. Hypothesis is based.
Basically approving the Assumption alternatehypothesis is when your
result disapprovesthe Assumption right therefore in our example, if the
probabilityof an event occurring is less than 5% which it isthen the
event is biased hence. It proves thealternate hypothesis. Undoubtedly
machine learning isthe most in-demand technology in today's market. It's
applications. From Seth driving cause to predicting deadly diseasessuch
as ALS the high demand for machine learning skillsis the motivation
behind today's session. So let me discussthe agenda with you first. Now,
we're goingto begin the session by understanding the need for machine
learning and whyit is important after that. We look at what
exactlymachine learning is and then we'll discuss a coupleof machine
learning definitions. Once we're done with that. We'll look at
themachine learning process and how you can solvea problem by using
Using the machine learning processnext we will discuss the types of
machine learning which includessupervised unsupervised and reinforcement
learning. Once we're done with that. We'll discuss the differenttypes
of problems that can be solved byusing machine learning. Finally. We
will end this sessionby looking at a demo where we'll see how youcan
perform weather forecasting by using machine learning. All right, so
guys, let's get startedwith our first topic. So what is the importance
or what is the needfor machine learning now? Since the technical
Revolution, we've been generatingan immeasurable amount of data as for
research with generating around2.5 quintillion bytes of data every
single day and it is estimated that by 2020 1.7 MB of datawill be
created every second for every person on earth. Now that is a lotof data
right now.
This data comesfrom sources such as the cloud iot devices
socialmedia and all of that. Since all of usare very interested in the
internet right nowwith generating a lot of data. All right, you have no
ideahow much data we generate through social mediaall the chatting that
we do and all the images that we poston Instagram the videos that we
watch all of thisgenerates a lot of data. Now how does machinelearning
fit into all of this since we're producingthis much data, we need to
find a method that can analyze processand interpret this much data. All
right, and weneed to find a method. That can make sense out of data. And
that methodis machine learning. Now the lotof talk tire companies and
data driven companysuch as Netflix and Amazon which build machine
learningmodels by using tons of data in order to identifyany profitable
opportunities. And if they want to avoidany unwanted risk it make use of
machine learning. Alright, so through machinelearning You can predict
risk You can predict profits youcan identify opportunities, which will
help yougrow your business. Business so now I'll show you a couple of
examples of wherein machine learning is used. All right, so I'm sure all
ofyou have been watch on Netflix. Now the most important thing about
Netflix isits recommendation engine. All right. Most of Netflix's
Revenue comesfrom its recommendation engine. So the recommendation
engine basically studies the movieviewing patterns of its users and then
recommendsrelevant movies to them. All right, it recommends
moviesdepending on users interests. Depending on the type of movies the
userwatches and all of that. Alright, so that is how Netflix usesmachine
learning. Next. We have Facebook'sAuto tagging feature. Now the logic
behind Facebook's Auto tagging featureis machine learning and neural
networks. I'm not sure how manyof you know this but Facebook makes use
of deepmindface verification system, which is based on machine
learningnatural language processing and neural networks. So deep mine
basicallystudies the facial features in an image and it tagyour friends
and family. Another such example isAmazon's Alexa now Alexa is basically
an advancedlevel virtual assistant that is based on natural language
processingand machine learning. Now, it can do morethan just play music
for you. All right, it can bookyour Uber it can connect with other I/O
devices that your house itcan track your health. It can order foodonline
and all of that. So data, and machine learningare basically the main
factors behind Alex has power another such example isthe Google spam
filter. So guys Gmail basically makes use of machine learningto filter
out spam messages. If any of you justopen your Gmail inbox, you'll see
that thereare separate sections. There's one for primarythis social the
spam and the Joe general made nowbasically Gmail makes use of machine
learning algorithmsand natural language processing to an Is emails in
real time and then classifythem as either spam or non-spam now, this is
another famousapplication of machine learning. So to sum this up,let's
look at a few reasons. Why machine learningis so important. So the first
reasonis obviously increase in data generation. So because of
excessiveproduction of data, we need a methodthat can be used to
structure and lies and drawuseful insights from data. This is where
machine learningcomes as in it uses data to solve problems and find
solutions to the most complex tasksfaced by organizations. Another
important reason is thatit improves decision-making. So by making use of
variousalgorithms machine learning can be used to makeBetter Business
decisions. For example machine learningis used to forecast sales. It is
used to predict anydownfalls in the stock market. It is used to
identifyrisks anomalies and so on now the next reasonIs it uncovers
patterns and Trends in data findinghidden patterns and extracting key
insights from data isthe most essential part of machine learning. So by
building predictive models and using statisticaltechniques machine
learning allows you to digbeneath the surface and explore the data at a
minut scalenow understanding data and extracting patterns manuallywill
take a lot of days. Now, if you do this throughmachine learning
algorithms, you can performsuch computations. Nations in less than a
second. Another reason is that it's solvedcomplex problems. So from
detecting genes that are linkedto deadly ALS disease is to building
self-driving cars and building phase detectionsystems machine learning
can be used to solvethe most complex problems. So guys now that you
know, why machine learningis so important. Let's look at what
exactlymachine learning is. The term machine learningwas first coined by
Arthur Samuel in the year1959 now looking back that your was
probablythe most significant in terms of technological advancements.
There is if you browsethrough the net about what is machine
learningyou'll get at least a hundred different definitions. Now the
first and very formaldefinition was given by Tom and Mitchell now, the
definition says that a computer program is setto learn from experience e
with respect to some class. Of caste andperformance measure P if its
performance at tasks in D as measured by P improveswith experience e all
right. Now I know this isa little confusing. So let's break it downinto
simple words. Now in simple termsmachine learning is a subset of
artificial intelligence which provides machines theability to learn
automatically and improve from experiencewithout being explicitly
programmed to doso in the sense. It is the practice of gettingmachines
to solve problems by gaining the abilityto think but wait now how can a
machine thinkor make decisions? Well, if you feel a machinea good amount
of data, it will learnhow to interpret process and analyze this data by
usingmachine learning algorithm. Okay. Now guys, lookat this figure on
top. Now this figure basically showshow a machine learning algorithm or
how the machine learningprocess really works. So the machine learning
Beginsby feeding the machine lots and lots of data okayby using this
data. The machine is trained to detecthidden insights and Trends. Now
these insightsare then used to build a machine learning modelby using an
algorithm in order to solve a problem. Okay. So basically you'regoing
to feed a lot of data to the machine. The machine is going to gettrained
by using this data. It's going to use this data and it's going todraw
useful insights and patterns from it, and then it's goingto build a
model by Using machine learning algorithms. Now this model will helpyou
predict the outcome or help you solveany complex problem or any business
problem. So that's a simple explanationof how machine learning works.
Now, let's move on and look at some of the most commonlyused machine
learning terms. So first of all,we have algorithm. Now, this isquite
self-explanatory. Basically algorithmis a set of rules or statistical
techniques, which are used to learnpatterns from data now an algorithm
is The logicbehind a machine learning model. All right, an example of a
machine learningalgorithm is linear regression. I'm not sure how many of
youhave heard of linear regression. It's the most simple and
basicmachine learning algorithm. All right. Next we have model nowmodel
is the main component of machine learning. All right. So model will
basically mapthe input to your output by using the machine
learningalgorithm and by using the data that you're feeding the machine.
So basically the model is a representation of the entiremachine
learning process. So the model isbasically fed input which has a lot of
data and then it will outputa particular result or a particular outcome
by usingmachine learning algorithms. Next we have somethingknown as
predictor variable. Now predictor variableis a feature of the data that
can be usedto predict the output. So for example, let's say that you're
trying to predictthe weight of a person depending on the person's
heightand their age. All right. So over here the predictorvariables are
your height and your age because you're usingheight and age of a person
to predict the person's weight. Alright, so the height and the A's
arethe predictor variables now, Wait on the other handis the response or
the target variable. So response variable isa feature or the output
variable that needs to be predicted byusing the predictor variables. All
right, after that we have somethingknown as training data. So guys the
data that is fed to a machinelearning model is always split into two
parts first. We have the training data and then we havethe testing data
now training data is basically used to buildthe machine learning model.
So usually training datais much larger. Than the testing data because
obviouslyif you're trying to train the machine then you're goingto feed
it a lot more data. Testing data is just usedto validate and evaluate
the efficiency of the model. Alright, so that was trainingdata and
testing data.
So Guys, these were a few termsthat I thought you should
know before we move any further. Okay. Now, let's move on and discussthe
machine learning process. Now, this is goingto get very interesting
because I'm goingto give you an example and make you understandhow the
machine learning. process works So first of all, let's definethe
different stages or the different steps involvedin the machine learning
process. So machine learningprocess always begins with defining the
objectiveor defining the problem that you're trying to solvenext is is
data Gathering or data collection. Now the data that youneed to solve
this problem is collected at this stage. This is followedby data
preparation or data processing after that. You have dataexploration and
Analysis. Isis and the nextstage is building a machine learning model.
This is followedby model evaluation. And finally you haveprediction or
your output. Now, let's try to understandthis entire process with an
example. So our problem statement hereis to predict the possibility of
rain by studyingthe weather conditions. So let's say that you're givena
problem statement and you're asked to usea machine learning process to
solve this problem statement. So let's get started. Alright, so the
first stepis to Find the objective of the problem statement. Our
objective here isto predict the possibility of rain by studyingthe
weather conditions. Now in the first stageof a machine learning process.
You must understand what exactly needsto be predicted. Now in our case
the objectiveis to predict the possibility of rain by studyingweather
conditions, right? So at this stage, it is also essential to takemental
notes on what kind of data can be usedto solve this problem or the type
of approach that you can follow to get. Get to the solution. All right, a
few questions that are worth askingduring this stage is what are we
trying to predict? What are the Target features or what arethe predictor
variables? What kind of inputdata do we need? And what kindof problem
are we facing? Is it a binary classificationproblem or is it a
clustering problemnow, don't worry. If you don't knowwhat classification
and clustering isI'll be explaining this in the upcoming slides. So
guys this was the first stepof a machine learning process, which is
Definethe Double the problem. All right. Now, let's move on and lookat
step number two. So step number two isbasically data collection or data
Gatheringnow at this stage. You must be asking questionssuch as what
kind of data is needed to solve the problemis the data available and if
it is available, how can I get the data? Okay. So once you know the
typeof data that is required, you must understand how you can derivethis
data data collection can be done manuallyor by web scraping, but if
you're a beginner Norand you're just looking to learn machine learning
you don't haveto worry about getting the data. OK there are thousandsof
data resources on the web. You can just go ahead and download the
datasetsfrom websites such as kaggle. Okay, now comingback to the
problem at hand the data needed for weather forecasting includes
measures such as humidity leveltemperature pressure locality whether or
not you livein a hill station and so on so guyssuch data must be
collected and stored for analysis. Now the next stage in machine
learningis preparing your data the data you collected is almostnever in
the right format. So basically you'll encountera lot of inconsistencies
in the data set. Okay, this includes missing values redundantvariables
duplicate values and so on removingsuch values is very important because
they might leadto wrongful computations and predictions. So that's why
at this stage youmust can the entire data set for any inconsistencies.
You have to fix themat this stage. Now. The next step isexploratory data
analysis. Now data analysis isall about diving deep into data and
finding allthe hidden data Mysteries. Okay. This is where youbecome a
detective. So edu or exploratory dataanalysis is like a brainstorming of
machine learningdata exploration involves understanding the patterns
and the trends in your data. So at this stage allthe useful insights are
drawn and all the correlations. Turns between thevariables are
understood. So you might ask what sort of correlations areyou talking
about? For example in the caseof predicting rain fall. We know that
there isa strong possibility of rain if the temperaturehas fallen low.
Okay. So such correlationshave to be understood and mapped at this
stage. Now. This stage is followedby stage number 5, which is buildinga
machine learning model. So all the insightsand the patterns that you
derive during data exploration are usedto build the machine learning. So
this stage always Beginsby splitting the data set into two parts
training dataand the testing data. So earlier in the session. I already
told you what training and testing data isnow the training data will be
used to buildand analyze the model and the logic of the model will be
based on the machinelearning algorithm that is being implemented. Okay.
Now in the caseof predicting rainfall since the output will bein the
form of true or false we can use a classification algorithmlike
logistically. Regression now choosingthe right algorithm depends on the
type of problem. You're trying to solvethe data set you have and the
level of complexityof the problem. So in the upcoming sections willbe
discussing different types of problems that can be solvedby using
machine learning.
So don't worry. If you don't knowwhat classification
algorithm is and what logistic regression in. Okay. So all you need to
knowis at this stage, you'll be buildinga machine learning model by
using machinelearning algorithm and by using the trainingdata set the
next But in on machine learningprocess is model evaluation and
optimization. So after building a modelby using the training data set it
is finally time to putthe model to a test. Okay. So the testing data
setis used to check the efficiency of the model and how accuratelyit can
predict the outcome. So once you calculatethe accuracy any improvements
in the model haveto be implemented in this stage. Okay, so methods like
parametertuning and cross-validation can be used to improvethe The
performance of the model this is followedby the last stage, which is
predictions. So once the model is evaluated and improved it is
finallyused to make predictions. The final output can bea categorical
variable or it can be a continuousquantity in our case for predicting
the occurrence of rainfall the outputwill be a categorical variable in
the sense. Our output will bein the form of true or false. Yes or no.
Yes, basically represents that is going to rainand no will represent
that. It wondering okayas simple as that, so guys that was the
entiremachine learning process.
Thanks For Reading
Post a Comment
If you have any questions ! please let me know