AI_DL_Assignment / 11. Assessing Model Performance /3. Finding and Viewing Misclassified Data.srt

Add files using upload-large-folder tool

0182da2 verified 3 months ago

8.69 kB

	1
	00:00:00,830 --> 00:00:05,740
	So welcome to 11. to where we actually can find and view our misclassified.

	2
	00:00:06,120 --> 00:00:12,790
	If you recall correctly last classifier was actually confusing sevens with tudes.

	3
	00:00:13,110 --> 00:00:16,800
	So how do we actually see the Sevens and two is that it's can be conveyed.

	4
	00:00:17,040 --> 00:00:18,860
	It's mixing up.

	5
	00:00:18,930 --> 00:00:22,180
	So now let's talk a bit about it first.

	6
	00:00:22,180 --> 00:00:27,210
	Firstly this is actually an under use function I don't see that many could be division the the scientists

	7
	00:00:27,690 --> 00:00:30,800
	using this technique to identify to classify as weakness.

	8
	00:00:30,840 --> 00:00:35,850
	I think it's crucial because by looking at what is misclassifying you can actually figure out oh I need

	9
	00:00:35,850 --> 00:00:39,240
	more of this type of data to make my classifier smarter.

	10
	00:00:39,240 --> 00:00:43,520
	We need to augment more we add some robustness to much ossify.

	11
	00:00:43,890 --> 00:00:49,590
	So viewing the misclassified tested it can tell us a lot of things sometimes what is confusing are the

	12
	00:00:49,590 --> 00:00:51,690
	classes looking similar even to us.

	13
	00:00:52,080 --> 00:00:56,550
	Maybe it's a more complex pattern maybe we need to add more deeply as and maybe our training day to

	14
	00:00:56,550 --> 00:01:02,340
	his mislabel that actually happens quite a bit to me because I tend to label a lot of my data sets myself

	15
	00:01:02,370 --> 00:01:07,660
	which is tedious and exhausting and prone to making errors sometimes.

	16
	00:01:07,680 --> 00:01:09,370
	So let's see how we do this.

	17
	00:01:09,380 --> 00:01:11,310
	No no I buy that book.

	18
	00:01:11,330 --> 00:01:16,110
	But before I go ahead let me just show you what it disk actually tells us.

	19
	00:01:16,110 --> 00:01:19,560
	This is some reallife say that it has been misled classified by a philosopher.

	20
	00:01:20,020 --> 00:01:23,900
	So it should be data a date or fix for you.

	21
	00:01:24,180 --> 00:01:30,060
	But basically what happens here is that this is a data input here or in page number.

	22
	00:01:30,090 --> 00:01:33,350
	So it actually was a 6 0 all classified picked at 0.

	23
	00:01:33,660 --> 00:01:38,410
	Now this clearly there's a 6 0 classified as doing something very wrong here.

	24
	00:01:38,520 --> 00:01:39,470
	This one isn't it.

	25
	00:01:39,480 --> 00:01:45,270
	But we can kind of forgive our classified slightly because it sort of looks like someone wrote to intentionally

	26
	00:01:45,700 --> 00:01:49,340
	and then maybe their pen skipped and then wasn't able to continue with it.

	27
	00:01:49,410 --> 00:01:55,340
	So possibly cause and because it too is most pronounced tick tick a part of this number.

	28
	00:01:55,500 --> 00:01:58,210
	I can see why they classified towards two.

	29
	00:01:58,470 --> 00:01:59,900
	This one was a 9.

	30
	00:02:00,150 --> 00:02:02,420
	Clearly an 0 predicted it was a 9.

	31
	00:02:02,430 --> 00:02:04,460
	How ever it actually wasn't it.

	32
	00:02:04,470 --> 00:02:06,900
	So when I said clearly it was 9 it wasn't it.

	33
	00:02:06,900 --> 00:02:14,220
	Someone wrote it very poorly and basically made this bottom cycle of the very small or perhaps it was

	34
	00:02:14,220 --> 00:02:15,070
	missed classified data.

	35
	00:02:15,120 --> 00:02:15,920
	We don't even know.

	36
	00:02:16,140 --> 00:02:18,290
	But let's trust our data for the.

	37
	00:02:18,390 --> 00:02:20,370
	And let's assume this wasn't it.

	38
	00:02:20,400 --> 00:02:22,460
	That was basically mystified.

	39
	00:02:23,970 --> 00:02:29,230
	This one is for when it actually looks like a nine to me to be honest I wasn't good.

	40
	00:02:29,520 --> 00:02:32,380
	I get scolded for that all the time in high school and private school.

	41
	00:02:32,730 --> 00:02:33,330
	So yeah.

	42
	00:02:33,360 --> 00:02:34,820
	So we understand that one.

	43
	00:02:34,890 --> 00:02:38,570
	This one I say is definitely a 6 or G even.

	44
	00:02:38,570 --> 00:02:41,010
	I mean this would be digits to be fair.

	45
	00:02:41,460 --> 00:02:47,460
	So we can see five good strong should've gotten that as a 6 because basically how it you know you don't

	46
	00:02:47,460 --> 00:02:49,010
	do a 5 like this.

	47
	00:02:49,080 --> 00:02:50,230
	So yeah.

	48
	00:02:50,250 --> 00:02:56,620
	So let's go into our I buy them book and see how we actually create plots or generate images like this.

	49
	00:02:57,120 --> 00:02:57,420
	OK.

	50
	00:02:57,430 --> 00:03:03,330
	So how do we find a misclassified data basically from what I find in the book from from our Python code

	51
	00:03:03,390 --> 00:03:04,430
	basically.

	52
	00:03:04,530 --> 00:03:06,690
	So let's think about as quickly right.

	53
	00:03:06,810 --> 00:03:10,230
	We have test data labels and test data.

	54
	00:03:10,440 --> 00:03:12,340
	And we have our training data which.

	55
	00:03:12,360 --> 00:03:15,680
	So how do we how do we figure out which labels have been wrong.

	56
	00:03:15,960 --> 00:03:17,310
	And that's actually fairly easy.

	57
	00:03:17,310 --> 00:03:25,170
	All we need to do is compare white tests with y prediction and that's what this function and P absolute

	58
	00:03:25,170 --> 00:03:27,060
	does right.

	59
	00:03:27,120 --> 00:03:32,790
	It creates a basically it creates an array that stores a value of 1 when a mistress vission occurs.

	60
	00:03:32,790 --> 00:03:35,500
	And basically we used Asare know this result.

	61
	00:03:35,760 --> 00:03:42,120
	We basically create a matrix here that basically when the result is greater than 1 which means that

	62
	00:03:42,120 --> 00:03:48,750
	it's basically misclassified we get indices here and these indices now will correspond to the actual

	63
	00:03:48,960 --> 00:03:54,250
	digit invite us that all right and why Treen that was actually mis classified.

	64
	00:03:54,420 --> 00:03:56,220
	So why Treen 247.

	65
	00:03:56,310 --> 00:04:03,450
	If you put some brackets in and go to for some that was an actual mistrustfully classified data image

	66
	00:04:03,540 --> 00:04:04,390
	input.

	67
	00:04:04,740 --> 00:04:08,290
	So let's run this when we get this here.

	68
	00:04:08,910 --> 00:04:11,700
	And this does it quite quickly as you can see.

	69
	00:04:11,700 --> 00:04:15,930
	This is providing you actually have why predict we got predict.

	70
	00:04:15,930 --> 00:04:21,150
	Basically if you remember correctly from this hair model that predicts like glasses and then we generated

	71
	00:04:21,150 --> 00:04:24,840
	our confusion and classification report.

	72
	00:04:24,840 --> 00:04:30,120
	So now let's display it using open Zeevi So actually commented on some lines say and that's because

	73
	00:04:30,120 --> 00:04:35,850
	if you wanted to load a model that would assume we didn't run this model and you just wanted to load

	74
	00:04:35,940 --> 00:04:42,000
	a model can load it in here and classify and just change the model to classify here and basically do

	75
	00:04:42,000 --> 00:04:42,710
	that same thing.

	76
	00:04:42,750 --> 00:04:48,230
	And I've come to this line here as I was a Princip and I was just printing the labels.

	77
	00:04:48,300 --> 00:04:48,860
	OK.

	78
	00:04:49,110 --> 00:04:53,730
	We're going to actually display it in all in one image for the first 10 misclassification.

	79
	00:04:53,730 --> 00:04:54,810
	So let's take a look.

	80
	00:04:57,390 --> 00:05:01,640
	So you know it is no let's look at this.

	81
	00:05:01,760 --> 00:05:02,580
	Exactly.

	82
	00:05:02,900 --> 00:05:08,470
	So what it tells us is that this was the input image for this is what it predicted in green.

	83
	00:05:08,690 --> 00:05:10,830
	And this is the actual true value for.

	84
	00:05:11,180 --> 00:05:12,000
	So it's interesting.

	85
	00:05:12,080 --> 00:05:13,180
	Let's take a look at another one.

	86
	00:05:15,180 --> 00:05:17,560
	This one is actually a 6 and 0.

	87
	00:05:18,100 --> 00:05:19,680
	Then keep going.

	88
	00:05:19,680 --> 00:05:21,130
	It's an 8 and actually.

	89
	00:05:21,140 --> 00:05:22,410
	But it predicted a 4.

	90
	00:05:22,680 --> 00:05:23,490
	Same for this one.

	91
	00:05:23,490 --> 00:05:24,030
	This one.

	92
	00:05:24,020 --> 00:05:25,780
	This one's This is pretty cool.

	93
	00:05:26,010 --> 00:05:30,630
	So you can keep going and go through all the misclassification if you want to see what is actually confusing.

	94
	00:05:30,670 --> 00:05:31,360
	You'll classify.