Course: DS 4002 - Data Science Project Course
Authors: Navya Annapareddy, Tony Albini, Kunaal Sarnaik, & Jaya Vellayan
Professor: Brian Wright
Date: January 15th, 2021
The following section serves as a general introduction to the topic that will be discussed throughout this research analysis — Chest X-Ray Diagnostic Image Classification. See the sub-sections embedded below for futher information: Motivation, Background, General Research Question, & Relevant Research.
According to the Mayo Clinic, Chest X-Rays are one of the most common diagnostic tools utilized in the clinical setting; their usage rates fall just behind those of the electrocardiogram and bedside ultrasound.$^{1}$ Due to its strong ability to reveal many things inside of a patient's body, a Chest X-Ray is often among the first procedures a physician orders if heart or lung problems are suspected in said patient. Furthermore, due to the numerous, severe respiratory and cardiovascular complications of COVID-19, the use of Chest X-Rays within public and private hospitals is only expected to increase greatly within the coming years.$^{2}$ However, an increase in the number of such procedures disproportionately affects physicians specializing in an already arduous medical field — radiology.
According to a study conducted by radiologists Bruls & Kwee at the Zuyderland Medical Center in the Netherlands, the workload for radiologists during on-call hours has increased dramatically in the past 15 years on the global scale.$^{3}$ In their research manuscript, Bruls & Kwee call for the radiologist and technician workforce to be increased so that not only potential burn-out can be avoided, but also quality and safety of radiological care can be maintained. In the United States alone, the workload of radiologists has increased over the past two decades at an alarming rate.$^{4}$ Furthermore, previous studies have shown that the number of errors attributed to this heavy workload have similarly increased.$^{5-6}$ For patients, these trends are extremely concerning for two primary reasons: 1) a potential health risk going unnoticed, and 2) exposure to dangerous radiological substances.
According to the Radiological Society of North America, most missed radiologic diagnoses are attributable to image interpretation errors by radiologists.$^{7}$ In study conducted by RSNA researchers Bruno et al., it was found that the rate of missed, incorrect, or delayed diagnoses in the United States is estimated to be as high as 10-15%. An especially frightening case that these researchers found was that of a 4-year old boy who had swallowed a coin which was lodged within his esophagus. According to Bruno et al., a skilled pediatric radiologist missed this clear diagnostic indicator twice, leaving any mention of a coin out of their clinical image interpretation. Similar errors across the country and world likely result in many diseases and medical problems going undiagnosed, making them all the more harder to treat as they progress to their later stages.
However, the complications resulting from radiologic image interpretation errors to not stop there; in addition to increasing the severity of preexisting medical conditions due to a lack of diagnosis, a radiologic error often leads to more imaging down the road for a patient.$^{5}$ As such, one instance of radiologic exposure can translate into multiple. Although the radioactive exposure itself is relatively small with a Chest X-Ray, it can still be extremely concerning for a patient who needs multiple due to suffering from a chronic disease such as Chronic Obstructive Pulmonary Disorder (COPD).$^{8}$
Given both the causes and complications of radiologist image interpretation errors, what if there was a way to better screen for medical images such as those resulting from a Chest X-Ray? More specifically, can deep learning and convolutional neural networks (CNNs) help with such a specific application? The following research project will leverage machine learning techniques and state-of-the-art CNN architectures to analyze this proposition. A dataset of several Chest X-Ray images will be utilized, exploratory data analysis and feature engineering will be conducted, and a finalized CNN model will be constructed to classify diagnostic Chest X-Ray Images for disease. In the end, the model can potentially be utilized to serve as a preliminary screening tool in the clinical setting to aid radiologists in their imaging interpretation. As such, the overarching goal of this project is to contribute to the innovative research concentrated in medical image classification to increase clinical workflow efficiency, reduce both physician workload and error alike, and most importantly, improve patient outcomes.
Chest X-Rays utilize very small doses of ionizing radiation to produce pictures of the thoracic cavity. Commonly utilized to evaluate the lungs, heart, and chest wall, chest X-rays serve as important tools to diagnose symptoms such as shortness of breath, persistent cough, fever, chest pain, or traumatic injury.$^{8}$ Furthermore, they can also be used to monitor chronic diseases and disorders such as pneumonia, emphysema, and cancer. Listed below is a comprehensive overview of what the Chest X-Ray can reveal about any given patient's body$^{1}$:
Lung Condition - Detecting cancer, infection, chronic conditions, complications, or air collecting around a lung potentially causing collapse.
Heart-Related Lung Problems - Detecting changes or problems in the lungs that stem from preexisting heart problems.
Heart Size and Outline - Detecting changes in size and shape of the heart, which may additionally be a sign of heart failure, excess fluid, or heart valve abnormalities.
Blood Vessels - Detecting aneurysms, congenital heart disease, or other problems with the aorta, pulmonary artery, coronary artery, and vena cava.
Calcium Deposits - Detecting presence of calcium which may correspond to excess fat in blood vessels, damage to heart valves, coronary artery abnormalities, or anatomical severities with respect to the heart and its protective sac.
Fractures - Detecting rib or spine fractures, as well as other problems with bones (cancer, osteomyelitis, etc.).
Postoperative Changes - Detect any problems that emerge from a surgery, most common being intubation complications to the esophagus.
Pacemakers, Defibrillators, & Catheters - Detect any problems with the placement of these devices.
The Convolutional Neural Networks (CNNs) are a class of artificial neural networks that have become especially popular in computer vision tasks.$^{9}$ The architecture of a CNN, consisting of various layers, is designed to take advantage of the two-dimensional structure of an input image.$^{10}$ This specific function is achieved with local connections and tied weights followed by some sort of pooling which results in translation invariant features between layers. An additional benefit of their many layers is ease of training secondary to the low number of parameters required as compared to fully connected networks.
The architecture of a CNN, as mentioned previously, consists of one or more layers. In general, three main types of layers are utilized in building any given CNN:
1. Convolutional Layers
The convolutional layer is always the first layer of a CNN. The input to this layer is a tensor which holds the raw pixel values of the image that is intended on being classified.$^{11}$ The shape of this input tensor is as follows: (# of images, image height, image width, input channels). The number of input channels is generally 3, corresponding to the RGB values of any given pixel. Essentially, the convolutional layer of a CNN takes this input tensor and uses filters (also called neurons or kernels) to convolve the image (via element-wise multiplication) pixel values and sum them up within an area of the inputted image (referred to as the neuron's receptive field), leaving one value. Depending on the number of filters utilized in the convolutional layer, the number of unique locations on the inputted image increase, which allows for a feature map. In the end, whatever input shape the original image tensor was in can be represented in fewer parameters, which is the hallmark function of a CNN.
2. Fully Connected (Dense) Layers
The next main type of layer in a CNN is the fully conected layer, sometimes referred to as the dense layer. This layer essentially takes an input volume and outputs an N dimensional vector where N is the number of classes that the program has to choose from. Each number in this N dimensional vector represents the probability of a certain class. For example, if one wanted to classify images as red, blue, or green, the output of a dense layer may be something along the lines of [0.2 0.5 0.3], indicating that the model finds there is a 20% chance of the image being red, 50% chance of it being blue, and 30% chance of it being green. The fully connected layer does this by looking at the output of the previous layer, which represents activation maps of high level features in the image, and determining which features most correlate to a particular class.
3. Pooling Layers
The third and final main type of CNN layer is the pooling layer. The pooling layers operate upon each feature map taken from their input to create a new set of the same number of pooled feature maps.$^{12}$ Essentially, what this does is filters the feature maps by value of importance, and this is performed with two common functions: 1) Average Pooling, and 2) Maximum Pooling. Average Pooling calculates the average value for each patch on the feature map whlie Max Pooling calculates the maximum value for each patch on the feature map. The end result of a pooling layer is a downsampled, summarized version of the features detected in the input image. This is what adds the model's invariance to local translation.
The above layers outline the general components comprising the architecture of any given CNN, yet several variants of these and additional layers also exist and are utilized widely for image classification. However, after the CNN architecture is laid out with respect to layer magnitude and ordering, the model still must be trained; CNNs achieve this training through backpropagation.$^{11}$ This training process is composed of four distinct sections:
1. The Forward Pass
This is the training phase in which an image is taken and passed through the entire CNN's layer hierarchy. The output of this first image is likely equal with respect to the number of classes that are being used to classify the image, with no preference given to any class. However, once the CNN finds the associated label that the training image is supposed to be classified as, it backpropagates through use of the loss function.
2. The Loss Function
To get to the stage of the CNN-predicted layer being the same as the intended training label for any given image, the loss of any given training image, as calculated by the loss function, must be zero. Essentially, this is an optimization problem in calculus, in which the loss function is utilized to find out which inputs (pixel weights in the feature map) most directly contributed to the loss (error) of the network. Common loss functions include MSE (Mean Squared Error), Hinge Loss, and Cross-Entropy, the lattermost of which will be utilized later in this notebook.
3. The Backward Pass
Since we have the loss, as calculated by any one of the loss functions listed previously, a backward pass is now performed through the CNN to find out which weights contributed most to the loss. Using these weight contributions, the CNN needs to adjust the features such that the loss decreases. This is done by calculating the change in loss with respect to change in weight.
4. Weight Update
Now that the weight importance of each feature in the CNNs layers is found with respect to the loss, the filter weights are updated such that the loss decreases. The learning rate is crucial to this phase, as it is essentially the length of steps taken in each weight update. Lower learning rates may take a long time to converge to minimum loss while larger learning rates may result in jumps that are too large and not precise enough to reach the optimal point quickly.
An in-depth analysis utilizing data science and machine learning principles in the context of medical Chest X-Ray Image Classification will be conducted in an attempt to address the following research question:
How do class imbalances affect the performance of a state-of-the-art Convolutional Neural Network model in classifying Chest X-Ray Images?
Using this queston as a guide, hypotheses will be developed and a comprehensive analysis that leverages state-of-the-art CNN principles will be conducted. A dataset containing several thousand Chest X-Ray images possessing diagnostic indications to a number of various cardiac and pulmonary diseases will be analyzed (see Dataset Overview & Cleaningsection for more information). The dataset will be cleaned, pre-processed, and exploratory analyzed, after which iterative model runs and feature engineering steps will be performed to test the performance of state-of-the-art CNN architectures with class imbalances.
The utilization of Convolutional Neural Networks within the medical space, specifically for image classification, is both an established and an emerging area of innovative research. The distinct average of utilizing CNNs for such an application is that the algorithms can be generalized to solve many different kinds of medical image classification tasks in a process called transfer learning.$^{13}$ Given the large rise in medical data and releases of images, many researches have applied deep learning networks in order to successfully classify images such as CT Scans, MRI Graphs, and Ultrasound results as pathophysiological and physiological. However, given the focus of this project, the following are some relevant studies pertaining to classifying pulmonary and cardiac disorders through utilization of training CNNs on Chest X-Ray images.
In late 2019, a group of researchers, Jain et al., on a joing project from Bharati Vidyapeeth's College of Engineering and Karunya Institute of Technology and Sciences, utilized several state-of-the-art CNN architectures to classify Chest X-Rays as positive or negative for pneumonia.$^{14}$ Using the VGG16, VGG19, ResNet50, and Inception-v3 architectures, Jain et al. found that these cutting-edge, pre-trained algorithms were able to achieve testing accuracies of 87.28%, 88.46%, 77.56%, and 70.99%, respectively. However, an interesting finding from this group was that two custom models, which consisted of two and three convolutional layers, respectively, achieved testing accuracies of 85.26% and 92.31%, respectively. These custom findings suggest that CNNs might achieve better performance with regard to medical image classification if the models are specifically trained on the dataset being classified, as pre-trained models are trained on other objects such as cats and dogs. As a result, the pre-trained models, although state-of-the-art, may not be able to detect complex intricacies within medical images for sufficient classification.
Furthermore, with the rise of the COVID-19 pandemic in the past two years, substantial research has gone into Chest X-Ray image classification as a method of detecting the virus early and often. In November 2020, researchers at the University of Waterloo (Canada), Wang et al., developed a tailored deep CNN design specifically for detection of COVID-19 cases from Chest X-Ray images.$^{15}$ After consolidating 13,975 Chest X-Ray images across 13,870 patient cases, their COVID-Net model achieved a respectable accuracy of 93.3%. Furthermore, they reported that this custom tailored model achieved better accuracies than both the pretrained VGG and ResNet CNN architectures. Similarly, a joing team of researchers from both the University of Azad Jammu & Kashmir in Pakistan, and Stony Brook University in New York, utilized CNNs to classify Chest X-Ray images as either positive for COVID-19, positive for bacterial pneumonia, negative for COVID-19 but positive for viral pneumonia, and negative for any pathology.$^{16}$ Using a deep convolutional network based off of the state-of-the-art U-Net CNN design, the researchers, Hussain et al., achieved an admirable accuracy of 79.52% for their complex multi-class classification.
Overall, the use of deep CNNs within the medical image classification space is emerging. However, many of these prior analyses focus on a single disease and specify their network in an according way. As such, the goal of these prior projects is to simulate radiologist-level classification, which ultimately may not be feasible as it would suggest replacing radiologists with artificial neural networks. Our project will instead focus on serving as an initial screening tool for Chest X-Rays, classifying them based on finding or severity, such that radiologists can refocus their efforts rather than being replaced entirely.
The following section describes the dataset being used to address the general research question listed above, as well as details how the dataset was pre-processed for analysis.
The dataset being utilized in this project is a sub-sample of a dataset provided by the NIH containing 112,120 Chest X-Ray images obtained from 30,805 unique patients. The labels in the original dataset were generator through the use of Natural Language Processing techniques to text-mine disease classifications from the associated radiological reports. The labels are expected to be approximately 90% accurate and suitable for weakly-supervised learning. The images were already resized to 1024x1024 pixels squared, and the sub-sample that will specifically be utilized in this notebook, due to computational limitations, contains 5,606 images. The images are classified through the use of 15 different labels (14 diseases and one "No findings" label), and some images are classified with more than one disease label (see below). Finally, the dataset also was generated with a comma-separated file of patient demographics and the bounding boxes of each image.
Link to dataset$^{17}$: https://www.kaggle.com/nih-chest-xrays/data?select=Data_Entry_2017.csv
The 15 different classifications in the dataset are as follows:
Atelectasis
Consolidation
Infiltration
Pneumothorax
Edema
Emphysema
Fibrosis
Effusion
Pneumonia
Pleural Thickening
Cardiomegaly
Nodule Mass
Hernia
No Findings
One or more disease classifications
Together, this makes for a total of 140 combinations that can exist for any given image's classification in just our 5,606-image sub-sample, as discussed later.
The following sub-section includes a step-by-step overview of how the dataset was loaded into the Google Colab environment, cleaned, and prepared for exploratory data analysis.
The following code chunk outlines the numerous packages, libraries, and modules that were imported to the notebook for the analysis to be conducted in Python. The tools are listed below:
import pandas as pd
import numpy as np
from os import listdir, walk
from os.path import isfile, join
import itertools
import sys,requests
import plotly.express as px
import plotly.graph_objects as go
import plotly.figure_factory as ff
from plotly.offline import iplot, init_notebook_mode
import matplotlib.pyplot as plt
import seaborn as sns
import math
import cv2
import pickle
from collections import Counter
from pylab import rcParams
#USE IF CONVERTING TO HTML
# import plotly.offline as py
# py.init_notebook_mode(connected=False)
from sklearn.feature_selection import chi2
from sklearn.preprocessing import OneHotEncoder
from sklearn.preprocessing import LabelEncoder
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.model_selection import cross_val_score
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import GridSearchCV
from sklearn.inspection import permutation_importance
from sklearn.model_selection import cross_val_score
from sklearn.metrics import classification_report, confusion_matrix
import sklearn.metrics as skmet
pd.set_option('mode.chained_assignment', None)
import keras
import keras.utils
from keras.utils import to_categorical
from keras.models import Model
from keras.applications.resnet50 import ResNet50
from keras.applications.vgg16 import VGG16
from keras.preprocessing import image
from keras.preprocessing.image import load_img, ImageDataGenerator
from keras.applications.resnet50 import preprocess_input, decode_predictions
from keras.callbacks import Callback, EarlyStopping, ReduceLROnPlateau, ModelCheckpoint
from sklearn.utils import class_weight
from keras.layers.convolutional import Conv2D, MaxPooling2D
from keras.models import Sequential
from keras.layers import Dense, Activation, Flatten, Dropout
from keras.optimizers import SGD
from keras import utils as np_utils
import tensorflow as tf
class history(Callback):
def __init__(self, savepath):
super(history, self).__init__()
self.savepath = savepath
self.history = {}
def on_epoch_end(self, epoch, logs=None):
for k, v in logs.items():
self.history.setdefault(k, []).append(v)
np.save(self.savepath, self.history)
The following section mounts the drive to the correct directory and imports the images from the dataset of interest into the Colab environment.
The following code chunk contains JavaScript Code necessary for preventing Colab's runtime from disconnecting after the session is left idle. Due to the complexity of our analysis, this script is necessary to run in the browser's console such that the CNN models can be trained on the images without interruption.
''' function ConnectButton(){
console.log("Connect pushed");
document.querySelector("#top-toolbar > colab-connect-button").shadowRoot.querySelector("#connect").click()
}
setInterval(ConnectButton,300000); '''
The following code chunk mounts the Notebook's drive to the correct directory necessary in order to import the images being utilized to train the CNN models in subsequent analysis.
# Load the Drive helper and mount
from google.colab import drive
drive.mount('/content/drive')
#!ls "/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/data/sample_images"
The following code chunk standardizes the path of the folder, images, and pickled pathways of the imaging dataset. It also includes a 'weight_path' argument, which will be utilized when fitting the pretrained models in the Keras API (VGG, ResNet, & DenseNet) to the imaging data.
folder_path = "/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/data/sample_images/images"
file_path = "/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/data/sample_images/sample_labels.csv"
folder_path_append = "/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/data/sample_images/images/"
pkl_path = '/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/img_arrays.pkl'
pkl_pathx = '/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/img_list_X.pkl'
pkl_pathy = '/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/img_list_Y.pkl'
weight_path = '/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5'
#Weight source: https://github.com/zdata-inc/applied_deep_learning/blob/master/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
recode_classes={0: 'No Finding', 1: 'Single Finding', 2: 'Multiple Findings'}
recoded_list = list(recode_classes.values())
def getAllFilesInDirectory(directoryPath: str):
return [(directoryPath + "/" + f) for f in listdir(directoryPath) if isfile(join(directoryPath, f))]
Now that the relevant Python tools are imported and we are in the directory that our images are stored in, let us have a look at the dataset.
The followign code chunks reads the csv file associated with the 5,606-image sub-sample that we will be utilizing to train the CNNs into the Google Colab environment.
df = pd.read_csv(file_path)
df.head(5)
| Image Index | Finding Labels | Follow-up # | Patient ID | Patient Age | Patient Gender | View Position | OriginalImageWidth | OriginalImageHeight | OriginalImagePixelSpacing_x | OriginalImagePixelSpacing_y | |
|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 00000013_005.png | Emphysema|Infiltration|Pleural_Thickening|Pneu... | 5 | 13 | 060Y | M | AP | 3056 | 2544 | 0.139 | 0.139 |
| 1 | 00000013_026.png | Cardiomegaly|Emphysema | 26 | 13 | 057Y | M | AP | 2500 | 2048 | 0.168 | 0.168 |
| 2 | 00000017_001.png | No Finding | 1 | 17 | 077Y | M | AP | 2500 | 2048 | 0.168 | 0.168 |
| 3 | 00000030_001.png | Atelectasis | 1 | 30 | 079Y | M | PA | 2992 | 2991 | 0.143 | 0.143 |
| 4 | 00000032_001.png | Cardiomegaly|Edema|Effusion | 1 | 32 | 055Y | F | AP | 2500 | 2048 | 0.168 | 0.168 |
As shown in the output above, the csv contains the image index, the finding labels classified to the image index, and then demographic and bounding box information about the patients and images, respectively. We can see right off the bat that the images were resized down to 1024x1024 pixels squared from larger sizes, and that the ultimate classifications can have multiple findings and be very complex.
The following section pre-processes the csv file in order to prepare our exploratory data analysis.
The following section performs several pre-processing functions to the dataset. First, the dataframe is categorized utilizing the following map: No Findings --> 0, One Finding --> 1, and Multiple Findings --> 2. This is necessary for subsequent analysis of the three classes and training the CNN. Another dataframe is also created reflecting only the 7 most frequent classifications found in the csv file. These alterations are done to the original as part of pre-processing and preparation for exploratory data analysis such that sufficient hypotheses pertaining to the dataset and general research question can be constructed.
df["path"] = folder_path_append+df["Image Index"]
print(df.shape)
categorized_df = df.copy()
categorized_df["Class"] = ""
categorized_df["Class"][(categorized_df["Finding Labels"] == "No Finding")] = 0
categorized_df["Class"][categorized_df["Finding Labels"].str.contains("|", regex=False)] = 2
categorized_df["Class"][(categorized_df["Class"] != 0) & (categorized_df["Class"] != 2)] = 1
categorized_df["Class"] = categorized_df["Class"].astype('int32')
categorized_df
findings = ["No Finding", "Infiltration", "Effusion", "Atelectasis", "Nodule", "Pneumothorax", "Mass"]
is_finding = categorized_df["Finding Labels"].isin(findings)
target_findings = categorized_df[is_finding].reset_index()
recoded_df = categorized_df[['Finding Labels', 'path', 'Class']]
print(recoded_df.shape)
categorized_df.head(1)
(5606, 12) (5606, 3)
| Image Index | Finding Labels | Follow-up # | Patient ID | Patient Age | Patient Gender | View Position | OriginalImageWidth | OriginalImageHeight | OriginalImagePixelSpacing_x | OriginalImagePixelSpacing_y | path | Class | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 00000013_005.png | Emphysema|Infiltration|Pleural_Thickening|Pneu... | 5 | 13 | 060Y | M | AP | 3056 | 2544 | 0.139 | 0.139 | /content/drive/MyDrive/DS_4002_JTerm2021/Week ... | 2 |
The following code chunk drops all columns of the dataframe pertaining to the 7 most frequent labels ('target_data'), such that subsequent visualizations can be constructed.
target_data = df["Finding Labels"].value_counts().rename_axis('Finding').reset_index(name='Count').head(7)
target_data
| Finding | Count | |
|---|---|---|
| 0 | No Finding | 3044 |
| 1 | Infiltration | 503 |
| 2 | Effusion | 203 |
| 3 | Atelectasis | 192 |
| 4 | Nodule | 144 |
| 5 | Pneumothorax | 114 |
| 6 | Mass | 99 |
The following sub-section performs exploratory data analysis on both the csv file associated with the images and the images themselves. This is conducted in order to finalize the hypotheses of interest when answering the general research question. At the end of the sub-section, the hypotheses of our investigation are listed.
In order to approach our general research question and construct hypotheses, we will construct key questions to guide our exploratory data analysis of the imaging dataset we will be utilizing. They are as follows:
all_finding_labels = df.groupby(["Finding Labels"]).size().reset_index(name='Count')
fig = px.bar(all_finding_labels,
x="Finding Labels",
y="Count",
color_discrete_sequence=px.colors.qualitative.D3,
title='Distribution of all Finding Labels'
)
fig.show()
all_finding_labels = target_findings.groupby(["Finding Labels"]).size().reset_index(name='Count')
fig = px.bar(all_finding_labels,
x="Finding Labels",
y="Count",
color_discrete_sequence=px.colors.qualitative.D3,
title='Distribution of Filtered Findings Labels'
)
fig.show()
all_finding_labels = target_findings.groupby(["Finding Labels"]).size().reset_index(name='Count')
fig = px.pie(all_finding_labels,
values="Count",
names = "Finding Labels",
color_discrete_sequence=px.colors.qualitative.D3,
title='Distribution of Filtered Findings Labels'
)
fig.show()
binary_df = df.copy()
binary_df["Finding Labels"][binary_df["Finding Labels"] != "No Finding"] = "Finding"
binary_df["Finding Labels"].value_counts().rename_axis('Finding').reset_index(name='Count')
num_normal = (binary_df["Finding Labels"] == "No Finding").sum()
num_abnormal = (binary_df["Finding Labels"] == "Finding").sum()
layout = go.Layout(title='Original Base Rates of Target Classes (Findings)')
fig = go.Figure(data=[go.Pie(labels=["No Finding", "Finding"],
values=[num_normal, num_abnormal],
pull=[0, 0.1],
)], layout=layout)
fig.update_traces(hoverinfo='label+value', textinfo='percent', textfont_size=20,
marker=dict(line=dict(color='#000000', width=2)))
fig.show()
tertiary_df = df.copy()
tertiary_df["Finding Labels"][tertiary_df["Finding Labels"].str.contains("|", regex=False)] = "Multiple Findings"
tertiary_df["Finding Labels"][(tertiary_df["Finding Labels"] != "No Finding") & (tertiary_df["Finding Labels"] != "Multiple Findings")] = "One Finding"
tertiary_df["Finding Labels"].value_counts().rename_axis('Finding').reset_index(name='Count')
num_nofinding = (tertiary_df["Finding Labels"] == "No Finding").sum()
num_onefinding = (tertiary_df["Finding Labels"] == "One Finding").sum()
num_mulfinding = (tertiary_df["Finding Labels"] == "Multiple Findings").sum()
layout = go.Layout(title='Original Base Rates of Target Classes')
fig = go.Figure(data=[go.Pie(labels=["No Finding", "One Finding", "Multiple Findings"],
values=[num_nofinding, num_onefinding, num_mulfinding],
pull=[0.1, 0, 0],
)], layout=layout)
fig.update_traces(hoverinfo='label+value', textinfo='percent', textfont_size=20,
marker=dict(line=dict(color='#000000', width=2)))
fig.show()
num_no_findings = (df['Finding Labels'] == "No Finding").sum()
num_keep = (target_findings["Finding Labels"] != "No Finding").sum()
num_drop = 5606 - 4299
layout = go.Layout(title='Dropped vs. Retained Data')
fig = go.Figure(data=[go.Pie(labels=["Kept with No Findings", "Kept with Findings", "Dropped"],
values=[num_no_findings, num_keep, num_drop],
pull=[0, 0, 0.1],
)], layout=layout)
fig.update_traces(hoverinfo='label+value', textinfo='percent', textfont_size=20,
marker=dict(line=dict(color='#000000', width=2)))
fig.show()
tertiary_df = target_findings.copy()
tertiary_df["Finding Labels"][tertiary_df["Finding Labels"].str.contains("|", regex=False)] = "Multiple Findings"
tertiary_df["Finding Labels"][(tertiary_df["Finding Labels"] != "No Finding") & (tertiary_df["Finding Labels"] != "Multiple Findings")] = "One Finding"
tertiary_df["Finding Labels"].value_counts().rename_axis('Finding').reset_index(name='Count')
num_nofinding = (tertiary_df["Finding Labels"] == "No Finding").sum()
num_onefinding = (tertiary_df["Finding Labels"] == "One Finding").sum()
num_mulfinding = (tertiary_df["Finding Labels"] == "Multiple Findings").sum()
layout = go.Layout(title='Filtered Base Rates of Target Classes')
fig = go.Figure(data=[go.Pie(labels=["No Finding", "One Finding", "Multiple Findings"],
values=[num_nofinding, num_onefinding, num_mulfinding],
pull=[0.1, 0, 0],
)], layout=layout)
fig.update_traces(hoverinfo='label+value', textinfo='percent', textfont_size=20,
marker=dict(line=dict(color='#000000', width=2)))
fig.show()
comp_g = target_findings.groupby(["Finding Labels", "Patient Gender"]).size().reset_index(name='Count')
fig = px.bar(comp_g,
x="Finding Labels",
y="Count",
color="Patient Gender",
barmode='group',
color_discrete_sequence=px.colors.qualitative.D3,
title='Distribution of Filtered Findings Labels by Gender')
fig.show()
plotted_cv2 = cv2.resize(np.array(cv2.imread(recoded_df_sample.path[0], cv2.COLOR_BGR2GRAY)), (128,128))
plt.imshow(plotted_cv2, cmap = 'gray', interpolation = 'bicubic')
print(plotted_cv2.shape)
(128, 128)
plotted_cv2 = cv2.resize(np.array(cv2.imread(recoded_df_sample.path[0])), (128,128))
plt.imshow(plotted_cv2, cmap = 'gray', interpolation = 'bicubic')
print(plotted_cv2.shape)
(128, 128, 3)
def plot_single(image_path):
path = image_path
print(image_path)
image_id = path.split('/')[10]
row = df[df['Image Index'] == image_id].values[0]
label = str(row[1])
img = image.load_img(image_path, target_size=(200, 200))
plt.title(image_id+"\n"+label)
plt.imshow(img)
plt.axis('off')
plt.show()
def plot_index(index, DataFrame):
row = DataFrame.iloc[index].values
image_id = str(row[0])
label = str(row[1])
image_path = str(row[11])
print(image_path)
img = image.load_img(image_path, target_size=(200, 200))
plt.title(image_id+"\n"+label)
plt.imshow(img)
plt.axis('off')
plt.show()
img1 = cv2.imread(image_path,0)
plt.hist(img1.ravel(),256,[0,256])
plt.title("Pixel intensity vs. number of pixels")
plt.show()
histg = cv2.calcHist([img1],[0],None,[256],[0,256])
plt.plot(histg)
plt.title("Pixel intensity vs. number of pixels")
plt.show()
plot_index(10, df)
/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/data/sample_images/images/00000061_025.png
plot_index(41, df)
/content/drive/MyDrive/DS_4002_JTerm2021/Week Two/Code/data/sample_images/images/00000243_001.png
def plot_history(history):
train_loss_history = history.history['loss']
validation_loss_history = history.history['val_loss']
train_acc_history = history.history['accuracy']
validation_acc_history = history.history['val_accuracy']
plt.plot(train_loss_history, '-ob')
plt.plot(validation_loss_history, '-or')
plt.xlabel("Epoch (count)")
plt.ylabel("Loss")
plt.legend(["Training", "Validation"])
plt.title("Training and Validation Losses as a Function of the Number of Epochs")
ax = plt.axes()
ax.grid(False)
ax.set_facecolor('white')
plt.show()
plt.show()
print("\n")
plt.plot(train_acc_history, '-ob')
plt.plot(validation_acc_history, '-or')
plt.xlabel("Epoch (count)")
plt.ylabel("Accuracy (%)")
plt.legend(["Training", "validation"])
plt.title("Training and Validation Accuracies as a Function of the Number of Epochs")
ax = plt.axes()
ax.grid(False)
ax.set_facecolor('white')
plt.show()
plt.show()
def evaluate_model(predictions, y_test):
num_correct = 0
#Confusion Matrix Set
confusion_matrix = np.zeros((3, 3), dtype=int)
for i in range(len(predictions)):
if predictions[i] == y_test[i]:
num_correct += 1
confusion_matrix[predictions[i]][y_test[i]] += 1
accuracy = (float(num_correct)) / (float(len(predictions)))
return accuracy, confusion_matrix
def plot_confusion(confusion_matrix):
confusion_labels = ['No Finding', '1 Finding', '1+ Findings']
sns.set()
fig, ax = plt.subplots(figsize=(8,6))
ax = sns.heatmap(confusion_matrix, annot=True, fmt='d', square=True, ax=ax, annot_kws={"fontsize":16},
linecolor="black", linewidth=0.1, xticklabels=confusion_labels, yticklabels=confusion_labels, cmap="rocket", cbar_kws={'label':'Count'})
plt.setp(ax.get_xticklabels(), fontsize=16, va='center', ha='center')
plt.setp(ax.get_yticklabels(), fontsize=16, va='center', ha='center')
plt.ylabel('Predicted', fontsize=18)
plt.xlabel('Actual', fontsize=18)
ax.set_title("Confusion Matrix", fontsize=24)
fig.tight_layout()
plt.show()
def getTrueFalsePosNeg(confusion_matrix, num):
if num == 0:
TP = float(confusion_matrix[0][0])
TN = float(confusion_matrix[1][1]+confusion_matrix[1][2]+confusion_matrix[2][1]+confusion_matrix[2][2])
FP = float(confusion_matrix[0][1]+confusion_matrix[0][2])
FN = float(confusion_matrix[1][0]+confusion_matrix[2][0])
elif num == 1:
TP = float(confusion_matrix[1][1])
TN = float(confusion_matrix[0][0]+confusion_matrix[0][2]+confusion_matrix[2][0]+confusion_matrix[2][2])
FP = float(confusion_matrix[1][0]+confusion_matrix[1][2])
FN = float(confusion_matrix[0][1]+confusion_matrix[2][1])
else :
TP = float(confusion_matrix[2][2])
TN = float(confusion_matrix[0][0]+confusion_matrix[0][1]+confusion_matrix[1][0]+confusion_matrix[1][1])
FP = float(confusion_matrix[2][0]+confusion_matrix[2][1])
FN = float(confusion_matrix[0][2]+confusion_matrix[1][2])
return TP, TN, FP, FN
def summaryStatistics(TP, TN, FP, FN):
if TP+TN+FP+FN == 0:
acc="---"
else:
acc = round(((TP+TN) / (TP+TN+FP+FN)), 2)
if TP+FP == 0:
pre="---"
else:
pre = round(((TP) / (TP+FP)), 2)
if TP+FN == 0:
rec="---"
else:
rec = round(((TP) / (TP+FN)), 2)
if TN+FP == 0:
spe="---"
else:
spe = round(1 - ((FP) / (TN+FP)), 2)
if rec == "---" or pre == "---":
f1="---"
else:
f1 = round((((2)*(rec)*(pre)) / (rec+pre)), 2)
return [acc, pre, rec, spe, f1]
def tableResults(none_stats, one_stats, mul_stats):
cellColor='lightskyblue'
headerColor='dodgerblue'
theTable = plt.table(
cellText=[
none_stats,
one_stats,
mul_stats
],
cellColours=[
[cellColor, cellColor, cellColor, cellColor, cellColor],
[cellColor, cellColor, cellColor, cellColor, cellColor],
[cellColor, cellColor, cellColor, cellColor, cellColor]
],
cellLoc='center',
rowLabels=['NONE', 'ONE', 'MUL'],
rowColours=[headerColor, headerColor, headerColor],
rowLoc='center',
colLabels=['ACCURACY', 'PRECISION', 'RECALL', 'SPECIFICITY', 'F1-Score'],
colColours=[headerColor, headerColor, headerColor, headerColor, headerColor],
colLoc='center',
loc='center'
)
theTable.auto_set_font_size(False)
theTable.set_fontsize(16)
theTable.scale(2, 2)
ax=plt.gca()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.box(on=None)
plt.show()
def evaluate_binary_model(predictions, y_test):
num_correct = 0
#Confusion Matrix Set
confusion_matrix = np.zeros((2, 2), dtype=int)
for i in range(len(predictions)):
if predictions[i] == y_test[i]:
num_correct += 1
confusion_matrix[predictions[i]][y_test[i]] += 1
accuracy = (float(num_correct)) / (float(len(predictions)))
return accuracy, confusion_matrix
def plot_binary_confusion(confusion_matrix):
confusion_labels = ['No Finding', 'Finding']
sns.set()
fig, ax = plt.subplots(figsize=(8,6))
ax = sns.heatmap(confusion_matrix, annot=True, fmt='d', square=True, ax=ax, annot_kws={"fontsize":16},
linecolor="black", linewidth=0.1, xticklabels=confusion_labels, yticklabels=confusion_labels, cmap="rocket", cbar_kws={'label':'Count'})
plt.setp(ax.get_xticklabels(), fontsize=16, va='center', ha='center')
plt.setp(ax.get_yticklabels(), fontsize=16, va='center', ha='center')
plt.ylabel('Predicted', fontsize=18)
plt.xlabel('Actual', fontsize=18)
ax.set_title("Binary Confusion Matrix", fontsize=24)
fig.tight_layout()
plt.show()
def binary_summary_Statistics(confusion_matrix):
TP = float(confusion_matrix[1][1])
TN = float(confusion_matrix[0][0])
FP = float(confusion_matrix[1][0])
FN = float(confusion_matrix[0][1])
if TP+TN+FP+FN == 0:
acc="---"
else:
acc = round(((TP+TN) / (TP+TN+FP+FN)), 2)
if TP+FP == 0:
pre="---"
else:
pre = round(((TP) / (TP+FP)), 2)
if TP+FN == 0:
rec="---"
else:
rec = round(((TP) / (TP+FN)), 2)
if TN+FP == 0:
spe="---"
else:
spe = round(1 - ((FP) / (TN+FP)), 2)
if rec == "---" or pre == "---":
f1="---"
else:
f1 = round((((2)*(rec)*(pre)) / (rec+pre)), 2)
return [acc, pre, rec, spe, f1]
def binary_tableResults(stats):
cellColor='lightskyblue'
headerColor='dodgerblue'
theTable = plt.table(
cellText=[
stats,
],
cellColours=[
[cellColor, cellColor, cellColor, cellColor, cellColor],
],
cellLoc='center',
rowLabels=['Finding'],
rowColours=[headerColor],
rowLoc='center',
colLabels=['ACCURACY', 'PRECISION', 'RECALL', 'SPECIFICITY', 'F1-Score'],
colColours=[headerColor, headerColor, headerColor, headerColor, headerColor],
colLoc='center',
loc='center'
)
theTable.auto_set_font_size(False)
theTable.set_fontsize(16)
theTable.scale(2, 2)
ax=plt.gca()
ax.get_xaxis().set_visible(False)
ax.get_yaxis().set_visible(False)
plt.box(on=None)
plt.show()
def plot_history_binary(history):
print(history.history.keys())
train_loss_history = history.history['loss']
validation_loss_history = history.history['val_loss']
train_acc_history = history.history['binary_accuracy']
validation_acc_history = history.history['val_binary_accuracy']
plt.plot(train_loss_history, '-ob')
plt.plot(validation_loss_history, '-or')
plt.xlabel("Epoch (count)")
plt.ylabel("Loss")
plt.legend(["Training", "Validation"])
plt.title("Training and Validation Losses as a Function of the Number of Epochs")
ax = plt.axes()
ax.grid(False)
ax.set_facecolor('white')
plt.show()
print("\n")
plt.plot(train_acc_history, '-ob')
plt.plot(validation_acc_history, '-or')
plt.xlabel("Epoch (count)")
plt.ylabel("Accuracy (%)")
plt.legend(["Training", "validation"])
plt.title("Training and Validation Accuracies as a Function of the Number of Epochs")
ax = plt.axes()
ax.grid(False)
ax.set_facecolor('white')
plt.show()
def image_to_array(df, img_width, img_height):
i = 0
target_df = pd.DataFrame()
img_array, labels = [], []
for index, row in df.iterrows():
figure = cv2.resize(np.array(cv2.imread(row[1])), (img_width,img_height))
img_array.append(figure)
labels.append(row[2])
i += 1
target_df["images"] = img_array
target_df["labels"] = labels
return target_df
target_df = image_to_array(recoded_df_sample, 128, 128)
#picklefile = open(pkl_path, 'wb')
#pickle.dump(target_df, picklefile)
#picklefile.close()
print(target_df.shape)
print(target_df.images[0].shape)
(5606, 2) (128, 128, 3)
recoded_df_sample = recoded_df.head(30)
def image_to_array_list(df, img_width, img_height):
i = 0
img_array, labels = [], []
for index, row in df.iterrows():
figure = cv2.resize(np.array(cv2.imread(row[1])), (img_width,img_height))
img_array.append(figure)
labels.append(row[2])
i += 1
return img_array, labels
X_list, y_list = image_to_array_list(recoded_df, 128, 128)
picklefile = open(pkl_pathx, 'wb')
pickle.dump(X_list, picklefile)
picklefile.close()
picklefile = open(pkl_pathy, 'wb')
pickle.dump(y_list, picklefile)
picklefile.close()
picklefile = open(pkl_path, 'rb')
target_df = pickle.load(picklefile)
target_df.shape
(5606, 2)
target_df.labels.value_counts()
0 3044 1 1582 2 980 Name: labels, dtype: int64
target_df.columns
Index(['images', 'labels'], dtype='object')
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4484 and X_test is 1122 Y_train is 4484 and Y_test is 1122 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4484 and X_test is 1122 y_train is 4484 and y_test is 1122 X_train final shape is: (4484, 49152) X_train final shape is: (4484, 128, 128, 3)
print(y_train_OHE.shape)
#y_train_OHE
print(y_train.shape)
#y_train
(4484, 3) (4484,)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 0.613885 |
| 1 | 1 | 1.181205 |
| 2 | 2 | 1.906803 |
def model(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 100
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Fitting model
b_history = model.fit(X_train,y_train, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
#b_history = model.fit(X_train,y_train, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1)
modelScore = model.evaluate(X_test,y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
model_output = model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/100 141/141 [==============================] - 16s 56ms/step - loss: 7.7445 - accuracy: 0.3655 - val_loss: 5.6906 - val_accuracy: 0.4046 Epoch 2/100 141/141 [==============================] - 7s 49ms/step - loss: 4.8818 - accuracy: 0.4286 - val_loss: 5.3383 - val_accuracy: 0.3939 Epoch 3/100 141/141 [==============================] - 7s 49ms/step - loss: 3.9782 - accuracy: 0.4485 - val_loss: 4.6978 - val_accuracy: 0.4180 Epoch 4/100 141/141 [==============================] - 7s 50ms/step - loss: 3.0877 - accuracy: 0.4972 - val_loss: 4.4447 - val_accuracy: 0.4439 Epoch 5/100 141/141 [==============================] - 7s 51ms/step - loss: 2.7739 - accuracy: 0.5269 - val_loss: 4.6888 - val_accuracy: 0.3993 Epoch 6/100 141/141 [==============================] - 7s 51ms/step - loss: 2.1320 - accuracy: 0.5692 - val_loss: 4.2632 - val_accuracy: 0.4109 Epoch 7/100 141/141 [==============================] - 7s 52ms/step - loss: 1.8952 - accuracy: 0.5822 - val_loss: 4.7093 - val_accuracy: 0.4011 Epoch 8/100 141/141 [==============================] - 7s 51ms/step - loss: 1.7548 - accuracy: 0.6017 - val_loss: 4.2209 - val_accuracy: 0.4242 Epoch 9/100 141/141 [==============================] - 7s 51ms/step - loss: 1.5252 - accuracy: 0.6407 - val_loss: 3.9704 - val_accuracy: 0.4296 Epoch 10/100 141/141 [==============================] - 7s 51ms/step - loss: 1.1980 - accuracy: 0.6766 - val_loss: 3.9273 - val_accuracy: 0.4421 Epoch 11/100 141/141 [==============================] - 7s 51ms/step - loss: 1.0487 - accuracy: 0.6933 - val_loss: 3.9521 - val_accuracy: 0.4296 Epoch 12/100 141/141 [==============================] - 7s 51ms/step - loss: 0.9125 - accuracy: 0.7269 - val_loss: 3.9060 - val_accuracy: 0.4251 Epoch 13/100 141/141 [==============================] - 7s 51ms/step - loss: 0.7774 - accuracy: 0.7588 - val_loss: 3.8421 - val_accuracy: 0.4260 Epoch 14/100 141/141 [==============================] - 7s 51ms/step - loss: 0.6926 - accuracy: 0.7742 - val_loss: 3.7634 - val_accuracy: 0.4394 Epoch 15/100 141/141 [==============================] - 7s 51ms/step - loss: 0.6078 - accuracy: 0.7912 - val_loss: 3.8323 - val_accuracy: 0.4153 Epoch 16/100 141/141 [==============================] - 7s 51ms/step - loss: 0.5559 - accuracy: 0.8084 - val_loss: 3.7887 - val_accuracy: 0.4358 Epoch 17/100 141/141 [==============================] - 7s 51ms/step - loss: 0.4706 - accuracy: 0.8221 - val_loss: 3.7455 - val_accuracy: 0.4367 Epoch 18/100 141/141 [==============================] - 7s 51ms/step - loss: 0.4547 - accuracy: 0.8299 - val_loss: 3.7728 - val_accuracy: 0.4340 Epoch 19/100 141/141 [==============================] - 7s 51ms/step - loss: 0.4041 - accuracy: 0.8426 - val_loss: 3.6603 - val_accuracy: 0.4572 Epoch 20/100 141/141 [==============================] - 7s 51ms/step - loss: 0.3574 - accuracy: 0.8684 - val_loss: 3.7348 - val_accuracy: 0.4439 Epoch 21/100 141/141 [==============================] - 7s 51ms/step - loss: 0.3022 - accuracy: 0.8829 - val_loss: 3.9107 - val_accuracy: 0.4207 Epoch 22/100 141/141 [==============================] - 7s 51ms/step - loss: 0.3013 - accuracy: 0.8824 - val_loss: 3.9522 - val_accuracy: 0.4242 Epoch 23/100 141/141 [==============================] - 7s 51ms/step - loss: 0.2249 - accuracy: 0.9140 - val_loss: 3.8526 - val_accuracy: 0.4323 Epoch 24/100 141/141 [==============================] - 7s 51ms/step - loss: 0.2213 - accuracy: 0.9080 - val_loss: 4.0143 - val_accuracy: 0.4118 Epoch 25/100 141/141 [==============================] - 7s 51ms/step - loss: 0.1859 - accuracy: 0.9273 - val_loss: 3.6690 - val_accuracy: 0.4635 Epoch 26/100 141/141 [==============================] - 7s 51ms/step - loss: 0.1692 - accuracy: 0.9351 - val_loss: 3.7348 - val_accuracy: 0.4456 Epoch 27/100 141/141 [==============================] - 7s 51ms/step - loss: 0.1510 - accuracy: 0.9515 - val_loss: 3.7157 - val_accuracy: 0.4590 Epoch 28/100 141/141 [==============================] - 7s 51ms/step - loss: 0.1232 - accuracy: 0.9583 - val_loss: 3.7171 - val_accuracy: 0.4465 Epoch 29/100 141/141 [==============================] - 7s 51ms/step - loss: 0.1135 - accuracy: 0.9671 - val_loss: 3.7419 - val_accuracy: 0.4519 Epoch 30/100 141/141 [==============================] - 7s 51ms/step - loss: 0.1102 - accuracy: 0.9742 - val_loss: 3.6903 - val_accuracy: 0.4626 Epoch 31/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0884 - accuracy: 0.9803 - val_loss: 3.8040 - val_accuracy: 0.4412 Epoch 32/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0910 - accuracy: 0.9803 - val_loss: 3.7444 - val_accuracy: 0.4483 Epoch 33/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0911 - accuracy: 0.9748 - val_loss: 3.7324 - val_accuracy: 0.4572 Epoch 34/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0694 - accuracy: 0.9900 - val_loss: 3.7650 - val_accuracy: 0.4581 Epoch 35/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0667 - accuracy: 0.9860 - val_loss: 3.8522 - val_accuracy: 0.4340 Epoch 36/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0764 - accuracy: 0.9800 - val_loss: 3.7706 - val_accuracy: 0.4572 Epoch 37/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0566 - accuracy: 0.9925 - val_loss: 3.7878 - val_accuracy: 0.4617 Epoch 38/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0485 - accuracy: 0.9950 - val_loss: 3.8138 - val_accuracy: 0.4537 Epoch 39/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0503 - accuracy: 0.9937 - val_loss: 3.9688 - val_accuracy: 0.4260 Epoch 40/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0463 - accuracy: 0.9960 - val_loss: 3.8290 - val_accuracy: 0.4572 Epoch 41/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0384 - accuracy: 0.9975 - val_loss: 3.8294 - val_accuracy: 0.4554 Epoch 42/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0371 - accuracy: 0.9975 - val_loss: 3.9225 - val_accuracy: 0.4385 Epoch 43/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0317 - accuracy: 0.9992 - val_loss: 3.9154 - val_accuracy: 0.4430 Epoch 44/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0295 - accuracy: 0.9993 - val_loss: 3.8746 - val_accuracy: 0.4528 Epoch 45/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0251 - accuracy: 0.9997 - val_loss: 4.0338 - val_accuracy: 0.4260 Epoch 46/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0268 - accuracy: 0.9991 - val_loss: 3.9839 - val_accuracy: 0.4421 Epoch 47/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0248 - accuracy: 0.9998 - val_loss: 3.9267 - val_accuracy: 0.4652 Epoch 48/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0214 - accuracy: 1.0000 - val_loss: 4.0037 - val_accuracy: 0.4349 Epoch 49/100 141/141 [==============================] - 7s 52ms/step - loss: 0.0299 - accuracy: 0.9966 - val_loss: 3.9472 - val_accuracy: 0.4643 Epoch 50/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0219 - accuracy: 0.9996 - val_loss: 3.9645 - val_accuracy: 0.4643 Epoch 51/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0179 - accuracy: 1.0000 - val_loss: 4.0191 - val_accuracy: 0.4545 Epoch 52/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0171 - accuracy: 1.0000 - val_loss: 4.0367 - val_accuracy: 0.4456 Epoch 53/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0160 - accuracy: 1.0000 - val_loss: 4.0456 - val_accuracy: 0.4554 Epoch 54/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0153 - accuracy: 1.0000 - val_loss: 4.0089 - val_accuracy: 0.4652 Epoch 55/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0144 - accuracy: 1.0000 - val_loss: 4.1421 - val_accuracy: 0.4394 Epoch 56/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0132 - accuracy: 1.0000 - val_loss: 4.0394 - val_accuracy: 0.4590 Epoch 57/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0128 - accuracy: 1.0000 - val_loss: 4.0635 - val_accuracy: 0.4554 Epoch 58/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0116 - accuracy: 1.0000 - val_loss: 4.0509 - val_accuracy: 0.4759 Epoch 59/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0219 - accuracy: 0.9984 - val_loss: 4.1884 - val_accuracy: 0.4733 Epoch 60/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0167 - accuracy: 0.9991 - val_loss: 4.2998 - val_accuracy: 0.4492 Epoch 61/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0140 - accuracy: 1.0000 - val_loss: 4.1901 - val_accuracy: 0.4635 Epoch 62/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0095 - accuracy: 1.0000 - val_loss: 4.1884 - val_accuracy: 0.4590 Epoch 63/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0086 - accuracy: 1.0000 - val_loss: 4.1959 - val_accuracy: 0.4581 Epoch 64/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0080 - accuracy: 1.0000 - val_loss: 4.1948 - val_accuracy: 0.4706 Epoch 65/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0076 - accuracy: 1.0000 - val_loss: 4.2488 - val_accuracy: 0.4608 Epoch 66/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0073 - accuracy: 1.0000 - val_loss: 4.2855 - val_accuracy: 0.4545 Epoch 67/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0066 - accuracy: 1.0000 - val_loss: 4.3009 - val_accuracy: 0.4537 Epoch 68/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0062 - accuracy: 1.0000 - val_loss: 4.3515 - val_accuracy: 0.4483 Epoch 69/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0060 - accuracy: 1.0000 - val_loss: 4.2668 - val_accuracy: 0.4661 Epoch 70/100 141/141 [==============================] - 7s 52ms/step - loss: 0.0059 - accuracy: 1.0000 - val_loss: 4.3055 - val_accuracy: 0.4626 Epoch 71/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0056 - accuracy: 1.0000 - val_loss: 4.3148 - val_accuracy: 0.4608 Epoch 72/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0049 - accuracy: 1.0000 - val_loss: 4.2971 - val_accuracy: 0.4777 Epoch 73/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0051 - accuracy: 1.0000 - val_loss: 4.3596 - val_accuracy: 0.4643 Epoch 74/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0045 - accuracy: 1.0000 - val_loss: 4.3251 - val_accuracy: 0.4777 Epoch 75/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0047 - accuracy: 1.0000 - val_loss: 4.4037 - val_accuracy: 0.4563 Epoch 76/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0060 - accuracy: 0.9993 - val_loss: 4.8086 - val_accuracy: 0.4715 Epoch 77/100 141/141 [==============================] - 7s 51ms/step - loss: 0.3250 - accuracy: 0.8871 - val_loss: 4.6478 - val_accuracy: 0.4715 Epoch 78/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0596 - accuracy: 0.9753 - val_loss: 4.7943 - val_accuracy: 0.4661 Epoch 79/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0296 - accuracy: 0.9923 - val_loss: 4.6595 - val_accuracy: 0.4626 Epoch 80/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0080 - accuracy: 1.0000 - val_loss: 4.6379 - val_accuracy: 0.4554 Epoch 81/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0053 - accuracy: 1.0000 - val_loss: 4.6293 - val_accuracy: 0.4545 Epoch 82/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0048 - accuracy: 1.0000 - val_loss: 4.5900 - val_accuracy: 0.4688 Epoch 83/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0047 - accuracy: 1.0000 - val_loss: 4.6532 - val_accuracy: 0.4528 Epoch 84/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0039 - accuracy: 1.0000 - val_loss: 4.6392 - val_accuracy: 0.4519 Epoch 85/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0039 - accuracy: 1.0000 - val_loss: 4.6560 - val_accuracy: 0.4528 Epoch 86/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0036 - accuracy: 1.0000 - val_loss: 4.6546 - val_accuracy: 0.4510 Epoch 87/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0035 - accuracy: 1.0000 - val_loss: 4.6418 - val_accuracy: 0.4545 Epoch 88/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0032 - accuracy: 1.0000 - val_loss: 4.6425 - val_accuracy: 0.4554 Epoch 89/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0032 - accuracy: 1.0000 - val_loss: 4.6623 - val_accuracy: 0.4492 Epoch 90/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0030 - accuracy: 1.0000 - val_loss: 4.6556 - val_accuracy: 0.4537 Epoch 91/100 141/141 [==============================] - 7s 52ms/step - loss: 0.0029 - accuracy: 1.0000 - val_loss: 4.6561 - val_accuracy: 0.4545 Epoch 92/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0028 - accuracy: 1.0000 - val_loss: 4.6498 - val_accuracy: 0.4537 Epoch 93/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0027 - accuracy: 1.0000 - val_loss: 4.6792 - val_accuracy: 0.4510 Epoch 94/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0027 - accuracy: 1.0000 - val_loss: 4.6932 - val_accuracy: 0.4537 Epoch 95/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0026 - accuracy: 1.0000 - val_loss: 4.7018 - val_accuracy: 0.4537 Epoch 96/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0025 - accuracy: 1.0000 - val_loss: 4.6764 - val_accuracy: 0.4590 Epoch 97/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0023 - accuracy: 1.0000 - val_loss: 4.6615 - val_accuracy: 0.4661 Epoch 98/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0023 - accuracy: 1.0000 - val_loss: 4.7097 - val_accuracy: 0.4581 Epoch 99/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0022 - accuracy: 1.0000 - val_loss: 4.7144 - val_accuracy: 0.4581 Epoch 100/100 141/141 [==============================] - 7s 51ms/step - loss: 0.0022 - accuracy: 1.0000 - val_loss: 4.6841 - val_accuracy: 0.4599 Batch Loss: 4.684112071990967 Accuracy: 0.45989304780960083
plot_history(model_output[1])
test_accuracy, conf_matrix = evaluate_model(model_output[2], model_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", test_accuracy)
plot_confusion(conf_matrix)
Testing Accuracy: 0.45989304812834225
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.59 Precision 0.64 Recall (Sensitivity) 0.58 Specificity 0.6 F-1 Score 0.61 One Finding Results Accuracy 0.6 Precision 0.27 Recall (Sensitivity) 0.33 Specificity 0.7 F-1 Score 0.3 Finding Finding Results Accuracy 0.73 Precision 0.31 Recall (Sensitivity) 0.31 Specificity 0.83 F-1 Score 0.31
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
print(y_train_OHE.shape)
#y_train_OHE
print(y_train.shape)
#y_train
(2352, 3) (2352,)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def model(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 100
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Fitting model
b_history = model.fit(X_train,y_train, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
#b_history = model.fit(X_train,y_train, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1)
modelScore = model.evaluate(X_test,y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
model_output = model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/100 74/74 [==============================] - 5s 66ms/step - loss: 8.8285 - accuracy: 0.3389 - val_loss: 5.7904 - val_accuracy: 0.4235 Epoch 2/100 74/74 [==============================] - 4s 51ms/step - loss: 5.3429 - accuracy: 0.4345 - val_loss: 5.3131 - val_accuracy: 0.4320 Epoch 3/100 74/74 [==============================] - 4s 51ms/step - loss: 4.0155 - accuracy: 0.4696 - val_loss: 5.1723 - val_accuracy: 0.4252 Epoch 4/100 74/74 [==============================] - 4s 53ms/step - loss: 3.4166 - accuracy: 0.5034 - val_loss: 5.2378 - val_accuracy: 0.4303 Epoch 5/100 74/74 [==============================] - 4s 52ms/step - loss: 2.8043 - accuracy: 0.5476 - val_loss: 4.8261 - val_accuracy: 0.4337 Epoch 6/100 74/74 [==============================] - 4s 51ms/step - loss: 2.4468 - accuracy: 0.5512 - val_loss: 4.6703 - val_accuracy: 0.4303 Epoch 7/100 74/74 [==============================] - 4s 52ms/step - loss: 1.9572 - accuracy: 0.6185 - val_loss: 4.6264 - val_accuracy: 0.4337 Epoch 8/100 74/74 [==============================] - 4s 52ms/step - loss: 1.6515 - accuracy: 0.6515 - val_loss: 4.6978 - val_accuracy: 0.4303 Epoch 9/100 74/74 [==============================] - 4s 52ms/step - loss: 1.5301 - accuracy: 0.6556 - val_loss: 4.5778 - val_accuracy: 0.4235 Epoch 10/100 74/74 [==============================] - 4s 51ms/step - loss: 1.2078 - accuracy: 0.7072 - val_loss: 4.5317 - val_accuracy: 0.4269 Epoch 11/100 74/74 [==============================] - 4s 51ms/step - loss: 0.9673 - accuracy: 0.7565 - val_loss: 4.4190 - val_accuracy: 0.4388 Epoch 12/100 74/74 [==============================] - 4s 51ms/step - loss: 0.8437 - accuracy: 0.7691 - val_loss: 4.3552 - val_accuracy: 0.4320 Epoch 13/100 74/74 [==============================] - 4s 51ms/step - loss: 0.6490 - accuracy: 0.8169 - val_loss: 4.3498 - val_accuracy: 0.4337 Epoch 14/100 74/74 [==============================] - 4s 51ms/step - loss: 0.5815 - accuracy: 0.8259 - val_loss: 4.3648 - val_accuracy: 0.4201 Epoch 15/100 74/74 [==============================] - 4s 51ms/step - loss: 0.5117 - accuracy: 0.8642 - val_loss: 4.3270 - val_accuracy: 0.4184 Epoch 16/100 74/74 [==============================] - 4s 51ms/step - loss: 0.4054 - accuracy: 0.8806 - val_loss: 4.3222 - val_accuracy: 0.4252 Epoch 17/100 74/74 [==============================] - 4s 51ms/step - loss: 0.3023 - accuracy: 0.9120 - val_loss: 4.2752 - val_accuracy: 0.4235 Epoch 18/100 74/74 [==============================] - 4s 51ms/step - loss: 0.2809 - accuracy: 0.9243 - val_loss: 4.2816 - val_accuracy: 0.4337 Epoch 19/100 74/74 [==============================] - 4s 51ms/step - loss: 0.2507 - accuracy: 0.9291 - val_loss: 4.3001 - val_accuracy: 0.4235 Epoch 20/100 74/74 [==============================] - 4s 51ms/step - loss: 0.2011 - accuracy: 0.9504 - val_loss: 4.2944 - val_accuracy: 0.4286 Epoch 21/100 74/74 [==============================] - 4s 51ms/step - loss: 0.1768 - accuracy: 0.9543 - val_loss: 4.3178 - val_accuracy: 0.4320 Epoch 22/100 74/74 [==============================] - 4s 52ms/step - loss: 0.1654 - accuracy: 0.9618 - val_loss: 4.3953 - val_accuracy: 0.4116 Epoch 23/100 74/74 [==============================] - 4s 51ms/step - loss: 0.1561 - accuracy: 0.9636 - val_loss: 4.2589 - val_accuracy: 0.4422 Epoch 24/100 74/74 [==============================] - 4s 51ms/step - loss: 0.1091 - accuracy: 0.9835 - val_loss: 4.2752 - val_accuracy: 0.4320 Epoch 25/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0864 - accuracy: 0.9888 - val_loss: 4.2943 - val_accuracy: 0.4371 Epoch 26/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0772 - accuracy: 0.9948 - val_loss: 4.2694 - val_accuracy: 0.4405 Epoch 27/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0671 - accuracy: 0.9948 - val_loss: 4.2674 - val_accuracy: 0.4286 Epoch 28/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0613 - accuracy: 0.9956 - val_loss: 4.2874 - val_accuracy: 0.4286 Epoch 29/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0545 - accuracy: 0.9973 - val_loss: 4.2990 - val_accuracy: 0.4303 Epoch 30/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0472 - accuracy: 0.9989 - val_loss: 4.2955 - val_accuracy: 0.4303 Epoch 31/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0448 - accuracy: 0.9983 - val_loss: 4.2999 - val_accuracy: 0.4286 Epoch 32/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0410 - accuracy: 0.9998 - val_loss: 4.2977 - val_accuracy: 0.4320 Epoch 33/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0371 - accuracy: 1.0000 - val_loss: 4.3008 - val_accuracy: 0.4337 Epoch 34/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0346 - accuracy: 1.0000 - val_loss: 4.2986 - val_accuracy: 0.4303 Epoch 35/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0344 - accuracy: 1.0000 - val_loss: 4.3145 - val_accuracy: 0.4320 Epoch 36/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0290 - accuracy: 1.0000 - val_loss: 4.3294 - val_accuracy: 0.4286 Epoch 37/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0263 - accuracy: 1.0000 - val_loss: 4.3195 - val_accuracy: 0.4371 Epoch 38/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0257 - accuracy: 1.0000 - val_loss: 4.3310 - val_accuracy: 0.4269 Epoch 39/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0254 - accuracy: 1.0000 - val_loss: 4.3331 - val_accuracy: 0.4269 Epoch 40/100 74/74 [==============================] - 4s 52ms/step - loss: 0.0236 - accuracy: 1.0000 - val_loss: 4.3388 - val_accuracy: 0.4269 Epoch 41/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0217 - accuracy: 1.0000 - val_loss: 4.3456 - val_accuracy: 0.4286 Epoch 42/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0205 - accuracy: 1.0000 - val_loss: 4.3512 - val_accuracy: 0.4286 Epoch 43/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0184 - accuracy: 1.0000 - val_loss: 4.3499 - val_accuracy: 0.4320 Epoch 44/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0180 - accuracy: 1.0000 - val_loss: 4.3671 - val_accuracy: 0.4269 Epoch 45/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0176 - accuracy: 1.0000 - val_loss: 4.3627 - val_accuracy: 0.4337 Epoch 46/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0165 - accuracy: 1.0000 - val_loss: 4.3632 - val_accuracy: 0.4286 Epoch 47/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0165 - accuracy: 1.0000 - val_loss: 4.3659 - val_accuracy: 0.4286 Epoch 48/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0150 - accuracy: 1.0000 - val_loss: 4.3799 - val_accuracy: 0.4235 Epoch 49/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0140 - accuracy: 1.0000 - val_loss: 4.3955 - val_accuracy: 0.4252 Epoch 50/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0135 - accuracy: 1.0000 - val_loss: 4.3980 - val_accuracy: 0.4218 Epoch 51/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0121 - accuracy: 1.0000 - val_loss: 4.3964 - val_accuracy: 0.4235 Epoch 52/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0129 - accuracy: 1.0000 - val_loss: 4.3832 - val_accuracy: 0.4235 Epoch 53/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0119 - accuracy: 1.0000 - val_loss: 4.4093 - val_accuracy: 0.4201 Epoch 54/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0114 - accuracy: 1.0000 - val_loss: 4.4073 - val_accuracy: 0.4252 Epoch 55/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0116 - accuracy: 1.0000 - val_loss: 4.4193 - val_accuracy: 0.4252 Epoch 56/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0101 - accuracy: 1.0000 - val_loss: 4.4128 - val_accuracy: 0.4286 Epoch 57/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0100 - accuracy: 1.0000 - val_loss: 4.4128 - val_accuracy: 0.4269 Epoch 58/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0091 - accuracy: 1.0000 - val_loss: 4.4220 - val_accuracy: 0.4269 Epoch 59/100 74/74 [==============================] - 4s 53ms/step - loss: 0.0092 - accuracy: 1.0000 - val_loss: 4.4350 - val_accuracy: 0.4269 Epoch 60/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0083 - accuracy: 1.0000 - val_loss: 4.4407 - val_accuracy: 0.4252 Epoch 61/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0083 - accuracy: 1.0000 - val_loss: 4.4530 - val_accuracy: 0.4252 Epoch 62/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0084 - accuracy: 1.0000 - val_loss: 4.4492 - val_accuracy: 0.4252 Epoch 63/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0073 - accuracy: 1.0000 - val_loss: 4.4650 - val_accuracy: 0.4269 Epoch 64/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0076 - accuracy: 1.0000 - val_loss: 4.4654 - val_accuracy: 0.4252 Epoch 65/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0070 - accuracy: 1.0000 - val_loss: 4.4738 - val_accuracy: 0.4235 Epoch 66/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0070 - accuracy: 1.0000 - val_loss: 4.4692 - val_accuracy: 0.4218 Epoch 67/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0063 - accuracy: 1.0000 - val_loss: 4.4729 - val_accuracy: 0.4269 Epoch 68/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0063 - accuracy: 1.0000 - val_loss: 4.4750 - val_accuracy: 0.4235 Epoch 69/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0060 - accuracy: 1.0000 - val_loss: 4.4916 - val_accuracy: 0.4252 Epoch 70/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0057 - accuracy: 1.0000 - val_loss: 4.4812 - val_accuracy: 0.4235 Epoch 71/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0056 - accuracy: 1.0000 - val_loss: 4.4920 - val_accuracy: 0.4269 Epoch 72/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0053 - accuracy: 1.0000 - val_loss: 4.5111 - val_accuracy: 0.4252 Epoch 73/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0051 - accuracy: 1.0000 - val_loss: 4.5221 - val_accuracy: 0.4218 Epoch 74/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0051 - accuracy: 1.0000 - val_loss: 4.5141 - val_accuracy: 0.4235 Epoch 75/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0047 - accuracy: 1.0000 - val_loss: 4.5121 - val_accuracy: 0.4252 Epoch 76/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0047 - accuracy: 1.0000 - val_loss: 4.5262 - val_accuracy: 0.4252 Epoch 77/100 74/74 [==============================] - 4s 53ms/step - loss: 0.0046 - accuracy: 1.0000 - val_loss: 4.5305 - val_accuracy: 0.4269 Epoch 78/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0042 - accuracy: 1.0000 - val_loss: 4.5427 - val_accuracy: 0.4235 Epoch 79/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0042 - accuracy: 1.0000 - val_loss: 4.5344 - val_accuracy: 0.4218 Epoch 80/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0039 - accuracy: 1.0000 - val_loss: 4.5560 - val_accuracy: 0.4235 Epoch 81/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0038 - accuracy: 1.0000 - val_loss: 4.5606 - val_accuracy: 0.4218 Epoch 82/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0036 - accuracy: 1.0000 - val_loss: 4.5601 - val_accuracy: 0.4218 Epoch 83/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0034 - accuracy: 1.0000 - val_loss: 4.5634 - val_accuracy: 0.4252 Epoch 84/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0034 - accuracy: 1.0000 - val_loss: 4.5783 - val_accuracy: 0.4235 Epoch 85/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0033 - accuracy: 1.0000 - val_loss: 4.5774 - val_accuracy: 0.4184 Epoch 86/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0031 - accuracy: 1.0000 - val_loss: 4.5879 - val_accuracy: 0.4201 Epoch 87/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0030 - accuracy: 1.0000 - val_loss: 4.5950 - val_accuracy: 0.4218 Epoch 88/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0028 - accuracy: 1.0000 - val_loss: 4.5886 - val_accuracy: 0.4286 Epoch 89/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0029 - accuracy: 1.0000 - val_loss: 4.5959 - val_accuracy: 0.4235 Epoch 90/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0027 - accuracy: 1.0000 - val_loss: 4.6145 - val_accuracy: 0.4201 Epoch 91/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0026 - accuracy: 1.0000 - val_loss: 4.6137 - val_accuracy: 0.4235 Epoch 92/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0025 - accuracy: 1.0000 - val_loss: 4.6263 - val_accuracy: 0.4218 Epoch 93/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0024 - accuracy: 1.0000 - val_loss: 4.6296 - val_accuracy: 0.4184 Epoch 94/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0023 - accuracy: 1.0000 - val_loss: 4.6322 - val_accuracy: 0.4218 Epoch 95/100 74/74 [==============================] - 4s 53ms/step - loss: 0.0023 - accuracy: 1.0000 - val_loss: 4.6345 - val_accuracy: 0.4201 Epoch 96/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0021 - accuracy: 1.0000 - val_loss: 4.6535 - val_accuracy: 0.4184 Epoch 97/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0021 - accuracy: 1.0000 - val_loss: 4.6426 - val_accuracy: 0.4269 Epoch 98/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0020 - accuracy: 1.0000 - val_loss: 4.6536 - val_accuracy: 0.4201 Epoch 99/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0020 - accuracy: 1.0000 - val_loss: 4.6608 - val_accuracy: 0.4286 Epoch 100/100 74/74 [==============================] - 4s 51ms/step - loss: 0.0019 - accuracy: 1.0000 - val_loss: 4.6693 - val_accuracy: 0.4201 Batch Loss: 4.669287204742432 Accuracy: 0.42006802558898926
plot_history(model_output[1])
test_accuracy, conf_matrix = evaluate_model(model_output[2], model_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", test_accuracy)
plot_confusion(conf_matrix)
Testing Accuracy: 0.4200680272108844
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.62 Precision 0.45 Recall (Sensitivity) 0.47 Specificity 0.7 F-1 Score 0.46 One Finding Results Accuracy 0.57 Precision 0.38 Recall (Sensitivity) 0.37 Specificity 0.68 F-1 Score 0.37 Finding Finding Results Accuracy 0.65 Precision 0.43 Recall (Sensitivity) 0.42 Specificity 0.75 F-1 Score 0.42
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4484 and X_test is 1122 Y_train is 4484 and Y_test is 1122 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4484 and X_test is 1122 y_train is 4484 and y_test is 1122 X_train final shape is: (4484, 49152) X_train final shape is: (4484, 128, 128, 3)
print(y_train_OHE.shape)
#y_train_OHE
print(y_train.shape)
#y_train
(4484, 2) (4484,)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 0.920828 |
| 1 | 1 | 1.094067 |
def binary_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 20
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Fitting model
b_history = model.fit(X_train,y_train, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
#b_history = model.fit(X_train,y_train, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1)
modelScore = model.evaluate(X_test,y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
binary_model_output = binary_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 141/141 [==============================] - 8s 53ms/step - loss: 5.6119 - accuracy: 0.5619 - val_loss: 4.0092 - val_accuracy: 0.5588 Epoch 2/20 141/141 [==============================] - 8s 54ms/step - loss: 3.0932 - accuracy: 0.6080 - val_loss: 3.4027 - val_accuracy: 0.5873 Epoch 3/20 141/141 [==============================] - 8s 55ms/step - loss: 2.4106 - accuracy: 0.6287 - val_loss: 3.1125 - val_accuracy: 0.5731 Epoch 4/20 141/141 [==============================] - 8s 54ms/step - loss: 2.1156 - accuracy: 0.6458 - val_loss: 2.9152 - val_accuracy: 0.5820 Epoch 5/20 141/141 [==============================] - 7s 52ms/step - loss: 1.6258 - accuracy: 0.6756 - val_loss: 2.8841 - val_accuracy: 0.5677 Epoch 6/20 141/141 [==============================] - 7s 51ms/step - loss: 1.4483 - accuracy: 0.6978 - val_loss: 2.6578 - val_accuracy: 0.5847 Epoch 7/20 141/141 [==============================] - 7s 51ms/step - loss: 1.1955 - accuracy: 0.7281 - val_loss: 2.6106 - val_accuracy: 0.5873 Epoch 8/20 141/141 [==============================] - 7s 51ms/step - loss: 1.0606 - accuracy: 0.7496 - val_loss: 2.6235 - val_accuracy: 0.5829 Epoch 9/20 141/141 [==============================] - 7s 51ms/step - loss: 0.9824 - accuracy: 0.7470 - val_loss: 2.5089 - val_accuracy: 0.5784 Epoch 10/20 141/141 [==============================] - 7s 51ms/step - loss: 0.7718 - accuracy: 0.7862 - val_loss: 2.5258 - val_accuracy: 0.5793 Epoch 11/20 141/141 [==============================] - 7s 51ms/step - loss: 0.7219 - accuracy: 0.7907 - val_loss: 2.5080 - val_accuracy: 0.5820 Epoch 12/20 141/141 [==============================] - 7s 51ms/step - loss: 0.6251 - accuracy: 0.8121 - val_loss: 2.4846 - val_accuracy: 0.5820 Epoch 13/20 141/141 [==============================] - 7s 51ms/step - loss: 0.5642 - accuracy: 0.8244 - val_loss: 2.5530 - val_accuracy: 0.5704 Epoch 14/20 141/141 [==============================] - 7s 51ms/step - loss: 0.5460 - accuracy: 0.8260 - val_loss: 2.5260 - val_accuracy: 0.5775 Epoch 15/20 141/141 [==============================] - 7s 51ms/step - loss: 0.4253 - accuracy: 0.8542 - val_loss: 2.4139 - val_accuracy: 0.5749 Epoch 16/20 141/141 [==============================] - 7s 52ms/step - loss: 0.3862 - accuracy: 0.8578 - val_loss: 2.4700 - val_accuracy: 0.5713 Epoch 17/20 141/141 [==============================] - 7s 51ms/step - loss: 0.3861 - accuracy: 0.8654 - val_loss: 2.4608 - val_accuracy: 0.5758 Epoch 18/20 141/141 [==============================] - 7s 52ms/step - loss: 0.3052 - accuracy: 0.8837 - val_loss: 2.4708 - val_accuracy: 0.5749 Epoch 19/20 141/141 [==============================] - 7s 51ms/step - loss: 0.2970 - accuracy: 0.8980 - val_loss: 2.4344 - val_accuracy: 0.5971 Epoch 20/20 141/141 [==============================] - 7s 51ms/step - loss: 0.2972 - accuracy: 0.8948 - val_loss: 2.4491 - val_accuracy: 0.5829 Batch Loss: 2.4491090774536133 Accuracy: 0.5828877091407776
plot_history(binary_model_output[1])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(binary_model_output[2], binary_model_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
Testing Accuracy: 0.5828877005347594
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
Binary Finding Results Accuracy 0.58 Precision 0.53 Recall (Sensitivity) 0.48 Specificity 0.66 F-1 Score 0.5
binary_tableResults(binary_stats)
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df_binary = pd.DataFrame(y, columns=['label'])
zero_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 0].index.tolist()[:2562]
print(len(zero_indices_binary))
one_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 1].index.tolist()[:2562]
print(len(one_indices_binary))
print(len(zero_indices_binary)+len(one_indices_binary))
sampled_indices_binary = zero_indices_binary + one_indices_binary
len(sampled_indices_binary)
2562 2562 5124
5124
X = np.asarray([X[i] for i in sampled_indices_binary])
y = np.asarray([y[i] for i in sampled_indices_binary])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4099 and X_test is 1025 Y_train is 4099 and Y_test is 1025 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4099 and X_test is 1025 y_train is 4099 and y_test is 1025 X_train final shape is: (4099, 49152) X_train final shape is: (4099, 128, 128, 3)
print(y_train_OHE.shape)
#y_train_OHE
print(y_train.shape)
#y_train
(4099, 2) (4099,)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
binary_model_output = binary_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 129/129 [==============================] - 8s 54ms/step - loss: 5.2419 - accuracy: 0.5345 - val_loss: 3.7427 - val_accuracy: 0.5776 Epoch 2/20 129/129 [==============================] - 7s 51ms/step - loss: 3.0235 - accuracy: 0.6176 - val_loss: 3.2293 - val_accuracy: 0.5737 Epoch 3/20 129/129 [==============================] - 7s 52ms/step - loss: 2.3673 - accuracy: 0.6489 - val_loss: 2.9713 - val_accuracy: 0.5776 Epoch 4/20 129/129 [==============================] - 7s 54ms/step - loss: 1.9054 - accuracy: 0.6608 - val_loss: 2.8401 - val_accuracy: 0.5756 Epoch 5/20 129/129 [==============================] - 7s 55ms/step - loss: 1.5366 - accuracy: 0.6926 - val_loss: 2.7283 - val_accuracy: 0.5707 Epoch 6/20 129/129 [==============================] - 7s 53ms/step - loss: 1.3200 - accuracy: 0.7149 - val_loss: 2.6948 - val_accuracy: 0.5698 Epoch 7/20 129/129 [==============================] - 7s 52ms/step - loss: 1.1097 - accuracy: 0.7377 - val_loss: 2.6521 - val_accuracy: 0.5776 Epoch 8/20 129/129 [==============================] - 7s 51ms/step - loss: 0.9975 - accuracy: 0.7501 - val_loss: 2.6461 - val_accuracy: 0.5600 Epoch 9/20 129/129 [==============================] - 7s 51ms/step - loss: 0.9523 - accuracy: 0.7471 - val_loss: 2.6091 - val_accuracy: 0.5727 Epoch 10/20 129/129 [==============================] - 7s 51ms/step - loss: 0.7683 - accuracy: 0.7772 - val_loss: 2.6007 - val_accuracy: 0.5571 Epoch 11/20 129/129 [==============================] - 7s 51ms/step - loss: 0.6623 - accuracy: 0.8044 - val_loss: 2.4622 - val_accuracy: 0.5834 Epoch 12/20 129/129 [==============================] - 7s 51ms/step - loss: 0.5473 - accuracy: 0.8226 - val_loss: 2.4586 - val_accuracy: 0.5688 Epoch 13/20 129/129 [==============================] - 7s 52ms/step - loss: 0.5016 - accuracy: 0.8345 - val_loss: 2.4850 - val_accuracy: 0.5727 Epoch 14/20 129/129 [==============================] - 7s 51ms/step - loss: 0.4855 - accuracy: 0.8312 - val_loss: 2.4432 - val_accuracy: 0.5854 Epoch 15/20 129/129 [==============================] - 7s 51ms/step - loss: 0.3669 - accuracy: 0.8723 - val_loss: 2.5989 - val_accuracy: 0.5844 Epoch 16/20 129/129 [==============================] - 7s 51ms/step - loss: 0.4525 - accuracy: 0.8428 - val_loss: 2.4879 - val_accuracy: 0.5688 Epoch 17/20 129/129 [==============================] - 7s 51ms/step - loss: 0.2946 - accuracy: 0.8977 - val_loss: 2.4808 - val_accuracy: 0.5659 Epoch 18/20 129/129 [==============================] - 7s 51ms/step - loss: 0.2552 - accuracy: 0.9085 - val_loss: 2.4404 - val_accuracy: 0.5737 Epoch 19/20 129/129 [==============================] - 7s 52ms/step - loss: 0.2176 - accuracy: 0.9244 - val_loss: 2.4859 - val_accuracy: 0.5649 Epoch 20/20 129/129 [==============================] - 7s 51ms/step - loss: 0.3375 - accuracy: 0.8626 - val_loss: 2.4861 - val_accuracy: 0.5756 Batch Loss: 2.4860992431640625 Accuracy: 0.5756097435951233
plot_history(binary_model_output[1])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(binary_model_output[2], binary_model_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
Testing Accuracy: 0.5756097560975609
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
Binary Finding Results Accuracy 0.58 Precision 0.57 Recall (Sensitivity) 0.64 Specificity 0.51 F-1 Score 0.6
binary_tableResults(binary_stats)
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def augmented_multiclass_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 20
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=20,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 74/74 [==============================] - 11s 141ms/step - loss: 10.3839 - accuracy: 0.3396 - val_loss: 6.5228 - val_accuracy: 0.4456 Epoch 2/20 74/74 [==============================] - 10s 137ms/step - loss: 6.4067 - accuracy: 0.4073 - val_loss: 6.6116 - val_accuracy: 0.4473 Epoch 3/20 74/74 [==============================] - 10s 137ms/step - loss: 5.2649 - accuracy: 0.4509 - val_loss: 5.4422 - val_accuracy: 0.4609 Epoch 4/20 74/74 [==============================] - 10s 136ms/step - loss: 5.0218 - accuracy: 0.4504 - val_loss: 5.0675 - val_accuracy: 0.4541 Epoch 5/20 74/74 [==============================] - 10s 137ms/step - loss: 4.7467 - accuracy: 0.4391 - val_loss: 4.8930 - val_accuracy: 0.4609 Epoch 6/20 74/74 [==============================] - 10s 138ms/step - loss: 4.3553 - accuracy: 0.4489 - val_loss: 4.6729 - val_accuracy: 0.4796 Epoch 7/20 74/74 [==============================] - 10s 136ms/step - loss: 4.5149 - accuracy: 0.4322 - val_loss: 4.7285 - val_accuracy: 0.4694 Epoch 8/20 74/74 [==============================] - 10s 132ms/step - loss: 4.3438 - accuracy: 0.4629 - val_loss: 4.5971 - val_accuracy: 0.4660 Epoch 9/20 74/74 [==============================] - 10s 137ms/step - loss: 3.7710 - accuracy: 0.5038 - val_loss: 4.4096 - val_accuracy: 0.4796 Epoch 10/20 74/74 [==============================] - 10s 134ms/step - loss: 3.6060 - accuracy: 0.4749 - val_loss: 4.5540 - val_accuracy: 0.4575 Epoch 11/20 74/74 [==============================] - 10s 136ms/step - loss: 3.4280 - accuracy: 0.4827 - val_loss: 4.2559 - val_accuracy: 0.4643 Epoch 12/20 74/74 [==============================] - 10s 136ms/step - loss: 3.3422 - accuracy: 0.4903 - val_loss: 4.1700 - val_accuracy: 0.4711 Epoch 13/20 74/74 [==============================] - 10s 138ms/step - loss: 3.2245 - accuracy: 0.4817 - val_loss: 4.0689 - val_accuracy: 0.4609 Epoch 14/20 74/74 [==============================] - 10s 138ms/step - loss: 3.2077 - accuracy: 0.4874 - val_loss: 4.4219 - val_accuracy: 0.4592 Epoch 15/20 74/74 [==============================] - 11s 142ms/step - loss: 3.2161 - accuracy: 0.4904 - val_loss: 3.9197 - val_accuracy: 0.4660 Epoch 16/20 74/74 [==============================] - 10s 137ms/step - loss: 3.0094 - accuracy: 0.5214 - val_loss: 3.8926 - val_accuracy: 0.4575 Epoch 17/20 74/74 [==============================] - 10s 135ms/step - loss: 2.8452 - accuracy: 0.4938 - val_loss: 3.9434 - val_accuracy: 0.4643 Epoch 18/20 74/74 [==============================] - 10s 137ms/step - loss: 2.8461 - accuracy: 0.5081 - val_loss: 4.1720 - val_accuracy: 0.4558 Epoch 19/20 74/74 [==============================] - 10s 137ms/step - loss: 2.6716 - accuracy: 0.5167 - val_loss: 4.5644 - val_accuracy: 0.4558 Epoch 20/20 74/74 [==============================] - 10s 136ms/step - loss: 2.5964 - accuracy: 0.5398 - val_loss: 3.5810 - val_accuracy: 0.4711 Batch Loss: 3.5809977054595947 Accuracy: 0.4710884392261505
plot_history(augmented_mc_output[1])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", test_accuracy)
plot_confusion(conf_matrix)
Testing Accuracy: 0.4710884353741497
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.68 Precision 0.5 Recall (Sensitivity) 0.62 Specificity 0.71 F-1 Score 0.55 One Finding Results Accuracy 0.6 Precision 0.42 Recall (Sensitivity) 0.4 Specificity 0.71 F-1 Score 0.41 Finding Finding Results Accuracy 0.65 Precision 0.49 Recall (Sensitivity) 0.41 Specificity 0.78 F-1 Score 0.45
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df_binary = pd.DataFrame(y, columns=['label'])
zero_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 0].index.tolist()[:2562]
print(len(zero_indices_binary))
one_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 1].index.tolist()[:2562]
print(len(one_indices_binary))
print(len(zero_indices_binary)+len(one_indices_binary))
sampled_indices_binary = zero_indices_binary + one_indices_binary
len(sampled_indices_binary)
2562 2562 5124
5124
X = np.asarray([X[i] for i in sampled_indices_binary])
y = np.asarray([y[i] for i in sampled_indices_binary])
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4099 and X_test is 1025 Y_train is 4099 and Y_test is 1025 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4099 and X_test is 1025 y_train is 4099 and y_test is 1025 X_train final shape is: (4099, 49152) X_train final shape is: (4099, 128, 128, 3)
def augmented_binary_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 20
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=20,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
augmented_bin_output = augmented_binary_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 129/129 [==============================] - 19s 139ms/step - loss: 3.7220 - binary_accuracy: 0.5602 - val_loss: 2.9794 - val_binary_accuracy: 0.6146 Epoch 2/20 129/129 [==============================] - 18s 138ms/step - loss: 2.6671 - binary_accuracy: 0.6066 - val_loss: 2.5913 - val_binary_accuracy: 0.6020 Epoch 3/20 129/129 [==============================] - 18s 137ms/step - loss: 2.4062 - binary_accuracy: 0.6174 - val_loss: 2.5705 - val_binary_accuracy: 0.6088 Epoch 4/20 129/129 [==============================] - 17s 133ms/step - loss: 2.2159 - binary_accuracy: 0.6078 - val_loss: 2.3030 - val_binary_accuracy: 0.6088 Epoch 5/20 129/129 [==============================] - 17s 134ms/step - loss: 2.1652 - binary_accuracy: 0.6163 - val_loss: 2.2013 - val_binary_accuracy: 0.6078 Epoch 6/20 129/129 [==============================] - 17s 134ms/step - loss: 1.9514 - binary_accuracy: 0.6409 - val_loss: 2.1655 - val_binary_accuracy: 0.6059 Epoch 7/20 129/129 [==============================] - 18s 137ms/step - loss: 1.9475 - binary_accuracy: 0.6298 - val_loss: 2.0555 - val_binary_accuracy: 0.6107 Epoch 8/20 129/129 [==============================] - 18s 138ms/step - loss: 1.7994 - binary_accuracy: 0.6266 - val_loss: 1.9503 - val_binary_accuracy: 0.6020 Epoch 9/20 129/129 [==============================] - 18s 137ms/step - loss: 1.7839 - binary_accuracy: 0.6268 - val_loss: 1.9159 - val_binary_accuracy: 0.6244 Epoch 10/20 129/129 [==============================] - 17s 135ms/step - loss: 1.6501 - binary_accuracy: 0.6414 - val_loss: 1.8298 - val_binary_accuracy: 0.6049 Epoch 11/20 129/129 [==============================] - 18s 138ms/step - loss: 1.5459 - binary_accuracy: 0.6493 - val_loss: 1.7892 - val_binary_accuracy: 0.6039 Epoch 12/20 129/129 [==============================] - 18s 137ms/step - loss: 1.5416 - binary_accuracy: 0.6380 - val_loss: 1.7837 - val_binary_accuracy: 0.6049 Epoch 13/20 129/129 [==============================] - 18s 137ms/step - loss: 1.4882 - binary_accuracy: 0.6536 - val_loss: 1.6780 - val_binary_accuracy: 0.6195 Epoch 14/20 129/129 [==============================] - 18s 138ms/step - loss: 1.4063 - binary_accuracy: 0.6353 - val_loss: 1.6656 - val_binary_accuracy: 0.6254 Epoch 15/20 129/129 [==============================] - 17s 135ms/step - loss: 1.4245 - binary_accuracy: 0.6493 - val_loss: 1.7293 - val_binary_accuracy: 0.5785 Epoch 16/20 129/129 [==============================] - 18s 139ms/step - loss: 1.3459 - binary_accuracy: 0.6457 - val_loss: 1.7340 - val_binary_accuracy: 0.6000 Epoch 17/20 129/129 [==============================] - 18s 137ms/step - loss: 1.3771 - binary_accuracy: 0.6517 - val_loss: 1.5887 - val_binary_accuracy: 0.6107 Epoch 18/20 129/129 [==============================] - 18s 139ms/step - loss: 1.2690 - binary_accuracy: 0.6620 - val_loss: 1.6202 - val_binary_accuracy: 0.5990 Epoch 19/20 129/129 [==============================] - 18s 137ms/step - loss: 1.2610 - binary_accuracy: 0.6575 - val_loss: 1.5293 - val_binary_accuracy: 0.6000 Epoch 20/20 129/129 [==============================] - 18s 136ms/step - loss: 1.1769 - binary_accuracy: 0.6524 - val_loss: 1.5430 - val_binary_accuracy: 0.5961 Batch Loss: 1.5429900884628296 Accuracy: 0.5960975885391235
plot_history_binary(augmented_bin_output[1])
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:111: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:124: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(augmented_bin_output[2], augmented_bin_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
Testing Accuracy: 0.5960975609756097
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
Binary Finding Results Accuracy 0.6 Precision 0.62 Recall (Sensitivity) 0.5 Specificity 0.69 F-1 Score 0.55
binary_tableResults(binary_stats)
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def augmented_multiclass_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 100
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.2,
fill_mode = 'nearest'
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/100 74/74 [==============================] - 12s 148ms/step - loss: 1.1649 - accuracy: 0.3353 - val_loss: 15.5809 - val_accuracy: 0.3605 Epoch 2/100 74/74 [==============================] - 11s 145ms/step - loss: 1.1073 - accuracy: 0.3788 - val_loss: 15.6497 - val_accuracy: 0.3776 Epoch 3/100 74/74 [==============================] - 11s 146ms/step - loss: 1.0822 - accuracy: 0.4042 - val_loss: 16.1567 - val_accuracy: 0.3724 Epoch 4/100 74/74 [==============================] - 11s 144ms/step - loss: 1.0688 - accuracy: 0.4145 - val_loss: 14.7239 - val_accuracy: 0.3827 Epoch 5/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0428 - accuracy: 0.4335 - val_loss: 16.6185 - val_accuracy: 0.3980 Epoch 6/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0312 - accuracy: 0.4656 - val_loss: 18.8118 - val_accuracy: 0.3980 Epoch 7/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0312 - accuracy: 0.4767 - val_loss: 17.5612 - val_accuracy: 0.3929 Epoch 8/100 74/74 [==============================] - 11s 143ms/step - loss: 1.0416 - accuracy: 0.4560 - val_loss: 16.9751 - val_accuracy: 0.3980 Epoch 9/100 74/74 [==============================] - 11s 143ms/step - loss: 1.0363 - accuracy: 0.4693 - val_loss: 17.2968 - val_accuracy: 0.4150 Epoch 10/100 74/74 [==============================] - 10s 140ms/step - loss: 1.0374 - accuracy: 0.4653 - val_loss: 17.8172 - val_accuracy: 0.4031 Epoch 11/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0094 - accuracy: 0.4774 - val_loss: 17.6410 - val_accuracy: 0.4082 Epoch 12/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0054 - accuracy: 0.4997 - val_loss: 17.9621 - val_accuracy: 0.3980 Epoch 13/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0051 - accuracy: 0.4662 - val_loss: 18.2877 - val_accuracy: 0.4133 Epoch 14/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0122 - accuracy: 0.4876 - val_loss: 20.2918 - val_accuracy: 0.3912 Epoch 15/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0187 - accuracy: 0.4705 - val_loss: 18.9361 - val_accuracy: 0.3946 Epoch 16/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0120 - accuracy: 0.4873 - val_loss: 18.6203 - val_accuracy: 0.4031 Epoch 17/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0362 - accuracy: 0.4570 - val_loss: 17.2264 - val_accuracy: 0.4014 Epoch 18/100 74/74 [==============================] - 10s 140ms/step - loss: 1.0219 - accuracy: 0.4859 - val_loss: 17.7919 - val_accuracy: 0.4201 Epoch 19/100 74/74 [==============================] - 10s 140ms/step - loss: 1.0005 - accuracy: 0.4832 - val_loss: 22.8754 - val_accuracy: 0.4014 Epoch 20/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9916 - accuracy: 0.4974 - val_loss: 20.9421 - val_accuracy: 0.4099 Epoch 21/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9920 - accuracy: 0.5011 - val_loss: 15.8258 - val_accuracy: 0.4167 Epoch 22/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0035 - accuracy: 0.4833 - val_loss: 17.9785 - val_accuracy: 0.4184 Epoch 23/100 74/74 [==============================] - 10s 140ms/step - loss: 1.0082 - accuracy: 0.4880 - val_loss: 18.7684 - val_accuracy: 0.4201 Epoch 24/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0022 - accuracy: 0.4852 - val_loss: 17.7363 - val_accuracy: 0.4082 Epoch 25/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0135 - accuracy: 0.4797 - val_loss: 18.0832 - val_accuracy: 0.4167 Epoch 26/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9952 - accuracy: 0.4888 - val_loss: 16.7057 - val_accuracy: 0.4167 Epoch 27/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0024 - accuracy: 0.5037 - val_loss: 18.4833 - val_accuracy: 0.4184 Epoch 28/100 74/74 [==============================] - 11s 142ms/step - loss: 1.0005 - accuracy: 0.4776 - val_loss: 16.7275 - val_accuracy: 0.4269 Epoch 29/100 74/74 [==============================] - 11s 142ms/step - loss: 0.9727 - accuracy: 0.5075 - val_loss: 17.2522 - val_accuracy: 0.4218 Epoch 30/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9953 - accuracy: 0.4921 - val_loss: 14.6003 - val_accuracy: 0.4320 Epoch 31/100 74/74 [==============================] - 11s 142ms/step - loss: 0.9940 - accuracy: 0.4912 - val_loss: 15.3165 - val_accuracy: 0.4286 Epoch 32/100 74/74 [==============================] - 11s 146ms/step - loss: 0.9981 - accuracy: 0.4844 - val_loss: 15.0657 - val_accuracy: 0.4286 Epoch 33/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9861 - accuracy: 0.5061 - val_loss: 15.6266 - val_accuracy: 0.4303 Epoch 34/100 74/74 [==============================] - 11s 143ms/step - loss: 1.0144 - accuracy: 0.4618 - val_loss: 15.6313 - val_accuracy: 0.4269 Epoch 35/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9926 - accuracy: 0.4872 - val_loss: 16.0744 - val_accuracy: 0.4218 Epoch 36/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0160 - accuracy: 0.4845 - val_loss: 16.9906 - val_accuracy: 0.4201 Epoch 37/100 74/74 [==============================] - 11s 143ms/step - loss: 1.0094 - accuracy: 0.5042 - val_loss: 16.7221 - val_accuracy: 0.4133 Epoch 38/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9925 - accuracy: 0.4898 - val_loss: 16.1545 - val_accuracy: 0.4150 Epoch 39/100 74/74 [==============================] - 11s 143ms/step - loss: 1.0010 - accuracy: 0.5007 - val_loss: 15.3418 - val_accuracy: 0.4286 Epoch 40/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9884 - accuracy: 0.5245 - val_loss: 14.9493 - val_accuracy: 0.4320 Epoch 41/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0074 - accuracy: 0.4915 - val_loss: 15.4279 - val_accuracy: 0.4269 Epoch 42/100 74/74 [==============================] - 11s 143ms/step - loss: 1.0054 - accuracy: 0.4862 - val_loss: 16.4184 - val_accuracy: 0.4269 Epoch 43/100 74/74 [==============================] - 11s 142ms/step - loss: 0.9806 - accuracy: 0.5232 - val_loss: 15.8512 - val_accuracy: 0.4354 Epoch 44/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9705 - accuracy: 0.5148 - val_loss: 14.5774 - val_accuracy: 0.4320 Epoch 45/100 74/74 [==============================] - 11s 142ms/step - loss: 0.9797 - accuracy: 0.5023 - val_loss: 16.0051 - val_accuracy: 0.4286 Epoch 46/100 74/74 [==============================] - 11s 142ms/step - loss: 0.9831 - accuracy: 0.5020 - val_loss: 15.9419 - val_accuracy: 0.4303 Epoch 47/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9830 - accuracy: 0.5092 - val_loss: 17.7909 - val_accuracy: 0.4167 Epoch 48/100 74/74 [==============================] - 10s 139ms/step - loss: 1.0046 - accuracy: 0.4850 - val_loss: 16.3658 - val_accuracy: 0.4337 Epoch 49/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9959 - accuracy: 0.4791 - val_loss: 15.9669 - val_accuracy: 0.4320 Epoch 50/100 74/74 [==============================] - 10s 137ms/step - loss: 1.0258 - accuracy: 0.4761 - val_loss: 18.6968 - val_accuracy: 0.4065 Epoch 51/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9877 - accuracy: 0.5165 - val_loss: 16.5901 - val_accuracy: 0.4371 Epoch 52/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9943 - accuracy: 0.5136 - val_loss: 17.5629 - val_accuracy: 0.4133 Epoch 53/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9662 - accuracy: 0.5164 - val_loss: 16.8893 - val_accuracy: 0.4150 Epoch 54/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9858 - accuracy: 0.4918 - val_loss: 15.4880 - val_accuracy: 0.4201 Epoch 55/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9853 - accuracy: 0.4989 - val_loss: 14.7059 - val_accuracy: 0.4303 Epoch 56/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9797 - accuracy: 0.5118 - val_loss: 16.0162 - val_accuracy: 0.4303 Epoch 57/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9947 - accuracy: 0.4955 - val_loss: 14.6129 - val_accuracy: 0.4116 Epoch 58/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9821 - accuracy: 0.5087 - val_loss: 16.6462 - val_accuracy: 0.4252 Epoch 59/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9971 - accuracy: 0.4997 - val_loss: 16.3020 - val_accuracy: 0.4133 Epoch 60/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9823 - accuracy: 0.4957 - val_loss: 16.8345 - val_accuracy: 0.4167 Epoch 61/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9711 - accuracy: 0.5052 - val_loss: 14.8208 - val_accuracy: 0.4354 Epoch 62/100 74/74 [==============================] - 10s 141ms/step - loss: 1.0041 - accuracy: 0.4897 - val_loss: 14.2121 - val_accuracy: 0.4286 Epoch 63/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9747 - accuracy: 0.5165 - val_loss: 14.0001 - val_accuracy: 0.4371 Epoch 64/100 74/74 [==============================] - 10s 140ms/step - loss: 0.9763 - accuracy: 0.5110 - val_loss: 15.3712 - val_accuracy: 0.4252 Epoch 65/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9678 - accuracy: 0.5145 - val_loss: 13.7674 - val_accuracy: 0.4388 Epoch 66/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9736 - accuracy: 0.4998 - val_loss: 13.9772 - val_accuracy: 0.4507 Epoch 67/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9773 - accuracy: 0.5350 - val_loss: 14.3370 - val_accuracy: 0.4371 Epoch 68/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9702 - accuracy: 0.5249 - val_loss: 15.8747 - val_accuracy: 0.4269 Epoch 69/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9781 - accuracy: 0.5295 - val_loss: 13.4902 - val_accuracy: 0.4422 Epoch 70/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9759 - accuracy: 0.5097 - val_loss: 14.7561 - val_accuracy: 0.4320 Epoch 71/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9868 - accuracy: 0.4944 - val_loss: 16.4040 - val_accuracy: 0.4269 Epoch 72/100 74/74 [==============================] - 10s 136ms/step - loss: 0.9769 - accuracy: 0.5135 - val_loss: 15.6886 - val_accuracy: 0.4303 Epoch 73/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9918 - accuracy: 0.5019 - val_loss: 14.9216 - val_accuracy: 0.4218 Epoch 74/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9754 - accuracy: 0.5217 - val_loss: 15.6621 - val_accuracy: 0.4116 Epoch 75/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9651 - accuracy: 0.5112 - val_loss: 14.4741 - val_accuracy: 0.4235 Epoch 76/100 74/74 [==============================] - 11s 142ms/step - loss: 0.9844 - accuracy: 0.5044 - val_loss: 14.9119 - val_accuracy: 0.4269 Epoch 77/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9902 - accuracy: 0.4896 - val_loss: 15.8317 - val_accuracy: 0.4218 Epoch 78/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9771 - accuracy: 0.5143 - val_loss: 17.1754 - val_accuracy: 0.4252 Epoch 79/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9952 - accuracy: 0.4964 - val_loss: 14.0091 - val_accuracy: 0.4218 Epoch 80/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9893 - accuracy: 0.5109 - val_loss: 15.8790 - val_accuracy: 0.4252 Epoch 81/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9762 - accuracy: 0.4968 - val_loss: 16.5136 - val_accuracy: 0.4184 Epoch 82/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9925 - accuracy: 0.4914 - val_loss: 15.1917 - val_accuracy: 0.4286 Epoch 83/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9778 - accuracy: 0.5060 - val_loss: 14.6217 - val_accuracy: 0.4303 Epoch 84/100 74/74 [==============================] - 10s 136ms/step - loss: 0.9643 - accuracy: 0.5172 - val_loss: 16.1189 - val_accuracy: 0.4184 Epoch 85/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9765 - accuracy: 0.5208 - val_loss: 13.4334 - val_accuracy: 0.4303 Epoch 86/100 74/74 [==============================] - 10s 139ms/step - loss: 0.9705 - accuracy: 0.5223 - val_loss: 15.9242 - val_accuracy: 0.4218 Epoch 87/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9853 - accuracy: 0.5227 - val_loss: 14.8702 - val_accuracy: 0.4320 Epoch 88/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9872 - accuracy: 0.5124 - val_loss: 15.1403 - val_accuracy: 0.4286 Epoch 89/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9610 - accuracy: 0.5276 - val_loss: 16.4791 - val_accuracy: 0.4422 Epoch 90/100 74/74 [==============================] - 10s 136ms/step - loss: 0.9724 - accuracy: 0.5087 - val_loss: 14.3552 - val_accuracy: 0.4269 Epoch 91/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9888 - accuracy: 0.5056 - val_loss: 16.4131 - val_accuracy: 0.4320 Epoch 92/100 74/74 [==============================] - 10s 141ms/step - loss: 0.9878 - accuracy: 0.5317 - val_loss: 13.9111 - val_accuracy: 0.4218 Epoch 93/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9794 - accuracy: 0.5053 - val_loss: 13.9693 - val_accuracy: 0.4252 Epoch 94/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9649 - accuracy: 0.5136 - val_loss: 15.1871 - val_accuracy: 0.4252 Epoch 95/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9796 - accuracy: 0.5139 - val_loss: 15.1319 - val_accuracy: 0.4167 Epoch 96/100 74/74 [==============================] - 10s 137ms/step - loss: 0.9645 - accuracy: 0.5217 - val_loss: 13.8465 - val_accuracy: 0.4320 Epoch 97/100 74/74 [==============================] - 10s 136ms/step - loss: 0.9892 - accuracy: 0.5039 - val_loss: 14.8170 - val_accuracy: 0.4252 Epoch 98/100 74/74 [==============================] - 10s 136ms/step - loss: 0.9607 - accuracy: 0.5139 - val_loss: 15.2745 - val_accuracy: 0.4218 Epoch 99/100 74/74 [==============================] - 10s 136ms/step - loss: 0.9852 - accuracy: 0.5014 - val_loss: 16.3693 - val_accuracy: 0.4150 Epoch 100/100 74/74 [==============================] - 10s 138ms/step - loss: 0.9660 - accuracy: 0.5220 - val_loss: 14.8654 - val_accuracy: 0.4184 Batch Loss: 14.865427017211914 Accuracy: 0.4183673560619354
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", test_accuracy)
plot_confusion(conf_matrix)
Testing Accuracy: 0.41836734693877553
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.58 Precision 0.42 Recall (Sensitivity) 0.78 Specificity 0.48 F-1 Score 0.55 One Finding Results Accuracy 0.61 Precision 0.37 Recall (Sensitivity) 0.07 Specificity 0.93 F-1 Score 0.12 Finding Finding Results Accuracy 0.65 Precision 0.43 Recall (Sensitivity) 0.46 Specificity 0.74 F-1 Score 0.44
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df_binary = pd.DataFrame(y, columns=['label'])
zero_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 0].index.tolist()[:2562]
print(len(zero_indices_binary))
one_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 1].index.tolist()[:2562]
print(len(one_indices_binary))
print(len(zero_indices_binary)+len(one_indices_binary))
sampled_indices_binary = zero_indices_binary + one_indices_binary
len(sampled_indices_binary)
2562 2562 5124
5124
X = np.asarray([X[i] for i in sampled_indices_binary])
y = np.asarray([y[i] for i in sampled_indices_binary])
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4099 and X_test is 1025 Y_train is 4099 and Y_test is 1025 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4099 and X_test is 1025 y_train is 4099 and y_test is 1025 X_train final shape is: (4099, 49152) X_train final shape is: (4099, 128, 128, 3)
def augmented_binary_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 20
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=30,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
augmented_bin_output = augmented_binary_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 129/129 [==============================] - 18s 134ms/step - loss: 0.7102 - binary_accuracy: 0.5345 - val_loss: 8.5682 - val_binary_accuracy: 0.5356 Epoch 2/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6652 - binary_accuracy: 0.6154 - val_loss: 7.8139 - val_binary_accuracy: 0.5805 Epoch 3/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6575 - binary_accuracy: 0.6235 - val_loss: 6.9420 - val_binary_accuracy: 0.6263 Epoch 4/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6486 - binary_accuracy: 0.6326 - val_loss: 8.0641 - val_binary_accuracy: 0.6166 Epoch 5/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6408 - binary_accuracy: 0.6476 - val_loss: 7.3500 - val_binary_accuracy: 0.6439 Epoch 6/20 129/129 [==============================] - 17s 132ms/step - loss: 0.6382 - binary_accuracy: 0.6534 - val_loss: 7.9155 - val_binary_accuracy: 0.6380 Epoch 7/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6370 - binary_accuracy: 0.6620 - val_loss: 7.2770 - val_binary_accuracy: 0.6576 Epoch 8/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6324 - binary_accuracy: 0.6534 - val_loss: 8.9382 - val_binary_accuracy: 0.6273 Epoch 9/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6302 - binary_accuracy: 0.6609 - val_loss: 8.7483 - val_binary_accuracy: 0.6254 Epoch 10/20 129/129 [==============================] - 17s 131ms/step - loss: 0.6236 - binary_accuracy: 0.6719 - val_loss: 8.5564 - val_binary_accuracy: 0.6283 Epoch 11/20 129/129 [==============================] - 17s 131ms/step - loss: 0.6231 - binary_accuracy: 0.6579 - val_loss: 9.1956 - val_binary_accuracy: 0.6224 Epoch 12/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6368 - binary_accuracy: 0.6498 - val_loss: 8.8069 - val_binary_accuracy: 0.6263 Epoch 13/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6339 - binary_accuracy: 0.6523 - val_loss: 9.5690 - val_binary_accuracy: 0.6215 Epoch 14/20 129/129 [==============================] - 17s 131ms/step - loss: 0.6321 - binary_accuracy: 0.6550 - val_loss: 10.2514 - val_binary_accuracy: 0.6146 Epoch 15/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6243 - binary_accuracy: 0.6548 - val_loss: 9.0190 - val_binary_accuracy: 0.6263 Epoch 16/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6315 - binary_accuracy: 0.6601 - val_loss: 8.8494 - val_binary_accuracy: 0.6176 Epoch 17/20 129/129 [==============================] - 17s 129ms/step - loss: 0.6264 - binary_accuracy: 0.6669 - val_loss: 7.9629 - val_binary_accuracy: 0.6341 Epoch 18/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6136 - binary_accuracy: 0.6729 - val_loss: 8.4127 - val_binary_accuracy: 0.6361 Epoch 19/20 129/129 [==============================] - 17s 130ms/step - loss: 0.6249 - binary_accuracy: 0.6573 - val_loss: 9.8455 - val_binary_accuracy: 0.6195 Epoch 20/20 129/129 [==============================] - 17s 132ms/step - loss: 0.6177 - binary_accuracy: 0.6630 - val_loss: 9.6979 - val_binary_accuracy: 0.6215 Batch Loss: 9.697927474975586 Accuracy: 0.621463418006897
plot_history_binary(augmented_bin_output[1])
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:15: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(augmented_bin_output[2], augmented_bin_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
Testing Accuracy: 0.6214634146341463
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
Binary Finding Results Accuracy 0.62 Precision 0.78 Recall (Sensitivity) 0.38 Specificity 0.89 F-1 Score 0.51
binary_tableResults(binary_stats)
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def augmented_multiclass_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 20
base_model = ResNet50(weights = 'imagenet', include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=20,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 74/74 [==============================] - 14s 158ms/step - loss: 1.9342 - accuracy: 0.4030 - val_loss: 1.5113 - val_accuracy: 0.4456 Epoch 2/20 74/74 [==============================] - 10s 135ms/step - loss: 1.4521 - accuracy: 0.4429 - val_loss: 1.4499 - val_accuracy: 0.4915 Epoch 3/20 74/74 [==============================] - 10s 135ms/step - loss: 1.2905 - accuracy: 0.4847 - val_loss: 1.5705 - val_accuracy: 0.4116 Epoch 4/20 74/74 [==============================] - 10s 137ms/step - loss: 1.3379 - accuracy: 0.4642 - val_loss: 1.3598 - val_accuracy: 0.4949 Epoch 5/20 74/74 [==============================] - 10s 136ms/step - loss: 1.1991 - accuracy: 0.5098 - val_loss: 1.5182 - val_accuracy: 0.4643 Epoch 6/20 74/74 [==============================] - 10s 137ms/step - loss: 1.2088 - accuracy: 0.5049 - val_loss: 1.5161 - val_accuracy: 0.4541 Epoch 7/20 74/74 [==============================] - 10s 140ms/step - loss: 1.2050 - accuracy: 0.5314 - val_loss: 1.3677 - val_accuracy: 0.4813 Epoch 8/20 74/74 [==============================] - 10s 138ms/step - loss: 1.0850 - accuracy: 0.5600 - val_loss: 1.3643 - val_accuracy: 0.4779 Epoch 9/20 74/74 [==============================] - 10s 137ms/step - loss: 1.0482 - accuracy: 0.5490 - val_loss: 1.6114 - val_accuracy: 0.4405 Epoch 10/20 74/74 [==============================] - 10s 137ms/step - loss: 1.1291 - accuracy: 0.5308 - val_loss: 1.3536 - val_accuracy: 0.4694 Epoch 11/20 74/74 [==============================] - 10s 137ms/step - loss: 1.0809 - accuracy: 0.5610 - val_loss: 1.3461 - val_accuracy: 0.4728 Epoch 12/20 74/74 [==============================] - 10s 136ms/step - loss: 1.0598 - accuracy: 0.5572 - val_loss: 1.4559 - val_accuracy: 0.4660 Epoch 13/20 74/74 [==============================] - 10s 136ms/step - loss: 0.9873 - accuracy: 0.5783 - val_loss: 1.4902 - val_accuracy: 0.4796 Epoch 14/20 74/74 [==============================] - 10s 138ms/step - loss: 1.0326 - accuracy: 0.5651 - val_loss: 1.4062 - val_accuracy: 0.4932 Epoch 15/20 74/74 [==============================] - 10s 136ms/step - loss: 0.9615 - accuracy: 0.5891 - val_loss: 1.4306 - val_accuracy: 0.4864 Epoch 16/20 74/74 [==============================] - 10s 136ms/step - loss: 0.9707 - accuracy: 0.5879 - val_loss: 1.4169 - val_accuracy: 0.4847 Epoch 17/20 74/74 [==============================] - 10s 140ms/step - loss: 0.9045 - accuracy: 0.6123 - val_loss: 1.5299 - val_accuracy: 0.4558 Epoch 18/20 74/74 [==============================] - 10s 136ms/step - loss: 0.8932 - accuracy: 0.6268 - val_loss: 1.3602 - val_accuracy: 0.5000 Epoch 19/20 74/74 [==============================] - 10s 137ms/step - loss: 0.9549 - accuracy: 0.5879 - val_loss: 1.3497 - val_accuracy: 0.5000 Epoch 20/20 74/74 [==============================] - 10s 137ms/step - loss: 0.9165 - accuracy: 0.6202 - val_loss: 1.6347 - val_accuracy: 0.5000 Batch Loss: 1.6346724033355713 Accuracy: 0.5
plot_history(augmented_mc_output[1])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", test_accuracy)
plot_confusion(conf_matrix)
Testing Accuracy: 0.5
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.68 Precision 0.53 Recall (Sensitivity) 0.74 Specificity 0.65 F-1 Score 0.62 One Finding Results Accuracy 0.66 Precision 0.26 Recall (Sensitivity) 0.06 Specificity 0.92 F-1 Score 0.1 Finding Finding Results Accuracy 0.66 Precision 0.51 Recall (Sensitivity) 0.65 Specificity 0.67 F-1 Score 0.57
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df_binary = pd.DataFrame(y, columns=['label'])
zero_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 0].index.tolist()[:2562]
print(len(zero_indices_binary))
one_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 1].index.tolist()[:2562]
print(len(one_indices_binary))
print(len(zero_indices_binary)+len(one_indices_binary))
sampled_indices_binary = zero_indices_binary + one_indices_binary
len(sampled_indices_binary)
2562 2562 5124
5124
X = np.asarray([X[i] for i in sampled_indices_binary])
y = np.asarray([y[i] for i in sampled_indices_binary])
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4099 and X_test is 1025 Y_train is 4099 and Y_test is 1025 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4099 and X_test is 1025 y_train is 4099 and y_test is 1025 X_train final shape is: (4099, 49152) X_train final shape is: (4099, 128, 128, 3)
def augmented_binary_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 20
base_model = ResNet50(weights = 'imagenet', include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
# featurewise_center=True,
# featurewise_std_normalization=True,
rotation_range=20,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
augmented_bin_output = augmented_binary_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/20 129/129 [==============================] - 21s 142ms/step - loss: 0.9038 - binary_accuracy: 0.5866 - val_loss: 0.8701 - val_binary_accuracy: 0.5717 Epoch 2/20 129/129 [==============================] - 18s 137ms/step - loss: 0.7613 - binary_accuracy: 0.6423 - val_loss: 0.8422 - val_binary_accuracy: 0.5863 Epoch 3/20 129/129 [==============================] - 18s 137ms/step - loss: 0.7442 - binary_accuracy: 0.6435 - val_loss: 0.7862 - val_binary_accuracy: 0.6029 Epoch 4/20 129/129 [==============================] - 17s 135ms/step - loss: 0.7138 - binary_accuracy: 0.6625 - val_loss: 0.7419 - val_binary_accuracy: 0.6400 Epoch 5/20 129/129 [==============================] - 18s 136ms/step - loss: 0.7053 - binary_accuracy: 0.6521 - val_loss: 0.7478 - val_binary_accuracy: 0.6371 Epoch 6/20 129/129 [==============================] - 17s 134ms/step - loss: 0.6856 - binary_accuracy: 0.6707 - val_loss: 0.7517 - val_binary_accuracy: 0.6302 Epoch 7/20 129/129 [==============================] - 18s 136ms/step - loss: 0.6850 - binary_accuracy: 0.6614 - val_loss: 0.7770 - val_binary_accuracy: 0.6117 Epoch 8/20 129/129 [==============================] - 18s 138ms/step - loss: 0.6583 - binary_accuracy: 0.6823 - val_loss: 0.7511 - val_binary_accuracy: 0.6215 Epoch 9/20 129/129 [==============================] - 17s 135ms/step - loss: 0.6422 - binary_accuracy: 0.6947 - val_loss: 0.7770 - val_binary_accuracy: 0.6205 Epoch 10/20 129/129 [==============================] - 18s 136ms/step - loss: 0.6557 - binary_accuracy: 0.6872 - val_loss: 0.7550 - val_binary_accuracy: 0.6234 Epoch 11/20 129/129 [==============================] - 18s 136ms/step - loss: 0.6365 - binary_accuracy: 0.6961 - val_loss: 0.7795 - val_binary_accuracy: 0.5980 Epoch 12/20 129/129 [==============================] - 17s 135ms/step - loss: 0.6349 - binary_accuracy: 0.6921 - val_loss: 0.7294 - val_binary_accuracy: 0.6380 Epoch 13/20 129/129 [==============================] - 18s 137ms/step - loss: 0.6379 - binary_accuracy: 0.6927 - val_loss: 0.7502 - val_binary_accuracy: 0.6468 Epoch 14/20 129/129 [==============================] - 18s 137ms/step - loss: 0.6379 - binary_accuracy: 0.6948 - val_loss: 0.7464 - val_binary_accuracy: 0.6380 Epoch 15/20 129/129 [==============================] - 18s 137ms/step - loss: 0.6559 - binary_accuracy: 0.6872 - val_loss: 0.7709 - val_binary_accuracy: 0.6312 Epoch 16/20 129/129 [==============================] - 18s 138ms/step - loss: 0.6125 - binary_accuracy: 0.7059 - val_loss: 0.8563 - val_binary_accuracy: 0.5893 Epoch 17/20 129/129 [==============================] - 18s 136ms/step - loss: 0.6104 - binary_accuracy: 0.7039 - val_loss: 0.7285 - val_binary_accuracy: 0.6322 Epoch 18/20 129/129 [==============================] - 18s 136ms/step - loss: 0.5903 - binary_accuracy: 0.7167 - val_loss: 0.7473 - val_binary_accuracy: 0.6312 Epoch 19/20 129/129 [==============================] - 17s 135ms/step - loss: 0.5937 - binary_accuracy: 0.7080 - val_loss: 0.7663 - val_binary_accuracy: 0.6263 Epoch 20/20 129/129 [==============================] - 18s 137ms/step - loss: 0.5927 - binary_accuracy: 0.7228 - val_loss: 0.7391 - val_binary_accuracy: 0.6429 Batch Loss: 0.739121675491333 Accuracy: 0.642926812171936
plot_history_binary(augmented_bin_output[1])
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:111: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:124: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(augmented_bin_output[2], augmented_bin_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
Testing Accuracy: 0.6429268292682927
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
Binary Finding Results Accuracy 0.64 Precision 0.6 Recall (Sensitivity) 0.72 Specificity 0.58 F-1 Score 0.65
binary_tableResults(binary_stats)
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def augmented_multiclass_model_low_lr(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 10
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.2,
fill_mode = 'nearest'
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model_low_lr(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/10 74/74 [==============================] - 11s 136ms/step - loss: 1.1971 - accuracy: 0.3241 - val_loss: 8.7984 - val_accuracy: 0.3810 Epoch 2/10 74/74 [==============================] - 10s 131ms/step - loss: 1.1001 - accuracy: 0.4032 - val_loss: 7.5736 - val_accuracy: 0.4405 Epoch 3/10 74/74 [==============================] - 10s 131ms/step - loss: 1.0511 - accuracy: 0.4440 - val_loss: 8.8487 - val_accuracy: 0.4694 Epoch 4/10 74/74 [==============================] - 10s 132ms/step - loss: 1.0495 - accuracy: 0.4502 - val_loss: 9.0712 - val_accuracy: 0.4643 Epoch 5/10 74/74 [==============================] - 10s 135ms/step - loss: 1.0155 - accuracy: 0.4748 - val_loss: 9.3814 - val_accuracy: 0.4762 Epoch 6/10 74/74 [==============================] - 10s 135ms/step - loss: 1.0240 - accuracy: 0.4792 - val_loss: 13.2570 - val_accuracy: 0.4677 Epoch 7/10 74/74 [==============================] - 10s 132ms/step - loss: 1.0169 - accuracy: 0.4917 - val_loss: 12.6170 - val_accuracy: 0.4677 Epoch 8/10 74/74 [==============================] - 10s 132ms/step - loss: 0.9953 - accuracy: 0.4991 - val_loss: 12.6648 - val_accuracy: 0.4677 Epoch 9/10 74/74 [==============================] - 10s 131ms/step - loss: 1.0075 - accuracy: 0.4837 - val_loss: 11.1492 - val_accuracy: 0.4762 Epoch 10/10 74/74 [==============================] - 10s 131ms/step - loss: 0.9818 - accuracy: 0.5051 - val_loss: 13.6334 - val_accuracy: 0.4745 Batch Loss: 13.633390426635742 Accuracy: 0.47448980808258057
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.64 Precision 0.46 Recall (Sensitivity) 0.82 Specificity 0.57 F-1 Score 0.59 One Finding Results Accuracy 0.63 Precision 0.36 Recall (Sensitivity) 0.04 Specificity 0.96 F-1 Score 0.07 Finding Finding Results Accuracy 0.68 Precision 0.5 Recall (Sensitivity) 0.62 Specificity 0.7 F-1 Score 0.55
def augmented_multiclass_model_high_lr(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 10
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.01), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.2,
fill_mode = 'nearest'
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model_high_lr(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/10 74/74 [==============================] - 11s 141ms/step - loss: 3.6155 - accuracy: 0.3755 - val_loss: 44.2836 - val_accuracy: 0.4286 Epoch 2/10 74/74 [==============================] - 10s 133ms/step - loss: 2.5508 - accuracy: 0.4320 - val_loss: 111.3256 - val_accuracy: 0.4014 Epoch 3/10 74/74 [==============================] - 10s 132ms/step - loss: 2.9178 - accuracy: 0.4222 - val_loss: 88.5409 - val_accuracy: 0.4048 Epoch 4/10 74/74 [==============================] - 10s 133ms/step - loss: 2.5883 - accuracy: 0.4205 - val_loss: 108.7519 - val_accuracy: 0.3827 Epoch 5/10 74/74 [==============================] - 10s 133ms/step - loss: 3.2590 - accuracy: 0.4323 - val_loss: 79.9754 - val_accuracy: 0.4320 Epoch 6/10 74/74 [==============================] - 10s 131ms/step - loss: 2.5387 - accuracy: 0.4618 - val_loss: 122.4371 - val_accuracy: 0.3605 Epoch 7/10 74/74 [==============================] - 10s 131ms/step - loss: 2.4474 - accuracy: 0.4723 - val_loss: 90.6664 - val_accuracy: 0.4116 Epoch 8/10 74/74 [==============================] - 10s 133ms/step - loss: 2.6598 - accuracy: 0.4738 - val_loss: 87.4363 - val_accuracy: 0.4082 Epoch 9/10 74/74 [==============================] - 10s 130ms/step - loss: 2.7565 - accuracy: 0.4622 - val_loss: 82.6910 - val_accuracy: 0.4235 Epoch 10/10 74/74 [==============================] - 10s 131ms/step - loss: 2.7403 - accuracy: 0.4361 - val_loss: 85.7159 - val_accuracy: 0.4201 Batch Loss: 85.71588897705078 Accuracy: 0.42006802558898926
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
None Finding Results Accuracy 0.62 Precision 0.43 Recall (Sensitivity) 0.66 Specificity 0.6 F-1 Score 0.52 One Finding Results Accuracy 0.6 Precision 0.38 Recall (Sensitivity) 0.17 Specificity 0.85 F-1 Score 0.23 Finding Finding Results Accuracy 0.62 Precision 0.43 Recall (Sensitivity) 0.47 Specificity 0.69 F-1 Score 0.45
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df_binary = pd.DataFrame(y, columns=['label'])
zero_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 0].index.tolist()[:2562]
print(len(zero_indices_binary))
one_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 1].index.tolist()[:2562]
print(len(one_indices_binary))
print(len(zero_indices_binary)+len(one_indices_binary))
sampled_indices_binary = zero_indices_binary + one_indices_binary
len(sampled_indices_binary)
2562 2562 5124
5124
X = np.asarray([X[i] for i in sampled_indices_binary])
y = np.asarray([y[i] for i in sampled_indices_binary])
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4099 and X_test is 1025 Y_train is 4099 and Y_test is 1025 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4099 and X_test is 1025 y_train is 4099 and y_test is 1025 X_train final shape is: (4099, 49152) X_train final shape is: (4099, 128, 128, 3)
def augmented_binary_model(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 10
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
augmented_bin_output = augmented_binary_model(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/10 129/129 [==============================] - 19s 141ms/step - loss: 0.7099 - binary_accuracy: 0.5190 - val_loss: 4.8192 - val_binary_accuracy: 0.6010 Epoch 2/10 129/129 [==============================] - 17s 135ms/step - loss: 0.6758 - binary_accuracy: 0.5926 - val_loss: 6.0910 - val_binary_accuracy: 0.6195 Epoch 3/10 129/129 [==============================] - 17s 132ms/step - loss: 0.6588 - binary_accuracy: 0.6131 - val_loss: 6.7163 - val_binary_accuracy: 0.6283 Epoch 4/10 129/129 [==============================] - 17s 132ms/step - loss: 0.6435 - binary_accuracy: 0.6396 - val_loss: 7.1379 - val_binary_accuracy: 0.6332 Epoch 5/10 129/129 [==============================] - 17s 132ms/step - loss: 0.6381 - binary_accuracy: 0.6381 - val_loss: 6.6853 - val_binary_accuracy: 0.6459 Epoch 6/10 129/129 [==============================] - 17s 134ms/step - loss: 0.6305 - binary_accuracy: 0.6511 - val_loss: 8.4177 - val_binary_accuracy: 0.6137 Epoch 7/10 129/129 [==============================] - 18s 136ms/step - loss: 0.6323 - binary_accuracy: 0.6574 - val_loss: 7.4597 - val_binary_accuracy: 0.6351 Epoch 8/10 129/129 [==============================] - 17s 134ms/step - loss: 0.6278 - binary_accuracy: 0.6619 - val_loss: 8.4336 - val_binary_accuracy: 0.6273 Epoch 9/10 129/129 [==============================] - 17s 133ms/step - loss: 0.6393 - binary_accuracy: 0.6558 - val_loss: 7.2995 - val_binary_accuracy: 0.6488 Epoch 10/10 129/129 [==============================] - 17s 131ms/step - loss: 0.6301 - binary_accuracy: 0.6571 - val_loss: 7.7845 - val_binary_accuracy: 0.6371 Batch Loss: 7.784481048583984 Accuracy: 0.6370731592178345
plot_history_binary(augmented_bin_output[1])
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(augmented_bin_output[2], augmented_bin_output[3]) #predictions, true_predictions
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
binary_tableResults(binary_stats)
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:111: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:124: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
Binary Finding Results Accuracy 0.64 Precision 0.73 Recall (Sensitivity) 0.45 Specificity 0.82 F-1 Score 0.56
def augmented_binary_model_higher_lr(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 10
base_model = VGG16(weights = weight_path, include_top=False, input_shape=(128, 128, 3))
# Add a new top layer
x = base_model.output
x = Flatten()(x)
# Softmax
prediction_layer = Dense(num_class, activation='softmax')(x)
model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
for layer in base_model.layers:
layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.01), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
#print(true_predictions)
#print(predictions)
return model, b_history, predictions, true_predictions
augmented_bin_output = augmented_binary_model_higher_lr(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/10 129/129 [==============================] - 19s 139ms/step - loss: 1.3094 - binary_accuracy: 0.5695 - val_loss: 39.6638 - val_binary_accuracy: 0.5288 Epoch 2/10 129/129 [==============================] - 17s 135ms/step - loss: 1.0033 - binary_accuracy: 0.6037 - val_loss: 26.7083 - val_binary_accuracy: 0.5815 Epoch 3/10 129/129 [==============================] - 17s 133ms/step - loss: 1.1784 - binary_accuracy: 0.5926 - val_loss: 37.4213 - val_binary_accuracy: 0.5551 Epoch 4/10 129/129 [==============================] - 18s 136ms/step - loss: 1.0494 - binary_accuracy: 0.6047 - val_loss: 35.9734 - val_binary_accuracy: 0.5873 Epoch 5/10 129/129 [==============================] - 17s 132ms/step - loss: 1.2273 - binary_accuracy: 0.5990 - val_loss: 33.5626 - val_binary_accuracy: 0.6137 Epoch 6/10 129/129 [==============================] - 17s 131ms/step - loss: 1.2917 - binary_accuracy: 0.6172 - val_loss: 38.6981 - val_binary_accuracy: 0.6078 Epoch 7/10 129/129 [==============================] - 17s 130ms/step - loss: 1.1350 - binary_accuracy: 0.6125 - val_loss: 66.0832 - val_binary_accuracy: 0.5141 Epoch 8/10 129/129 [==============================] - 17s 130ms/step - loss: 1.0961 - binary_accuracy: 0.6170 - val_loss: 66.3189 - val_binary_accuracy: 0.5288 Epoch 9/10 129/129 [==============================] - 17s 132ms/step - loss: 1.1578 - binary_accuracy: 0.6332 - val_loss: 46.4646 - val_binary_accuracy: 0.5951 Epoch 10/10 129/129 [==============================] - 17s 130ms/step - loss: 1.0636 - binary_accuracy: 0.6167 - val_loss: 31.4029 - val_binary_accuracy: 0.6185 Batch Loss: 31.402904510498047 Accuracy: 0.6185365915298462
plot_history_binary(augmented_bin_output[1])
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(augmented_bin_output[2], augmented_bin_output[3]) #predictions, true_predictions
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
binary_tableResults(binary_stats)
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:111: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:124: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
Binary Finding Results Accuracy 0.62 Precision 0.64 Recall (Sensitivity) 0.55 Specificity 0.69 F-1 Score 0.59
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def augmented_multiclass_model_custom(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 20
# Creating a Sequential model
model = Sequential()
model.add(Conv2D(32, (3, 3), input_shape=(128, 128, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Conv2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(num_class))
model.add(Activation('softmax'))
# Add a new top layer
#x = base_model.output
#x = Flatten()(x)
# Softmax
#prediction_layer = Dense(num_class, activation='softmax')(x)
#model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
#for layer in base_model.layers:
# layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
featurewise_center=True,
featurewise_std_normalization=True,
rotation_range=20,
width_shift_range=0.2,
height_shift_range=0.2,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.2,
fill_mode = 'nearest'
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model_custom(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/10 74/74 [==============================] - 9s 119ms/step - loss: 1.0960 - accuracy: 0.3723 - val_loss: 43.4747 - val_accuracy: 0.3180 Epoch 2/10 74/74 [==============================] - 9s 116ms/step - loss: 1.0757 - accuracy: 0.4097 - val_loss: 77.6616 - val_accuracy: 0.3180 Epoch 3/10 74/74 [==============================] - 9s 116ms/step - loss: 1.0638 - accuracy: 0.4331 - val_loss: 73.4242 - val_accuracy: 0.3180 Epoch 4/10 74/74 [==============================] - 9s 118ms/step - loss: 1.0584 - accuracy: 0.4500 - val_loss: 82.0337 - val_accuracy: 0.3180 Epoch 5/10 74/74 [==============================] - 9s 119ms/step - loss: 1.0569 - accuracy: 0.4334 - val_loss: 98.6399 - val_accuracy: 0.3180 Epoch 6/10 74/74 [==============================] - 9s 119ms/step - loss: 1.0609 - accuracy: 0.4402 - val_loss: 77.1101 - val_accuracy: 0.3180 Epoch 7/10 74/74 [==============================] - 9s 121ms/step - loss: 1.0519 - accuracy: 0.4369 - val_loss: 95.0960 - val_accuracy: 0.3180 Epoch 8/10 74/74 [==============================] - 9s 121ms/step - loss: 1.0545 - accuracy: 0.4346 - val_loss: 74.0448 - val_accuracy: 0.3180 Epoch 9/10 74/74 [==============================] - 9s 120ms/step - loss: 1.0554 - accuracy: 0.4361 - val_loss: 78.9248 - val_accuracy: 0.3180 Epoch 10/10 74/74 [==============================] - 9s 119ms/step - loss: 1.0426 - accuracy: 0.4598 - val_loss: 82.9443 - val_accuracy: 0.3180 Batch Loss: 82.94430541992188 Accuracy: 0.31802719831466675
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
None Finding Results Accuracy 0.32 Precision 0.32 Recall (Sensitivity) 1.0 Specificity 0.0 F-1 Score 0.48 One Finding Results Accuracy 0.66 Precision --- Recall (Sensitivity) 0.0 Specificity 1.0 F-1 Score --- Finding Finding Results Accuracy 0.66 Precision --- Recall (Sensitivity) 0.0 Specificity 1.0 F-1 Score ---
from keras.applications.densenet import DenseNet121, preprocess_input
from keras.layers import Input
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df = pd.DataFrame(y, columns=['label'])
zero_indices = sampling_df[sampling_df["label"] == 0].index.tolist()[:980]
print(len(zero_indices))
one_indices = sampling_df[sampling_df["label"] == 1].index.tolist()[:980]
print(len(one_indices))
mult_indices = sampling_df[sampling_df["label"] == 2].index.tolist()[:980]
print(len(mult_indices))
print(len(zero_indices)+len(one_indices)+len(mult_indices))
sampled_indices = zero_indices + one_indices + mult_indices
len(sampled_indices)
980 980 980 2940
2940
X = np.asarray([X[i] for i in sampled_indices])
y = np.asarray([y[i] for i in sampled_indices])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 2352 and X_test is 588 Y_train is 2352 and Y_test is 588 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 2352 and X_test is 588 y_train is 2352 and y_test is 588 X_train final shape is: (2352, 49152) X_train final shape is: (2352, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
| 2 | 2 | 1.0 |
def myFunc(image):
image = np.array(image)
hsv_image = cv2.cvtColor(image,cv2.COLOR_RGB2HSV)
return Image.fromarray(hsv_image)
def augmented_multiclass_model_dense_tune_batch(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 15
img_in = Input(X_train.shape[1:])
model = DenseNet121(include_top= False,
weights='imagenet',
input_tensor= img_in,
input_shape= X_train.shape[1:],
pooling ='avg')
x = model.output
predictions = Dense(num_class, activation="softmax", name="predictions")(x)
model = Model(inputs=img_in, outputs=predictions)
# Add a new top layer
#x = base_model.output
#x = Flatten()(x)
# Softmax
#prediction_layer = Dense(num_class, activation='softmax')(x)
#model = Model(inputs=base_model.input, outputs=prediction_layer)
# Do not train base layers; only top
#for layer in base_model.layers:
# layer.trainable = False
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
#model.compile(loss='crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
#callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# model.summary()
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.15
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), batch_size=35, verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_mc_output = augmented_multiclass_model_dense_tune_batch(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/15 74/74 [==============================] - 25s 188ms/step - loss: 1.2524 - accuracy: 0.4294 - val_loss: 1.6238 - val_accuracy: 0.3571 Epoch 2/15 74/74 [==============================] - 12s 164ms/step - loss: 1.0649 - accuracy: 0.4807 - val_loss: 1.2141 - val_accuracy: 0.4609 Epoch 3/15 74/74 [==============================] - 12s 163ms/step - loss: 1.0503 - accuracy: 0.4916 - val_loss: 1.2746 - val_accuracy: 0.4592 Epoch 4/15 74/74 [==============================] - 12s 162ms/step - loss: 0.9912 - accuracy: 0.5253 - val_loss: 1.1597 - val_accuracy: 0.4677 Epoch 5/15 74/74 [==============================] - 12s 165ms/step - loss: 0.9866 - accuracy: 0.5358 - val_loss: 1.0923 - val_accuracy: 0.5085 Epoch 6/15 74/74 [==============================] - 12s 164ms/step - loss: 0.9599 - accuracy: 0.5380 - val_loss: 1.1518 - val_accuracy: 0.4524 Epoch 7/15 74/74 [==============================] - 12s 164ms/step - loss: 0.9290 - accuracy: 0.5509 - val_loss: 1.4250 - val_accuracy: 0.4388 Epoch 8/15 74/74 [==============================] - 12s 165ms/step - loss: 0.9050 - accuracy: 0.5742 - val_loss: 1.2505 - val_accuracy: 0.4320 Epoch 9/15 74/74 [==============================] - 12s 163ms/step - loss: 0.8590 - accuracy: 0.5867 - val_loss: 1.1570 - val_accuracy: 0.4490 Epoch 10/15 74/74 [==============================] - 12s 164ms/step - loss: 0.8943 - accuracy: 0.5697 - val_loss: 1.1916 - val_accuracy: 0.4830 Epoch 11/15 74/74 [==============================] - 12s 165ms/step - loss: 0.8324 - accuracy: 0.6221 - val_loss: 1.1487 - val_accuracy: 0.4932 Epoch 12/15 74/74 [==============================] - 12s 164ms/step - loss: 0.8319 - accuracy: 0.6129 - val_loss: 1.1358 - val_accuracy: 0.4694 Epoch 13/15 74/74 [==============================] - 12s 166ms/step - loss: 0.8281 - accuracy: 0.6161 - val_loss: 1.3157 - val_accuracy: 0.4677 Epoch 14/15 74/74 [==============================] - 12s 165ms/step - loss: 0.8330 - accuracy: 0.6177 - val_loss: 1.1505 - val_accuracy: 0.4813 Epoch 15/15 74/74 [==============================] - 12s 167ms/step - loss: 0.7945 - accuracy: 0.6383 - val_loss: 1.2244 - val_accuracy: 0.4847 Batch Loss: 1.2243845462799072 Accuracy: 0.48469388484954834
#2 1 15 35
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
None Finding Results Accuracy 0.71 Precision 0.54 Recall (Sensitivity) 0.6 Specificity 0.76 F-1 Score 0.57 One Finding Results Accuracy 0.62 Precision 0.48 Recall (Sensitivity) 0.19 Specificity 0.87 F-1 Score 0.27 Finding Finding Results Accuracy 0.64 Precision 0.45 Recall (Sensitivity) 0.73 Specificity 0.6 F-1 Score 0.56
#2 1 1 35
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
None Finding Results Accuracy 0.72 Precision 0.57 Recall (Sensitivity) 0.51 Specificity 0.82 F-1 Score 0.54 One Finding Results Accuracy 0.61 Precision 0.43 Recall (Sensitivity) 0.07 Specificity 0.95 F-1 Score 0.12 Finding Finding Results Accuracy 0.58 Precision 0.41 Recall (Sensitivity) 0.89 Specificity 0.44 F-1 Score 0.56
#2 1 15 32
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
None Finding Results Accuracy 0.71 Precision 0.53 Recall (Sensitivity) 0.62 Specificity 0.74 F-1 Score 0.57 One Finding Results Accuracy 0.6 Precision 0.46 Recall (Sensitivity) 0.34 Specificity 0.76 F-1 Score 0.39 Finding Finding Results Accuracy 0.7 Precision 0.51 Recall (Sensitivity) 0.59 Specificity 0.75 F-1 Score 0.55
#2 1 1 35
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
None Finding Results Accuracy 0.72 Precision 0.57 Recall (Sensitivity) 0.49 Specificity 0.83 F-1 Score 0.53 One Finding Results Accuracy 0.62 Precision 0.5 Recall (Sensitivity) 0.38 Specificity 0.77 F-1 Score 0.43 Finding Finding Results Accuracy 0.7 Precision 0.51 Recall (Sensitivity) 0.73 Specificity 0.69 F-1 Score 0.6
from keras.applications.densenet import DenseNet121, preprocess_input
from keras.layers import Input
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
sampling_df_binary = pd.DataFrame(y, columns=['label'])
zero_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 0].index.tolist()[:2562]
print(len(zero_indices_binary))
one_indices_binary = sampling_df_binary[sampling_df_binary["label"] == 1].index.tolist()[:2562]
print(len(one_indices_binary))
print(len(zero_indices_binary)+len(one_indices_binary))
sampled_indices_binary = zero_indices_binary + one_indices_binary
len(sampled_indices_binary)
2562 2562 5124
5124
X = np.asarray([X[i] for i in sampled_indices_binary])
y = np.asarray([y[i] for i in sampled_indices_binary])
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4099 and X_test is 1025 Y_train is 4099 and Y_test is 1025 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4099 and X_test is 1025 y_train is 4099 and y_test is 1025 X_train final shape is: (4099, 49152) X_train final shape is: (4099, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 1.0 |
| 1 | 1 | 1.0 |
def augmented_binary_model_dense_tune_batch(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 15
img_in = Input(X_train.shape[1:])
model = DenseNet121(include_top= False,
weights='imagenet',
input_tensor= img_in,
input_shape= X_train.shape[1:],
pooling ='avg')
x = model.output
predictions = Dense(num_class, activation="softmax", name="predictions")(x)
model = Model(inputs=img_in, outputs=predictions)
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.15
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), batch_size=35, verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
augmented_bin_output = augmented_binary_model_dense_tune_batch(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/densenet/densenet121_weights_tf_dim_ordering_tf_kernels_notop.h5 29089792/29084464 [==============================] - 0s 0us/step Epoch 1/15 129/129 [==============================] - 41s 183ms/step - loss: 0.7392 - binary_accuracy: 0.6022 - val_loss: 0.7642 - val_binary_accuracy: 0.5522 Epoch 2/15 129/129 [==============================] - 20s 154ms/step - loss: 0.6434 - binary_accuracy: 0.6575 - val_loss: 0.6885 - val_binary_accuracy: 0.6137 Epoch 3/15 129/129 [==============================] - 20s 156ms/step - loss: 0.6413 - binary_accuracy: 0.6538 - val_loss: 0.6886 - val_binary_accuracy: 0.6517 Epoch 4/15 129/129 [==============================] - 20s 157ms/step - loss: 0.6309 - binary_accuracy: 0.6564 - val_loss: 0.6955 - val_binary_accuracy: 0.6205 Epoch 5/15 129/129 [==============================] - 20s 156ms/step - loss: 0.6034 - binary_accuracy: 0.6886 - val_loss: 0.6549 - val_binary_accuracy: 0.6459 Epoch 6/15 129/129 [==============================] - 20s 157ms/step - loss: 0.6070 - binary_accuracy: 0.6675 - val_loss: 0.6081 - val_binary_accuracy: 0.6810 Epoch 7/15 129/129 [==============================] - 20s 157ms/step - loss: 0.5890 - binary_accuracy: 0.6916 - val_loss: 0.6876 - val_binary_accuracy: 0.6068 Epoch 8/15 129/129 [==============================] - 20s 157ms/step - loss: 0.5984 - binary_accuracy: 0.6829 - val_loss: 0.6431 - val_binary_accuracy: 0.6634 Epoch 9/15 129/129 [==============================] - 20s 157ms/step - loss: 0.5627 - binary_accuracy: 0.7121 - val_loss: 0.5985 - val_binary_accuracy: 0.7024 Epoch 10/15 129/129 [==============================] - 20s 158ms/step - loss: 0.5731 - binary_accuracy: 0.7117 - val_loss: 0.6676 - val_binary_accuracy: 0.6546 Epoch 11/15 129/129 [==============================] - 20s 157ms/step - loss: 0.5801 - binary_accuracy: 0.7041 - val_loss: 0.7784 - val_binary_accuracy: 0.5795 Epoch 12/15 129/129 [==============================] - 20s 158ms/step - loss: 0.5512 - binary_accuracy: 0.7097 - val_loss: 0.6820 - val_binary_accuracy: 0.6380 Epoch 13/15 129/129 [==============================] - 20s 156ms/step - loss: 0.5612 - binary_accuracy: 0.6990 - val_loss: 0.6096 - val_binary_accuracy: 0.6741 Epoch 14/15 129/129 [==============================] - 20s 156ms/step - loss: 0.5606 - binary_accuracy: 0.7231 - val_loss: 0.7701 - val_binary_accuracy: 0.6244 Epoch 15/15 129/129 [==============================] - 21s 158ms/step - loss: 0.5428 - binary_accuracy: 0.7299 - val_loss: 0.8021 - val_binary_accuracy: 0.5902 Batch Loss: 0.8020676374435425 Accuracy: 0.5902438759803772
plot_history_binary(augmented_bin_output[1])
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(augmented_bin_output[2], augmented_bin_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
binary_tableResults(binary_stats)
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:111: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:124: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
Testing Accuracy: 0.5902439024390244
Binary Finding Results Accuracy 0.59 Precision 0.56 Recall (Sensitivity) 0.94 Specificity 0.21 F-1 Score 0.7
def augmented_multiclass_model_dense_tune_batch(X_train,y_train,X_test,y_test,input_weights):
num_class = 3
epochs = 15
img_in = Input(X_train.shape[1:])
model = DenseNet121(include_top= False,
weights='imagenet',
input_tensor= img_in,
input_shape= X_train.shape[1:],
pooling ='avg')
x = model.output
predictions = Dense(num_class, activation="softmax", name="predictions")(x)
model = Model(inputs=img_in, outputs=predictions)
# Compiler = categorical_crossentropy
model.compile(loss='categorical_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_acc', patience=3, verbose=1)]
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.15
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), batch_size=35, verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
#2 1 1 35
plot_history(augmented_mc_output[1])
test_accuracy, conf_matrix = evaluate_model(augmented_mc_output[2], augmented_mc_output[3]) #predictions, true_predictions
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
None Finding Results Accuracy 0.72 Precision 0.57 Recall (Sensitivity) 0.49 Specificity 0.83 F-1 Score 0.53 One Finding Results Accuracy 0.62 Precision 0.5 Recall (Sensitivity) 0.38 Specificity 0.77 F-1 Score 0.43 Finding Finding Results Accuracy 0.7 Precision 0.51 Recall (Sensitivity) 0.73 Specificity 0.69 F-1 Score 0.6
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
X,y = np.asarray(X_list), np.asarray(y_list)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 3)
y_test_OHE = to_categorical(y_test, num_classes = 3)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4484 and X_test is 1122 Y_train is 4484 and Y_test is 1122 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4484 and X_test is 1122 y_train is 4484 and y_test is 1122 X_train final shape is: (4484, 49152) X_train final shape is: (4484, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(3)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 0.613885 |
| 1 | 1 | 1.181205 |
| 2 | 2 | 1.906803 |
model_output = augmented_multiclass_model_dense_tune_batch(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/15 141/141 [==============================] - 33s 167ms/step - loss: 1.3702 - accuracy: 0.3911 - val_loss: 1.1816 - val_accuracy: 0.4537 Epoch 2/15 141/141 [==============================] - 22s 156ms/step - loss: 1.1245 - accuracy: 0.4494 - val_loss: 1.1021 - val_accuracy: 0.4679 Epoch 3/15 141/141 [==============================] - 22s 156ms/step - loss: 1.1026 - accuracy: 0.4585 - val_loss: 1.0823 - val_accuracy: 0.4180 Epoch 4/15 141/141 [==============================] - 22s 156ms/step - loss: 1.0551 - accuracy: 0.4757 - val_loss: 1.0666 - val_accuracy: 0.4661 Epoch 5/15 141/141 [==============================] - 22s 156ms/step - loss: 1.0268 - accuracy: 0.5027 - val_loss: 1.0461 - val_accuracy: 0.4483 Epoch 6/15 141/141 [==============================] - 22s 158ms/step - loss: 0.9848 - accuracy: 0.5154 - val_loss: 1.3522 - val_accuracy: 0.4251 Epoch 7/15 141/141 [==============================] - 22s 157ms/step - loss: 1.0091 - accuracy: 0.5226 - val_loss: 0.9895 - val_accuracy: 0.4742 Epoch 8/15 141/141 [==============================] - 22s 158ms/step - loss: 0.9740 - accuracy: 0.5220 - val_loss: 1.0658 - val_accuracy: 0.4537 Epoch 9/15 141/141 [==============================] - 22s 158ms/step - loss: 0.9446 - accuracy: 0.5390 - val_loss: 1.0722 - val_accuracy: 0.4804 Epoch 10/15 141/141 [==============================] - 23s 160ms/step - loss: 0.9432 - accuracy: 0.5424 - val_loss: 0.9738 - val_accuracy: 0.4911 Epoch 11/15 141/141 [==============================] - 22s 158ms/step - loss: 0.9412 - accuracy: 0.5467 - val_loss: 0.9687 - val_accuracy: 0.4884 Epoch 12/15 141/141 [==============================] - 23s 160ms/step - loss: 0.9436 - accuracy: 0.5473 - val_loss: 1.1145 - val_accuracy: 0.4661 Epoch 13/15 141/141 [==============================] - 22s 159ms/step - loss: 0.9456 - accuracy: 0.5636 - val_loss: 0.9523 - val_accuracy: 0.5490 Epoch 14/15 141/141 [==============================] - 22s 158ms/step - loss: 0.9148 - accuracy: 0.5660 - val_loss: 0.9765 - val_accuracy: 0.5446 Epoch 15/15 141/141 [==============================] - 22s 157ms/step - loss: 0.9210 - accuracy: 0.5756 - val_loss: 0.9782 - val_accuracy: 0.5241 Batch Loss: 0.9781773686408997 Accuracy: 0.5240641832351685
plot_history(model_output[1])
test_accuracy, conf_matrix = evaluate_model(model_output[2], model_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", test_accuracy)
plot_confusion(conf_matrix)
none_TFPN = getTrueFalsePosNeg(conf_matrix, 0)
none_stats = summaryStatistics(none_TFPN[0], none_TFPN[1], none_TFPN[2], none_TFPN[3])
print("None Finding Results")
print("Accuracy", none_stats[0])
print("Precision", none_stats[1])
print("Recall (Sensitivity)", none_stats[2])
print("Specificity", none_stats[3])
print("F-1 Score", none_stats[4])
one_TFPN = getTrueFalsePosNeg(conf_matrix, 1)
one_stats = summaryStatistics(one_TFPN[0], one_TFPN[1], one_TFPN[2], one_TFPN[3])
print("One Finding Results")
print("Accuracy", one_stats[0])
print("Precision", one_stats[1])
print("Recall (Sensitivity)", one_stats[2])
print("Specificity", one_stats[3])
print("F-1 Score", one_stats[4])
mul_TFPN = getTrueFalsePosNeg(conf_matrix, 2)
mul_stats = summaryStatistics(mul_TFPN[0], mul_TFPN[1], mul_TFPN[2], mul_TFPN[3])
print("Finding Finding Results")
print("Accuracy", mul_stats[0])
print("Precision", mul_stats[1])
print("Recall (Sensitivity)", mul_stats[2])
print("Specificity", mul_stats[3])
print("F-1 Score", mul_stats[4])
tableResults(none_stats, one_stats, mul_stats)
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:14: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:28: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
Testing Accuracy: 0.5240641711229946
None Finding Results Accuracy 0.67 Precision 0.66 Recall (Sensitivity) 0.68 Specificity 0.66 F-1 Score 0.67 One Finding Results Accuracy 0.61 Precision 0.35 Recall (Sensitivity) 0.34 Specificity 0.73 F-1 Score 0.34 Finding Finding Results Accuracy 0.76 Precision 0.41 Recall (Sensitivity) 0.41 Specificity 0.85 F-1 Score 0.41
def augmented_binary_model_dense_tune_batch(X_train,y_train,X_test,y_test,input_weights):
num_class = 2
epochs = 15
img_in = Input(X_train.shape[1:])
model = DenseNet121(include_top= False,
weights='imagenet',
input_tensor= img_in,
input_shape= X_train.shape[1:],
pooling ='avg')
x = model.output
predictions = Dense(num_class, activation="softmax", name="predictions")(x)
model = Model(inputs=img_in, outputs=predictions)
# Compiler = categorical_crossentropy
model.compile(loss='binary_crossentropy', optimizer=keras.optimizers.Adam(lr=0.0001), metrics=['binary_accuracy'])
callbacks_list = [keras.callbacks.EarlyStopping(monitor='val_binary_accuracy', patience=3, verbose=1)]
# Adding the Image Data Generator from Keras
generator = ImageDataGenerator(
rotation_range=10,
width_shift_range=0.2,
height_shift_range=0.1,
horizontal_flip=True,
vertical_flip=True,
zoom_range = 0.15
)
generator.fit(X_train)
itr = generator.flow(X_train, y_train)
# Fitting the Model to the Augmented Data
b_history = model.fit(x=itr, epochs=epochs, class_weight=input_weights, validation_data=(X_test,y_test), batch_size=35, verbose=1,callbacks = [history('metrics')])
modelScore = model.evaluate(X_test, y_test, verbose=0)
# Printing metrics
print(f'Batch Loss: {modelScore[0]}')
print(f'Accuracy: {modelScore[1]}')
# Printing confusion matrix
y_pred = model.predict(X_test)
true_predictions = np.argmax(y_test,axis = 1)
predictions = np.argmax(y_pred,axis = 1)
return model, b_history, predictions, true_predictions
#X,y = tf.convert_to_tensor(target_df['images'], np.float32), tf.convert_to_tensor(target_df['labels'], np.float32)
picklefile = open(pkl_pathx, 'rb')
X_list = pickle.load(picklefile)
picklefile = open(pkl_pathy, 'rb')
y_list = pickle.load(picklefile)
for n, i in enumerate(y_list) :
if i == 2:
y_list[n] = 1
X,y = np.asarray(X_list), np.asarray(y_list)
print(f'X is {len(X)}')
print(f'Y is {len(y)}')
X is 5606 Y is 5606
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'Y_train is {len(y_train)} and Y_test is {len(y_test)}')
# Block training for testing
block_train = False
if block_train:
X_train = X_train[:80]
y_train = y_train[:80]
X_test = X_test[:20]
y_test = y_test[:20]
print(type(X))
print(type(y))
y_train_OHE = to_categorical(y_train, num_classes = 2)
y_test_OHE = to_categorical(y_test, num_classes = 2)
X_train_scaled = X_train.shape[1]*X_train.shape[2]*X_train.shape[3]
X_test_scaled = X_test.shape[1]*X_test.shape[2]*X_test.shape[3]
X_train_1d = X_train.reshape(X_train.shape[0], X_train_scaled)
X_test_1d = X_test.reshape(X_test.shape[0], X_test_scaled)
print(f'X_train is {len(X_train)} and X_test is {len(X_test)}')
print(f'y_train is {len(y_train)} and y_test is {len(y_test)}')
for i in range(len(X_train_1d)):
X_train_1d_new = X_train_1d.reshape(len(X_train_1d),128,128,3)
for i in range(len(X_test_1d)):
X_test_1d_new = X_test_1d.reshape(len(X_test_1d),128,128,3)
print("X_train final shape is: ",X_train_1d.shape)
print("X_train final shape is: ",X_train_1d_new.shape)
X_train is 4484 and X_test is 1122 Y_train is 4484 and Y_test is 1122 <class 'numpy.ndarray'> <class 'numpy.ndarray'> X_train is 4484 and X_test is 1122 y_train is 4484 and y_test is 1122 X_train final shape is: (4484, 49152) X_train final shape is: (4484, 128, 128, 3)
label_encoded_array = np.unique(y)
unbalanced_weights = class_weight.compute_class_weight('balanced', np.unique(y), y)
weight_data = {'labels': label_encoded_array, 'weight': unbalanced_weights}
#class_dict = dict(0=unbalanced_weights[0],1=unbalanced_weights[1], 2=unbalanced_weights[2])
class_dict = {i : unbalanced_weights[i] for i in range(2)}
pd.DataFrame(data=weight_data)
| labels | weight | |
|---|---|---|
| 0 | 0 | 0.920828 |
| 1 | 1 | 1.094067 |
binary_model_output = augmented_binary_model_dense_tune_batch(X_train_1d_new, y_train_OHE, X_test_1d_new, y_test_OHE, class_dict)
Epoch 1/15 141/141 [==============================] - 34s 176ms/step - loss: 0.7486 - binary_accuracy: 0.6084 - val_loss: 0.8506 - val_binary_accuracy: 0.5963 Epoch 2/15 141/141 [==============================] - 22s 156ms/step - loss: 0.6639 - binary_accuracy: 0.6515 - val_loss: 0.6512 - val_binary_accuracy: 0.6515 Epoch 3/15 141/141 [==============================] - 22s 158ms/step - loss: 0.6357 - binary_accuracy: 0.6611 - val_loss: 0.7027 - val_binary_accuracy: 0.5891 Epoch 4/15 141/141 [==============================] - 22s 157ms/step - loss: 0.6193 - binary_accuracy: 0.6757 - val_loss: 0.6487 - val_binary_accuracy: 0.6533 Epoch 5/15 141/141 [==============================] - 22s 156ms/step - loss: 0.6256 - binary_accuracy: 0.6731 - val_loss: 0.7564 - val_binary_accuracy: 0.5526 Epoch 6/15 141/141 [==============================] - 22s 156ms/step - loss: 0.6178 - binary_accuracy: 0.6756 - val_loss: 0.6122 - val_binary_accuracy: 0.6720 Epoch 7/15 141/141 [==============================] - 22s 157ms/step - loss: 0.5926 - binary_accuracy: 0.6967 - val_loss: 0.6404 - val_binary_accuracy: 0.6506 Epoch 8/15 141/141 [==============================] - 22s 158ms/step - loss: 0.5937 - binary_accuracy: 0.6899 - val_loss: 0.6302 - val_binary_accuracy: 0.6649 Epoch 9/15 141/141 [==============================] - 22s 157ms/step - loss: 0.5829 - binary_accuracy: 0.7015 - val_loss: 0.6282 - val_binary_accuracy: 0.6515 Epoch 10/15 141/141 [==============================] - 22s 157ms/step - loss: 0.5886 - binary_accuracy: 0.6898 - val_loss: 0.6314 - val_binary_accuracy: 0.6711 Epoch 11/15 141/141 [==============================] - 22s 158ms/step - loss: 0.5738 - binary_accuracy: 0.7154 - val_loss: 0.6490 - val_binary_accuracy: 0.6480 Epoch 12/15 141/141 [==============================] - 22s 159ms/step - loss: 0.5799 - binary_accuracy: 0.6909 - val_loss: 0.6359 - val_binary_accuracy: 0.6693 Epoch 13/15 141/141 [==============================] - 23s 160ms/step - loss: 0.5554 - binary_accuracy: 0.7234 - val_loss: 0.7364 - val_binary_accuracy: 0.5954 Epoch 14/15 141/141 [==============================] - 22s 159ms/step - loss: 0.5567 - binary_accuracy: 0.7238 - val_loss: 0.6293 - val_binary_accuracy: 0.6622 Epoch 15/15 141/141 [==============================] - 22s 158ms/step - loss: 0.5378 - binary_accuracy: 0.7443 - val_loss: 0.6362 - val_binary_accuracy: 0.6756 Batch Loss: 0.636225163936615 Accuracy: 0.675579309463501
plot_history_binary(binary_model_output[1])
binary_test_accuracy, binary_conf_matrix = evaluate_binary_model(binary_model_output[2], binary_model_output[3]) #predictions, true_predictions
print("Testing Accuracy: \n", binary_test_accuracy)
plot_binary_confusion(binary_conf_matrix)
binary_stats = binary_summary_Statistics(binary_conf_matrix)
print("Binary Finding Results")
print("Accuracy", binary_stats[0])
print("Precision", binary_stats[1])
print("Recall (Sensitivity)", binary_stats[2])
print("Specificity", binary_stats[3])
print("F-1 Score", binary_stats[4])
binary_tableResults(binary_stats)
dict_keys(['loss', 'binary_accuracy', 'val_loss', 'val_binary_accuracy'])
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:111: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
/usr/local/lib/python3.6/dist-packages/ipykernel_launcher.py:124: MatplotlibDeprecationWarning: Adding an axes using the same arguments as a previous axes currently reuses the earlier instance. In a future version, a new instance will always be created and returned. Meanwhile, this warning can be suppressed, and the future behavior ensured, by passing a unique label to each axes instance.
Testing Accuracy: 0.6755793226381461
Binary Finding Results Accuracy 0.68 Precision 0.69 Recall (Sensitivity) 0.52 Specificity 0.81 F-1 Score 0.59