Run a sweeping line from x-axis (detected by axes detection algorithm) to the bottom of the image, and check when the sweeping line intersects with the maximum number of text boxes.Filter the text boxes which are below the x-axis(, and to the right of y-axis).Therefore, bounding box is calculated from the polygon coordinates (or vertices) from the AWS Rekognition output. There is an issue with bounding box for vertical text or text with an angle. Run text detection (2nd pass) on the new image, and consider only the ones with confidence >= 80.Fill the polygons corresponding to these text with white color.Text detection using detect_text AWS Rekognition API, and considered only the text boxes for which confidence >= 80.To improve text detection, double-pass algorithm is employed. Only the text with confidence >= 80 are considered. DetectText API is used for detecting text. Text detectionĪmazon Rekognition is used to detect text in the image. Below are some of the failed cases in axes detection. ![]() Thresholdīoth x and y axes are detected correctly for 1006 images out of 1254 images (test data set). Table below shows the accuracy of axes detection with varying threshold values. We experimented with threshold values of 0, 5, 10, 12 and found that threshold value of 10 gives better results for axes detection. Similar approach is followed for the x-axis, but the last row is picked where the max-continuous 1s fall in the region.A certain threshold (=10) is assumed, and the first column where the max-continuous 1s falls in the region is the y-axis.Next, for all columns, the maximum value of the max-continuous 1s is picked. ![]() Firstly, the image is converted into black and white image, then the max-continuous ones along each row and each column are obtained.Highlighted images (6 in number out of 100 randomly picked) are incorrectly classified as bar plots. The following are 100 randonly picked images which are predicted as bar plots. The following are the training accuracy and loss curves captured during the training phase for each fold of cross validation. We see that the accuracy is around 84% for all the models used to train the data. The accuracy of the models are given below in the table. The accuracy is calculated using stratified five-fold cross validation. ![]() Pretrained models VGG-19, ResNet152V2, InceptionV3, EfficientNetB3 are used to train the images, and is run on the test images to classify the images to 13 different types such as Bar chart, Line graph, Pie chart etc. The following are the training data used, and model files.īelow is the count of images for each type: Plot type Step 2: The downloaded images are carefully reviewed and the incorrect images are removed. $ cd google-images-download & python setup.py install
0 Comments
Leave a Reply. |