DSpace Repository

DEEP LEARNING MODELS FOR IMAGE ENHANCEMENT AND INTERPRETATION

Show simple item record

dc.contributor.advisor Matthew, N. Dailey
dc.contributor.author Farooq, Muhammad
dc.contributor.other Parnichkun, Manukid
dc.date.accessioned 2021-05-18T08:05:21Z
dc.date.available 2021-05-18T08:05:21Z
dc.date.issued 2020-05-18
dc.identifier.uri http://www.cs.ait.ac.th/xmlui/handle/123456789/1004
dc.description.abstract Deep learning methods have produced remarkable results in solving many classic problems in the fields of computer vision and image processing in the last few years. Many previously unsolved problems have been solved using deep learning models that transform input images into output images or other representations such distributions over a set of categories. Examples from image processing include super-resolution, denoising, and colorization, where the input is a degraded image (low-resolution, noisy, or grayscale), and the output is a high-quality image. Examples from computer vision includes flow estimation and change detection, where the input is a pair of images and the output is a flow field or a mask indicating where the two input images are different. This dissertation explores the use of deep learning techniques to solve the specific image enhancement and interpretation problems of face super-resolution (SR) and vehicle change detection (VCD). The goal of face SR is to enhance the quality and resolution of human face images in low quality images such as surveillance video footage. Face super-resolution has important applications in problems such as face recognition, 3D face reconstruction, face alignment, and face parsing. The majority of published face SR reconstruction work deals with synthetic data or with videos recorded under carefully controlled conditions, and there is relatively little published work on real-world reconstruction of LR human faces acquired under more challenging conditions such as surveillance camera settings. This dissertation focuses on SR reconstruction of human faces extracted from real world, low quality surveillance video. Change detection is another important problem in computer vision that up till now has been primarily applied in the area of geographic information systems, especially on satellite imagery. Change detection in satellite images is relatively straightforward, because successive images of the same area can be precisely aligned using planar homographies then compared directly. The main challenges are to ignore noise, cloud cover, shadows, and atmospheric differences when making the before/after comparison. Change detection can be much more challenging, however, when images of arbitrary 3D objects that have been acquired from different points of view must be compared. I deal with this more challenging problem in this dissertation. Most super-resolution (SR) methods proposed to date do not use real ground truth high-resolution (HR) and low-resolution (LR) image pairs to learn models; instead, the vast majority of methods use synthetic LR images generated by undersampling the HR images. This approach yields excellent performance on similar synthetic LR data, but on real-world poor-quality surveillance video footage, they suffer from performance degradation. A promising alternative is to apply recent advances in style transfer for unpaired datasets, but state-of-the-art work along these lines (\citealt {bulat2018learn}) has used LR images and HR images from completely different datasets, introducing more variation between the HR and LR domains than strictly necessary. In this dissertation, I propose methods that overcome both of these limitations, applying unpaired style transfer learning methods to face SR using real-world HR and LR datasets that share important properties. The key is to acquire roughly paired training data from a high quality main stream and a lower quality sub stream of the same IP camera. Based on this principle, I have constructed four datasets comprising more than 400 people, with 1–15 weakly aligned real HR-LR pairs for each subject. I describe a style transfer Cycle GAN approach that produces impressive super-resolved images for low-quality test images never seen during training. The second problem I target is vehicle change detection (VCD), aiming to solve some of the problems arising in mobility applications related to inspection of a vehicle after it has been used. The vehicle owner’s interest is to detect any damage that may have occurred to his or her vehicle while it was in use, while the user’s interest is to document that any damage visible on the vehicle was pre-existing. In order to address this problem, I again turn to deep learning models: one for car masking, one for image alignment, one for damage image generation, and one for change detection. We design and implement a deep learning model that precisely aligns two images of the same car by warping one image onto the other. I utilize a separate deep learning model to generate sample damage images from undamaged images, and I also design and implement deep learning models that fuse damage patches onto car images in a realistic way. Finally, I apply deep learning models that detect changes on vehicles from a pair of aligned input images. The contributions of the dissertation are 1) I introduce a deep learning model called SR-CGAN (an adopted cycle GAN) that produces impressive super-resolved images for real world low-quality test images; 2) I introduce a new deep learning model called IACGAN that precisely aligns two images of the same vehicle by warping one image onto the other; 3) I demonstrate the feasibility of utilizing a deep learning model to generate sample damage images from undamaged images; 4) I develop new deep learning models that fuse damage patches onto car images in a realistic way; 5) I provide a baseline for a deep learning models that detect changes on vehicles from pairs of aligned input images. In this dissertation, I lay the groundwork for a longer term research program on deep learning models for image enhancement and interpretation. en_US
dc.description.sponsorship AIT AI Center Univesity of the Punjab Fellowship from AIT en_US
dc.language.iso en en_US
dc.subject Deep learning models en_US
dc.title DEEP LEARNING MODELS FOR IMAGE ENHANCEMENT AND INTERPRETATION en_US
dc.type Dissertation en_US


Files in this item

This item appears in the following Collection(s)

Show simple item record

Search DSpace


Advanced Search

Browse

My Account