Application of Deep Learning Algorithms for Lithographic Mask Characterization

Abstract: Optical Lithography is a technique used to transfer patterns from a given photomask to a photoresist on top of a semiconductor wafer. One of the key challenges in lithographic printing is the appearance of defects on the photomask. Printable defects affect the lithographic process by causing errors in both the phase and magnitude of the light and of the sizes and location of the printed features. Since it is not yet possible to produce defect-free masks, methods to inspect and repair mask defects play a significant role. This master thesis proposes and investigates the application of Convolutional Neural Networks (CNNs) to characterize and classify defects on lithographic masks. CNN as one of the algorithms in deep learning have achieved good results in image classification for other problems in the past.

The simulation software Dr.LiTHO of Fraunhofer IISB is used to simulate aerial images of defect-free masks and of masks with different types and locations of defects. Specifically we compute images of regular arrays of 38 nm and 25 nm wide squares to be imaged with typical settings of EUV lithography (l = 13.5 nm, NA = 0.33). Only absorber defects on the mask are studied. We simulated 5 types of defects (extrusion, intrusion, oversize, undersize and center spot). 

Depending on the position of the defects, extrusion and intrusion are further classified in to 8 types which results in a total of 19 classes for the classification. The final architecture of the CNN contains 5 convolutional layers (conv. layers), where most of them are followed by a max-pooling layer. A mixed size of  filters is used for the conv. layers (3  3) and (5  5). The convolution stride is fixed to 1 pixel. The spatial padding of conv. layer input is such that the spatial resolution is preserved after convolution, i.e. the padding is 1 pixel for all conv. layers. Two separate networks are trained for detection of the defect types and location. Another algorithm is used to combine the results from the two networks.

An accuracy of 99.9% on the training set and 99.3% on the test set is achieved for detection of the defect type. The network trained for location detection results in 98.7% training accuracy and 98.0% for the test set. The performance of the models is also measured with other performance measures like confusion matrix, precision and recall. The robustness of the models is studied by systematically removing images of certain defect sizes from the training set. Moreover, we investigate the relation between defect sizes and the accuracy of the models. The defect detection model can predict the types of defects with a size of 5 nm or 0.12l NA , which is well below the classical resolution limit. The location detection model shows a slightly different behavior for the different dataset of images. It predicts location of a defect above the size of 4 nm for 25 nm features and of 6 nm for 38 nm features.