Three-dimensional methodology for photogrammetric acquisition of the soft tissues of the face: a new clinical-instrumental protocol

Background The objective of this study is to define an acquisition protocol that is clear, precise, repeatable, simple, fast and that is useful for analysis of the anthropometric characteristics of the soft tissue of the face. Methods The analysis was carried out according to a new clinical-instrumental protocol that comprises four distinct phases: (1) setup of portable equipment in the space in which field analysis will be performed, (2) preparation of the subject and spatial positioning, (3) scanning of the subject with different facial expressions, and (4) treatment and processing of data. The protocol was tested on a sample comprising 66 female subjects (64 Caucasian, 1 Ethiopian, and 1 Brazilian) who were the finalists of an Italian national beauty contest in 2010. To illustrate the potential of the method, we report here the measurements and full analysis that were carried out on the facial model of one of the subjects who was scanned. Results This new protocol for the acquisition of faces is shown to be fast (phase 1, about 1 h; phase 2, about 1.5 min; phase 3, about 1.5 min; phase 4, about 15 min), simple (phases 1 to 3 requiring a short operator training period; only phase 4 requires expert operators), repeatable (with direct palpation of anatomical landmarks and marking of their positions on the face, the problem of identification of these same landmarks on the digital model is solved), reliable and precise (average precision of measurements, 0.5 to 0.6 mm over the entire surface of the face). Conclusions This standardization allows the mapping of the subjects to be carried out following the same conditions in a reliable and fast process for all of the subjects scanned.


Background
The need for three-dimensional (3D), noninvasive methods and diagnostic tools that can be used in addition to, or as an alternative to, radiographic methods has stimulated a growing interest in facial anthropometry. This is based on the mapping and measurement of the soft tissue of the face [1,2]. To make better use of the potential of these methods, it is necessary to have reliable anthropometric data of reference populations, which is made possible through the analysis of a suitably large number of representative samples. It is therefore essential to have a highly standardized clinical-instrumental protocol that allows precise, repeatable, fast, and simple data acquisition that can be used for subjects of different ages and abilities to cooperate. This can then be used for the survey of suitably large samples that are representative of different subjects, and it should also be easy enough to carry out repeated mapping over time (i.e., monitoring of the spontaneous evolution of clinical cases or results obtained under given treatments).
There are various methods for the acquisition of data relating to the shape of a 3D object [3][4][5][6]. Digital closerange photogrammetry is suitable for use in medicine. The 3D information is obtained through the acquisition and comparison of a number of specific photographic images [7][8][9] that make use of the principle of triangulation, as shown schematically in Figure 1. For each couple of camera stations (CS2, CS3) rigidly positioned in the space, for each corresponding point P(x, y) in the two images, it is possible to calculate the third coordinate z as the value of b in the triangle, knowing the distance and angles between the cameras (h, α, β).
The photographs must be taken from at least two different positions, such that through the intersection of the virtual lines drawn from each of the cameras to the object (here, the face), it is possible to define the 3D coordinates of the points of interest and, therefore, the shape and size of the object [10]. The positions and angles of orientation of each of the cameras must be known for all of the pictures in a dataset [11].
A 3D model of the facial soft tissue can then be obtained in the form of a point cloud, which provides a list of points described according to their spatial coordinates. Through the connection and recognition of the characteristics that bind these 3D points in the cloud, a mesh is created, i.e., a reconstruction of the face that consists of tiny polygons, usually triangles. A computer equipped with the photogrammetric software can then be used to process the images (Figure 2).
A new method of 3D scanning can be used to create dense 3D point clouds and to reconstruct a detailed virtual model of the surfaces, which is known as photobased scanning. This process uses digital cameras in combination with algorithms that analyze digital images [11]. The algorithm for the processing of the scans compares image pairs on the basis of small portions (or patches), area by area, evaluating which patches correspond to one another. Once the optimal correspondence is defined, the position and orientation already computed for photographs are used to calculate the position of this patch in the 3D space. The accuracy and reliability of 3D measurements of facial soft tissue were evaluated by Swennen et al. [12] through comparisons of data obtained on the same sample of subjects using 3D computer tomography and 3D stereophotogrammetry. The measurement of the facial soft tissue with 3D computer tomography showed great accuracy, except for the landmarks of the hairline, the eyebrows, and the eyelids. In contrast, 3D stereophotogrammetry showed great accuracy and reliability, except for the bony landmarks. Thus, the combination of 3D computer tomography and 3D stereophotogrammetry for the analysis of facial soft tissue allows these problems related to the hairline, eyebrows, eyelids, and the bony landmarks to be overcome [12]. The study of Ghoddousi et al. [13] compared the accuracy of the mapping of facial soft tissue using 3D stereophotogrammetry, manual anthropometric measurements, and 2D photographs. These three methods showed good repeatability of measurements. The level of accuracy of the 3D stereophotogrammetric system was very satisfactory, both for the measurement of distances and for the mapping of surfaces. Indeed, the measurements obtained with the 3D systems were sufficiently accurate and reliable enough for clinical use [14].
In surface anthropometry, indirect methods (i.e., those without direct physical contact between the subject and the mapping system), such as photogrammetry, have several advantages over direct methods (i.e., those with direct physical contact needed). This can be seen when the sensors of the equipment used for direct methods can deform the facial surface during measurements, albeit only slightly, and thus contribute as a source of inaccuracy. Instead, photogrammetric methods are relatively insensitive to slight movements of the body, as the time of interaction with the subject is shorter, and therefore, this method is less influenced by any behavior of the Figure 1 The principle of triangulation as used for digital close-range photogrammetry. CS2 and CS3, camera stations; h, α, β, distance and angles between the cameras; b, calculated z distance of the single point from the cameras.
subject. Moreover, some measurements, such as those relating to the eyes, are difficult to obtain with direct mapping methods, without causing discomfort or possible injury to the patient [12,14,15].
Previous studies investigated on the accuracy of the measurements made using photogrammetric face scanners. Rangel et al. [16] presented a study using 3dMD™ (Atlanta, GA, USA) stereophotogrammetric system: they obtained a 3D digital model having an average error of 0.35 mm with a standard deviation of 0.32 mm. Khambay et al. [17] validated a high-resolution 3D imaging system (Di3D™, Dimensional Imaging Ltd., Glasgow, Scotland, UK) using only ten landmarks on facial plaster casts, comparing the results with those obtained by a coordinate-measuring machine (CMM), reporting a system error of 0.2 mm and a reproducibility error of less than 0.5 mm. Winder et al. [18] used 18 landmarks on a mannequin head to validate a Di3D™ system, using comparison laser scans and digital caliper measures, having mean differences between measures of 0.62 mm. In 2010, Lübbers et al. [19] examined the precision and accuracy of the 3dMD™ system using 41 landmarks on a mannequin head: they report a mean global error of 0.2 mm (range 0.1 to 0.5mm). More recently, they [20] repeated the experiments using 61 landmarks on two real faces, reporting a mean global error of 0.41 mm (range 0 to 3.3 mm). Deli et al. [21,22] compared the results achieved with various photogrammetric systems having different numbers and types of cameras. Galantucci et al. [23] recently also developed a simple photogrammetric system for automatic capture and measurement of facial soft tissues during movement: digital close-range photogrammetry was used to acquire the spatial coordinates of facial landmark points and track their movements.
In [24], the Face Shape Maxi5 3D photogrammetric scanner developed by Polishape 3D srl (Bari, Italy), the one used in this study, was validated in terms of reproducibility and accuracy. Measurements were taken over a set of 23 anthropological soft tissue facial landmarks marked on two different dummies, assessing photogrammetric software precision and system measurement accuracy. The operator error was measured by repeatedly digitizing landmarks on the 3D model, and it was around 0.059 mm. The reproducibility error was calculated by digitizing landmarks on two different occasions. The average Euclidean distance between the two matched sets of coordinates was thus computed, giving a result of approximately 0.090 mm. Each dummy was digitized for comparison using a CMM of documented accuracy (0.5 μm). The as-obtained landmark coordinates were considered as the 'gold standard'. The system error mean value was then found to be equal to 0.425 mm, with a standard deviation of 0.142 mm. Therefore, accuracy results suggested that the 3D scanning system used in this work was reliable enough to capture the facial morphology for clinical and anthropological usage.
This methodology thus provides higher accuracy with respect to laser scanning (Vivid 910i, Minolta, Osaka, Japan; a method based on triangulation and widely used in the literature for 3D facial scans), which has an accuracy of ±0.38 and ±0.31 mm in the X and Y directions, respectively, and ±0.2 mm in the Z direction.
The aim of the study was to define a standardized clinical-instrumental procedure to perform such facial scanning and mapping with the use of a specific 3D photogrammetric methodology that allows the reconstruction of 3D digital models of the face. Thus, with the aid of dedicated computer software, this allows quantitative and qualitative anthropometric evaluations of the surface characteristics of the facial soft tissue.

Methods
The 3D photogrammetric method that was designed and implemented in the Laboratory of Rapid Prototyping and Reverse Engineering in Politecnico di Bari (Italy) using the equipment shown in Figure 3. The choice of photogrammetry over other optical scanning methods, such as laser or structured light, was done because photogrammetry is the only method that allows simultaneous acquisition of a large visual and angular field in a much shorter time (1/100 to 1/5,000 s) than with other techniques [3].
The equipment used included five high-definition digital single-lens reflex cameras with flash (Canon 40D, 10 Mpx) that were suitably mounted on a specific rigid structure and connected via computer for their simultaneous operation. The processing of the point clouds and the creation of the virtual 3D models were carried out through the use of photogrammetric reconstruction algorithms [25,26]. The operational protocol designed for this facial scanning consisted of four phases defined as follows [27]: Phase 1: initial setup and preparation of the scanning device. The cameras are mounted on an aluminum frame structure on a robust tripod. The structure contains synchronization and data communication electronics, electric tension transformers, and all of the connecting cables. This frame is designed and built to allow the correct orientation with respect to the face to be scanned ( Figure 3). The structure is positioned at a distance of about 1 m from the subject and connected to a laptop computer used for the remote control of the cameras and to record the images acquired. Two multi-phosphorus fluorescent daylight lamps are positioned for the diffuse illumination of the subject and to facilitate the focusing operations. The background is provided with coded targets, in front of which the subject is seated on a stool that is height adjustable. Phase 2: preparation of the face to be mapped. The face must be free of make-up, with the hair back, to clearly and completely expose the parts of the face to be mapped and the natural texture of the skin. An operator identifies the anatomical skin landmarks on the face through direct method of inspection and palpation. In the present study, 33 landmarks were used, as selected from those according to Sforza et al. and Swennen and Farkas [28][29][30][31][32], and as illustrated in Figure 4. These were then marked by the same operator using hypoallergenic eyeliner to more easily locate the positions of these various anatomical landmarks ( Figure 5). The operator then applies three especially made hypoallergenic coded targets to the face at points N' , T_dx, and T_sx. The main skin landmarks used are [29,[33][34][35] midpoints (trichion (Tr), glabella (G), skin nasion (N'), pronasal (Prn), subnasal (Sn), labialis superior (Ls), labial inferior (Li), sub labialis (Sl), skin pogonion (Pg'), skin menton (Me')) and bilateral points (upper orbital (Os), frontotemporal (Ft), endocanthion (En), exocanthion (Ex), chresta philtri (Chp), cheilion (Ch), tragion (T'), nasal wing crest (Ac), skin antegonial point (Go')). Phase 3: spatial positioning of the subject to be scanned ( Figure 6). For the measurement of the scale during the scan, a specific reference sample is positioned on the subject. The subject is seated in front of the background of the set, which contains the coded targets, on the adjustable height stool positioned facing, and at about 1 m from, the photographic equipment. The head of the subject must be suitably framed by all of the cameras, both at the front and at the sides. It is advisable to use a mirror positioned directly above the cameras, which should be an integral part of the equipment, so that the subject can more easily comply with the conditions for the natural head position, which incorporates a natural expression of the soft tissue, with the eyes gazing into the distance. Three scans are carried out under different shooting conditions: the subject alone at rest (without any specific reference sample that would be useful for subsequent measurements) and two more scans with the inclusion of the  reference sample that is applied directly on the subject: the first of these under conditions of rest, and the second, while smiling. The necessary instructions are provided to ensure that during the scanning, while resting, the subject retains their usual dental occlusion (so as not to alter their anthropometric measurements), with the lips at rest and maintaining a very natural facial expression; even the acquisition while the subject is smiling has to maintain the same conditions. Phase 4: data processing. Using the appropriate procedures, a very dense 3D cloud of the spatial points is obtained (potentially with millions of points). The 3D individual digital models are created from this 3D cloud ( Figure 7). Finally, an automatic procedure for the extraction of the 3D models and the storage of the coordinates of the selected landmarks is also created.
This new clinical-instrumental protocol for photogrammetric acquisition was applied for the first time to acquire the faces of a sample of 66 female subjects (64 Caucasian, 1 Ethiopian, and 1 Brazilian), who are finalists in an Italian national beauty contest in 2010. Each subject explicitly expressed their free and voluntary participation in this study and allowed the 3D photogrammetric scanning of their face and related facial analyses. The 3D scanning of their faces was all carried out on the same day, having at least three different scans for each subject. The software used for photogrammetric processing was PhotoModeler™ Scanner version 6 (Eos Systems Inc., Vancouver, BC, Canada), while for the development of the 3D models, Geomagic™ version10 (Morrisville, NC, USA) was used. PhotoModeler provides estimates of how precise the positions of the points computed are, using the processing algorithm.
On the basis of these coordinates, it is possible to identify a series of anthropometric measurements and angles that can provide a guide and tool for surgeons and orthodontists involved in the reevaluation of female faces. The ith linear distance was calculated as where x, y, and z are the spatial coordinates of the j and k landmarks. The (rp-rq) ith angle measurement was calculated (in sexagesimal degrees) using Equation 2:

Results
This new protocol is very fast for the acquisition of the face. There is the need for about 1 h for the first phase of the protocol (the preparation of the scanning system and of the set must be performed only once for the installation of the equipment, and then it can work for the acquisition of the whole sample to be scanned in that environment). The subject preparation (second phase) takes about 1.5 min per subject; the correct subject positioning and the acquisition of three scans (third phase) also takes for about 1.5 min per subject. The time taken by the system for each acquisition is actually 6 ms, as the duration of the flash. The fourth phase of data processing requires about 15 min by a skilled operator (often an engineer) for the manipulation of the 3D images. From previous studies using this methodology [21,[24][25][26], the average accuracy of the measurements made onto the digital model, referring to the entire surface of the face, was 0.5 to 0.6 mm, while for the individual points (the coded targets), the measurement accuracy obtained was 0.03 mm in the X and Y directions and up to 0.15 mm in the Z direction.
These three scans were then used to build the 3D digital models under the different shooting conditions of the scanning subject (subject under resting conditions without and with references samples and while smiling).
These 3D digital models allow the automatic extraction of the precise spatial coordinates of the anatomical landmarks selected ( Figure 7). As indicated above, the direct pre-marking of the selected landmarks on the face makes these easily identifiable and uniquely recognizable on the virtual model of the face, allowing extremely precise anthropometric measurements [22,24,26].
The standardization of the methodology made it possible to define the samples evenly, recreating the same conditions of data acquisition for all the subjects. Following these requirements for each of the four phases of the protocol provided good repeatability for data acquisition [3,24,36], which was associated with excellent mapping precision, which is an intrinsic characteristic of this methodology [24,37]. Figure 7 shows the development of the 3D model of one of the subjects of the sample analyzed (Miss TF) and specifically the subject who should be the most attractive of all as she won the final of the 2010 competition. Figure 8 shows the 3D spatial mapping of the anthropometric landmarks obtained via computer on the digital images of subject Miss TF.
The 3D coordinates of the anatomical landmarks and the estimation of the accuracy of each measurement for subject Miss TF are listed in Table 1.The linear and angular measurements and the measurement ratios of subject Miss TF are given in Tables 2 and 3.

Discussion
The protocol that we have developed here has highlighted some key elements that have to be met to ensure the full and correct performance and reliability of the measures.
It is essential that the system has full synchronization of the shutters of the cameras. In phase 1 it is important to assure the correct positioning of the system and a good illumination of the face. In phase 2, for a good repeatability and comparability of measurements, the positioning of the anatomical landmarks must be done very accurately. For multiple scanning, the operations of this second phase must be performed by the same operator to provide a uniform standard of the mapping (the same intra-operator error in all of the mapping carried out) and also to avoid incorporating interoperator error into the methodology of the mapping (this error must be considered in the case of measurements taken by different operators). During phase 3, close attention must be paid to the correct positioning of the head. Indeed, although a 3D model is theoretically insensitive to the position of the head, this is not actually true because the distribution of the soft tissue of the face can change under the directional force of gravity. During this third phase, close attention must also be paid to the correct facial expression as the distribution of the soft tissue of the face can change as a result of the facial expression itself. In phase 4, it is crucial to make use of suitable hardware and software for the processing of the point clouds as these contain huge amounts of data. The automatic mapping of the spatial coordinates of the soft tissue pre-marked landmarks of the 3D virtual model allows for numerous measurements to be made for subsequent post-processing (e.g., lengths, angles, linear and angular ratios). These measurements are useful to obtain multiple high-precision biometric measurements.
The estimations of the accuracy of measurements are in line with what is generally considered to be acceptable for facial anthropometric measurements (1.5 mm) and with what has been obtained in previous studies (0.5 to 0.6 mm) [21,22,24,26]. For the subject illustrated as an example, the average of the estimations of precision was 0.43 mm. The analysis of the data shows that the greatest uncertainty in the measurement (i.e., the greatest value of the precision vector) was associated with some points that were not marked directly on the face with the eyeliner but were identified on the virtual model after the 3D reconstruction (Ex, En, Ch) and with some points that were detected by only two of the cameras (Ac, Ft, Go). The smallest uncertainty was associated with the estimate of the position of the N-nasion (0.05 mm), while the largest uncertainty was associated with Ft_l, the left frontotemporal skin landmark (1.14 mm).
It is interesting to evaluate whether the measures correspond to average values or to measures of attractiveness. As an example, we have reported here the analysis of subject Miss TF, in terms of the neoclassical canons of facial proportion [38] and the evaluation of the profile according to Peck and Peck [39]. The neoclassical canons that were used for the comparisons were as follows [38]:  Vertical proportions. The three facial sections of Tr-N, N-Sn, and Sn-Me must have the same length. Naso-aural proportions. The distances N-Sn and Sa-Sba must be the same. Naso-aural slope. The slope of the nose should be equal to the slope of the ear.
Horizontal proportions. The interocular width of En_r-En_l must be equal to the width of the eye as En_r-Ex_r and En_l-Ex_l.
Orbito-nasal proportions. The interocular width of En_r-En_l must be equal to the width of the nose as Ac_r-Ac_l.   Nose-mouth proportions. The width of the nose of Ac_r-Ac_l must be two thirds of the width of the mouth as Ch_r-Ch_l.
Nose-face proportions. The width of the nose of Ac_r-Ac_l must be one quarter of the width of the face as Zy_r-Zy_l.
From the analysis of the data obtained from the measurements taken on Miss TF (Figure 9), it is clear that the naso-aural, horizontal, and nose-mouth proportions of her face are very close to the neoclassical canons (Table 4). According to Farkas et al. [38], the orbito-nasal proportions of the lower third of her face, the naso-aural slope, and the proportions of the lower third of her face deviate slightly from the neoclassical canons. Then, according to Piero della Francesca and Durer reported in Farkas [38], the vertical and nose-facial proportions and those of the lower third of her face deviate substantially from the neoclassical canons.
The analysis of the measurements of Miss TF performed according to a profile analysis of Peck and Peck [39] (Table 5) shows that her facial angle and jaw angle are very close to the average values reported by their analysis of attractive women. Her nasal angle and the maxillo-facial angle differ slightly, while her mandibular, nasal-maxillary, and total vertical angle show the largest deviations from the average values of Peck and Peck [39].

Conclusions
The protocol described here allows the study of anthropometric characteristics of faces, making it easy to apply various methods of facial morphometric analysis on 3D digital models, both in profile and as a frontal view. The analysis of the neoclassical canons [38,40] and the analysis of profiles according to Peck and Peck [39] are also described here, and it would also be easy to perform further esthetic analyses, such as of the profile according to Bütow [41], the cephalometric analysis of soft tissue (ACTM) according to Arnett and McLaughlin [33], the 3D facial morphometry method according to Ferrario et al. [42], and the method of Baik et al. [43] to mention a few.
The direct pre-marking of the anatomical landmarks on the face makes it easy to automatically recognize their 3D positions on the 3D digital model as these are clearly visible and easily identifiable with respect to the other structures of the face, and this improves the goodness of the measurement of the coordinates of these points. The speed of this methodology made it possible to apply it in a facial mapping situation that was very particular, i.e., the national final of a popular Italian beauty contest. This provided data on the anthropometric characteristics of the faces of contestants who were considered to be the more attractive, as they had passed the initial wide selection made by juries across the whole of Italy. The selection had thus started with a sample size of about 20,000 girls to arrive at this sample of only 66 finalists in 2010. Indeed, it is the speed of the method that makes it relatively easy to apply in different facial mapping situations, on large sample groups or for extensive clinical trials. This can also be applied to different age groups and in the case of subjects who cannot cooperate fully due to problems relating to age or the presence of various handicaps. For the subjects being examined, this methodology appears to be well accepted as it is noninvasive, fast (it takes very little time for the subject), and easy to perform (the collaboration requested from the patient is minimal).

Consent
Written informed consent was obtained from the patient for the publication of this report and any accompanying images.  According to Peck and Peck [39].