The wind industry has experienced a rapid expansion. As wind farms are aging, their operations and maintenance issues are gaining in significance. The wind industry has been affected by failures of wind turbine components such as main bearings, gearboxes, and generators. Replacement of failed components results in the high cost in energy production. Therefore, research in fault identification and condition monitoring is warranted. In this study, detecting wind turbine gearbox faults based on vibration acceleration data provided by the National Renewable Energy Laboratory (NREL) has been investigated. Data mining methods [1, 2] are applied to identify the faults in the time domain.

**NREL Gearbox Test Facility**

The data used in this research originate from a damaged gearbox of a test wind turbine. The gearbox was retested at the Dynamometer Test Facility (DTF) at NREL. To retest the gearbox, the complete nacelle, and the drive train of the test wind turbine were installed at the DTF. The nacelle was hard fixed to the floor without hub, rotor, and yaw bearing. Figure 1 shows the diagram of DTF.

The gearbox included three stages: low-speed stage (LSST), intermediate-speed stage (ISST), and high-speed stage (HSST). It was instrumented with over 125 sensors. Figure 2 shows the side view of the gearbox. As shown in Figure 2, the LSS is connected to the rotor and the HSS is connected to the generator.

To investigate the root cause of the gearbox damage and conduct the fault identification analysis, vibration data needed to be collected. Therefore, 12 accelerometers were mounted on the outside of the gearbox, generator, and main bearing to measure the vibration acceleration. Vibration data measured by all 12 accelerometers were collected at 40 kHz using a high-speed data acquisition system. Besides the vibration data, the corresponding torque of the low-speed shaft and the generator speed were recorded. The direction of the drive train vibration acceleration is described as a three-dimensional coordinate system and sensed by accelerometers. The origin of the coordinate system is the intersection of the planet carrier rotation axis and the plane cutting the torque arm cylinder in half along their length. The x-axis describes the system acceleration along the main shaft axis and the downwind side, and the y-axis represents the vibration acceleration direction, which is horizontally perpendicular to the x-axis. The z-axis is orthogonal to the x- and y-axes. Figure 3 illustrates the coordinate system of the vibration acceleration. Although the vibration acceleration of the system is depicted by a three-dimensional coordinate system, the mounted accelerometers can only sense one or two directions of acceleration. Table 1 presents the locations of the accelerometers, the measured directions of vibration acceleration and the units of the recorded data. Figure 4 illustrates the locations of 12 accelerometers.

Three test cases are conducted by NREL. In Case 1 the nominal speed of the high-speed shaft is set to 1800 rpm and the electricity power is set to 25 percent of the rated power. In Case 2 the nominal speed of the high-speed shaft is the same as in Case 1, but the electricity power is set to 50 percent of the rated power. This indicates that the torque in Case 2 is twice the amount of torque in Case 1. In Case 3 the generator speed is 1200 rpm and the torque is at 25 percent. The test length of all cases is the same, 10 min.

**Data Processing**

To analyze the gearbox vibration in the time domain, jerk is utilized. Jerk describes the rate of acceleration change, and it is often used to indicate the excitement of vibration. For the high-frequency vibration acceleration data in Section 2, the jerk is approximated in (1).

where J is jerk, *a* is acceleration, t is the time index, and T represents the sampling interval.

Since the sampling frequency is high (i.e., 40 kHz), the number of data points within 10 min length is large. Therefore, vibration, the high-frequency jerk data (40 kHz), is then converted into much lower-frequency data (1/15 Hz) by computing the mean of jerk at 15-s intervals. The standard deviation and the maximum value of the jerk data in each 15-s interval are also computed.

**Fault Identification Methodology**

In this section, clustering analysis [1] is utilized to investigate the failed components in the gearbox. Clustering analysis is an unsupervised method of data analysis. Clustering algorithms group observations into clusters by evaluating similarities among the observed data. The component failure can be identified by examining the pattern similarity of the jerk data measured by accelerometers mounted at different locations of the drivetrain.

The clustering analysis aims at grouping data from 12 sensors using the jerk data. The time series of the jerk described in the previous section are utilized in the clustering analysis. The k-means algorithm [3] is modified in this study to establish clusters. In the original version of k-means algorithm, the number of clusters, k, should be arbitrarily set by the analyst. In this study, a clustering cost function is introduced to evaluate the cluster quality with k.

The results of clustering analysis for Cases 1 and 2 are the same and illustrated by Figure 5. As shown there, the 12 accelerometers are classified to three clusters according to the modified k-means algorithm. In Cluster 1, most of the sensors sense the vibration acceleration of the gearbox low-speed stage. Cluster 2 contains data from two sensors that monitor the acceleration of the main bearing. Sensors that measure the vibration acceleration in the intermediate- and high-speed stage make Cluster 3. To further analyze the data in the three clusters, the Euclidean distance between the centroids of clusters is calculated. The shorter the distance, the more similar the two clusters are. Figure 6 demonstrates the cluster distances for Cases 1 and 2, respectively.

Since the gearbox test experiment was conducted to examine the failure components of the gearbox, the vibration of the main bearing was considered as normal in this research. In Case 1 of Figure 6, the distance between the centroids of Cluster 1 and Cluster 2 is 2.31 while the distance between the centroids of Cluster 3 and Cluster 2 is 5.72. The Case 2 demonstrated in Figure 6 presents a similar result. Based on the results in Figure 6, the components sensed by the accelerometers in Cluster 3 are considered to be primarily failed because the distance between Cluster 1 and Cluster 2 is small. Some components sensed by the sensors in Cluster 1 are also considered as failed since the vibration data from the two sensors installed for monitoring the same stage belong to two different clusters.

**Conclusion**

In this article vibration acceleration data of an impaired wind turbine gearbox provided by NREL were analyzed to identify the faulty stage of the gearbox. In the analysis the vibration acceleration data were transformed to the change rate of vibration acceleration. The correlation coefficient analysis and modified k-means clustering approach were introduced to identify the faulty stage of the gearbox. The suspected faulty stages of the gearbox were proved after the inspection of the gearbox by disassembling.

**References:**

1) A. Kusiak and A. Verma, A Data-Mining Approach to Monitoring Wind Turbines, IEEE Transactions on Sustainable Energy, Vol. 3, No. 1, 2012, pp. 150-157.

2) A. Kusiak and A. Verma, Prediction of Status Patterns of Wind Turbines: A Data-Mining Approach, ASME Journal of Solar Energy Engineering, Vol. 133, No. 1, 2011, pp. 011008-1 – 011008-10.

3) P.N. Tan, M. Steinbach and V. Kumar, Introduction to Data Mining, Boston, MA: Addison Wesley, 2006.