Volume 7, Issue 5, October Issue  2019, Pages:452461 
Authors: Mohit Nain, B.K. Hooda 
Abstract: Rainfall analysis is vital phenomenon for farming creation as well as for regulatory purposes and plays an important role in designing water harvesting structures as well as crop planning. In present study, monthly rainfall data of 42 years (19702011), covering 27 rain gauge stations of Haryana was used for the classification and identification of homogeneous rainfall stations in Haryana. Clustering of rainfall stations for monsoon period was done by utilizing Ward's method on the common principal components scores (CPCs). The results of present study showed that there are four clusters of rain gauge stations having similar monsoon rainfall spread over Haryana. Cluster I consisted of six stations; Cluster II consisted of eight stations; Cluster III consisted of ten stations and Cluster IV consisted of three stations. Cluster analysis of mean monthly rainfall was also performed by using Ward’s method. On the basis of mean monthly rainfall we observed that there are again four clusters of rain gauge stations. Cluster I consisted of five stations; Cluster II consisted of eight stations; Cluster III consisted of ten stations while Cluster IV consisted of four stations. The two analyses gave the patterns in close agreement and it was found that Haryana can be grouped into four clusters based on Monsoon rainfall. 
[Download PDF] 
Full Text: 1 Introduction Rainfall is one of the most variable climatic characters and its variability varies with both spatially and temporally. India is a tropical country and its agricultural planning and water utilization mainly depends on monsoon rainfall. More than 75% of the India rainfall occurs during the monsoon season. The Agriculture of the state depend on the rainfall received and rainfall characteristics like magnitude, frequency and intensity vary from spatially and temporally both.The random nature of rainfall occurrence suggests need for its statistical analysis and logical interpretation. In particular, the monthly rainfall of a region is very helpful for farmers in deciding when and where to sow and reap for successful cultivation with proper utilization of available water and irrigation facilities. The Eastern agro climatic zone of Haryana has high rainfall (>400mm) whereas the Western agro climatic zone has lesser amount of rainfall (200400mm) and maximum rainfall reaches to 800mm in Northern districts of Panchkula, Ambala, Yamunanagar, Kurukshetra etc. (Agro climatic Atlas of Haryana, Technical Bulletin No. 15, 2010). There are two main cropping patterns in Haryana i.e. Rabi and Kharif . Wheat is the main crop of Rabi season and second main crop season is Kharif coinciding with hot weather and southwest monsoon season. In Kharif season main crops are rice (Eastern agro climatic zone) and cotton (Western agro climatic zone). Due to frequent abnormalities in the magnitude and distribution of rainfall make the cropping more risky. Multivariate techniques are very useful tools to find hydro logically homogeneous regions and to classify regions based meteorological data such as rainfall. Gadgil & Iyengar (1980) applied principal component analysis to derive patterns of temporal variation of the rainfall at fiftythree stations in peninsular India and eight clusters was found. Further, Kulkarni & Rao (2000) used Common Principal Components (CPC) approach for classification of the 20 districts of Andhra Pradesh based on monthly rainfall data. Similarly, Kulkarni & Reddy (1994) used average linkage method to group the districts of Andhra Pradesh and found that districts were classified into 5 to 7 clusters which depend on the season. Further, MunozD?az & Rodrigo (2004) used Ward’s clustering methods and principal component analysis technique to find out climatically homogeneous zones, based on seasonal rainfall for 32 Spanish localities and found that cluster analysis technique to more suitable than principal component analysis. Similarly, cluster analysis technique was used by various researchers in various regions of India for indentifying homogeneous rainfall regimes, among these some most popular works are Venkatesh & Jose, 2007 (Western Ghats region of Karnataka); Yashwant & Sananse, 2015 (Marathwada region in Maharashtra) and Shirin & Thomas, 2016 (Kerala). OliveiraJúnior et al. (2017) identified three homogeneous rainfall regions in Tocantins State, Brazil using Ward's algorithm of cluster analysis. Similarly, Terassi & Galvani (2017) also identified the homogeneous rainfall regions in the eastern watersheds of the State of Paraná, Brazil. Recently, SirajUdDoulah & Islam (2019) analyzed monthly rainfall data from 34 climate stations of Bangladesh using five agglomerative hierarchical clustering measures and found that Ward method based on Euclidean distance, Kmeans, Fuzzy were the most suitable methods in this particular case. They found seven different climate zones in Bangladesh. Similarly, Gonçalves et al. (2018) used annual mean precipitation and found six homogeneous regions through cluster analysis using Ward's agglomeration method, applied to a historical series of 31 years (19601990) at 413 satellite monitoring points in the state of Pará, in the Amazon where the selected years occurred during an El Niño or a La Niña event.The aim of this study was to identify homogeneous regions (raingauge stations) in Haryana using cluster analysis and common principal component analysis techniques. For the study monthly rainfall data of 42 years (19702011), covering 27 rain gauge stations of Haryana was used for the identification of homogeneous rainfall stations in Haryana. 2 Material and Methods 2.1. Location of study and Data The state Haryana is located in north western India and occupies 1.3 per cent geographical area of the country. The latitude and longitude coverage of the state extends between 27^{0}39' to 30^{0}55'N and 74^{0}27' to 77^{0}36'E respectively. The data for this study include monthly rainfall data obtained from Indian Meteorological Department (IMD) Pune covering 27 rain gauge stations scattered in all the districts of Haryana state. Depending on the availability, 42 years’ data (19702011) were obtained for rain gauge stations Fatehabad, Gurgaon, Sohana, Jind, Narwana, Firozpur Jhirka, Nuh, Panipat, Rohtak, Sonipat, Sirsa, Hisar, Bawal, Karnal, Ambala and Kaithal. For the stations Tohana, Jhajjar, Dujana, Kalka, Dadupur and Jagadhari the data for the 2006 was missing. Also for stations Dadri, Ballabgarh, Thanesar, Hassanpur and Narnaul the data were available for 36 years (data from 19841990 was not available). 2.2. Ward’s Cluster Method Cluster analysis (CA) is a convenient method for identifying homogenous groups of objects called clusters. There are number of methods that can be used to carry out cluster analysis and in this study, Ward’s (1963) method of cluster analysis was used which is also known as “minimum variance method”. This method is different from other hierarchical clustering methods because it uses an analysis of variance approach to evaluate the distances between clusters. In Ward’s method the within cluster sum of squares is minimized and clusters with minimum betweencluster distance are merged. Let we have two clusters Ck and Cl which are merged to form a new cluster Cm , then the Euclidean distance between the new cluster and another cluster Cj is given by the formula: dj,m=nj+nkdjk+nj+nldjlnjdklnj+nm Where nj , nk , nl and n_{m} are the number of objects in clusters j, k, l and m, respectively and djk , djl and dkl represent the distances between the observations in clusters j and k, between j and l, and between k and l, respectively (Ramos, 2001). The, Ward’s algorithm can be implemented by updating a stored Euclidean distance between cluster centroids. Although clustering results may be sensitive to the chosen method, Blashfield (1976) found that the Ward’s method provides the most accurate solutions among the hierarchical methods 2.3.1. Clustering under Multiple Sampling Common principal components approach (Kulkarni & Rao, 2000) was used and described as: let we have ‘n’ objects which are to be classified into k (< n) homogeneous groups. Suppose that the jth object has observations (j = 1..., n) which are recorded by drawing a random sample of size Nj from it. Let X be the random vector consisting of p variables, then Xij represents the ith observation vector on the jth object (i = 1,..., Nj , j = 1,..., n). Thus on the basis of the observation vector Xij the n objects are to be classified into k (<n) distinct groups. This approach involves determining a vector subspace which represents the vector subspaces of all the objects as closely as possible. Several developments have taken place in this field (Flury, 1988). Suppose that principal component analysis has been carried out for each of the ‘n’ objects. Furthermore, the first q (< p) principal components are adequate for summarizing the total variance of each of the covariance matrices. Let Lt (q x p) be the matrix of these vectors corresponding to the tth object (t = 1…, n) whose rows are the Eigen vectors of the pprincipal components. Let ∑ is the covariance matrix and H(p x p) = Lt'Lt be a matrix whose first q (<k) principal components represent the "common principal components”. For obtaining Common Principal Components, each of the covariance matrices of the rain gauge stations were subjected to principal component analysis. It was observed that the first 3 PCs accounted for at least 85 per cent of the sample variance and so adequately summarized the total variance of the 4 rainfall variables in all the 27 stations. Hence the matrix Lt (t = 1, 2,....., 27) was defined on the basis of the first 3 components. Using these CPCs, component scores for the stations (based on mean rainfall) are obtained and clustering was carried out on the basis of these scores. 3 Results and Discussion For clustering various rainfall stations data for monsoonperiod (JuneSeptember), common principal components approach was used. It was observed that the first 3 PCs accounted for at least 86 per cent of the sample variance and so adequately summarized the total variance of the 4 rainfall variables (JuneSeptember) in all the 27 stations. Hence the matrix Lt (t=1,..., 27) was defined on the basis of the first 3 components. The results of the principal component analysis of the matrix H =Lt'Lt given below which give the common principal components are presented in Table 1. H = 
Blashfield RK (1976) Mixture Model tests of Cluster Analysis: Accuracy of four Agglomerative Hierarchical Methods. Psychological Bulletin 83: 377–388. Flury B (1988) Common Principal Components and Related Multivariate Models. John Wiley and Sons, New York, USA. Gonçalves M F, Blanco CJC, Santos VC dos, Oliveira LL dos S (2018) Homogenous regions and rainfall probability models considering El Niño and La Niña in the State of Pará in the Amazon. Acta Scientiarum Technology 40: 1:10. Gadgil S, Iyengar RN (1980) Cluster Analysis of Rainfall Stations of the Indian Peninsula. Quarterly Journal of the Royal Meteorological Society 106: 873886 Kulkarni BS, Rao GN (2000) The Common Principal Components a for Clustering Under Multiple Sampling. Journal of Indian Society for Agricultural Statistics 53: 111. Kulkarni BS, Reddy DD (1994) The Cluster Analysis Approach for Classification of Andhra Pradesh on the Basis of Rainfall. Mausam 45: 325332. MuñozDíaz D, Rodrigo FS (2004) Spatiotemporal patterns of Seasonal Rainfall in Spain (19122000) Using Cluster and Principal Component Analysis: Comparison. Annals of Geophysics 22:14351448. OliveiraJúnior JF de, Xavier FMG, Teodoro Pe,Gois D de , Delgado RC (2017) Cluster Analysis Identified Rainfall Ramos MC (2001) Divisive and Hierarchical Clustering Techniques to Analyze Variability of Rainfall Distribution Patterns in a Mediterranean Region. Atmospheric Research 57: 123–138. Singh D, Singh R, Anuragh SC, Rao VUM, Singh S (2010) Agro Climatic Atlas of Haryana. Technical Bulletin No. 15, Department of Agricultural Meteorology, CCS HAU, Hisar.Pp 80. Shirin AHS, Thomas R (2016) Regionalization of Rainfall in Kerala State. Procedia Technology 24: 1522. Swaminathan S, Meganathan S (2018) Identifying efficient clustering techniques for classifying rainfall data. Asian Journal of Microbiology, Biotechnology and Environmental Sciences 20:566568 SiralUdDoulah M, Islam MN (2019) Defining Homogenous climatic Zones of Bangladesh using Cluster Analysis. International Journal of Statistics and Mathematics 6:119129. Terassi PMB, Galvani E (2017) Identi?cation of Homogeneous Rainfall Regions in the Eastern Watersheds of the State of Paraná, Brazil. Climate 53: 113. Venkatesh B, Jose MK (2007) Identification of Homogeneous Rainfall Regimes in Parts of Western Ghat Regions of Karnatka. Journal of Earth System Science 116:321329. Ward JH (1963) Hierarchical Grouping to Optimize an Objective Function. Journal of American Statistical Association 58: 236–244. Yashwant S, Sananse SL (2015) Comparisons of Different Methods of Cluster Analysis with Application to Rainfall Data. International Journal of innovative Research in Science, Engineering and Technology 4: 1086110872.
