Spatial Analysis on School
1 Introduction
Education is the cornerstone of social and economic development, recognized globally for its transformative impact. The importance of education is underscored by Sustainable Development Goal 4, which advocates for inclusive and equitable quality education for all [1]. Similarly, Brunei Darussalam’s national vision, Wawasan Brunei 2035, prioritizes education as a fundamental driver of its development goals [2]. One critical factor in effective education is accessibility, which has spurred interest in understanding the spatial distribution of educational facilities.
This study employs spatial statistical methods, including Global Moran’s I and Local Getis-Ord \(G_i^*\), to analyze the spatial autocorrelation of schools across the country. It also evaluates whether schools are strategically located to serve the population effectively. Specifically, the study addresses three key research questions:
- Are schools in Brunei Darussalam spatially clustered or dispersed?
- If clustered, where are the clusters concentrated?
- Do school clusters align with areas of high population?
The motivation behind this study is inspired by Tobler’s First Law of Geography, which states that “everything is related to everything else, but near things are more related than distant things” [3]. The first two research questions examine the spatial correlation of school locations, while the third provides a practical example of how these results can inform analyses in social sciences and other fields.
Importantly, this study does not assess the quality of education, nor does it aim to address broader social science questions. Instead, the primary goal is to offer essential baseline data on the spatial distribution of schools, serving as a foundation for future research into educational equity and outcomes in Brunei, as well as the relationship between geography and education.
The paper is structured as follows: Section 2 reviews relevant literature, establishing the context and methodological framework. Section 3 introduces the study area and dataset. Section 4 outlines the methodologies, while Section 5 presents the results, identifying key patterns and trends. Finally, Section 6 concludes the paper, summarizing the findings, discussing implications, and proposing directions for future research.
2 Literature Review
Extensive studies have explored the development and general aspects of education in Brunei [4–7]. However, research utilizing quantitative or spatial methodologies to assess educational effectiveness remains limited. Notably, no existing work has provided a comprehensive spatial analysis of educational accessibility and its alignment with population needs. This paper seeks to address this research gap.
Spatial autocorrelation, the measurement of similarity between spatially distributed variables, has evolved significantly since its theoretical origins in the 19th century. Early ideas, such as Ravenstein’s exploration of distance effects on spatial phenomena, laid the groundwork for modern spatial analysis [8]. Its formalization began in the mid-20th century through the efforts of many researchers, including Michael F. Dacey and others who advanced the theoretical and practical tools for spatial analysis [8]. These collective advancements have established spatial autocorrelation as a widely used technique in geography, econometrics, and beyond, with applications ranging from cluster detection to modeling spatial relationships.
Several methods exist for measuring spatial autocorrelation, including Geary’s C, Moran’s I, and Getis-Ord statistics. Among these, Moran’s I is the most widely used [9]. A fundamental concept underpinning all spatial autocorrelation methods is the notion of spatial weight which quantifies neighbour relationships between regions on a map. If location i is a neighbor of location j, then \(w_{ij} \neq 0\), otherwise \(w_{ij} = 0\). Usually, a location i is not cosidered to be a neighbour of itself and hence \(w_{ij} = 0\). There are various versions of weights including:
- Contiguity-Based Weights
- Rook Contiguity: Spatial units share a common edge.
- Queen Contiguity: Spatial units share a common edge or vertex
- Distance-Based Weights
- Inverse Distance Weighting (IDW): Closer units have higher weights.
- Fixed Distance Weighting: Units within a specified distance have a weight of 1, others have a weight of 0.
- Nearest Neighbors (KNN): Each unit is assigned weights based on the K closest units.
For the purposes of this spatial study, contiguity-based (rook) weights is used. Mukims are treated as non-overlapping polygons, and the neighbour (rook) contiguity structure of the mukims is defined by the common boundary between two mukims.
3 Study Area and Data
3.1 Description of Study Area
Brunei Darussalam, commonly known as Brunei, is located on the northern coast of the island of Borneo in Southeast Asia. With an area of approximately 5,765 square kilometers, Brunei is bordered by the South China Sea to the north and surrounded by the Malaysian state of Sarawak. The nation’s territory is divided into two non-contiguous areas: The larger western section comprising Brunei-Muara, Tutong, and Belait districts; and the smaller eastern Temburong district. In the Northeast of the larger section lies Brunei’s capital, Bandar Seri Begawan.
The districts of Brunei are subdivided into 39 smaller administrative zones known as mukims, each embraces a number of kampongs (villages). Brunei’s geography is characterized by a mix of urban centers, dense forests, and coastal lowlands. More than 70% of the nation is covered with forests, with majority locating inland, southern parts of Belait and Tutong, as well as most of Temburong [10].
According to the 2021 census, Brunei has a population of approximately 445,000 [11]. The majority of the population is concentrated along the coastline, particularly in Bandar Seri Begawan, which serves as the administrative, cultural, and economic center of the nation. Brunei is a high-income country, boasting the second-highest per capita income and Human Development Index (HDI) in Southeast Asia, as well as the highest per capita Gross National Income (GNI) among OECD countries from 2005 to 2020 [12].
Education in Brunei is both free (for citizens) and compulsory for children aged 5 to 16, leading to a high literacy rate across the population. Given the nation’s wealth and commitment to education, it would be interesting to leverage spatial analysis in finding patterns and understanding how schools are clustered and distributed across the country.
3.2 Data Collection
The dataset comprises \(N = 252\) schools in Brunei Darussalam, sourced from Ministry of Education’s Brunei Darussalam Education Statistics 2018 [13]. The decision to use the 2018 dataset stems from the lack of detailed data in more recent publications, which only provide summary versions. Specifically, the 2018 dataset includes:
- A complete listing of all schools in Brunei by sector
- Categorization of pre-primary to sixth forms institutions from Ministry of Education (MOE Sector) into administrative clusters (Cluster 1–6)
- Student-teacher ratios and enrolment by sector and cluster
details which are not available in the summarised editions of the statistical book from recent years.
Since [13] is only available in PDF format, we converted it to a spreadsheet format using an online converter. The data was then extracted, cleaned, and reorganized in Microsoft Excel before being imported into R using the read_csv()
function.
In order to retrieve the latitudes and longitudes of the schools, the osmdata_sf()
function from the osmdata package was initially used. This approach, however, proved insufficient, as some schools were missing, and others had abbreviated names. Consequently, only partial location data was obtained. To address this, left_join()
was used to merge the available locations with the school listing, and the remaining coordinates were manually collected.
3.3 Preliminary Data Analysis
With the exception of Pusat Pembangunan Belia and Pusat Latihan Kesenian dan Pertukangan Tangan which serves as youth and community training centers, academic schools in Brunei are categorized into three main sectors: Ministry of Education (MOE), Ministry of Religious Affairs (MORA) and private institutions. The distribution of schools across these sectors includes 164 under MOE, 9 under MORA, and 77 private, comprising approximately 70% public (MOE, MORA) and 30% private. Generally, from Figure 2, it seems that schools in Brunei are located near the shoreline, particularly towards the South China Sea.
In the MOE sector, schools from pre-primary to sixth form are organized into Clusters 1 to 6. While the number of schools in each cluster is relatively balanced, Clusters 3 and 4 have notably higher class counts and students, followed by Clusters 1 and 2, with Clusters 5 and 6 having the lowest.
Cluster | School | Class | Student |
---|---|---|---|
Cluster 1 | 25 | 453 | 9,505 |
Cluster 2 | 26 | 486 | 9,606 |
Cluster 3 | 25 | 566 | 11,064 |
Cluster 4 | 27 | 505 | 10,648 |
Cluster 5 | 29 | 379 | 6,183 |
Cluster 6 | 21 | 359 | 6,884 |
In regards to student-teacher ratio, we concentrate on pre-primary through sixth-form schools, excluding vocational and higher education institutions due to their inconsistent structures and varying class arrangements. Across districts, Belait and Brunei-Muara have relatively higher student-teacher ratio (about 10) compared to Temburong and Tutong (approximately 7.6). By sector, MOE and MORA school shares similar values, whereas private schools have nearly double the student-teacher ratio, except in the Temburong district.
District | Student | Teacher | Student-Teacher Ratio |
---|---|---|---|
Belait | 12,955 | 1,239 | 10.50 |
Brunei Muara | 68,188 | 6,892 | 9.89 |
Temburong | 1,893 | 248 | 7.63 |
Tutong | 9,029 | 1,180 | 7.65 |
Sector | Student | Teacher | Student-Teacher Ratio |
---|---|---|---|
MOE | 53,890 | 6,574 | 8.20 |
MORA | 5,483 | 670 | 8.18 |
Private | 32,692 | 2,315 | 14.10 |
4 Methods
This section provides detailed descriptions of the spatial autocorrelation methods used to analyse the hostpots and clusters of schools. Due to the relatively low amount of schools in Brunei (\(N = 252\)), the spatial autocorrelation analysis will consider all schools as whole, instead of by sector or cluster. To assess the relationship between count of schools and population, linear regression model is used.
4.1 Global spatial autocorrelation (GISA): Global Moran’s I
To examine whether schools in Brunei exhibit a clustered, dispersed, or random spatial pattern, we apply the Global Moran’s I test [14] using the global_moran_test()
function from the sfdep
package. This test is computed for each mukim in the study area, indexed by \(i, j = 1, 2, \ldots, N\). The Moran’s I test statistic is defined as follows:
\[ I = \frac{N}{\sum_{i=1}^N \sum_{j=1}^N w_{ij}} \frac{\sum_{i=1}^N \sum_{j=1}^N w_{ij} (x_i - \bar{x})(x_j - \bar{x})}{\sum_{i=1}^N (x_i - \bar{x})^2} \in [-1,1], \]
where:
- \(x_i\) is the value of the study variable (count of schools) in mukim \(i\),
- \(\bar{x}\) is the mean number of schools per mukim,
- \(w_{ij}\) is the spatial weight between mukims \(i\) and \(j\).
For simplicity, rook contiguity neighbours is used for the spatial weights, as discussed in Section 2. This approach assigns \(w_{ij} = 1\) if mukims \(i\) and \(j\) share one or more boundaries, and \(w_{ij} = 0\) otherwise.
Moran’s I values are standardized, with values close to \(+1\) indicating positive spatial autocorrelation (i.e., clustering), where high or low values are near each other. Values close to \(-1\) indicate negative spatial autocorrelation (i.e., dispersion), where neighboring values differ significantly. Values near \(0\) suggest randomness, indicating an absence of spatial pattern. Figure 3 shows the three configurations of areas.
To determine the significance of the Moran’s I statistic, we employ the Central Limit Theorem to calculate p-values based on a Z-score, allowing us to test the following hypotheses:
- \(H_0: I = 0\) (no spatial autocorrelation),
- \(H_1: I \neq 0\) (presence of spatial autocorrelation).
4.2 Local spatial autocorrelation (LISA): Local Getis-Ord
While a visual inspection suggests that certain kampongs may have a higher concentration of schools, we aim to quantify this pattern. Whereas global spatial autocorrelation tests confirm whether clustering exists, we use the Getis-Ord \(G_i^*\) statistic [16] to identify the specific areas where schools are concentrated. This statistic is computed using the hotspot_gistar
function from the sfhotspot
package.
In our analysis, the study area is subdivided into \(n\) square grids, indexed by \(i=1, 2, \ldots, n\). By default, the hotspot_gistar
function automatically sets the grid size to be 3,400 square meters. For each grid cell \(i\), the \(G_i^*\) statistic is calculated as:
\[ G_i^* = \frac{\sum_j w_{ij} x_j}{\sum_j x_j} \]
where:
- \(x_j\) is the value of the study variable (count of schools) for grid cell \(j\),
- \(w_{ij}\) is the spatial weight between grid cells \(i\) and \(j\).
Similar to global spatial autocorrelation in Section 4.1, the spatial weights used are based on rook contiguity neighbours. However, there is one slight modification: the spatial weights \(w_{ii}\) are set to 1 rather than 0. This adjustment gives \(G_i^*\) a more localized perspective, which is valuable for identifying clusters centered directly on a point of interest rather than merely in its surrounding areas.
A statistically significant high \(G_i^*\) value indicates a “hotspot” or a cluster of high values, whereas a low \(G_i^*\) value suggests a “coldspot” or a cluster of low values.
To highlight only significant hotspot clusters, the output was filtered to include only values with \(G_i^* > 0\) and p-value \(< 0.05\). The output dataset is then cropped to Brunei’s boundary using st_intersection
to refine the analysis. This method identifies school hotspots, areas where there are more schools than would be expected if they were distributed randomly.
4.3 Linear Regression
To assess the relationship of count of schools and population (by mukim), linear regression model is used. For mukim indexed \(n=1,2,\ldots,N\), let \(Y\) be the count of schools and \(X\) be the population. Assuming a straightforward linear relationship between pairs of observations \(\{(Y_n,\mathbf X_n) \}_{n=1}^N\), this model is given by \[ Y_n = \beta_0 + \beta_1X_n + \epsilon_n \] Here, \(\epsilon\) is a term quantifying the errors or inadequacies of the model, and the least squares estimates of the coefficients aim to minimise the sum of these squared errors. This simple model allows us to quantify the effect of each covariate on the count of schools, and to even make predictions about the count of schools in a particular mukim (hypothetical or not) given its population.
5 Results
The results section is organized into three parts, each corresponding to one of the topic questions introduced in Section 1.
5.1 Are the schools clustered?
The Global Moran’s I analysis yielded an I value of 0.457. The positive Moran’s I statistic suggests a positive autocorrelation in the count of schools across mukims. Given the statistically significant results (low p-value of \(\mathbf{4.54 \times 10^{-6} < 0.001)}\)), there is sufficient evidence to reject the null hypothesis \(H_0\), which assumes no spatial autocorrelation in the distribution of schools.
This finding supports the presence of a moderate to strong clustering tendency, implying that mukims with a similar number of schools, whether high or low, are geographically close to each other. The results of the Global Moran’s I test align with our visual observation as observed in Figure 4 below.
Hypothetical distribution of schools if not clustered (random shuffle)
Visually, we also see that the clusters (mukim with brighter colors) are concentrated near the coastal regions. We will verify this using local spatial autocorrelation in the following subsection.
5.2 Locations of School Clusters
As highlighted in Figure 5, the primary concentration of schools is located in central Mukim Brunei-Muara, the capital district of Brunei. This is unsurprising given its status as the nation’s capital, urban, and administrative center. Other notable clusters outside the capital are located in:
- Temburong District: Mukim Bangar
- Tutong District: Mukim Pekan Tutong, Mukim Telisai
- Kuala Belait District: Mukim Kuala Belait, Mukim Seria
This result confirms that school clusters do indeed concentrate near the coastal regions. Furthermore, schools appear to be less abundant or accessible in the outskirts and areas outside the capital district, Brunei-Muara.
Another insight of the Getis-Ord analysis is its ability to pinpoint specific areas of clustering within each mukim, offering an advantage over the choropleth map (Figure 4). For example, schools cluster in the northeastern areas of Mukim Telisai but are more concentrated toward the south in Mukim Bangar (refer Figure 5). This level of detail enables a more precise understanding of spatial clustering patterns.
5.3 Comparison to Distribution of Population
With statistically significant result from linear regression (p-value of \(2.54 \times 10^{-7} < 0.01\)), we observe a positive correlation between count of schools and population by mukim. For every 10,000 increase in population, the predicted count of schools increases by 3.5. With \(R^2 = 0.558\), the model explains approximately 55.8% of the variability in the count of schools, indicating a moderate fit.
Due to the moderate \(R^2\) value, we now attempt to conduct a visual inspection. Although it seems to suggest a general alignment between schools and population hotspots by kampong level (Figure 7), a more detailed comparison reveals some notable patterns. Namely, when we examine the top 10 kampongs by school count and by population (Table 4) only three kampongs, namely Kg. Mata-Mata, Kg. Panaga, and Kg. Sungai Akar, are shared across both lists.
At the mukim level (Table 5), the overlap is more prominent, with six mukims appearing in both top 10 lists for school count and population. This indicates that, while schools are generally located in highly populated mukims, they may not always be centered within the kampongs with the highest populations. Instead, the schools may be distributed across several kampongs within a populous mukim, possibly for reasons such as accessibility, land availability, or local demand variations.