1 Introduction
Education is a foundational pillar of national development and its people, influencing social well-being, economic growth, and long-term sustainability. The global significance of education is recognized in Sustainable Development Goal 4, which promotes inclusive and equitable quality education for all (Assembly 2015). Nationally, Brunei Darussalam’s national vision, Wawasan Brunei 2035, positions education as a cornerstone of the country’s long-term development goals (Brunei Darussalam 2020). Ensuring equitable access to education through sufficient infrastructure and fair resource distribution is critical to delivering quality learning experiences.
While several studies have examined general aspects of education in Brunei, there have been limited studies based on quantitative methods (Ebil and Shahrill 2023; Abdul Latif, Matzin, and Escoto-Kemp 2021; Salbrina, Deterding, and Nur Raihan 2024; Mohamad et al. 2018). This project aims to addresses that gap by conducting spatial analysis on schools across the mukims in Brunei, specifically using Bayesian hierarchical models to identify adminstrative regions in Brunei where school availability falls significantly below the national baseline, supporting future policy planning and school placements.
2 Data
This study focuses exclusively on government primary and secondary schools in Brunei Darussalam, as these institutions serve as the main access points to education for most youth. Bayesian spatial modeling was conducted using Brunei’s 39 mukims (administrative region) as the unit of analysis (areal analysis) to achieve a balance between geographic detail and interpretability. Key data variables that were used include school counts, administrative boundary data, population, and house prices. These datasets were mostly sourced from Bruneiverse Github Page, particularly via the bruneimap R package, and were subsequently cleaned, wrangled, and merged primarily using left_join() and rbind() (Jamil 2025a; Jamil et al. 2025).
The school dataset from 2018 was used as it is the most recent year for which disaggregated school-level data is available. Although more recent statistics exist, they are published only in summary form. Population data is drawn from the 2021 national census, the most recent census available in Brunei, despite the mismatch in years with the school dataset. Brunei conducts its national census every ten years, making the 2021 data the best option for population estimates.
To incorporate a socioeconomic indicator, we used spatiotemporal trend in average house prices from the existing literature (Jamil 2025b). These were calculated at the mukim level and included as a covariate in the Bayesian model. In cases where house price data were missing, values were imputed using predictions from an INLA-based Gaussian model. Manual imputations based on local knowledge were initially tested, but the INLA-predicted values were ultimately adopted, as both methods produced similar model outcomes. Given the nature of the data, house prices were treated as partially simulated estimates and may not fully reflect actual market values.
3 Methods
Choropleth map is used for exploratory data analysis to give an overview of school counts per mukim and overall geography of Brunei.
3.1 Spatial regression model
Let \(Y_i\) and \(E_i\) denote the observed and expected counts of schools, respectively, in mukim \(i \in \{1, \dotsc, n\}\). Let \(\theta_i\) represent the relative abundance of schools in mukim \(i\), analogous to a relative risk in disease mapping. The Bayesian hierarchical model (Banerjee, Carlin, and Gelfand 2003), incorporating spatial Besag-York-Mollié (BYM) model (Besag, York, and Mollié 1991) is specified as follow:
\[ Y_i \mid \theta_i \sim \text{Poisson}(E_i \cdot \theta_i), \quad i = 1, \dotsc, n \]
\[ \log(\theta_i) = \beta_0 + \beta_1 \cdot \text{pop youth}_i + \beta_2 \cdot \text{area}_i + \beta_3 \cdot \text{hp}_i + u_i + v_i, \]
where:
- \(\beta_0\) is the intercept,
- \(\beta_1\), \(\beta_2\), and \(\beta_3\) are regression coefficients for the standardized covariates:
- \(\text{pop youth}_i\): population youth aged 0-24 (in units of 10,000),
- \(\text{area}_i\): mukim size (in units of 10 km²),
- \(\text{hp}_i\): spatiotemporal trend in average houseprice,
- \(u_i\) is a structured spatial effect, modelled using an intrinsic conditional autoregressive (CAR) prior \(u_i \mid u_{-i} \sim \mathcal{N}(\bar{u}_{\delta_i}, \frac{1}{\tau_u n_{\delta_i}})\)
- \(v_i\) is an unstructured random effect, \(v_i \sim \text{Normal}(0, \frac{1}{\tau_v})\)
The spatial random effect \(u_i\) requires a neighborhood (adjacency) matrix. Here, We define two mukims as neighbors if they share at least one boundary point (Queen contiguity). The neighborhood graph is constructed using the poly2nb() function from the spdep package.
Model fitting was performed in a Bayesian framework using the Integrated Nested Laplace Approximation (INLA) (Rue, Martino, and Chopin 2009). Model adequacy with respect to spatial structure is assessed by by examining the spatial autocorrelation ofstandardiszed Pearson residual defined as: \[residual_i = \dfrac{Y_i - \mu_i}{\sqrt{\mu_i}} = \dfrac{Y_i - E_i \cdot \theta_i}{\sqrt{E_i \cdot \theta_i}},\] where \(Y_i\) is the observed count, \(E_i\) is the expected count (offset), and \(\theta_i\) is the relative risk. Significant residual spatial autocorrelation may indicate unaccounted spatial structure in the model.
The expected counts are computed using indirect standardization: \[E_i = \sum_{j=1}^{n} Y_j \cdot \frac{\text{pop}_i}{\sum_{j=1}^{n} \text{pop}_j}\]
3.2 Global Moran’s I
To examine whether Pearson residuals of our model exhibit a clustered, dispersed, or random spatial pattern, we apply the Global Moran’s I test using the global_moran_test() function from the spdep package. This test is computed for each mukim in the study area, indexed by \(i, j = 1, 2, \ldots, N\). The Moran’s I test statistic is defined as follows:
\[ I = \frac{N}{\sum_{i=1}^N \sum_{j=1}^N w_{ij}} \frac{\sum_{i=1}^N \sum_{j=1}^N w_{ij} (x_i - \bar{x})(x_j - \bar{x})}{\sum_{i=1}^N (x_i - \bar{x})^2} \in [-1,1], \]
where:
- \(x_i\) is the Pearson residual in mukim \(i\),
- \(\bar{x}\) is the mean residual per mukim,
- \(w_{ij}\) is the spatial weight between mukims \(i\) and \(j\).
For simplicity, the same Queen Contiguity neighbour is used. Moran’s I values are standardized, with values close to \(+1\) indicating positive spatial autocorrelation (i.e., clustering), where high or low values are near each other. Values close to \(-1\) indicate negative spatial autocorrelation (i.e., dispersion), where neighboring values differ significantly. Values near \(0\) suggest randomness, indicating an absence of spatial pattern. Figure 1 shows the three configurations of areas.
To determine the significance of the Moran’s I statistic, we employ the Central Limit Theorem to calculate p-values based on a Z-score, allowing us to test the following hypotheses:
- \(H_0: I = 0\) (no spatial autocorrelation),
- \(H_1: I \neq 0\) (presence of spatial autocorrelation).
4 Results
4.1 Exploratory Data Analysis
Figure 2 shows that mukims with higher schools count located in the coastal region near South China Sea. As expected, majority of mukims with high school count are in Brunei-Muara District (north-east of mainland), the most populated and urban district in Brunei.
4.2 Model
The model results show that the intercept is estimated at \(\hat{\beta}_0 = 0.824\), with a 95% credible interval of \((0.430, 1.215)\). The coefficient for youth population, \(\beta_1 = -0.123\), with a 95% credible interval of \((-0.163, -0.083)\), indicates a statistically significant negative relationship between population size and relative school abundance. Specifically, for every 10,000 increase in population, the relative abundance of schools decreases by approximately 12%, since \(\exp(-0.123) \approx 0.884\). In contrast, the coefficient for mukim size is positive \(\hat{\beta_2}=0.018\) (95% CI: 0.001, 0.036). However, the house price is not statistically significant \(\hat{\beta_3}=0.001\) (95% CI: -0.002, 0.003).
The estimated relative abundance (RA) of schools across mukims (Figure 3) indicates lower values in the northern coastal mukims, particularly in northern Brunei-Muara District, along the South China Sea. Conversely, higher RA values appear inland, especially in less densely populated regions.
To identify mukims with potentially inadequate school provision, non-exceedence probabilities were computed for a threshold of \(RA <0.7\). The analysis (Figure 4) indicates that it is highly likely that several mukims fall below this threshold, including Mukim Sengkurong, Mukim Gadong A, Mukim Gadong B, Mukim Berakas B, and Mukim Mentiri.
Lastly, the global Moran’s I statistic of \(0.213\) (\(p<0.05\)) provides statistically significant evidence to reject the null hypothesis of no spatial autocorrelation in the residuals.
5 Discussion
The results in Section 4.2 offer insights into both overall model performance and patterns of educational equity. Global Moran’s I statistic (0.213) is not substantially high (<0.3), suggesting that the model captures most, but not all, of the residual spatial dependence. This is illustrated in Figure 5, where the variation in residual colors suggests a reasonably good fit, though further improvement may be possible with additional spatial covariates or refined random effects. The negative relationship between population and school counts, as well as higher RA values in inland (rural) mukims than coastal areas with larger population suggests that school availability in rural mukims seem to be adequate, potentially due to legacy planning policies or intentional efforts to ensure equitable access in remote areas.
In contrast, urban and peri-urban mukims in the Brunei-Muara District, the country’s most densely populated and economically active region, show signs of disparities in school access. Notably, the high non-exceedance probability of \(RA < 0.7\) in Mukim Sengkurong, Gadong A & B, Berakas B, and Mentiri indicates that they have fewer schools than expected relative to their population sizes. These disparities highlight areas that may be underinvested in educational infrastructure, warranting closer policy attention.
Importantly, these mukims are also encompass several new government housing developments, such as Perpindahan Lugu and Perpindahan Tanah Jambu. This suggests a potential planning gap, where population is increasing due to housing developments, but educational infrastructure has not yet caught up.
However, inspection of Figure 2 reveals that these areas have a similar number of schools compared to their neighboring mukims. This suggests that while the absolute number of schools may not be unusually low, the rapid increase in population within these new housing areas may have outpaced school capacity, leading to a situation where demand exceeds supply within these specific zones.
Nevertheless, this situation represents a strategic opportunity. Prioritizing school construction in these fast-growing neighborhoods could significantly improve access to education, reduce commute times, lower transportation costs, and enhance the overall quality of life for residents. Locating schools closer to homes supports national goals related to sustainable urban development, walkability, and equity in public services.
5.1 Limitations
This study is limited to public schools, excluding private and international institutions that may affect school availability in some areas. The data is from 2018, potentially missing recent changes, especially in fast-growing neighborhoods. Next, our model used the general population data and is not age-standardized, are important for demand estimation. Housing price that was used as a proxy for socioeconomic status are missing for some mukims and is therefore imputed, introducing further uncertainty. Lastly, alternative models (Negative Binomial, zero-inflated Poisson) can be explored to better han- dle overdispersion.
6 Conclusions
In summary, our model reveals a negative relationship between size of population youth and relative school abundance, suggesting urban areas have fewer schools per capita than rural ones. Several mukims in the Brunei-Muara District, including Sengkurong, Gadong A & B, Berakas B, and Mentiri, appear to have fewer schools than expected, despite ongoing population growth driven by new housing developments. While total school counts may seem adequate, local demand in these areas may exceed capacity. Addressing this gap offers a strategic opportunity to improve access, reduce travel time, and support equitable urban planning.