Population Dynamics and Spatial Dependence: Evidence from Brazilian Cities ∗

: This paper focuses on Brazil’s population dynamics between 1970 and 2010. The ﬁrst objective is to explore the behavior of Brazil’s population distribution, revisiting the traditional rank-size rule and Markov chain approaches. To increase the accuracy of the information on the dynamics and evolution of the population distribution, spatial dependence is introduced through spatial Markov chains. The distribution shape may indicate that divergence in population size of minimum comparable areas (MCAs) is decreasing. The Zipf’s law estimation indicates that the population distribution is, every decade, moving away from Pareto law. The Markov chain approach indicates, as main evidence, the high persistence of MCAs to remain in their own class size from one decade to another over the entire period, and that diﬀerent spatial contexts have diﬀerent eﬀects on regional transitions.


INTRODUCTION
Between 1970 and 2010, Brazil's population became increasingly more concentrated in its cities. Using data from the United Nations Population Division, we find that the percentage of Brazilians living in urban centers increased from 55.8 percent in 1970 to 87 percent in 2010, which for a developing country is a high proportion. This change, which reflects great economic and social transformations that occurred in the country during this period, took the form of specific cities' size distributions. Data from the Brazilian Institute of Geography and Statistics (IBGE, 2011) indicates that 66 municipalities had more than 100,000 inhabitants in 1970, 105 municipalities in the early 1980s, 398 in 20001980s, 398 in , and 488 in 20101980s, 398 in . From 20001980s, 398 in to 2010, the number of cities with over 1 million inhabitants only increased from 13 to 14.
Brazil, in the 1970s, experienced a phase of industrial consolidation. This phase, combined with the gradual replacement of subsistence farming by exportable products, contributed to the consolidation of old, and the emergence of new, metropolises in Brazil. During the next two decades, Brazil experienced rising urban land prices, which was the result of an accelerated urbanization process. Brazil's rapid urbanization generated negative externalities, such as the congestion of public services and a decentralization of population growth. As asserted by Martine (1994), the expansion of Brazil's agricultural frontiers to the mid-west and of the tertiary sector also contributed to urbanization in this region. Between 2000 and 2010, increases in Brazil's production of commodities and agricultural products generated jobs in towns far away from large traditional urban centers. However, an entire decade of price stability and income growth, mainly in the peripheral regions, and the new potential relevance of the big cities' human capital created additional conditions for the population dynamics in Brazil's cities (Da Mata et al., 2007;Silva et al., 2017).
In summary, three social forces are acting to shape the size distribution of Brazilian cities. First, agglomeration forces were associated with the industrialization process and, more recently, with local concentration of human capital. These forces tend to increase the importance of traditional big cities and metropoles. Additionally, both the problems associated with the congestion or insufficiency of public services and goods provision and the expansion of agriculture frontiers acted to stimulate the growth of small-and mediumsized cities. Despite the great transformations experienced in Brazil between 1970 and 2010, the number of large cities (over 1 million) was relatively stable and, by contrast, many small-sized towns grew to be medium-sized.
In this research, we aim to increase the understanding of the impacts of these forces by answering two basic questions associated with city size distribution in Brazil. First, how did cities of different sizes grow during the period 1970 to 2010? Second, how are the movements within the distribution affected by spatial dependence? Specifically, we investigated the evolution of city size distribution and how this is affected by the location of the cities relative to their neighbors. The last point is motivated by the evidence provided by Resende (2013) and Silva et al. (2017): spatial context is essential for understanding the economic and population dynamism of Brazilian cities.
City size distribution has been extensively studied (Pumain and Moriconi-Ebrard, 1997;Dobkins and Ioannides, 2000;Black and Henderson, 2003;Gabaix and Ioannides, 2004;Le Gallo and Chasco, 2008;Lalanne, 2013;Soo, 2014), but most empirical studies consider only developed countries and few explicitly account for spatial dependence. Among these studies, Dobkins and Ioannides (2000) found evidence of divergent growth if spatial dependence is ignored but found convergent growth in the presence of spatial effects. Le Gallo and Chasco (2008) analyzed the evolution of the population growth of Spain's municipalities between 1900 and 2001. The authors estimated transition matrices associated with discrete Markov chains to obtain information concerning the movements of the urban groups within the population distribution, but they did not explore the approach suggested by Rey (2001), who proposes the use of a spatial Markov matrix. Lalanne (2013) investigated the hierarchical structure of the Canadian urban system and found that Zipf's law is not valid.
In the case of Brazil, few studies have investigated the behavior of city size distribution. Oliveira (2004) estimated Pareto coefficients for Brazil between 1936 and to study the c Southern Regional Science Association 2019. evolution of size distribution in Brazilian cities and test the validity of Zipf's law. The obtained results do not support the conclusion that the rule of order and size applies to Brazil. Trindade and Sartoris (2009) examined the evolution of size distribution of cities in Brazil between 1920 and 2000 and the results show evidence of divergence. Similar to Oliveira (2004), Trindade and Sartoris (2009) did not use spatial Markov matrix. Monastério (2009) analyzed the changes in the spatial distribution of population and manufacturing employment in Brazil between 1872 and 1920 and considered exploratory spatial data analysis and Markov chains, as suggested by Rey (2001). He found that localities, where the density of proximal neighbors was low, tended to approach the low-density profile of its neighbors. Justo (2012) found evidence of low inter-class mobility and high persistence in the population distribution behavior of 431 municipalities between 1910 and 2010. However, the author did not use the approach of a spatial Markov matrix proposed by Rey (2001). Finally, Moro and Santos (2013) found low mobility between 1970 and 2010, but their sample only included municipalities that existed in 1970, which did not cover all of Brazil's territory.
This set of studies on the dynamism of Brazilian cities presents at least two notable limitations. First, they did not consider an entire set of Brazilian municipalities. Thus, the results only provide a partial picture of the country's dynamic city-size distribution. Second, spatial dependence is scarcely considered when the dynamics of city size are studied and, as recently shown by Silva et al. (2017), capturing spatial spillovers are essential for understanding the population growth of Brazilian municipalities. The present study addresses both of these limitations.
By analyzing the evolution of the Brazilian city size distribution, we begin by revisiting the traditional Zipf's law approach and subsequently, the distribution evolution as a Markovian stochastic process. Next, we consider spatial dependence through the spatial Markov chains developed by Rey (2001). Our three main results indicate that (i) Brazil's population distribution is, as the decades progress, moving away from Pareto law; (ii) there is low inter-class mobility and a high likelihood for cities to remain in their own class size from one decade to another, and; (iii) different spatial contexts have different effects on regional transition.
In addition to this introduction, Section 2 presents the data and initial evidence. Section 3 explores the evidence using the traditional Zipf's law approach. Section 4 presents evidence about the city size distribution by using classic Markov chains. Section 5 presents the evidence of the importance of spatial interactions using spatial Markov chains. Section 6 presents the conclusions and policy implications.

DATA AND INITIAL EVIDENCE
The main source of data used in this analysis comes from the Brazilian Demographic Census, conducted by Brazil's IBGE, for the years 1970, 1980, 1991, 2000, and 2010. Although a municipality constitutes the smallest unit of observation in political and administrative terms, the inter-temporal comparisons at a strictly municipal geographic level becomes inconsistent with changes in the number, area, and border of municipalities that occurred over those decades. From 1970 to 2000, the number of municipalities increased from 3,952 to 5,565. Therefore, to allow for consistent comparisons over time, it is necessary to aggregate these c Southern Regional Science Association 2019. municipalities into broader geographical areas, called minimum comparable areas (MCAs). Following the approach of Silva et al. (2017), for each census, the municipalities are grouped in MCAs to ensure their boundaries do not change during the study period. 1 This study includes 3,659 MCAs and represents an aggregation of all Brazilian municipalities for each census from 1970 to 2010 covering the entire territory and avoiding selection bias problems.
Initially, to investigate the evolution of Brazil's population distribution shape from 1970 to 2010, a non-parametric normal kernel density with a bandwidth value of 0.0245 was estimated for the urban population distribution for each decade. 2 Relative population size is considered and Figure 1 presents the distributions of population size in 1970, 1991, and 2010. The number 1 on the horizontal axis indicates Brazil's average MCA size, 1.5 indicates 50 percent higher than the average, and so on.
The decrease in the concentration of MCAs around the mean over the decades sampled is remarkable (Figure 1). This decrease presents a regressive deconcentration rhythm in each decade, that is, the sizes of the localities are not converging at the same level but diverging at a diminishing rate. The statistical summary in Table 1 clarifies Figure 1. The distribution in 2010 is the most dispersed around the mean, and this is the trend between 1970 and 2010. Specifically, measured by the standard deviation increase, the distribution became approximately 20 percent more dispersed between 1970 and 2010. However, this divergence decreases between 1970 and 1980, and between 2000 and 2010, the standard deviation growth was 9 percent and 2.1 percent, respectively. With the reduction in the median over the sample decades, we observe that greater dispersion arises mainly from an increased presence of cities above the mean.

ZIPF'S LAW
In this section, we extend the aforementioned analysis by considering the evolution of the size distribution of cities through Zipf's law. Denoting the classification order of the city in the population size distribution by R and the city's population by S, this rank-size rule states that the size distribution of cities is given by: where α and β are the parameters. 3 β is the Pareto exponent, and the size distribution of cities depends on its value. By capturing the relationship between the relative variation of Notes: Authors' estimates using data from Brazil's Demographic Census years 1970Census years , 1980Census years , 1991Census years , 2000Census years , and 2010 city sizes and ranks, this coefficient provides information about the conditions of the city size distribution convergence or mobility. If β > 1 and tends to infinity, all cities tend to have the same size. By contrast, the smaller the value of β < 1, the weaker the reaction of the rank to variation in the city sizes, the harder the conditions of mobility and the more the city size distribution spreads out. Finally, if β = 1, we have Zipf's Law and the cities' populations are proportional to their inverse rank position (Le Gallo and Chasco, 2008). Notably, however, because rank size analysis does not provide information about the growth rate of the city sizes, the value of the coefficient by itself does not assure that the city sizes converge.
Empirically, to avoid OLS bias when estimating the parameters of Equation (1), we follow the suggestion of Gabaix and Ibraagmov (2011) and consider the modified log-log rank-size specification in Equation (2): c Southern Regional Science Association 2019.  1970, 1980, 1991, 2000, and 2010. Table 2 presents the estimation of the parameters of the rank-size equation (2) for all of Brazil's MCAs in each decade using an OLS estimator. In the 1970s, the estimated Pareto coefficient approaches Zipf's law, with an estimated value of 0.95. Consistent with the time dynamics of the distributions presented in Figure 1, in the following decades, this coefficient deviates increasingly from the unit value, reaching 0.77 in 2010. 4,5  1970, 1980, 1991, 2000, and 2010. ** Indicates significance at the 1 percent level.
This result is qualitatively consistent with Trindade and Sartoris (2009); Justo (2012); Moro and Santos (2013). However, the first two studies, unlike our analysis, used a high level of aggregation: 920 MCAs between 1920 and 2000, and 431 observational units between 1910 and 2010. The high aggregation level used by these studies has led to results that indicate a higher population concentration than using less aggregated data. 6 Although Moro and Santos (2013) use municipality as an observational unit (more disaggregated than MCAs), they consider only the urban population, which makes our results quantitatively incomparable.

CITY SIZE DISTRIBUTION AND THE MARKOV TRANSITION DY-NAMIC
Although useful for obtaining information about general mobility conditions, the rank-size relation is limited when examining the movements inside the size distribution of cities. Thus, in this section, we present evidence on the city size distribution dynamics according to the rank position movements as a Markovian stochastic process. By modeling the transition process of the MCAs directly, we can examine the evolution and trends in the MCAs' size distribution.
We denote D t as the distribution of the cross-section relative population of the MCAs at time t. Next, we discretize the population distribution in K groups. To proceed with the estimation, we first need to assume that the distribution frequency follows a first order stationary process of Markov (Le Gallo and Chasco, 2008). This assumption requires transition probabilities, m ij , of order 1, which means independence of the classes at the beginning periods (t − 2, t − 3, ...). Following this assumption, the transition probability matrix, M , represents the evolution of a size distribution, where each element (i, j) indicates the probability that a city in class i at time t will be in the class j in the following period (Le Gallo and Chasco, 2008). The frequency of cities in each size class at time t is, thus, given by the following: where the transition probability matrix, M , is: where m ij ≥ 0 represents the probability that the cities of a particular size class i at time t − 1 will be in the class j at time t and K j=1 m ij = 1. These elements of M can be obtained following the maximum likelihood estimator (Amemiya, 1985;Hamilton, 1994), specifically, m ij = n ij n i , where n ij is the number of MCAs moving from class i to j between t − 1 and t, and n i is the number of municipalities that remain in i for all T − 1 transitions.  years 1970, 1980, 1991, 2000, and 2010. As highlighted for example, by Le Gallo and Chasco (2008), if the stationarity of transition is satisfied, then and we can define the steady-state distribution of D t when s tends to infinity. In this case, the matrix of transition probabilities converges to M * of rank 1, and the steady-state distribution, D * , is given by the following: When t → ∞ then D * represents the future distribution of the localities' size. Table 3 shows the traditional Markov transition probability matrix for four classes according to the quartiles for each decade between 1970 and 2010. An MCA in the first quartile class means it is among the smallest 25 percent in terms of relative population in that year. If an MCA is inserted in the fourth quartile, this means it is among the largest 25 percent in terms of relative population.
From Table 3, several points can be observed. First, if the MCA is in the i th class, the probability of being in the same class in the subsequent decade is at least 82.49 percent and at most 93.14 percent. These high probabilities on the main diagonal show low rates of interclass mobility and a high persistence of MCAs that remain in their own class from one decade to another over the entire period. This is a remarkable result when we consider Brazil's rapid urbanization. The evidence indicates that this process occurred while preserving basically the same pattern of distribution as the 1970s and is consistent with the persistence of Brazil's regional inequality during the period (Lima and Silveira Neto, 2015).
Second, the largest and smallest MCAs have a lower probability of moving to other categories, that is, these localities have less inter-class mobility than the medium-sized cities. This evidence suggests that the initial agglomeration gains of big Brazilian cities are considerable and in line with the evidence of an urban wage premium in Brazilian cities (Barufi et al., 2016). By contrast, the lower mobility observed for the group of smallest cities is consistent with the loss of scale economies involved in the municipalities' creation during the 1980s and 1990s and with poor infrastructure conditions in these cities (Lima and Silveira Neto, 2018). The higher inter-class mobility of the medium-sized cities indicates the presence of differentiated performance among them. More specifically, the highest transition probabilities among the classes are 9.71 percent and 9.51 percent, which occur, respectively, for the movements from the third to second and second to third quartiles.
Together with the high persistence of the largest and smallest MCAs to remain in their initial class, this final piece of evidence indicates the major role of medium-sized localities in the processes of urban agglomeration that occurred in Brazil over the last 40 years. This evidence is in agreement with Andrade and Serra (2001), who asserted that medium-sized cities play a decisive role in the automatic decentralization of economic activities in Brazil. For example, Silveira Neto and Azzoni (2011) shows that Brazil's manufacturing sector, initially located close to big cities, presents a clear tendency of less spatial concentration between 1990 and 2010. Given the country's regional diversity, this differentiation observed in medium-sized cities can be associated with the city's locations, a possibility we explore in the next section.
To determine the speed with which the urban municipalities move within the distribution, we estimate the first mean passage of time relative to population. 7 On average, the number of years needed to reach any class other than the original class is relatively high: the shortest and longest time passages are 13 years (from class 1 to class 2) and 75.36 years (from class 1 to class 4), respectively. As expected, more distant classes take a longer time to reach. For example, for an MCA that was originally in class 1 to achieve class 3, it takes on average 33.6 years. The faster declines in classes 3 and 4 (20.28 and 14.58 years, respectively) may indicate that localities in these classes are more likely to lose relative population. 8 This evidence suggests a general progressive suburbanization process in which big cities slow or stop growing, favoring the progressive appearance of smaller population cores (Le Gallo and Chasco, 2008).
To compare the changes over the time, Table 4 presents the Markov transition probability matrix decade by decade. 9 The probabilities of MCAs remaining in the same class increased over time, which can be verified by observing the values on the main diagonals. The probability increases again, indicating a low mobility among the classes. This evidence of stability in the population distribution behavior over time is consistent with the previous normal non-parametric kernel density function estimates (Figure 1). Chi-square (Q) and likelihood ratio (LR) tests for homogeneity across lag classes, suggested by Bickenbach and Bode (2003), indicate that the Q and LR statistics are 261.0 and 241.81, respectively, and are both significant at the 1 percent level, indicating the dynamics are not homogeneous across the time. Notably, however, these test statistics tend to be inflated in the presence  Notes: Authors' estimates using data from Brazil's Demographic Census years 1970Census years , 1980Census years , 1991Census years , 2000Census years , and 2010 of positive spatial autocorrelations among spatial units (Fingleton, 1983a(Fingleton, ,b, 1986, which is exactly the context we found and discuss in the following section. We cannot definitively reject the assumption of homogeneity. The ergodic distribution, or limit distribution, can be interpreted as the long-run equilibrium in the distribution of the relative population of the MCAs. In this study, as the relative population discretization was created from the quartiles, the ergodic distribution will naturally be similar to the initial distribution of classes (25 percent of MCAs in each class) and does not present notable results. However, we can use the tests suggested by Rey (2001) to compare the steady-state matrix of the classic case with the steady-state matrix conditioned by spatial lag, as we shall see in Section 5.

SPATIAL MARKOV DYNAMICS
An important limitation of the traditional Markov chains, when studying the dynamics of cities, is that they do not consider the spatial dependence that may exist among spatial observational units. The spatial dependence can arise from measurement errors, such as boundary mismatches between the administrative data and actual market processes, but also may reflect shared amenities and knowledge spillovers among spatial unities, trade and migration flows, and competition or complementarities among local governments. In the case of Brazil's cities and regions, evidence indicates that spatial dependence substantively affects the growth of cities and regional income (Lima and Silveira Neto, 2015;Silva et al., 2017). Furthermore, given the country's high regional diversity, it is possible that, by considering the specific location of municipalities, we can expand our understanding of the factors behind the higher inter-class mobility verified in medium-sized municipalities. Thus, we introduce spatial dependence into the previous analysis of the population distribution dynamics of Brazil's cities through spatial Markov chains proposed by Rey (2001). Rey (2001) suggested a modification of the traditional Markov matrix: conditioning the transition probability, p ij , to the j initial class of the spatial lag of the variable in question. Here, this conditioning concerns the population size class of the spatial lag in the initial period. 10 This combination of traditional Markov matrix and spatial autocorrelation is called the spatial Markov matrix. This matrix can be constructed by the traditional matrix decomposed into a k × k × k. An explicit test of influence of the neighbors of a locality can be based on the comparison between the states' transitions conditioned to the initial state of its spatial lag (Rey, 2001).
For the k th matrix conditional, an element m ij|k is the probability of a region in class i at time t that becomes class j in the next moment on the understanding that its spatial lag was in class k at time t. The spatial transition matrix can be used to test the negative or positive influence of geographic neighbors in a region. 11 In our case, the MCAs are divided into four size classes (i.e., small, medium-small, medium-large, and large). For example, if we want to know the effect of medium-large sized neighbors on the transition to move up or down of a locality, we analyze the matrix elements in the third conditional, where the spatial lag is medium-large. For instance, the m 34|3 element stands for the possibility of a region in the medium-large class to move upward, given that its neighbors are in the medium-large class.
Furthermore, it is possible to determine the influence of spatial dependence on the transition probability by comparing the elements of the traditional transition Markov matrix with the elements of the spatial Markov matrix. For example, if m 34 > m 34|3 , the probability of an upward movement in the classification of a city in the medium-large class is higher than the probability of one in the medium-large class with neighbors in the same class. By con-10 The population spatial lag is given by P op. spatial lag i = n j=1 w ij P op i where w ij is the ij th element of the nonnegative W(N xN ) matrix describing the arrangement of the spatial units. In this case, the standard contiguity neighbors matrix was used. In this spatial weight matrix, w ij = 1 if i and j are contiguous neighbors and zero otherwise, the diagonal elements also have values set to zero, because no spatial unit can be viewed as its own neighbor. 11 See Table 3 in Rey (2001). c Southern Regional Science Association 2019.
trast, if the neighborhood has no effect on the probability of transition, then the conditional probability is equal to the probability of the traditional Markov matrix: The main benefit in analyzing the dynamics of the spatial conditioning is capturing the influence of neighborhood dimensions on the mobility possibility of areas within the population hierarchy. In addition to providing a detailed view of the geographic dimension of population distribution, is it possible to answer questions such as, is an MCA's probability of moving up or down in the distribution related to its neighbors? Or,is spatial dependence similar or different for upward and downward movements?
To introduce spatial conditioning in the previous analysis, we use the standard contiguity neighbors matrix (W ) to estimate the spatial Markov transition matrices. 12 In addition, we test for the null hypothesis that there is no difference between each spatial transition matrix and the traditional transition matrix using an LR test proposed by Le Gallo (2004), where the statistic is given as follows: which is asymptotically distributed as a Chi-squared statistic with K(K − 1) 2 degrees of freedom, and where K is the number of cells of the distribution (K = 4),m ij is the maximum likelihood estimate (equation (5)),m ij (l) is the estimated probability that an MCA is in class i at t−1 and class j at t given its spatial lag is in class l at t−1, and n ij (l) is the corresponding number of MCAs. The set of matrices, together with the test value statistics, are reported in Table 5.
The numbers in the last two columns of Table 5 indicate that we should reject the null hypothesis that the spatially conditioned sub-matrices and the traditional transition matrix are not different at the 1 percent significance level. This suggests that to capture the transition dynamics of the size distribution, it is necessary to consider the spatial dependence among MCAs, that is, the initial location of the MCAs matters.
According to the numbers of the transition matrices in Table 5, the neighbors of a MCA affect its transition probabilities over time. We also note that different spatial contexts have different effects on the transition of different regions. Specifically, the probability of upward transitions for the following quartile increase when its neighbors in are in higher classes. For example, for a MCA in the first quartile with neighbors in the same class, the probability of moving upward to the second quartile is 4.48 percent; however, if it is adjacent to localities in the fourth quartile, this probability increases to 13.26 percent. In the same manner, for an MCA in the third quartile with neighbors in the second, the probability of moving upward to the fourth quartile is 4.09 percent; however, if it is adjacent to localities in fourth quartile this probability increases to 11.66 percent. Notably, the probability of downward transitions for the following quartile does not necessarily increase with neighbors in smaller classes.  years 1970, 1980, 1991, 2000, and 2010. These initial observations suggest that from 1970 to 2010 the presence of bigger MCAs in the vicinity favors the growth of Brazil's MCAs. These observations are also consistent with the presence of positive spatial spillovers in population growth from the workforce occupied in the neighbors MCAs in the same period, which are provided by the spatial panel econometric analysis of Silva et al. (2017) and explained by easier and more robust economic interactions. Additionally, these observations are in line with the idea of a spatially concentrated dispersion of economic activities in Brazil identified by Sobrinho and Azzoni (2014), where the industrial dispersion from the big cities tends to favor neighboring cities and regions.
Introducing spatial conditioning also allows for the understanding of some the characteristics of recent urban agglomeration processes in Brazil. Notably, the intermediary classes present a higher probability of inter-class mobility (2 and 3), and the highest probability of moving upward occurs when the neighbors are in the most populous class (4). This evidence is consistent with an overflow of the population from large-to medium-sized cities and again highlights the major role of medium-sized localities in the processes of urban agglomeration that occurred in Brazil over the last 40 years.
It is possible to understand the influence of spatial dependence on the transition probac Southern Regional Science Association 2019.  years 1970, 1980, 1991, 2000, and 2010. bility by comparing the elements of a traditional transition matrix with the elements of the spatial Markov matrix. For example, ignoring the spatial context (Table 5), the probability of an MCA in the third quartile to move down to the second quartile is 9.71 percent, this probability increases to 10.88 percent if the neighbors are in the first quartile (less populated class). We can also observe, by comparing the traditional and spatial matrices, that less populous MCAs with highly populated neighbors decreases the probability of persistence in the same class distribution. Specifically, ignoring the spatial context, the probabilities of MCAs to remain in the first and second quartiles are 92.08 percent and 82.49 percent, respectively. These probabilities decrease to 86.28 percent and 74.74 percent, respectively, when highly populated neighbors surround these locations. Additionally, to observe how Brazil's urban hierarchy is spatially dependent, we can explore the steady-state distribution implied by each estimated conditional transition probability matrix from Table 5. The calculated steady-state distributions are presented in Table 6, together with the multinomial test statistic values suggested by Rey (2001) for the difference between the steady-state distribution from a conditional distribution and the overall steadystate distribution. The values of these statistics are presented in the last two columns of Table 6 and indicate that, in all cases, one should reject the null hypothesis that the spatially conditioned steady-state distribution equals the unconditioned steady-state distribution.
According to the numbers in Table 6, in the case of Brazil, urban hierarchy appears strongly dependent on the neighbors that surround the locations. More specifically, for example, the long-run distribution for MCAs neighboring relatively less populated (class 1) locations has 54.72 percent of localities in the first quartile and 11.19 percent in the fourth quartile. By contrast, the long-run distribution for MCAs neighboring relatively highly populated (class 4) locations has just 5.37 percent of the localities in the first quartile and 53.57 percent in the fourth quartile.
As argued by Le Gallo and Chasco (2008), concentration of the frequencies in some classes, that is, a multimodal limit distribution, may be interpreted as a tendency toward stratification into different convergence clubs. As can be observed in the main diagonal, there would be a higher concentration of frequency according to the spatial lag on a particular class that can indicate different convergence clubs according to spatial lag.
The traditional central place theory (Lösch, 1954;Christaller, 1966;Mulligan, 1984) high-lights the different levels of a location's centrality, with the higher level of centrality often providing all the functions found in the lower level locations. Because different levels of centrality can arise from the spatially conditioned steady-state distributions, the aforementioned set of evidence presents some coherence with a hierarchical system. Notably, however, the results suggest neighbor's locations with similar levels of centrality present more complementarities than competition. This evidence is consistent with much better infrastructure existing in Brazil's biggest cities and their neighboring locations and the process of spatially concentrated dispersion of aforementioned economics activities.
Our results also appear consonant with approaches to urban systems based on the New Economic Geography (NEG) perspective. These approaches highlight two basic opposing forces that condition the economic landscape and number of cities, the population size and transport costs that, respectively, favor and disfavor the number of cities (Fugita and Krugman, 1995;Fugita et al., 1999;Tabuchi et al., 2011). We note that both factors were clearly present in Brazil during the period from 1970 to 2010, and our set of evidence appears in line with their potential influence: 13 the dispersion force arising from a bigger population could have acted to increase the dispersion of Brazilian city size distribution, and the transport cost reduction could also have limited this process. The spatial dependence we have identified suggests that the locations in the neighborhoods of bigger cities benefited most from population growth.

CONCLUSIONS
The objective of this paper was to explore the behavior of the population size distribution of MCAs covering Brazil's entire territory between 1970 and 2010, a period of rapid urban development that has been fully studied. This research not only revisits the traditional ranksize rule and Markov chain approaches but, in line with evidence of strong spatial dependence among Brazil's cities (Silva et al., 2017), it also introduces the spatial Markov chain analysis proposed by Rey (2001).
The evidence obtained from non-parametric normal kernel density functions estimates and Zipf's law estimation indicate that population size distribution of MCAs is becoming more spread out and, as the years progress, moving away from Pareto law. That is, the result shows that in the case of Brazil, over time, the ranking of cities is decreasingly influenced by their size.
The traditional Markov chain approach indicates that there are high probabilities in the main diagonal, indicating low inter-class mobility and a high persistence of MCAs that remain in their own class size from one decade to another over the entire period. This result suggests significant rigidity in Brazil's urban hierarchy. The spatial Markov transition probability matrix and spatial ergodic distributions analysis substantively qualify this result. Specifically, we found that the probability of upward transition increases for MCAs with neighbors in high classes, and the MCAs grouped into medium classes have a higher probability of downward transition if their neighbors are in a less populated class. Simi-larly, a city's long-run size spatial distributions indicate strong (positive) influence of the neighboring city size on city dimension.
These sets of evidence are entirely consistent with studies about the spatial dependence of the population growth of Brazil's cities and the process of concentrated dispersion of economic activities in the Brazilian landscape during the period and appear to reflect the superior infrastructure existent in Brazil's biggest cities and their neighboring locations. Additionally, although Brazil's urban hierarchy appears to not perfectly reflect the traditional central place theory hierarchy, it appears consistent with the roles of population growth and transport cost reduction in configuring urban systems present in more recent NEG models.
Brazil is a developing country characterized by its rapid urbanization, inequality, and cities with strong presence of informal housing. The evidence obtained in this research indicates that city size distribution dynamics, in this context, are strongly conditioned by the spatial heterogeneities of this country's landscape, which makes location critical for the evolution of cities. Additional research is necessary to verify if similar patterns are generalizable to other developing countries.  years 1970, 1980, 1991, 2000, and 2010. ** Indicates significance at the 1 percent level. Notes: Authors' estimates using data from Brazil's Demographic Census years 1970Census years , 1980Census years , 1991Census years , 2000Census years , and 2010.