Socioeconomic factors associated with hospital deaths due to COVID-19 in Brazil

This paper aims to identify the socioeconomic, demographic, and clinical factors associated with COVID-19 deaths in Brazil using information from municipal and individual databases. The data was extracted from the IBGE, the Ministry of Tourism for municipalities, and the Ministry of Health for individuals, with a particular focus on the period from January 1, 2020, to May 31, 2022. Data analysis was performed based on the estimation of odds ratios through logistic regression. The results show that the probability of the death of individuals who were hospitalized by COVID-19 is greater for those living in cities with low GDP per capita , high illiteracy rates, and a high percentage of extreme poverty. In addition, individuals over 60 years old, males, racial minorities, and illiterates were more likely to die from COVID-19. This study provides evidence that the effects of COVID-19 can be alleviated by improving socioeconomic conditions.


Introduction
The first half of 2020 introduced one of the world's biggest challenges in terms of economic, social, and public health: the coronavirus (severe acute respiratory syndrome coronavirus 2 or SARS-CoV-2). The disease caused by this new virus  began in December 2019 after an unknown outbreak of pneumonia in the city of Wuhan, the capital of the Chinese province of Hubei (WHO, 2020b). The advancement and severity of the disease led the World Health Organization (WHO) to declare COVID-19 as a global pandemic on March 11, 2020(WHO, 2020c. From the first COVID-19-induced death in January 2020 in China, to May 31, 2022, approximately 527.12 million cases of COVID-19 have been confirmed around the world, including about 6.30 million recorded deaths worldwide (WHO, 2020a). During the period from the beginning of 2020, when the first case of COVID-19 was confirmed, to May 31, 2022, Brazil registered approximately 31.02 million confirmed cases, including about 667 thousand deaths (MS, 2020b).
To reduce the number of cases and deaths caused by COVID-19, many countries have implemented several measures to inhibit the transmission of the new coronavirus, including social distancing (Cohen and Kupferschmidt, 2020). In Brazil, the state and municipal governments took several actions to eradicate the virus to the extent that by the end of March 2020, all Brazilian states had introduced some measures of social distancing (Moraes et al., 2020). These measures, as pointed out by Stojkoski et al. (2020), have contributed to reducing the expected shock from the coronavirus. However, the magnitude of this reduction within the country varies.
Another measure adopted by many countries was the mandatory use of masks. In some countries, such as the United States (Lyu and Wehby, 2020;Van Dyke et al., 2020;Kwon et al., 2021) and Germany (Mitze et al., 2020), the use of masks is associated with a reduction in the transmission of COVID-19. There are no studies on the use of masks and their impact on COVID-19 in Brazil since there is no data on population mask usage.
The WHO considers that older individuals (people over 60 years of age) and individuals who have comorbidities (such as cardiovascular disease, diabetes, chronic respiratory diseases, and disease (WHO, 2020b). Research on COVID-19 has shown that of the individuals who were hospitalized with the disease, those with comorbidities, the elderly, and males presented the greatest complications of the disease (Garg et al., 2020;Hou et al., 2020;Huang et al., 2020;Li et al., 2020;Liu et al., 2020;Wu et al., 2020) There are studies on the economic and social factors associated with COVID-19induced deaths. However, these two factors have the potential to impact the dynamics of infectious diseases (Suk and Semenza, 2011), as is the case with COVID-19, and, therefore, must be analyzed. Some studies have sought to identify the association between these factors and deaths caused by COVID-19 (Abedi et al., 2021;Alsan et al., 2020;Qiu et al., 2020;You et al., 2020). In Brazil, there is evidence that economic inequalities have a positive relationship with the incidence of COVID-19 and the resulting deaths (Demenech et al., 2020). The Gini coefficient, which measures the income concentration of a population, is positively correlated with the increase in cases and death rates from the new coronavirus in Brazilian states (Demenech et al., 2020). Therefore, it is necessary to consider the social and economic aspects of the analysis of deaths caused by COVID-19.
Given this context, this paper aims to identify the factors associated with COVID-19-induced deaths in Brazil, focusing on socioeconomic, demographic, and clinical aspects from January 1, 2020, to May 31, 2022. This association was measured based on the estimation of the odds ratio through logistic regression, which makes it possible to model the probability that the dependent variable, which is binary in this case, assumes a certain value as a function of a set of explanatory variables.
This research makes two important contributions by doing this analysis. First, it presents the literature on economics and epidemiology in a single study, showing that these two areas are interconnected and can make important contributions to analyzing the situation of COVID-19 in Brazil. Second, this study considers in its analysis the association of clinical factors of individuals as well as the socioeconomic factors of municipalities associated with COVID-19. It is important to note that, concerning the analysis of the clinical factors associated with COVID-19, there is a study on the state of Espírito Santo (Maciel et al., 2020). There is also a study that relates economic inequality with the incidence and mortality rates due to COVID-19 in Brazilian states; however, it does not consider clinical factors in its analysis (Demenech et al., 2020). This study differs from these two, as it considers, in addition to clinical factors, socioeconomic factors in its analysis through a combination of databases at the municipality level and at the individual level.
In general, this paper, by including variables from the municipalities, introduces a component related to the environment in which the individual infected by COVID-19 lives, to identify how the characteristics of that environment contribute to the outcome of the disease. This facilitates the understanding and analysis of the progress of COVID-19. The rest of the paper proceeds as follows: Section 2 captures the method used in this study and describes the database. Section 3 presents the results. Section 4 discusses the results, and Section 5 concludes the study.

Database
This study was carried out using a combination of databases at the municipal and individual levels. All municipal variables are considered the most up-to-date data and are separated into two blocks. The first block includes two economic variables: the ln of the municipal GDP per capita in the year 2017 (the variable was used in its logarithmic form to avoid possible biases in the estimates due to the presence of outliers and a high standard deviation), extracted from the Instituto Brasileiro de Geografia e Estatística (IBGE, 2020b), and an extreme poverty variable, measured in terms of the percentage of the municipal population living in extreme poverty (≤1% and >1%), which was extracted from the 2010 Census data (IBGE, 2010).
The second block presents the municipalities' sociodemographic characteristics. The variables ln of the population and ln of the population density (inhab / km2) were used in their logarithmic form for the same reasons as the variable ln of GDP per capita. The population of the municipalities is based on the last IBGE population estimate (IBGE, 2020a), and the area of the municipalities was retrieved from the last IBGE territorial areas survey (IBGE, 2020c). The following variables were extracted from the 2010 Census (IBGE, 2010): piped water (≤90% and >90%), expressed as the percentage of households with running water, garbage collection (≤90% and >90%), expressed in terms of the percentage of households with garbage collection, and illiteracy rate.
Furthermore, based on data from the IBGE, the following variables were collected: type of composition (metropolitan region, integrated development regions, urban ag-glomerations, and other compositions) that indicate the composition of; border municipality (yes and no); and coastal municipality (yes and no). Finally, the last sociodemographic variable considered is the tourist municipality (yes and no) variable, which was extracted from the Brazilian Ministry of Tourism (2020).
At the individual level, the database refers to individuals hospitalized by COVID-19, from the severe acute respiratory syndrome (SARS) database, which includes COVID-19 data, made available by OPENDATASUS on the Brazilian Ministry of Health website (MS, 2020a). The data considered for the present study covers the period from January 01, 2020, to May 31, 2022, and considers only cases of SARS classified as COVID-19.
The dependent variable is the outcome of COVID-19, a binary variable where 0 represents other outcomes different from death by COVID-19 (hospital discharge, death from other causes, ongoing treatment) and 1 represents death by COVID-19.
The characteristics of the individuals are divided into two sets: the sociodemographic characteristics and the clinical conditions of the individuals. The first set consists of the following variables: age group (in years: <30, 30-60, and ≥60), gender (male and female), race (white, black, yellow, pardos, indigenous, unidentified, and missing), education (illiterate, elementary 1, elementary 2, high school, higher education, and missing), region of residence (North, Northeast, Midwest, Southeast, and South) and area of residence (urban, rural, and missing).
The set of characteristics, including the variables described in blocks 1, 2, and 3, consists of the variables of the individuals' clinical conditions, which includes the following variables: main symptoms (at least one and none), risk group (yes and no), main symptoms and risk groups (yes and no), ICU admission (yes, no, and missing), and ventilatory support (invasive, non-invasive, no, and missing).
For those variables that had more than 10% of missing observations, a category called "missing" was created to group them to avoid generating biased estimates. The main symptom variable considers the following symptoms: fever, cough, dyspnea, respiratory distress, and oxygen saturation. The variable risk group considers whether the patient has any comorbidity, such as chronic diseases, immunodeficiency, diabetes mellitus, asthma, puerperal women, and obesity, among others. Table 1 below shows the expected association between the dependent variables when it assumes value 1 that is, being equal to death and the main explanatory variables of this study. For categorical variables, the expected association is always about the comparison category. In this study, some important variables for the outcome of COVID-19, such as vaccination, social distancing, and use of masks, were not considered in the analysis because there is insufficient data on the latter two. Regarding vaccination, despite the existence of data, it is restricted to vaccination that began in 2021 and was mostly staggered.

Method
Logistic regression is a generalized linear model used to describe the behavior of a binary dependent variable, which assumes values of 0 or 1, and both qualitative and quantitative variables (Fávero et al., 2009). It allows the assumption of a certain value as a function of a set of explanatory variables by modeling the probability of the dependent variable.
In this study, binomial logistic regression analysis was used to estimate the values of the crude and adjusted odds ratios (OR), as well as their respective 95% confidence intervals (95% CI), with the COVID-19 outcome as the dependent binary variable, where 0 represents other outcomes and 1 represents death by COVID-19. The data analysis was performed using Stata, version 15.1.
Considering that this paper aims to identify the factors associated with COVID-19-induced death in Brazil with a focus on economic and social aspects, four blocks of logistic regression were estimated. Block 1 encompasses the municipal economic variables. Block 2 covers the sociodemographic variables of the municipalities. Block 3 presents the estimates of individuals' sociodemographic characteristics. Block 4 focuses on individuals' clinical characteristics.
Each block includes a univariate analysis to capture the crude odds ratio for each variable and another multivariate analysis to capture the adjusted OR that are adjusted for all variables in the block. Thus, the OR for block 4 are adjusted for all the predictor variables described in this section, which will enable the identification of a more direct effect of the association between COVID-19-induced deaths and the economic and sociodemographic characteristics of the municipalities. The logistic regression model is estimated by the maximum likelihood method, and it is expressed as follows: where y denotes a binary variable, 0 represents other outcomes different from COVID-19 induced deaths (hospital discharge, death from other causes, ongoing treatment); and 1 represents COVID-19 induced deaths; X i is the vector of explanatory variables referring to an individual i and R a is the vector of explanatory variables referring to the municipality a where an individual i resides.

Results
From January 1, 2020, to May 31, 2022, 1,944,391 people were diagnosed and hospitalized with COVID-19, of which 610,639 (31.41%) died and 1,333,752 (68.59%) had other outcomes (hospital discharge, death from other causes, and ongoing treatment). Table 2 presents the sample distribution as well as the proportion of deaths due to economic, sociodemographic, and health characteristics. Regarding the economic and sociodemographic aspects (the focus of this study), it is observed that the proportion of deaths is higher among individuals who live in municipalities with lower levels of GDP per capita, lower access to piped water, lower garbage collection percentage, and lower illiteracy rate, as well as for those individuals who live in poorer regions (North and Northeast and rural areas) and coastal municipalities. Regarding individuals, the death outcome is higher among the elderly, males, and less educated individuals. To be continued To be continued In Table 2, some variables have many missing values, such as education, race, area of residence, ICU admission, and ventilatory support. Therefore, it is important to emphasize that the interpretation of estimates of these variables must be done with caution. Table 3 presents the crude and adjusted OR between the variables in block 1 and COVID-19-induced deaths. The variable ln of GDP per capita in both the crude and adjusted models proved to be statistically significant, with a lower risk of death for individuals living in wealthier municipalities. Individuals within the group with the most extreme poverty (>1%) are more likely to die.  Table 4 presents the crude and adjusted OR between the variables in block 2 (the sociodemographic characteristics of the municipalities) and COVID-19-induced deaths. All variables before and after the adjustment were statistically significant, except for tourist municipality. The results in Table 4 show that groups with greater piped water coverage and a higher percentage of garbage collection are less likely to die. Higher population density and higher illiteracy rates are also associated with higher chances of death. Not living in the metropolitan area is associated with lower chances of death when compared to other areas (except for urban agglomerations in both the crude and adjusted OR models). Hospitalized individuals living in non-border municipalities are more likely to die than those living in border municipalities (15.6% in the crude analysis and 10.3% in the adjusted analysis). Living in tourist municipalities is associated with lower chances of death, while living in coastal counties has a positive relationship with deaths by COVID-19.
As can be seen in Table 5, the crude and adjusted OR between the variables in block 3 -the individuals' sociodemographic characteristics and the deaths by COVID-19 were statistically significant for all variables, except for gender. Older individuals are more likely to die. Black and pardos individuals are more likely to die when compared to whites. Individuals with higher education levels are less likely to die from COVID-19 when compared to illiterate individuals. The same occurs for those people who live in rural areas in the adjusted analysis. People who live in the center-south of Brazil have lower chances of death from COVID-19. Source: Elaborated by the authors themselves based on the study database. Notes: All the variables included in the model were adjusted. Robust confidence interval in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1 Table 6 presents the estimates of the crude and adjusted OR between COVID-19induced deaths and the variables in block 4, which include, in addition to the economic and social variables of the municipality and the individuals' sociodemographic variables, the clinical variables of the individuals. In both the univariate analysis and the multivariate analysis of the clinical characteristics of individuals, having none of the main symptoms and not being at risk was negatively associated with deaths resulting from COVID-19, while individuals who were admitted to the ICU and used invasive respiratory support had a higher risk of death.
In the multivariate analysis (adjusted for all variables) presented in Table 6, some variables lost statistical significance, namely: main symptoms and risk groups; ICU admission; the other compositions category of the type of composition variable; the yellow, pardos, and indigenous categories of the race variable; and all categories, ex-cept for elementary 1, of the variable schooling. The other variables remained statistically significant, including the female category of the gender variable and the Northeast category of the region of residence variable, which before adjustment were not statistically significant. For some variables, the magnitude of the OR in the adjusted model was modified, changing the direction of their association with the dependent variable. This occurred for the categories of >90% of the piped water and garbage collection variables, which increased the risk of COVID-19-induced deaths. In some cases, the risk of death was reduced after the adjustment, as in the case of individuals hospitalized with the new coronavirus residing in non-tourist municipalities and non-rural areas.
Even after all variables were adjusted (Table 6), individuals living in wealthier municipalities (with a higher ln of GDP per capita) were less likely to die after being infected with COVID-19, while individuals living in municipalities with a higher population density, a higher illiteracy rate, and individuals in the group of greatest extreme poverty (>1%) had a higher chance of death by COVID-19. These chances are also greater for individuals residing in urban agglomerations, coastal municipalities, non-border municipalities, as well as in tourist municipalities. To be continued Source: Elaborated by the authors based on the study database. Notes: All the variables included in the model were adjusted. A robust confidence interval is captured in parentheses. *** p < 0.01, ** p < 0.05, * p < 0.1 The adjusted model in Table 6 also shows that more educated individuals, females, those who had not been admitted to the ICU and who did not use invasive ventilatory support, and those who were not in the risk group were less likely to die from COVID-19 when compared to their counterparts. When compared to white, people of color were more likely to die from COVID-19. The same situation was noted for individuals living in the Northern region and the elderly.

Discussion
In Brazil, between January 1, 2020, and May 31, 2022, most of the people who were hospitalized and died after contracting COVID-19 were male, older than 60 years, pardos, residents of the Southeast region, and urban dwellers. These people lived mostly in municipalities with more than 1% of the population living in extreme poverty, tourist cities, and metropolitan regions. Most of the deaths occurred in municipalities with low GDP per capita, high population density, and high illiteracy rates.
The adjusted results of the logistic model reveal that the chances of the death of an individual that was hospitalized due to COVID-19 decreased as the municipal GDP per capita increased. The opposite occurred for those hospitalized in municipalities where over 1% of the population lived in extreme poverty when compared to those who lived in cities where this percentage was 1% and below. These results are supported by the recent literature on the topic Abedi et al. (2021); You et al. (2020), which states that low economic development might increase the chances of deaths by COVID-19.
Results of this study indicate that the chances of COVID-19-induced deaths increase when the population density and the size of the municipal population increase. The literature has presented different results that show the relationship between the population variable and COVID-19-induced deaths. In some cases, the results differ (Abedi et al., 2021;Stojkoski et al., 2020) from those of the present study, but in others, they align (You et al., 2020). These differences may be related to the behavior, access conditions, and absorption of information about the new coronavirus by the population within each locality, as well as methodological issues in each study, such as sample size and observation periods.
It is also observed that individuals living in coastal, tourist, and non-border municipalities are more likely to die from COVID-19 complications when compared to their counterparts. These results indicate that the rate of contracting COVID-19 is higher in municipalities with the highest flow of people (tourists and coastal), consequently increasing the chances of COVID-19-related deaths. However, individuals who do not reside in border municipalities are more likely to die, which corroborates the fact that Brazil is the epicenter of the disease in South America.
The adjusted estimates of the logistic model indicate that individuals over 60 years old are more likely to die when compared to individuals under 30 years old. These results corroborate many studies that have shown that elderly COVID-19 patients are more likely to die when compared to their younger counterparts (Garg et al., 2020;Gupta et al., 2020;Hou et al., 2020;Li et al., 2020;Wu et al., 2020). Regarding gender, hospitalized females were less likely to die when compared to males. This evidence has been documented in research that investigates the factors associated with COVID-19-induced deaths (Alsan et al., 2020;Gupta et al., 2020;Li et al., 2020).
The adjusted OR indicates that more educated individuals affected by COVID-19 were less likely to die compared to illiterate individuals. This result may indicate that the way individuals receive and process information may vary according to their level of education. Regarding race, blacks, pardos, and indigenous individuals who were hospitalized due to COVID-19 had a higher risk of death when compared to white ones. This result may be linked to their level of exposure to COVID-19, which may be higher for racial minorities, especially blacks and pardos, as they experience less favorable economic situations than white people (Abedi et al., 2021;Alsan et al., 2020).
Regarding the region in which the individual is hospitalized due to COVID-19, individuals hospitalized in the Northern region have a higher risk of death from COVID-19 when compared to other regions. This result may be linked to the high incidence and mortality rates of COVID-19 in the Northern region when compared to other regions of the country and the high percentage of the population of the municipalities living in extreme poverty, which is 12.14%, the highest percentage in the country, according to the calculations in this research.
The OR adjusted for clinical characteristics reveal that those individuals hospitalized with COVID-19 but did not present the main symptoms, those who are not in the risk group, and those who did not present any of the main symptoms and are not at risk, simultaneously, had a lower risk of dying from COVID-19 complications when compared to their counterparts (Garg et al., 2020;Gupta et al., 2020;Hou et al., 2020;Huang et al., 2020;Li et al., 2020;Liu et al., 2020;Wu et al., 2020).
The results presented in this study also reveal that individuals who were not hos-pitalized in ICUs and those who did not receive invasive respiratory support were less likely to die from COVID-19 compared to those who were hospitalized in ICUs and those who received invasive respiratory support. These results may reflect a possible delay between the moment of worsening of symptoms and the intervention needed (ICU admission or use of ventilatory support) to relieve them in time to prevent preventable deaths with these interventions. This possible delay may be due to a fragile hospital structure regarding the availability of beds, ICUs, and ventilatory support. Therefore, the fight against the COVID-19 pandemic is strongly linked to an improvement in hospital structure to adequately serve the population's needs, especially the most vulnerable portion, which depends almost exclusively on the public health system.
Caution is necessary when interpreting the estimates for municipalities as they are estimates at the municipal level, not at the individual level. From the results of this study, it can be stated that the locality in which the hospitalized individual lives is an important factor in the outcome of the disease. The risk of dying from COVID-19 is higher in places with the worst socioeconomic indicators and in places with higher population density and a higher population. Moreover, other characteristics of the municipalities, including where they are located (coastal, border) and their composition, are factors that are significantly associated with COVID-19-related deaths. This indicates that the factors (economic, social, and demographic) that relate to the place of residence of individuals hospitalized because of the virus are important factors that may increase or decrease the chances of COVID-19-induced deaths. Thus, the planning of actions to combat health crises such as that caused by the new coronavirus may not ignore the influence of local factors, especially socioeconomic ones, on the worsening of the crisis.
As in this study, some studies using different methodologies and with different objectives have highlighted those economic inequalities in Brazil that can aggravate the impact of the COVID-19 pandemic on individuals living in less developed cities (Demenech et al., 2020;Pires et al., 2020). The fact is that these economic inequalities can compromise the ability to react to the disease both at the individual and municipal levels (Demenech et al., 2020).
Ceteris paribus, the better the socioeconomic conditions, the lower the risk of COVID-19-induced deaths. This shows the importance of developing public policies that aim to improve socioeconomic conditions in Brazil to increase the power of reaction to COVID-19 or other diseases with similar characteristics that may affect the Brazilian population. These policies should focus on reducing extreme poverty and increasing the education level of the population, to improve access and absorption of important information to combat a pandemic. In structural terms, it is necessary to guarantee a more inclusive and accessible health system, especially for those individuals experiencing less favorable economic situations. Policies to strengthen the Brazilian health system, such as the expansion of the primary care network and hospital structures, must be formulated. These policies are structural and long-term. Therefore, they should be discussed and practicalized as soon as possible to reduce the impact of a future pandemic or health crisis like COVID-19 on Brazilians.

Conclusions
The results found in this study for the clinical and sociodemographic factors of individuals hospitalized by COVID-19 are widely supported in the literature on the subject. Individuals over 60 years of age, male, black, pardos, indigenous, and illiterate had at least one of the main symptoms, needed ICU admission, and the use of invasive ventilatory support had a greater risk of death by COVID-19 when compared to their counterparts. The same is true for hospitalized individuals living in municipalities with a high population and a high population density. Also, the risk of death by COVID-19 is higher for those who live in cities with low GDP per capita, high illiteracy rates, and a high percentage of extreme poverty.
This study achieved its objective of identifying the factors associated with deaths from COVID-19 in Brazil. It presents important results by relating these deaths to socioeconomic and demographic factors in addition to clinical factors. From this, it is possible to point out the main public policies that should be adopted to tackle the COVID-19 pandemic. These policies involve improving the socioeconomic conditions of the population and strengthening the public health system by expanding and improving hospital structures and access to them.
Although this paper has achieved its objective, it is important to highlight certain limitations regarding the use of the data in this study. As the database at the individual level is built from records filled out by the reporting health units, its quality depends on those who fill in the data. Unfortunately, some information has not been filled in, generating high percentages of missing data for some variables, such as the schooling variable, in which 64.41% of the observations are missing. However, the dependent variable -the outcome of COVID-19 -was filled in. Considering the data at the municipality level, the main limitation relates to the time lag of some variables, especially those from the 2010 Census.