Exploratory Analysis of Swiss Fertility Data

This report is a summary based on exploratory analysis of the Swiss dataset. The dataset contains a total of 47 observations, representing the 47 provinces of Switzerland in 1888. The dataset contains a total of 6 variables. Of particular interest is the fertility variable, which is a standardized measure of fertility.

From this dataset, a few summary statistics can be observed. First, the measure of Fertility ranges from a minimum value of 35 to a maximum value of 92.5 with a mean value of 70.14. The socioeconomic indicators show that across provinces, the average percent of males involved in agriculture as occupation was 50.66 at that time. IN addition, the average percent of draftees with education beyond primary school was 10.98 at that time.

Relationship between Socioeconomic Indicators and Fertility

Examining the bivariate correlations of the variables included shows that there are multiple socioeconomic indicators that appear to have a relationship with fertility.

  Fertility Agriculture Examination Education Catholic Infant.Mortality
Fertility 1 0.353 -0.646 -0.664 0.464 0.417
Agriculture 0.353 1 -0.687 -0.64 0.401 -0.061
Examination -0.646 -0.687 1 0.698 -0.573 -0.114
Education -0.664 -0.64 0.698 1 -0.154 -0.099
Catholic 0.464 0.401 -0.573 -0.154 1 0.175
Infant.Mortality 0.417 -0.061 -0.114 -0.099 0.175 1

Based on the correlation table, it appears that there is a strong negative correlation between fertility and education with a correlation of -0.664. This suggests that in provinces where education beyond primary school is higher, there tends to be a lower fertility rate.

Similarly, there is also a strong negative correlation between fertility and examination with a correlation of -0.646. The variable Examination represents the percent of draftees receiving the highest mark on the army examination. This suggests that in provinces with a high percent of draftees scoring the higest mark on the army examination, there tends to be a lower fertility rate.

Of the variables in the dataset, Catholic and Infant Mortality have the highest, positive correlation with fertility at 0.464 and 0.417 respectively. This suggests that provinces with a higher percent of catholics and provinces with a larger number of infants that live less than 1 year tend to have higher fertility rates.

From the plot comparing the percent of catholics with the fertility measure, it also appears that most provinces either have a large majority or a small minority of catholics. Only a few provinces appear to be relatively evenly split in the proportion of catholics versus protestants.

Relationships between other Variables

In addition, there appears to be a negative correleation between the percent of males involved in agriculture and percent of education beyond primary school at a correlation of -0.64. This suggests that in provinces where education beyond primary school is higher, the percent of males involved in agriculture is lower. This appears to be an intuitive relationship, considering that people who obtain a higher education are more likely to seek occupations that are more prestigious rather than being limited to agriculture.