Project Final Report

Team Members

Project Description

Project Goal & Social Problem

We have determined the earthquake, which is one of the natural disasters that can be devastating and unpredictable, especially in regions that are not used to earthquakes.

The aim of this project is to understand whether there is a relationship between earthquakes in the world. In this direction, historical, regional and trigger links between earthquakes were sought.

Project data & access to data

We knew that our dataset selection was important in order to make earthquake data more meaningful, so we chose the United States Geological Survey to access worldwide data, and Boğaziçi University Kandilli Observatory and Earthquake Research Institute to access data specific to Turkey. For this purpose, we used the earthquake data of the USGS and KOERI for the years 2016-2020.

The datasets were easily obtained in the web interface thanks to the API provided by the USGS and KOERI. The data used in the analysis consists of data with a magnitude >2.5 in order to increase accuracy and avoid confusion.

Actions taken

Within the scope of the project, we first tried to clean the data we imported from the USGS and KOERI sites. Because they were included in the dataset for uncertain earthquakes, we had to exclude them so that they do not affect the analysis. When importing the data, it made our job very easy as we got the size >2.5. In the next process, we cleaned ~2k lines of missing data. We reclassified the variables by data types and looked at their statistics for numeric variables to give us an idea. We then decided on the visualizations that we thought might be useful and tried to draw them.

  1. The datasets retrieved from KOERI and USGS has been imported.

  2. Column names and value formats has been modified conveniently.

turkey_tidyquake <- turkey_earthquake %>% 
                      select(No,
                             Event_ID = `Deprem Kodu`,
                             Date = `Olus tarihi`,
                             Origin_Time = `Olus zamani`,
                             Latitude = Enlem,
                             Longitude = Boylam,
                             Depth_km = `Der(km)`,
                             Mag = xM,
                             Type = Tip) %>% 
                      filter(Type == "Ke")
#KOERI
turkey_tidyquake$Date <- as.Date(turkey_tidyquake$Date, format = "%Y.%m.%d")
turkey_tidyquake <- turkey_tidyquake %>% 
                      mutate(Year = year((Date)),
                             Month = month((Date)))

#USGS
data$time <- strptime(data$time, format = "%Y-%m-%dT%H:%M:%OSZ")
data$year <- format(data$time, format="%Y")
data$month <- format(data$time, format="%m")
data$date <- format(data$time, format="%Y%m%d")
#KOERI
turkey_tidyquake <- turkey_tidyquake %>% 
                      mutate(magClass = cut(Mag, breaks=c(2.4,4,5,6,7,9),
                                            labels=c("2.5-4", "4-5", "5-6", "6-7", "7-9")))
#USGS
data<- mutate(data, magClass=cut(data$mag, breaks=c(2.4, 4, 5, 6, 7, 9),
                                 labels=c("2.5-4", "4-5", "5-6", "6-7", "7-9")))

Earthquakes Occurred On The World

As seen above, earthquakes that occurred between 2016 and 2020 has been pointed on the map. Atlantic Ocean fault line can be observed on the map.

Distribution of earthquakes on Turkey according to their magnitude

As seen above, earthquakes are occurred often on northwest to southwest of Turkey. Also, we can see the earthquakes are common on The North Anatolian Fault.

Pie Chart

Distribution of The Number of Earthquakes by Years.

As seen below, more than 20 thousand earthquakes occurred each year from 2016 to 2020. In 2018, there were nearly twice as many earthquakes occurred, compared to 2017. This is the highest count of earthquakes in these 5 years.

In Turkey, there are more than one thousand earthquakes happened each year from 2016 to 2020. In 2017, more than 5000 of earthquakes occurred in Turkey, which is a peak in these 5 years.

year_turkey <- turkey_tidyquake %>%
                group_by(Year) %>% 
                tally()
tr_yearthquake <- ggplot(year_turkey) + geom_bar(aes(x = Year, y = n, fill = as.factor(n)),
                                                 stat = "identity")+
                  scale_fill_brewer(palette = "Set1")+
                  theme_minimal()+
                  theme(legend.position = "none")+
                  ylab("Counts")
tr_yearthquake <- tr_yearthquake + ggtitle("Distribution of Earthquake Counts in Turkey by Years")
tr_yearthquake

Distribution of The Number of Earthquakes by Months.

There are more than 10 thousand earthquakes observed in the world each year from 2016 to 2020. The count of earthquakes increased on Summer. Therefore there could be a relationship between temperature and the earthquakes.

library(RColorBrewer)
month<-data %>% group_by(month) %>% tally()
world_monthquake <- ggplot(month) + geom_bar(aes(x=month, y=n, fill= as.factor(n)), stat="identity") +
     scale_fill_manual(values = colorRampPalette(brewer.pal(9,"Set1"))(12))+
     theme_minimal()+
     theme(legend.position = "none")
world_monthquake <- world_monthquake + ggtitle("Distribution of Earthquake Counts in The World by Months") + xlab("Months") + ylab("Counts")
world_monthquake

In Turkey, we also see the number of earthquakes are increased on Summer season, while the most of earthquakes occurred in January and February.

tr_monthquake <- turkey_tidyquake %>% 
                  group_by(Month) %>% 
                  tally()
tr_month_plot <- ggplot(tr_monthquake) + geom_bar(aes(x=Month, y=n, fill= as.factor(n)),
                                                  stat = "identity") +
                  scale_fill_manual(values = colorRampPalette(brewer.pal(9,"Set1"))(12))+
                  scale_x_continuous(breaks = c(1:12))+
                  theme_minimal()+
                  theme(legend.position = "none")
tr_month_plot <- tr_month_plot + 
                  ggtitle("Distribution of Earthquake Counts in Turkey by Months")
tr_month_plot

Distribution of The Earthquakes by years and magnitude classes.

year<-data %>% group_by(year, magClass) %>% tally()
year%>%ggplot(aes(year, n))+
geom_point(size=1, col="red")+
  facet_wrap(~magClass,  ncol=2, scales="free")+
   ggtitle("Number of Earthquakes by Magnitude Class and Year") +
           xlab("Year") + ylab("Number of Cases")+
  theme(plot.title = element_text(face="bold", size=14, hjust=0.5)) +
theme(axis.title = element_text(face="bold", size=12))

year<-turkey_tidyquake %>% group_by(Year, magClass) %>% tally()
year%>%ggplot(aes(Year, n))+
geom_point(size=1, col="red")+
  facet_wrap(~magClass,  ncol=2, scales="free")+
   ggtitle("Number of Earthquakes by Magnitude Class and Year") +
           xlab("Year") + ylab("Number of Cases")+
  theme(plot.title = element_text(face="bold", size=14, hjust=0.5)) +
theme(axis.title = element_text(face="bold", size=12))

Let’s examine the distribution of earthquakes by months and magnitude classes.

month<-data %>% group_by(month, magClass) %>% tally()
month%>%ggplot(aes(month, n))+
geom_point(size=1, col="red")+
  facet_wrap(~magClass,  ncol=2, scales="free")+
   ggtitle("Number of Earthquakes by Magnitude Class and Month") +
           xlab("Months") + ylab("Number of Cases")+
  theme(plot.title = element_text(face="bold", size=14, hjust=0.5)) +
theme(axis.title = element_text(face="bold", size=12))

Density Plots

ggdensity(data$mag, 
          main = "Density Plot of Magnitude in The World",
          xlab = "Magnitude")

ggdensity(turkey_tidyquake$Mag,
          main = "Density plot of magnitude in Turkey",
          xlab = "Magnitude")

Results and Discussion

As a result of our research, we have reached: We learned the averages of earthquake magnitudes in the world. On the world map, we have seen that earthquakes are more intense on coastlines. Compared to other years, we saw that the number of earthquakes in 2018 was almost doubled. We observed an increase in the number of earthquakes in the summer months. This increase seems especially 2.5-4 magnitude range.

Thanks to our project, we obtained separate insights from earthquake data of the World and Turkey. We checked whether the earthquake relations are a link between the world and Turkey. We compared the world averages with Turkey, which is known as the earthquake zone. In this process, we determined that we could make comparisons on the basis of size, hourly, seasonal, seasonal (sea/terrestrial) and we focused on these factors in our research accordingly.

Conclusion

As a result, we used data transfer, cleaning, reconstruction according to data types and basic visualization processes in the analysis of earthquake data in the world and Turkey between the years 2016-2020, which we obtained from USGS and KOERI organizations, which we identified as reliable data sources. In this way, we provided the opportunity to visually see whether there is a similarity between the earthquakes that took place in the world and in Turkey. In this way, we tried to figure out whether our country, which we refer to as an earthquake zone, is really an above-average earthquake zone when compared to other countries in the world.

You can also access our project’s GitHub page here: Statistics of earthquake hazards in Turkey and comparison with the world

References