
Exploratory data analysis of COVID-19 - Interactive notebook and report
Introduction
Covid-19 has been one of the greatest health and social crises of our time. To better understand the magnitude of this pandemic and make informed decisions, it is essential to perform an Exploratory Data Analysis (EDA). We will step-by-step explore the EDA process of Covid-19, using tools and techniques to visualize, analyze, and understand the data related to the virus spread in different countries.
1. Importing libraries and loading the dataset
We begin our EDA process by importing the necessary libraries, such as pandas, matplotlib, seaborn, and numpy. Then, we load the dataset in CSV format using the read_csv function from the pandas library. Next, we perform data cleaning, eliminating columns that do not provide relevant information for our analysis.
2. Filtering data by country
To perform a more specific analysis, we select a particular country. In our case, we choose Chile as an example. We filter the dataset's data according to the selected country and create a new dataframe to work with.
3. Data analysis and visualization
In this stage of the process, we dive into data analysis and visualization. Using line charts, we analyze and visualize features such as new cases, new deaths, positivity rate, ICU patients, and vaccinations over a timeline from the beginning of the pandemic to the present. Additionally, we calculate correlations between variables and create a heat map to better understand the relationships between the selected features.
4. Exporting the dataset
Once the cleaning and column selection stage is completed, we export the new dataset for use in interactive analyses with tools such as Plotly. This allows us to generate more interactive and attractive visualizations for the user.
5. Using the Plotly library
Finally, we use the Plotly library to create interactive visualizations. By using specific functions, we generate bar charts comparing different features, such as the number of cases between two countries. We also create a world map that displays Covid-19 statistics in different countries using a color-coded scheme.
Conclusions
Exploratory Data Analysis of Covid-19 is a powerful tool that allows us to better understand the spread of the virus and make informed decisions. Through data visualization and correlation calculation, we can identify patterns, trends, and relationships between different variables. The use of libraries like Plotly allows us to create interactive visualizations that facilitate the user's understanding of the data.
Here is the PDF
- Here is a PDF of the page, so you can read the report. Link to the PDF