A Data Science Project- Part 1(a)

In the earlier post, we have discussed our hypothesis and different variables impacting life expectancy. In this post, we will start digging into data. I suggest you go through codebook(check the earlier post). It is important to understand variables first before doing any analysis. I have chosen SAS as data analysis language, but you are free to choose any language R or Python. The process and logic will be same, the just syntax will be different. Now in this part, we will explore all the variables. We will check frequency and distribution of variables. For numeric variables, you can draw the histogram, boxplot, quantiles to understand them and for the categorical variables, you can draw barplot, frequency tables.

Advertisements

One thought on “A Data Science Project- Part 1(a)

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s