"Data science is a combination of three things: quantitative analysis (for the rigor required to understand your data), programming (to process your data and act on your insights), and narrative (to help people comprehend what the data means)." — Darshan S
It can be overwhelming to start building a great data analyst portfolio. There are tons of concepts you have learned and plenty you haven’t. There are a million project ideas all around and you aren’t sure where, to begin with. Here are a few things that help you to build a great portfolio for a data science role.
1. Use Kaggle
Kaggle is a go-to site for a lot of enthusiasts to find their data and begin their analysis.
You can pick an area of interest like, health care (body temperature/blood pressure, etc.), finance, movies, food, and so on. Download the data, cleanse it and do some data visualization, and summary stats and there you go, you've just done a data project.
If Kaggle looks overwhelming, then use your own personal data to get you started, or maybe conduct your own survey and ask people to fill up a questionnaire. Next time step it up a little and look online for a dataset. Go with something you are interested in (e.g. video games/sports/movies) and set yourself a question to answer. This might not feel like a major thing to show the recruiters, but you'll be building skills and confidence to move on to more involved projects.
Once you have worked on a few datasets available, try competing in one of their popular competitions and get a real-time feel of how you should build things from scratch.
2. Get your hands dirty by cleaning the data

Getting started is the hardest part but you'll never have impressive portfolio projects to show off if you don't start small.
When looking online for a dataset, there is a number at https://registry.opendata.aws/ which are freely available. Furthermore, they are compatible with AWS cloud computing resources.
The best part about these datasets is you can find something unique, whereas with Kaggle you might be repeating the work others have already done.
Open data from government sources are extremely useful. These datasets are often unstructured, and you need to spend time processing and cleaning that data in a meaningful way. You will face similar circumstances in the real world too. So, get your hands dirty and fully engage in the data cleaning process - which can often take up a majority of your time.
3. Working on your interests
Find something interesting to you and do something with it. You'll find a project like that a lot easier to talk about if you get to interview. It's often the thought process that employers are interested in rather than a shopping list of models and programming languages you've used to build it.
Don't overthink it, just do a project and see what comes out.
I often start with a dataset and work on applying the major algorithms: Linear Regression, Logistic Regression, SVM, Gradient Boosting, Random Forest, PCA, K-Means, Collaborative filtering, KNN, & ARIMA.
Not just to run the algorithms but to truly understand when and why to use each.
4. Using GitHub and WordPress
Use GitHub pages to display some project stuff. It is very easy to set up (Use: https://pages.github.com/ ) and show off some data cleaning/feature engineering on some data sets you find interesting. Then gradually add to them or take on different challenges.
On the other hand, you can build your free website using WordPress where you can share a lot more ideas you have related to data science, technology, or your hobbies in general. You can SEO-optimize your site to bring it to the top of the relevant google searches. Congrats! Everyone in the world can now see your work and your passion for data science.
5. Communication skills matter

Use LinkedIn and connect with people that share similar interests. Don't hesitate to reach out to recruiters and have conversations about their business needs and how you can add value to their firm. Use www.meetup.com to check for local data science/tech meetups. Attend them and network with people that have similar interests. These are great places to reach out to recruiters.
Quick tip : Pick an industry or business that interests you and try to identify a pain point in that business that data science can be used to solve. That will impress hiring managers as it shows you actually understand their business and how to leverage data in creating value - that's what they hire for at the end of the day.
"Data science is a combination of three things: quantitative analysis (for the rigor required to understand your data), programming (to process your data and act on your insights), and narrative (to help people comprehend what the data means)." — Darshan S
It can be overwhelming to start building a great data analyst portfolio. There are tons of concepts you have learned and plenty you haven’t. There are a million project ideas all around and you aren’t sure where, to begin with. Here are a few things that help you to build a great portfolio for a data science role.
1. Use Kaggle
Kaggle is a go-to site for a lot of enthusiasts to find their data and begin their analysis.
You can pick an area of interest like, health care (body temperature/blood pressure, etc.), finance, movies, food, and so on. Download the data, cleanse it and do some data visualization, and summary stats and there you go, you've just done a data project.
If Kaggle looks overwhelming, then use your own personal data to get you started, or maybe conduct your own survey and ask people to fill up a questionnaire. Next time step it up a little and look online for a dataset. Go with something you are interested in (e.g. video games/sports/movies) and set yourself a question to answer. This might not feel like a major thing to show the recruiters, but you'll be building skills and confidence to move on to more involved projects.
Once you have worked on a few datasets available, try competing in one of their popular competitions and get a real-time feel of how you should build things from scratch.
2. Get your hands dirty by cleaning the data

Getting started is the hardest part but you'll never have impressive portfolio projects to show off if you don't start small.
When looking online for a dataset, there is a number at https://registry.opendata.aws/ which are freely available. Furthermore, they are compatible with AWS cloud computing resources.
The best part about these datasets is you can find something unique, whereas with Kaggle you might be repeating the work others have already done.
Open data from government sources are extremely useful. These datasets are often unstructured, and you need to spend time processing and cleaning that data in a meaningful way. You will face similar circumstances in the real world too. So, get your hands dirty and fully engage in the data cleaning process - which can often take up a majority of your time.
3. Working on your interests
Find something interesting to you and do something with it. You'll find a project like that a lot easier to talk about if you get to interview. It's often the thought process that employers are interested in rather than a shopping list of models and programming languages you've used to build it.
Don't overthink it, just do a project and see what comes out.
I often start with a dataset and work on applying the major algorithms: Linear Regression, Logistic Regression, SVM, Gradient Boosting, Random Forest, PCA, K-Means, Collaborative filtering, KNN, & ARIMA.
Not just to run the algorithms but to truly understand when and why to use each.
4. Using GitHub and WordPress
Use GitHub pages to display some project stuff. It is very easy to set up (Use: https://pages.github.com/ ) and show off some data cleaning/feature engineering on some data sets you find interesting. Then gradually add to them or take on different challenges.
On the other hand, you can build your free website using WordPress where you can share a lot more ideas you have related to data science, technology, or your hobbies in general. You can SEO-optimize your site to bring it to the top of the relevant google searches. Congrats! Everyone in the world can now see your work and your passion for data science.
5. Communication skills matter

Use LinkedIn and connect with people that share similar interests. Don't hesitate to reach out to recruiters and have conversations about their business needs and how you can add value to their firm. Use www.meetup.com to check for local data science/tech meetups. Attend them and network with people that have similar interests. These are great places to reach out to recruiters.