Relevance of statistics in data science.
“Statistics is the discipline that concerns the collection, organization, analysis, interpretation and presentation of data.” — Wikipedia
By reading this, you can easily tell how it plays a huge role in Data Science as that definition is a part of what data scientists’ do. Statistics can be a key to perform great analysis of data that can result to bring out a usable output. Statistical features can be used in data science when exploring datasets. With it, we can gain deeper insights into how exactly a data is structured and use it to further crack hidden information . Which is why it is important to understand statistics and learn various techniques to know how and when to use them.
Importance of data preparation before data analysis.
Data preparation is important as it ensures good quality and accuracy of data. Without it, results can be dirty, inaccurate, and maybe a few important things can be overlooked. It’s like cleaning a chicken meat, or fish before cooking to make sure it’s clean and edible. First, you have to explore or check it if it’s still good or can be used, remove the parts that needs to be discarded, do more things that needs to be done and finally you’re good to go.