Learning Path for Data Scientists
Learning Path for Data Scientists
Fundamental of Python and R(10Hrs)
- Basics of Python and R
- Conditional and loops
- String and list objects.
- Functions & OOPs concepts.
- Exception handling.
- Database programming.
Data scientists
must know how to code - start by learning the fundamentals of two popular
programming languages Python and R.
Data Wrangling(16hrs)
- Reading CSV, JSON, XML and HTML files using Python
- NumPy & pandas
- Relational databases and data manipulation with SQL
- Scipy libraries
- Loading, cleaning, transforming, merging, and reshaping data
Once you have
the core skill of programming covered – dip your feet in the nitty-gritty of
working with data by learning how to wrangle and visualize them.
Statistics and Probability(8Hrs)
- Probability mass functions
- Probability distribution functions
- Cumulative distribution functions
- Modeling distributions
- Inferential statistics
- Estimation
- Hypothesis testing
- Implementation of statistical concepts in Python
It is
impossible to use data without knowledge of statistics. Collect, organize,
analyze, interpret, and present data using these concepts of statistics.
Introduction to AI(4hrs)
Learn how AI can help solve real-world problems using
machine learning & deep learning.
ML Models in Python(30Hrs)
- Building models using below algorithms
- Linear and logistics regression
- Decision trees
- Support vector machines (SVMs)
- Random forests
- XGBoost
- K nearest neighbour & hierarchical clustering
- Principal component analysis
- Text analytics and time series forecasting
Machines have
increased the ability to interpret large volumes of complex data. Combine
aspects of computer science with statistics to formulate algorithms that help
machines draw insights from structured and unstructured data.
Data Visualisation using Matplotlib and Tableau(10Hrs)
- Interactive visualizations with Matplotlib,
- Data visualizations using Tableau
- Tableau dashboard and story board
- Tableau and R integration
Complex data
sets call for simple representations that are easy to follow. Visualize and
communicate key insights derived from data effectively by using tools like
Matplotlib and Tableau.
ML/DL using Tensorflow(20Hrs)
- Basics of neural network
- Linear algebra
- Implementation of neural network in Vanilla
- Basics of TensorFlow
- Convolutional neural networks (CNNs)
- Recurrent neural networks (RNNs)
- Generative models
- Semi-supervised learning using GAN
- Seq-to-seq model
- Encoder and decoder
Go beyond
superficial analysis of data by learning how to interpret them deeply. Use
deep-learning nets to uncover hidden structures in even unlabeled and
unstructured data using TensorFlow.
Handling Big Data with Spark(12 Hrs)
- Introduction to Big Data & Spark
- RDD's in Spark, data frames & Spark SQL
- Spark streaming, MLib & GraphX
Lastly, manage
your infrastructure with a data engineering platform like Spark so that your
efforts can be focused on solving data problems rather than problems of
machines.
Finally, do some beautiful projects.
Cool,grt work bro.
ReplyDeleteThanks man
Delete