Learning Path for Data Scientists

Learning Path for Data Scientists

Fundamental of Python and R(10Hrs)

  • Basics of Python and R
  • Conditional and loops
  • String and list objects.
  • Functions & OOPs concepts.
  • Exception handling.
  • Database programming.
Data scientists must know how to code - start by learning the fundamentals of two popular programming languages Python and R.

Data Wrangling(16hrs)

  • Reading CSV, JSON, XML and HTML files using Python
  • NumPy & pandas
  • Relational databases and data manipulation with SQL
  • Scipy libraries
  • Loading, cleaning, transforming, merging, and reshaping data
Once you have the core skill of programming covered – dip your feet in the nitty-gritty of working with data by learning how to wrangle and visualize them.

Statistics and Probability(8Hrs)

  • Probability mass functions
  • Probability distribution functions
  • Cumulative distribution functions
  • Modeling distributions
  • Inferential statistics
  • Estimation
  • Hypothesis testing
  • Implementation of statistical concepts in Python
It is impossible to use data without knowledge of statistics. Collect, organize, analyze, interpret, and present data using these concepts of statistics.

Introduction to AI(4hrs)

Learn how AI can help solve real-world problems using machine learning & deep learning.

ML Models in Python(30Hrs)

  • Building models using below algorithms
  • Linear and logistics regression
  • Decision trees
  • Support vector machines (SVMs)
  • Random forests
  • XGBoost
  • K nearest neighbour & hierarchical clustering
  • Principal component analysis
  • Text analytics and time series forecasting
Machines have increased the ability to interpret large volumes of complex data. Combine aspects of computer science with statistics to formulate algorithms that help machines draw insights from structured and unstructured data.

Data Visualisation using Matplotlib and Tableau(10Hrs)

  • Interactive visualizations with Matplotlib,
  • Data visualizations using Tableau
  • Tableau dashboard and story board
  • Tableau and R integration
Complex data sets call for simple representations that are easy to follow. Visualize and communicate key insights derived from data effectively by using tools like Matplotlib and Tableau.

ML/DL using Tensorflow(20Hrs)

  • Basics of neural network
  • Linear algebra
  • Implementation of neural network in Vanilla
  • Basics of TensorFlow
  • Convolutional neural networks (CNNs)
  • Recurrent neural networks (RNNs)
  • Generative models
  • Semi-supervised learning using GAN
  • Seq-to-seq model
  • Encoder and decoder
Go beyond superficial analysis of data by learning how to interpret them deeply. Use deep-learning nets to uncover hidden structures in even unlabeled and unstructured data using TensorFlow.

Handling Big Data with Spark(12 Hrs)

  • Introduction to Big Data & Spark
  • RDD's in Spark, data frames & Spark SQL
  • Spark streaming, MLib & GraphX
Lastly, manage your infrastructure with a data engineering platform like Spark so that your efforts can be focused on solving data problems rather than problems of machines.

Finally, do some beautiful projects. 

Comments

Post a Comment

Popular posts from this blog

Data Science 

Learning Path for Deep Learning in 2019

Day 6 - Daily Dev Diaries