Curriculum
Module 1: Python
This module caters to beginners by acquainting them with the foundational concepts of Python programming. It covers everything from data types and loops to functions and data structures.
- Why python ?
- Python IDE
- Hello World Program
- Variables & Names
- String Basics
- List
- Tuple
- set
- Dictionaries
- Conditional Statements
- For and While Loop
- Built-in-Functions-(Numbers and Math)
- User Defined Function
- Modules and Packages
- Common Errors in Python
Module 2: Python Advanced
Building on the Python basics, this module explores more advanced concepts, such as list comprehensions, file handling, and object-oriented programming. It also delves into other important topics like pickling and debugging in Python, offering a comprehensive understanding of advanced Python concepts:
- List Comprehension
- File Handling
- Debugging in Python
- Class and Objects
- Lambda, Filters and Map
- Regular Expressions
- Python PIP
- Read Excel Data in Python
- Iterators, Decorators and Generators
- Pickling
- Python JSON
Module 3: Algorithmic Thinking with Python
This module covers key concepts in algorithm design, including problem-solving strategies, algorithm analysis, data structures, and algorithmic paradigms.
- Introduction to algorithmic Thinking
- Algorithm Efficiency and time complexity
- Example algorithms – binary search, Euclid’s algorithm
- Data structures – stack, heap, and binary trees
- Memory Management/Technologies
- Best Practices – Keeping it simple, dry code, naming Conventions, Comments, and docs.
Module 4: SQL Basic
In this module, we will dive into the SQL-based databases. We will learn the basics of SQL queries, schemas, and normalization.
- Database-Introduction and Installation,
- Data Modeling
- Normalization and Star schema
- ACID Transactions
- Data Types
- Data Definition Language (Create,Drop,Truncate,Alter)
- Data Manipulation Language (Select,Delete,Update,Insert)
- Data Control Language (Grant,Revoke)
- Transaction Control language (Commit,Revoke,Rollback)
- SQL Constraints(Primary key, Foreign Key,Unique,Not NULL, CHECK,DEFAULT)
- Operators (Arithmetic, Logical, Bitwise, Comparison,Compound)
- Clauses in SQL(Where,Having,Group by, Order by)
Module 5: SQL Advanced
we will Continue into the SQL-based databases. We will learn the SQL Advanced queries, Join, Date and Time Functions and SubQueries.
- Joins(Inner,Left,Right,Full Join,Equi Join,Non-Equi Join,Self Join)
- Mathematical functions (SQRT,PI,SQUARE,ROUND,CEILING)
- Conversion functions(changing the data types)
- General functions(COALESCE,NVL,NULLIF)
- Conditional expressions (if,case)
- Date and time functions
- Numeric functions
- String Functions
- Subqueries
- Rank and Window Functions
- Integrating Python with SQL
Module 6: Pandas
This module addresses the essential need for effective data handling. It introduces the Pandas library, detailing its various functions and features for efficient data manipulation and analysis:
- Introduction to Pandas
- Series Data Structure – Querying and Indexing
- DataFrame Data Structure – Querying, Indexing, and loading
- Merging data frames
- Group by operation
- Pivot table
- Date/Time functionality
- Example: Manipulating DataFrame
Module 7: Statistics & Probability with Numpy- Basic
We will go through Probability and Statistics which are essential to understanding, process and interpret the vast amount of data. We will deal with the basics of probability and statistics like Probability theory , Bayes theorem, distributions etc and their importance. Besides that we will do hands on with Numpy upon those concepts.
- Why counting and probability theory?
- Basics of sample and event space
- Axioms of probability
- Total Probability theorem and Bayes Theorem
- Random variables, PMF and CDF
- Discrete Distributions – Bernoulli, Binomial and Geometric
- Expectation and its properties
- Variance and its properties
- Continuous Distributions – uniform, exponential and normal
- Sampling from continuous distributions
- Simulation techniques – simulating in NumPy
Module 8: Statistics & Probability with Numpy- Advanced
We will continue with statistics and probability and we will deal with descriptive and inferential statistics along with Hypothesis testing and lot of other relevant statistics methods
- Inferential statistics – sample vs population
- CLT and its proof
- Chi-squared distribution and its properties
- Point and Interval Estimators
- Estimation technique – MLE
- Interval Estimator of μ with unknown σ
- Examples of estimators
- Hypothesis testing – I
- Hypothesis testing – II
- Hypothesis testing – III
Module 9: Data Visualization using Python
Data Visualization is used to understand data in visual context so that the patterns , trends and correlations in the data can be understood. We will do a lot of visualization with libraries like Seaborn, Matplotlib etc inturn that leads to effective storytelling.
- Read Complex JSON files
- Styling Tabulation
- Distribution of Data – Histogram
- Box Plot
- Data Visualization – Recap
- Pie Chart
- Donut Chart
- Stacked Bar Plot
- Relative Stacked Bar Plot
- Stacked Area Plot
- Scatter Plots
- Bar Plot
- Continuous vs Continuous Plot
- Line Plot
- Line Plot Covid Data
Module 10: Data Visualization (Tool) PowerBI/Tableau (Add-on)
This module covers a range of topics essential for mastering Power BI/ Tableau including data preparation, data modeling, data visualization, and report creation.
- POWERBI
- Introduction to PowerBI
- Creating, Managing and filtering Data
- Basic Plots in PowerBI – Trend Analysis, Area,
- Ribbon, Scatterplots and Decomposition trees
- Creating PowerBI reports
- Creating interactive dashboards and deploying the dashboards
- TABLEAU
- Introduction to Tableau
- Connecting, managing and aggregating data
- Visual Analytics in Tableau
- Simple predictive analytics using tableau
- Building Tableau Dashboards
Module 11: Introduction to Machine Learning
This module provides participants with a solid foundation in machine learning principles, algorithms, and methodologies.
- What is Machine Learning?
- Different types of Machine Learning problems (Supervised, Unsupervised, Reinforcement)
- Applications of Machine Learning
- The Machine Learning Pipeline
Module 12: Machine Learning: Data Collection
This module covers a wide range of topics related to data collection, including data acquisition strategies.You will learn how to identify relevant data sources, retrieve data from various sources such as databases, APIs, and web scraping.
- Data Sources (Structured, Unstructured)
- Data Collection Techniques (APIs, Web Scraping, Sensors)
- Data Acquisition Ethics
Module 13: Machine Learning: Data Cleaning & Pre-Processing
This module covers a comprehensive range of topics related to data cleaning and preprocessing, including handling missing values, dealing with outliers, standardizing and scaling numerical features, encoding categorical variables, and feature engineering.
- Data Cleaning Techniques (Handling Missing Values, Outliers)
- Data Transformation (Scaling, Normalization, Encoding)
- Feature Engineering (Feature Selection, Creation)
- Balancing Data (Undersampling, Over Sampling and SMOTE)
Module 14: Machine Learning: Exploratory Data Analysis
This module covers a wide range of topics related to exploratory data analysis, including data visualization, summary statistics, correlation analysis, and dimensionality reduction techniques.
- Data Visualization Techniques (Histograms, Scatter plots, Box Plots etc )
- Univariate, Bivariate and Multivariate Analysis
- Understanding Data Distribution and Relationships
- Identifying Patterns and Trends
- Feature Importance Analysis
Module 15: Machine Learning: Model Building
Model building is a crucial stage in the machine learning workflow, where practitioners leverage algorithms to learn patterns and make predictions from data. In this module, you will learn about Supervised learning techniques.
Supervised Learning
- Introduction to Supervised Learning
- Linear Regression (Regression)
- Logistic Regression (Classification)
- Decision Tree (Regression / Classification)
- Random Forest (Regression / Classification)
- Support Vector Machine (Regression / Classification)
- Naive Bayes (Regression / Classification)
- XGBoost (Regression / Classification)
- KNN (Regression / Classification)
- ARIMA (Forecasting)
Module 16: Machine Learning: Model Building- Continued
We will continue learning into model building and delve into UnSupervised learning techniques & Reinforcement learning.
- UnSupervised Learning
- Introduction to Unsupervised Learning
- K-Means Clustering
- Hierarchical Clustering
- DBSCAN
- PCA
-
- Introduction to Reinforcement Learning
Module 17: Machine Learning: Model Evaluation & Hyper Parameter Tuning
This module covers a comprehensive range of topics related to model evaluation and hyperparameter tuning. You will learn how to assess the performance of machine learning models using various evaluation metrics.
- Model Evaluation
- Regression (R2, MAE, MSE, RMSE etc)
- Classification( Accuracy, Precision, Recall,F1-Score, AUC-ROC etc)
- Model Hyperparameter Tuning
-
- Random Search
- Grid Search
- Bayesian Optimization
- Cross Validation
- Early Stopping
Module 18: Machine Learning: Model Deployment
This module focuses on the final stage of the machine learning pipeline, where trained models are deployed into production environments to make predictions on new data.
- Saving and Loading Models
- Preparing Models for Production Environments
- Model Monitoring and Performance Tracking
- MLFlow
Module 19: Deep Learning with Pytorch: NN & ANN
This module provides participants with a comprehensive introduction to deep learning concepts and techniques using PyTorch. We will also discuss neural networks(NN), the building blocks of deep learning, and artificial neural networks (ANNs).
- Fundamentals of Neural Networks: Limitations of ML; The Neuron; Linear perceptron as neurons
- Feed Forward Neural Networks: Linear Neurons and limitations; Sigmoid, Tanh and ReLU; Softmax
- Learning-I: Gradient Descent; Delta rule and learning rates; Gradient descent with sigmoidal Neurons
- Learning-II: Backpropagation; Stochastic and minibatch; Test set, validation set, and overfitting
- Preventing overfitting
- PyTorch Basics: Installation and setup of PyTorch; Tensors and operations in PyTorch
- Training Fundamentals: Autograd; Backpropagation; Gradient Descent; Training Pipeline.
- Regression with PyTorch: Linear Regression; Logistic Regression
- Dataset in PyTorch: Dataset and Dataloader; Dataset Transforms.
- Training Pipeline: Softmax and Crossentropy; Activation Functions
Module 20: Deep Learning with Pytorch: CNN
This module focuses specifically on CNNs, a specialized type of neural network designed to effectively capture spatial hierarchies and patterns present in images.
- Introduction to CNN Architecture
- Image Filter/Image kernel;
- Convolution layer and RGB
- Pooling Layer
Module 21: Deep Learning with Pytorch: RNN
This module is designed to provide a deep understanding of recurrent neural networks (RNNs) and their applications using PyTorch, a popular deep learning framework.
- Introduction to RNN Architecture
- Language models;
- Generation with RNNs
- Drawback of RNN
Module 22: Deep Learning with Pytorch: LSTM
This module provides a thorough understanding of Long Short-Term Memory(LSTM) networks, including their architecture, training algorithms, and applications.
- Adding more memory: LSTM architecture
- Applications of LSTM
- Drawback of LSTM
Module 23: Deep Learning with Pytorch: Transformers & GAN
This module explores advanced deep learning concepts focusing on Transformers and Generative Adversarial Networks (GANs) using PyTorch.
- Introduction to Transformer Architecture
- Self Attention Layer
- Encoder
- Decoder
- Sequence to Sequence
- Transfer Learning (Hugging Face)
Module 24: Natural Language Processing(NLP)
This module offers participants a comprehensive introduction to the field of natural language processing, focusing on techniques and applications for analyzing and understanding human language data.
-
- Tokenization
- Normalization
- Stop word removal
- Stemming/Lemmatization
- Text Vectorization and Embedding
-
- Bag-of-Words (BoW)
- TF-IDF
- Word Embeddings
- Sentence Embeddings
Module 25: Natural Language Processing(NLP)-Continued
In this module, We will continue into NLP techniques and focus on applications of pre-trained models using Hugging Face.
- Applications of Pre-Trained Models (Hugging Face):
-
- Text Classification: Classifying text into predefined categories (e.g., sentiment analysis, spam detection).
- Machine Translation: Translating text from one language to another.
- Question Answering: Extracting answers to questions from a given context.
- Text Summarization: Condensing lengthy text into a shorter, informative summary.
- Text Generation: Generating different creative text formats like poems, code, scripts, etc. (depending on the model).
Module 26: Computer Vision: Image Pre-Processing
This module is designed to equip participants with the essential techniques and methodologies for preparing and pre-processing images in computer vision applications.
- Annotation: Marking important parts of the image, like objects or areas of interest.
- Data Augmentation: Making variations of the image by doing things like flipping, rotating, or changing colors. This helps the model learn better by seeing more examples.
- Normalization: Adjusting the brightness and contrast of the image to make it easier for the model to understand.
- Resizing: Making sure all images are the same size so the model can process them easily.
Module 27: Computer Vision: Image Classification
This module covers Image classification, a fundamental task in computer vision, where the goal is to categorize images into predefined classes or categories based on their visual content.
- Convolutional Neural Networks (CNNs)
- Residual Networks (ResNets)
- Inception Networks
- MobileNets
- EfficientNet
Module 28: Computer Vision: Object Detection
This module delves into the techniques and methodologies for detecting and localizing objects within images or videos, a fundamental task in computer vision applications.
- Faster R-CNN
- YOLO (You Only Look Once)
- SSD (Single Shot Multibox Detector)
- Mask R-CNN
Module 29: Computer Vision: Image Segmentation
This module is dedicated to exploring advanced techniques for partitioning images into semantically meaningful regions, known as image segmentation.
Module 30: Cloud Computing using AWS
This module provides a comprehensive understanding of cloud computing principles and practical skills in utilizing Amazon Web Services (AWS), one of the leading cloud service providers.
- Cloud Infrastructure
-
- Overview of AWS services: compute, storage, networking, databases.
- Key AWS services: EC2, S3, VPC, RDS.
- Cloud Configurations & Services
-
- IAM for access control.
- CloudFormation for infrastructure as code.
- AWS Lambda for serverless computing.
- Elastic Beanstalk for application deployment.
Module 31: Cloud Computing using AWS-Continued
Through this module, you will have the skills and knowledge to effectively leverage Amazon SageMaker to build, train, and deploy machine learning & Deep learning models for a variety of use cases. We will understand the end-to-end workflow of model development in SageMaker.
- Building & Deploying ML Model in SageMaker
-
- SageMaker for ML model building and deployment.
- Data preprocessing and model selection.
- Training, evaluation, and deployment of ML models.
- Building & Deploying DL Model in SageMaker
-
- Deep learning concepts and architectures.
- SageMaker for building and training DL models.
- Deployment of DL models with SageMaker endpoints.
Module 32: Cloud Computing using AWS: Hosting
In this module, we will learn how to effectively deploy and host ML/DL applications on AWS infrastructure. Also, we will understand the different deployment options available on AWS and be able to select the most suitable approach based on their application requirements.
- Hosting An ML/DL Application on AWS
-
- Integrating ML/DL models into web apps.
- Deployment and scaling on AWS infrastructure.
- Monitoring, logging, security, and compliance measures.
Module 33: Generative AI: Unleashing the Power of Language Models
Generative AI introduces learners to the cutting-edge field of generative artificial intelligence (AI), focusing on the remarkable capabilities of Large Language Models (LLMs) and their applications in various domains. The module provides a comprehensive overview of LLMs, prompt engineering techniques, and fine-tuning strategies.
- LLM (Large Language Model)
- Introduction to Large Language Models
- Description of GPT-3 and chatGPT architecture
- Application of LLMs in various fields
- Basic description of other LLMs
- Learn GenAI with Llama, OpenAI, Gemini, Hugging Face
- Prompt Engineering
- Introduction to Prompt Engineering
- Overview of language models and their capabilities
- Understanding Language Model Responses
- Crafting Effective Prompts
- Controlling Model Output
- FineTuning LLM
- Fine-Tuning Techniques
- Task-specific fine-tuning vs. domain adaptation
- Architecture modifications for task-specific fine-tuning
- Dataset selection and curation for fine-tuning
- Implementing fine-tuning pipelines with PyTorch
- Hyperparameter tuning and optimization strategies.DATA