Data Science Course | Data Science Training Institute | Location: Hyderabad

Data Science Course Content

Notes, Code Sample – https://github.com/thedatajango/course_samples

• What is Data Science
• Why Data Science
• Applications of Data Science
• How much of statistics
• How much of mathematics
• How much demand in IT industry
• Binomial Distribution
• Introduction to Probability
• Normal Distribution
• How to install python (anaconda)
• How to work with Jupyter Notebook
• How to work with Spyder IDE
• Strings
• Lists
• Tuples
• Sets
• Dictionaries
• Control Flows
• Keywords (continue, break, pass)
• Functions
• Formal/Positional/Keyword arguments
• Predefined functions (range, len, enumerate, zip)
• Series
• DataFrame
• df.GroupBy
• df.crosstab
• df.apply
• df.map
• df.mapapply
• Statistical Data Analysis
• Fixing missing values
• Finding outliers
• Data quality check
• Feature transformation
• Data Visualization (Matplotlib, Seaboarn)
• Categorical to Categorical
• Categorical to Quantitative
• Quantitative to Quantitative
• Bi-Variate data analysis (Hypothesis Testing)
• Categorical and Quantitative (ANOVA)
• Categorical to Categorical (Chi-Square)
• Quantitative to Categorical (Chi-Square)
• Quantitative to Quantitative (Correlation)
• Multiple linear regression
• Train/Test Split
• Hypothesis testing formal way
• Feature selection methods (Backward, Forward and Mixed)
• Linear regression assumptions
• Normal Equation (Linear Algebraic way of solving linear equation)
• Multiple Linear Regression (SGD Regressor)
• Gradient Descent (Calculus way of solving linear equation)
• Feature Scaling ( Min-Max vs Mean Normalization)
• Feature Transformation
• Polynomial Regression
• Matrix addition, subtraction, multiplication, division and transpose
• Train/Validation/Test split
• K-Fold Cross Validation
• The Problem of Over-fitting (Bias-Variance tread-off)
• Learning Curve
• Regularization (Ridge, Lasso and Elastic-Net)
• Hyper Parameter Tuning (GridSearchCV, RandomizedSearchCV)
• Logistic Regression (SGD Classifier)
• Accuracy measurements
• Precision
• Recall
• Precision – Recall Tread-off
• AUC Score
• ROC Curve
• Multi-class Classification
• One-vs-One
• One-vs-All
• Softmax Regression Classifier
• Multi-label Classification
• Multi-output Classification
• K-means
• Hierarchical
• Regression Trees vs Classification Trees
• Entropy
• Gini Index
• Information Gain
• Tree Pruning
• Bayes Theorem
• Naive Bayes Algorithm
• Introduction to Natural Language Processing
• Overview of Hadoop architecture
• Overview of YARN architecture
• Map-Reduce example
• Perceptron, Sigmoid Neuron
• Neural Network model representation
• How it works
• Forward-Propagation
• Back-Propagation
• Central Tendency (mean, median and mode)
• Interquartile Range
• Variance
• Standard Deviation
• Z-Score/T-Score
• Co-variance
• Correlation
• Bar Chart
• Histogram
• Box whisker plot
• Dot-plot
• Line plot
• Scatter Plot
• One-dimensional Array
• Two-dimensional Array
• Pr-defined functions (arrange, reshape, zeros, ones, empty, eye, linespace)
• Basic Matrix operations
• Scalar addition, subtraction, multiplication, division
• Matrix addition, subtraction, multiplication, division and transpose
• Slicing
• Indexing
• Looping
• Shape Manipulation
• Stacking
• Central Limit Theorem
• Confidence Interval and z-distribution table
• Statistical Significance
• Hypothesis testing
• P-value
• One-tailed and Two-tailed Tests
• Chi-Square Goodness of Fit Test
• F- Statistic (ANOVA)
• Kurtosis
• Skewness
• What is regression
• Simple linear regression
• Explanation of statistics (statsmodels – OLS)
• Evaluation metrics (R-Squre, Adj R-Squre, MSE, RMSE)
• Hypothesis testing(Hackers way)
• Label Encoding
• One-Hot (dummy variable) encoding
• Dummy variable trap
• Scikit-Learn → Custom Transformers
• Scikit-Learn → Pipeline
• Hold-out Data
• K-fold Cross-Validation
• Leave-one-Out
• Random Sub-sampling Cross-Validation
• Bootstrapping
• Pickle (pkl file)
• Model load from pkl file and prediction
• SVM Classifier (Soft/Hard – Margin)
• Linear SVM
• Non-Linear SVM
• Kernal SVM
• SVM Regression
• PCA
• Choosing Right Number of Dimensions or Principal Components
• Incremental PCA
• Kernal PCA
• Locally Linear Embedding (LLE)
• Random Forest
• Bagging
• Heterogeneous Ensemble Models
• Anomaly vs Classification
• Assumptions of normality
• Data transformation techniques
• Overview of Spark Context (–master YARN)
• Resilient Distributed Datasets (RDDs)
• RDD Operations (Transformations, Actions)
• Spark DataFrames
• Spark ML model with Pipeline
• Classification model, MulticlassMetrics

• Data science, is an interdisciplinary field of scientific methods, processes, algorithms and systems to extract knowledge or insights from data in various forms, either structured or unstructured.
• Data jango providing the  best data science course in Hyderabad with industry experts.
• The most important interdisciplinary field of all is Machine Learning (a branch of Artificial Intelligence). Simply speaking, Machine Learning is the field of study, that gives computers the ability to learn from data, without being explicitly programmed.
• In this course we will cover all necessary concepts to make a successful Data Scientist. The concepts we cover are Descriptive Statistics, Inferential Statistics, Basic Python, Pandas, NumPy, StatsModels, Scikit-Learn, Mathematics behind Machine Learning Algorithms (Gradient Descent, SVM, Kernal SVM, etc.), error analysis and most of the accuracy measures, techniques of fine tuning the model.
•  Data science is a “concept to unify statistics, data analysis, machine learning and their related methods” in order to “understand and analyze actual phenomena” with data. It employs techniques and theories drawn from many fields within the context of mathematics, statistics, information science, and computer science.
• Turing award winner Jim Gray imagined data science as a “fourth paradigm” of science (empirical, theoretical, computational and now data-driven) and asserted that “everything about science is changing because of the impact of information technology” and the data deluge.
• In 2012, when Harvard Business Review called it “The Sexiest Job of the 21st Century”, the term “data science” became a buzzword. It is now often used interchangeably with earlier concepts like business analytics, business intelligence, predictive modeling, and Statistics. Even the suggestion that data science is sexy was a paraphrased reference to Dr. Hans Rosling’s 2011 BBC documentary quote, “Statistics, is now the sexiest subject around” . Nate Silver referred to data science as a sexed up term for statistics. In many cases, earlier approaches and solutions are now simply re-branded as “data science” to be more attractive, which can cause the term to become “dilute[d] beyond usefulness.” While many university programs now offer a data science degree, there exists no consensus on a definition or suitable curriculum contents. To its discredit, however, many data science and big data projects fail to deliver useful results, often as a result of poor management and utilization of resources.
• Source: Data Science Wikipedia

About Data Science Course Training from Data Jango, Hyderbad:

• Data Jango located in Hitec City,Hyderabad providing both class room and online mode training in Data Science Course with minimal fee in comparison with others. You will get real time training from our faculty who are well versed with R, Python, statistics, Machine Learning, Artificial Intelligence and NLP.
• Data Jango Hyderabad Course= {Data Science Course->3 Months->Online/ Class Room}

Who Can Join this Training of Data Science Course

• Software Engineers
• IT Managers
• IT Professionals,Programmers
• Statistician,Analysts
• MCA ,MBA ,BTech, MTech Students