R for Data Science Solutions [Video]

R for Data Science Solutions [Video]

Cookbook
Yu-Wei, Chiu (David Chiu)

Over 100 hands-on tasks to help you effectively solve real-world data problems using the most popular R packages and techniques
$30.00
RRP $99.99

Instantly access this course right now and get the skills you need in 2016

With unlimited access to a constantly growing library of over 3,500 courses, a subscription to Mapt gives you everything you need to get that next promotion or to land that dream job. Cancel anytime.

Code Files
+ Collection

Video Details

ISBN 139781787129122
Course Length5 hours 32 minutes

Video Description

R is a data analysis software as well as a programming language. Data scientists, statisticians and analysts use R for statistical analysis, data visualization and predictive modeling. R is open source and allows integration with other applications and systems. Compared to other data analysis platforms, R has an extensive set of data products. Problems faced with data are cleared with R’s excellent data visualization feature.

The first section in this course deals with how to create R functions to avoid the unnecessary duplication of code. You will learn how to prepare, process, and perform sophisticated ETL for heterogeneous data sources with R packages. An example of data manipulation is provided, illustrating how to use the ‘dplyr’ and ‘data.table’ packages to efficiently process larger data structures. We also focus on ‘ggplot2’ and show you how to create advanced figures for data exploration.

In addition, you will learn how to build an interactive report using the “ggvis” package. Later sections offer insight into time series analysis, while there is detailed information on the hot topic of machine learning, including data classification, regression, clustering, association rule mining, and dimension reduction.

By the end of this course, you will understand how to resolve issues and will be able to comfortably offer solutions to problems encountered while performing data analysis.

Style and Approach

This collection of independent videos offers a range of data analysis samples in simple and straightforward R code, providing step-by-step resources and time-saving methods to help you solve data problems efficiently.

Table of Contents

Functions in R
R Functions and Arguments
Understanding Environments
Working with Lexical Scoping
Understanding Closure
Performing Lazy Evaluation
Creating Infix Operators
Using the Replacement Function
Handling Errors in a Function
The Debugging Function
Data Extracting, Transforming, and Loading
Downloading Open Data
Reading and Writing CSV Files
Scanning Text Files
Working with Excel Files
Reading Data from Databases
Scraping Web Data
Data Pre-Processing and Preparation
Renaming the Data Variable
Converting Data Types
Working with Date Format
Adding New Records
Filtering Data
Dropping Data
Merging and Sorting Data
Reshaping Data
Detecting Missing Data
Imputing Missing Data
Data Manipulation
Enhancing a data.frame with a data.table
Managing Data with data.table
Performing Fast Aggregation with data.table
Merging Large Datasets with a data.table
Subsetting and Slicing Data with dplyr
Sampling Data with dplyr
Selecting Columns with dplyr
Chaining Operations in dplyr
Arranging Rows with dplyr
Eliminating Duplicated Rows with dplyr
Adding New Columns with dplyr
Summarizing Data with dplyr
Merging Data with dplyr
Visualizing Data with ggplot2
Creating Basic Plots with ggplot2
Changing Aesthetics Mapping
Introducing Geometric Objects
Performing Transformations
Adjusting Scales
Faceting
Adjusting Themes
Combining Plots
Creating Maps
Making Interactive Reports
Creating R Markdown Reports
Learning the Markdown Syntax
Embedding R Code Chunks
Creating Interactive Graphics with ggvis
Understanding Basic Syntax and Grammar
Controlling Axes and Legends and Using Scales
Adding Interactivity to a ggvis Plot
Creating an R Shiny Document
Publishing an R Shiny Report
Simulation from Probability Distributions
Generating Random Samples
Understanding Uniform Distributions
Generating Binomial Random Variates
Generating Poisson Random Variates
Sampling from a Normal Distribution
Sampling from a Chi-Squared Distribution
Understanding Student's t- Distribution
Sampling from a Dataset
Simulating the Stochastic Process
Statistical Inference in R
Getting Confidence Intervals
Performing Z-tests
Performing Student's t-Tests
Conducting Exact Binomial Tests
Performing Kolmogorov-Smirnov Tests
Working with the Pearson's Chi-Squared Tests
Understanding the Wilcoxon Rank Sum and Signed Rank Tests
Conducting One-way ANOVA
Performing Two-way ANOVA
Rule and Pattern Mining with R
Transforming Data into Transactions
Displaying Transactions and Associations
Mining Associations with the Apriori Rule
Pruning Redundant Rules
Visualizing Association Rules
Mining Frequent Itemsets with Eclat
Creating Transactions with Temporal Information
Mining Frequent Sequential Patterns with cSPADE
Time Series Mining with R
Creating Time Series Data
Plotting a Time Series Object
Decomposing Time Series
Smoothing Time Series
Forecasting Time Series
Selecting an ARIMA Model
Creating an ARIMA Model
Forecasting with an ARIMA Model
Predicting Stock Prices with an ARIMA Model
Supervised Machine Learning
Fitting a Linear Regression Model with lm
Summarizing Linear Model Fits
Using Linear Regression to Predict Unknown Values
Measuring the Performance of the Regression Model
Performing a Multiple Regression Analysis
Selecting the Best-Fitted Regression Model with Stepwise Regression
Applying the Gaussian Model for Generalized Linear Regression
Performing a Logistic Regression Analysis
Building a Classification Model with Recursive Partitioning Trees
Visualizing Recursive Partitioning Tree
Measuring Model Performance with a Confusion Matrix
Measuring Prediction Performance Using ROCR
Unsupervised Machine Learning
Clustering Data with Hierarchical Clustering
Cutting Tree into Clusters
Clustering Data with the k-means Method
Clustering Data with the Density-Based Method
Extracting Silhouette Information from Clustering
Comparing Clustering Methods
Recognizing Digits Using the Density-Based Clustering Method
Grouping Similar Text Documents with k-means Clustering Method
Performing Dimension Reduction with Principal Component Analysis (PCA)
Determining the Number of Principal Components Using a Scree Plot
Determining the Number of Principal Components Using the Kaiser Method
Visualizing Multivariate Data Using a biplot

What You Will Learn

  • Get to know the functional characteristics of R language
  • Extract, transform, and load data from heterogeneous sources-
  • Understand how easily R can confront probability and statistics problems
  • Get simple R instructions to quickly organize and manipulate large datasets
  • Create professional data visualizations and interactive reports
  • Predict user purchase behavior by adopting a classification approach
  • Implement data mining techniques to discover items that are frequently purchased together
  • Group similar text documents by using various clustering methods

Authors

Table of Contents

Functions in R
R Functions and Arguments
Understanding Environments
Working with Lexical Scoping
Understanding Closure
Performing Lazy Evaluation
Creating Infix Operators
Using the Replacement Function
Handling Errors in a Function
The Debugging Function
Data Extracting, Transforming, and Loading
Downloading Open Data
Reading and Writing CSV Files
Scanning Text Files
Working with Excel Files
Reading Data from Databases
Scraping Web Data
Data Pre-Processing and Preparation
Renaming the Data Variable
Converting Data Types
Working with Date Format
Adding New Records
Filtering Data
Dropping Data
Merging and Sorting Data
Reshaping Data
Detecting Missing Data
Imputing Missing Data
Data Manipulation
Enhancing a data.frame with a data.table
Managing Data with data.table
Performing Fast Aggregation with data.table
Merging Large Datasets with a data.table
Subsetting and Slicing Data with dplyr
Sampling Data with dplyr
Selecting Columns with dplyr
Chaining Operations in dplyr
Arranging Rows with dplyr
Eliminating Duplicated Rows with dplyr
Adding New Columns with dplyr
Summarizing Data with dplyr
Merging Data with dplyr
Visualizing Data with ggplot2
Creating Basic Plots with ggplot2
Changing Aesthetics Mapping
Introducing Geometric Objects
Performing Transformations
Adjusting Scales
Faceting
Adjusting Themes
Combining Plots
Creating Maps
Making Interactive Reports
Creating R Markdown Reports
Learning the Markdown Syntax
Embedding R Code Chunks
Creating Interactive Graphics with ggvis
Understanding Basic Syntax and Grammar
Controlling Axes and Legends and Using Scales
Adding Interactivity to a ggvis Plot
Creating an R Shiny Document
Publishing an R Shiny Report
Simulation from Probability Distributions
Generating Random Samples
Understanding Uniform Distributions
Generating Binomial Random Variates
Generating Poisson Random Variates
Sampling from a Normal Distribution
Sampling from a Chi-Squared Distribution
Understanding Student's t- Distribution
Sampling from a Dataset
Simulating the Stochastic Process
Statistical Inference in R
Getting Confidence Intervals
Performing Z-tests
Performing Student's t-Tests
Conducting Exact Binomial Tests
Performing Kolmogorov-Smirnov Tests
Working with the Pearson's Chi-Squared Tests
Understanding the Wilcoxon Rank Sum and Signed Rank Tests
Conducting One-way ANOVA
Performing Two-way ANOVA
Rule and Pattern Mining with R
Transforming Data into Transactions
Displaying Transactions and Associations
Mining Associations with the Apriori Rule
Pruning Redundant Rules
Visualizing Association Rules
Mining Frequent Itemsets with Eclat
Creating Transactions with Temporal Information
Mining Frequent Sequential Patterns with cSPADE
Time Series Mining with R
Creating Time Series Data
Plotting a Time Series Object
Decomposing Time Series
Smoothing Time Series
Forecasting Time Series
Selecting an ARIMA Model
Creating an ARIMA Model
Forecasting with an ARIMA Model
Predicting Stock Prices with an ARIMA Model
Supervised Machine Learning
Fitting a Linear Regression Model with lm
Summarizing Linear Model Fits
Using Linear Regression to Predict Unknown Values
Measuring the Performance of the Regression Model
Performing a Multiple Regression Analysis
Selecting the Best-Fitted Regression Model with Stepwise Regression
Applying the Gaussian Model for Generalized Linear Regression
Performing a Logistic Regression Analysis
Building a Classification Model with Recursive Partitioning Trees
Visualizing Recursive Partitioning Tree
Measuring Model Performance with a Confusion Matrix
Measuring Prediction Performance Using ROCR
Unsupervised Machine Learning
Clustering Data with Hierarchical Clustering
Cutting Tree into Clusters
Clustering Data with the k-means Method
Clustering Data with the Density-Based Method
Extracting Silhouette Information from Clustering
Comparing Clustering Methods
Recognizing Digits Using the Density-Based Clustering Method
Grouping Similar Text Documents with k-means Clustering Method
Performing Dimension Reduction with Principal Component Analysis (PCA)
Determining the Number of Principal Components Using a Scree Plot
Determining the Number of Principal Components Using the Kaiser Method
Visualizing Multivariate Data Using a biplot

Video Details

ISBN 139781787129122
Course Length5 hours 32 minutes
Read More

Read More Reviews