Documentation

F = fillmissing(A,'constant',v) fills missing entries of an array or table with the constant value v. If A is a matrix or multidimensional array, then v can be either a scalar or a vector. When v is a vector, each element specifies the fill value in the corresponding column of A. If A is a table or timetable, then v can also be a cell array.

Missing values are defined according to the data type of A:

NaN — double, single, duration, and calendarDuration
NaT — datetime
<missing>—string
<undefined> — categorical
' ' — char
{''} — cell of character arrays

If A is a table, then the data type of each column defines the missing value for that column.

F = fillmissing(A,method) fills missing entries using the method specified by method, which can be one of the following:

'previous' — previous non-missing value
'next' — next non-missing value
'nearest' — nearest non-missing value
'linear' — linear interpolation of neighboring, non-missing values (numeric, duration, and datetime data types only)
'spline' — piecewise cubic spline interpolation (numeric, duration, and datetime data types only)
'pchip' — shape-preserving piecewise cubic spline interpolation (numeric, duration, and datetime data types only)

F = fillmissing(___,dim) specifies the dimension of A to operate along. By default, fillmissing operates along the first dimension whose size does not equal 1. For example, if A is a matrix, then fillmissing(A,2) operates across the columns of A, filling missing data row by row.

F = fillmissing(___,Name,Value) specifies additional parameters for filling missing values using one or more name-value pair arguments. For example, you can use fillmissing(A,'linear','SamplePoints',x) to specify a vector x for linear interpolation of missing values.

[F,TF] = fillmissing(___) also returns a logical array corresponding to the entries of A that were filled.

Examples

Vector with `NaN` Values

Create a vector that contains NaN values and replace each NaN with the previous non-missing value.

A = [1 3 NaN 4 NaN NaN 5];
F = fillmissing(A,'previous')

F =

     1     3     3     4     4     4     5

Interpolate Missing Data

Use interpolation to replace NaN values in non-uniformly sampled data.

Define a vector of non-uniform sample points and evaluate the sine function over the points.

x = [-4*pi:0.1:0, 0.1:0.2:4*pi];
A = sin(x);

Inject NaN values into A.

A(A < 0.75 & A > 0.5) = NaN;

Fill the missing data using linear interpolation, and return the filled vector F and the logical vector TF. The value 1 (true) in entries of TF corresponds to the values of F that were filled.

[F,TF] = fillmissing(A,'linear','SamplePoints',x);

Plot the original data and filled data.

plot(x,A,'.', x(TF),F(TF),'o')
xlabel('x');
ylabel('sin(x)')
legend('Original Data','Filled Missing Data')

Matrix with Missing Endpoints

Create a matrix with missing entries and fill across the columns (second dimension) one row at a time using linear interpolation. For each row, fill leading and trailing missing values with the nearest non-missing value in that row.

A = [NaN NaN 5 3 NaN 5 7 NaN 9 NaN;
     8 9 NaN 1 4 5 NaN 5 NaN 5;
     NaN 4 9 8 7 2 4 1 1 NaN]

A =

   NaN   NaN     5     3   NaN     5     7   NaN     9   NaN
     8     9   NaN     1     4     5   NaN     5   NaN     5
   NaN     4     9     8     7     2     4     1     1   NaN

F = fillmissing(A,'linear',2,'EndValues','nearest')

F =

     5     5     5     3     4     5     7     8     9     9
     8     9     5     1     4     5     5     5     5     5
     4     4     9     8     7     2     4     1     1     1

Table with Multiple Data Types

Fill missing values for table variables with different data types.

Create a table whose variables include categorical, double, and char data types.

A = table(categorical({'Sunny';'Cloudy';''}),[66;NaN;54],{'';'N';'Y'},[37;39;NaN],...
    'VariableNames',{'Description' 'Temperature' 'Rain' 'Humidity'})

A = 

    Description    Temperature    Rain    Humidity
    ___________    ___________    ____    ________

    Sunny           66            ''       37     
    Cloudy         NaN            'N'      39     
    <undefined>     54            'Y'     NaN

Replace all missing entries with the value from the previous entry. Since there is no previous element in the Rain variable, the missing character vector is not replaced.

F = fillmissing(A,'previous')

F = 

    Description    Temperature    Rain    Humidity
    ___________    ___________    ____    ________

    Sunny          66             ''      37      
    Cloudy         66             'N'     39      
    Cloudy         54             'Y'     39

Replace the NaN values from the Temperature and Humidity variables in A with 0.

F = fillmissing(A,'constant',0,'DataVariables',{'Temperature','Humidity'})

F = 

    Description    Temperature    Rain    Humidity
    ___________    ___________    ____    ________

    Sunny          66             ''      37      
    Cloudy          0             'N'     39      
    <undefined>    54             'Y'      0

Alternatively, use the isnumeric function to identify the numeric variables to operate on.

F = fillmissing(A,'constant',0,'DataVariables',@isnumeric)

F = 

    Description    Temperature    Rain    Humidity
    ___________    ___________    ____    ________

    Sunny          66             ''      37      
    Cloudy          0             'N'     39      
    <undefined>    54             'Y'      0

Input Arguments

`A` — Input data
vector | matrix | multidimensional array | table | timetable

Input data, specified as a vector, matrix, multidimensional array, table, or timetable.

If A is a timetable, then only table values are filled. If the associated vector of row times contains a NaT or NaN value, then fillmissing produces an error. Row times must be unique and listed in ascending order.

`v` — Fill constant
scalar | vector | cell array

Fill constant, specified as a scalar, vector, or cell array. v can be a vector when A is a matrix or multidimensional array. v can be a cell array when A is a table or timetable.

`method` — Fill method
character vector

Fill method, specified as a character vector. method can be one of the following:

Method	Description
`'previous'`	previous non-missing value
`'next'`	next non-missing value
`'nearest'`	nearest non-missing value
`'linear'`	linear interpolation of neighboring, non-missing values (numeric, `duration`, and `datetime` data types only)
`'spline'`	piecewise cubic spline interpolation (numeric, `duration`, and `datetime` data types only)
`'pchip'`	shape-preserving piecewise cubic spline interpolation (numeric, `duration`, and `datetime` data types only)

Example: fillmissing(A,'linear')

Data Types: char

`dim` — Dimension to operate along
positive integer scalar

Dimension to operate along, specified as a positive integer scalar. If no value is specified, then the default is the first array dimension whose size does not equal 1.

When A is a table or timetable, dim is not supported. fillmissing operates along each table or timetable variable separately.

Consider a two-dimensional input array, A.

If dim=1, then fillmissing fills A column by column.
If dim=2, then fillmissing fills A row by row.

Name-Value Pair Arguments

Specify optional comma-separated pairs of Name,Value arguments. Name is the argument name and Value is the corresponding value. Name must appear inside single quotes (' '). You can specify several name and value pair arguments in any order as Name1,Value1,...,NameN,ValueN.

Example: fillmissing(A,'DataVariables',{'Temperature','Altitude'}) fills only the columns corresponding to the Temperature and Altitude variables of an input table

`'EndValues'` — Method for handling endpoints
`'extrap'` (default) | `'previous'` | `'next'` | `'nearest'` | `'none'` | scalar

Method for handling endpoints, specified as the comma-separated pair consisting of 'EndValues' and one of 'extrap', 'previous', 'next', 'nearest', 'none', or a constant scalar value. The endpoint fill method handles leading and trailing missing values based on the following definitions:

Method	Description
`'extrap'`	same as `method`
`'previous'`	previous non-missing value
`'next'`	next non-missing value
`'nearest'`	nearest non-missing value
`'none'`	no fill value
scalar	constant value (numeric, `duration`, and `datetime` data types only)

Example: 'EndValues','next'

`'SamplePoints'` — Sample points for fill method
vector

Sample points for fill method, specified as the comma-separated pair consisting of 'SamplePoints' and a vector used for interpolating the input data. For example, for vectors A and x, x(i) represents the sampling for A(i). The vector must be sorted and have no repeating elements. If A is a timetable, then the default is the associated vector of row times. Otherwise, the default is an index vector of consecutive, positive integers starting with 1.

Example: 'SamplePoints',x

`'DataVariables'` — Table variables to fill
variable name | cell array of variable names | numeric vector | logical vector | function handle

Table variables to fill, specified as the comma-separated pair consisting of 'DataVariables' and a variable name, a cell array of variable names, a numeric vector, a logical vector, or a function handle. The 'DataVariables' value indicates which columns of the input table to fill, and can be one of the following:

A character vector specifying a single table variable name
A cell array of character vectors where each element is a table variable name
A vector of table variable indices
A logical vector whose elements each correspond to a table variable, where true includes the corresponding variable and false excludes it
A function handle that returns a logical scalar, such as @isnumeric

Example: 'Age'

Example: {'Height','Weight'}

Example: @iscategorical

Output Arguments