Package: naniar

naniar: Data Structures, Summaries, and Visualisations for Missing Data

Missing values are ubiquitous in data and need to be explored and handled in the initial stages of analysis. 'naniar' provides data structures and functions that facilitate the plotting of missing values and examination of imputations. This allows missing data dependencies to be explored with minimal deviation from the common work patterns of 'ggplot2' and tidy data. The work is fully discussed at Tierney & Cook (2023) <doi:10.18637/jss.v105.i07>.

Authors:Nicholas Tierney [aut, cre], Di Cook [aut], Miles McBain [aut], Colin Fay [aut], Mitchell O'Hara-Wild [ctb], Jim Hester [ctb], Luke Smith [ctb], Andrew Heiss [ctb]

naniar.pdf |naniar.html
naniar/json (API)

# Install 'naniar' in R:
install.packages('naniar', repos = c('', ''))

Peer review:

Bug tracker:




122 exports 646 stars 8.24 score 54 dependencies 6 dependents 12 mentions 13.4k downloads

Last updated 4 months agofrom:255cb2550f



Exploring Imputed Values

Rendered fromexploring-imputed-values.Rmdusingknitr::rmarkdownon Jul 14 2024.

Last update: 2022-12-11
Started: 2018-08-21

Gallery of Missing Data Visualisations

Rendered fromnaniar-visualisation.Rmdusingknitr::rmarkdownon Jul 14 2024.

Last update: 2024-03-16
Started: 2017-08-04

Getting Started with naniar

Rendered fromnaniar.Rmdusingknitr::rmarkdownon Jul 14 2024.

Last update: 2024-03-16
Started: 2024-03-16

Replacing values with NA

Rendered fromreplace-with-na.Rmdusingknitr::rmarkdownon Jul 14 2024.

Last update: 2022-12-11
Started: 2018-01-18

Special Missing Values

Rendered fromspecial-missing-values.Rmdusingknitr::rmarkdownon Jul 14 2024.

Last update: 2024-03-16
Started: 2018-08-20

Readme and manuals

Help Manual

Help pageTopics
Add a column describing presence of any missing valuesadd_any_miss
Add a column describing if there are any missings in the datasetadd_label_missings
Add a column describing whether there is a shadowadd_label_shadow
Add a column that tells us which "missingness cluster" a row belongs toadd_miss_cluster
Add column containing number of missing data valuesadd_n_miss
Add column containing proportion of missing data valuesadd_prop_miss
Add a shadow column to dataframeadd_shadow
Add a shadow shifted column to a datasetadd_shadow_shift
Add a counter variable for a span of dataframeadd_span_counter
Helper function to determine whether there are any missingsany_row_miss
Identify if there are any or all missing or complete valuesall_complete all_miss all_na any-all-na-complete any_complete any_miss any_na
Create shadowsas_shadow
Convert data into shadow format for doing an upset plotas_shadow_upset
Bind a shadow dataframe to original databind_shadow
Add a shadow column to a datasetcast_shadow
Add a shadow and a shadow_shift column to a datasetcast_shadow_shift
Add a shadow column and a shadow shifted column to a datasetcast_shadow_shift_label
Common number values for NAcommon_na_numbers
Common string values for NAcommon_na_strings
Long form representation of a shadow matrixgather_shadow
Plot Missing Data Pointsgeom_miss_point
naniar-ggprotoGeomMissPoint naniar-ggproto StatMissPoint
Plot the number of missings per case (row)gg_miss_case
Plot of cumulative sum of missing for casesgg_miss_case_cumsum
Plot the number of missings for each variable, broken down by a factorgg_miss_fct
Plot the number of missings in a given repeating spangg_miss_span
Plot the pattern of missingness using an upset plot.gg_miss_upset
Plot the number of missings for each variablegg_miss_var
Plot of cumulative sum of missing value for each variablegg_miss_var_cumsum
Plot which variables contain a missing valuegg_miss_which
Impute data with values shifted 10 percent below range.impute_below
Impute data with values shifted 10 percent below range.impute_below_all
Scoped variants of 'impute_below'impute_below_at
Scoped variants of 'impute_below'impute_below_if
Impute numeric values below a range for graphical explorationimpute_below.numeric
Impute a factor value into a vector with missing valuesimpute_factor impute_factor.character impute_factor.default impute_factor.factor impute_factor.shade
Impute a fixed value into a vector with missing valuesimpute_fixed impute_fixed.default
Impute the mean value into a vector with missing valuesimpute_mean impute_mean.default impute_mean.factor
Impute the median value into a vector with missing valuesimpute_median impute_median.default impute_median.factor
Impute the mode value into a vector with missing valuesimpute_mode impute_mode.default impute_mode.factor impute_mode.integer
Impute zero into a vector with missing valuesimpute_zero
Detect if this is a shadeany_shade are_shade is_shade
Label a missing from one columnlabel_miss_1d
Is there a missing value in the row of a dataframe?label_missings
Little's missing completely at random (MCAR) testmcar_test
Summarise the missingness in each casemiss_case_cumsum
Summarise the missingness in each casemiss_case_summary
Tabulate missings in cases.miss_case_table
Proportions of missings in data, variables, and cases.miss_prop_summary
Search and present different kinds of missing valuesmiss_scan_count
Collate summary measures from naniar into one tibblemiss_summary
Cumulative sum of the number of missings in each variablemiss_var_cumsum
Find the number of missing and complete values in a single runmiss_var_run
Summarise the number of missings for a given repeating span on a variablemiss_var_span
Summarise the missingness in each variablemiss_var_summary
Tabulate the missings in the variablesmiss_var_table
Which variables contain missing values?miss_var_which
Proportion of variables containing missings or complete valuescomplete_case_pct complete_case_prop complete_var_pct complete_var_prop miss-pct-prop-defunct miss_case_pct miss_case_prop miss_var_pct miss_var_prop
Return the number of complete valuesn_complete
Return a vector of the number of complete values in each rown_complete_row
Return the number of missing valuesn_miss
Return a vector of the number of missing values in each rown_miss_row
The number of variables with complete valuesn-var-case-complete n_case_complete n_var_complete
The number of variables or cases with missing valuesn-var-case-miss n_case_miss n_var_miss
Convert data into nabular form by binding shade to itnabular
naniar: Data Structures, Summaries, and Visualisations for Missing Datananiar-package naniar
West Pacific Tropical Atmosphere Ocean Data, 1993 & 1997.oceanbuoys
Return the percent of complete valuespct_complete
Return the percent of missing valuespct_miss
Percentage of cases that contain a missing or complete values.pct-miss-complete-case pct_complete_case pct_miss_case
Percentage of variables containing missings or complete valuespct-miss-complete-var pct_complete_var pct_miss_var
Pedestrian count information around Melbourne for 2016pedestrian
Return the proportion of complete valuesprop_complete
Return a vector of the proportion of missing values in each rowprop_complete_row
Return the proportion of missing valuesprop_miss
Return a vector of the proportion of missing values in each rowprop_miss_row
Proportion of cases that contain a missing or complete values.prop-miss-complete-case prop_complete_case prop_miss_case
Proportion of variables containing missings or complete valuesprop-miss-complete-var prop_complete_var prop_miss_var
Add special missing values to the shadow matrixrecode_shadow recode_shadow.grouped_df
Replace NA value with provided valuereplace_na_with
Replace values with missingsreplace_to_na
Replace values with missingsreplace_with_na
Replace all values with NA where a certain condition is metreplace_with_na_all
Replace specified variables with NA where a certain condition is metreplace_with_na_at
Replace values with NA based on some condition, for variables that meet some predicatereplace_with_na_if
The Behavioral Risk Factor Surveillance System (BRFSS) Survey Data, 2009.riskfactors
Scoped variants of 'impute_mean'impute_mean_all impute_mean_at impute_mean_if scoped-impute_mean
Scoped variants of 'impute_median'impute_median_all impute_median_at impute_median_if scoped-impute_median
Set a proportion or number of missing valuesset-prop-n-miss set_n_miss set_prop_miss
Create new levels of missingshade
Reshape shadow data into a long formatshadow_long
Shift missing values to facilitate missing data exploration/visualisationshadow_shift
Unbind (remove) shadow from data, and vice versaunbinders unbind_data unbind_shadow
Split a call into two components with a useful verb name.where where
Which rows and cols contain missings?where_na
Which variables are shades?which_are_shade
Which elements contain missings?which_na