larray.LArray.describe_by¶
-
LArray.describe_by(*args, **kwargs)[source]¶ Descriptive summary statistics, excluding NaN values, along axes or for groups.
By default, it includes the number of non-NaN values, the mean, standard deviation, minimum, maximum and the 25, 50 and 75 percentiles.
Parameters: *args : int or str or Axis or Group or any combination of those, optional
Axes or groups to include in the result after aggregating. Defaults to aggregate over the whole array.
percentiles : array-like, optional.
list of integer percentiles to include. Defaults to [25, 50, 75].
Returns: LArray
See also
Examples
>>> data = [[0, 6, 3, 5, 4, 2, 1, 3], [7, 5, 3, 2, 8, 5, 6, 4]] >>> arr = LArray(data, 'gender=Male,Female;year=2013..2020').astype(float) >>> arr gender\year 2013 2014 2015 2016 2017 2018 2019 2020 Male 0.0 6.0 3.0 5.0 4.0 2.0 1.0 3.0 Female 7.0 5.0 3.0 2.0 8.0 5.0 6.0 4.0 >>> arr.describe_by('gender') gender\statistic count mean std min 25% 50% 75% max Male 8.0 3.0 2.0 0.0 1.75 3.0 4.25 6.0 Female 8.0 5.0 2.0 2.0 3.75 5.0 6.25 8.0 >>> arr.describe_by('gender', (X.year[:2015], X.year[2018:])) gender year\statistic count mean std min 25% 50% 75% max Male :2015 3.0 3.0 3.0 0.0 1.5 3.0 4.5 6.0 Male 2018: 3.0 2.0 1.0 1.0 1.5 2.0 2.5 3.0 Female :2015 3.0 5.0 2.0 3.0 4.0 5.0 6.0 7.0 Female 2018: 3.0 5.0 1.0 4.0 4.5 5.0 5.5 6.0 >>> arr.describe_by('gender', percentiles=[50, 90]) gender\statistic count mean std min 50% 90% max Male 8.0 3.0 2.0 0.0 3.0 5.3 6.0 Female 8.0 5.0 2.0 2.0 5.0 7.3 8.0