lux.core.series.LuxSeries

class lux.core.series.LuxSeries(*args, **kw)[source]

A subclass of pd.Series that supports all 1-D Series operations

__init__(*args, **kw)[source]

Initialize self. See help(type(self)) for accurate signature.

Methods

__init__(*args, **kw) Initialize self.
abs() Return a Series/DataFrame with absolute numeric value of each element.
add(other[, level, fill_value, axis]) Return Addition of series and other, element-wise (binary operator add).
add_prefix(prefix) Prefix labels with string prefix.
add_suffix(suffix) Suffix labels with string suffix.
agg([func, axis]) Aggregate using one or more operations over the specified axis.
aggregate([func, axis]) Aggregate using one or more operations over the specified axis.
align(other[, join, axis, level, copy, …]) Align two objects on their axes with the specified join method.
all([axis, bool_only, skipna, level]) Return whether all elements are True, potentially over an axis.
any([axis, bool_only, skipna, level]) Return whether any element is True, potentially over an axis.
append(to_append[, ignore_index, …]) Concatenate two or more Series.
apply(func[, convert_dtype, args]) Invoke function on values of Series.
argmax([axis]) Return int position of the largest value in the Series.
argmin([axis, skipna]) Return int position of the smallest value in the Series.
argsort([axis, kind, order]) Return the integer indices that would sort the Series values.
asfreq(freq[, method, fill_value]) Convert TimeSeries to specified frequency.
asof(where[, subset]) Return the last row(s) without any NaNs before where.
astype(dtype, copy, errors) Cast a pandas object to a specified dtype dtype.
at_time(time, asof[, axis]) Select values at particular time of day (e.g., 9:30AM).
autocorr([lag]) Compute the lag-N autocorrelation.
backfill([axis, limit, downcast]) Synonym for DataFrame.fillna() with method='bfill'.
between(left, right[, inclusive]) Return boolean Series equivalent to left <= series <= right.
between_time(start_time, end_time, …[, axis]) Select values between particular times of the day (e.g., 9:00-9:30 AM).
bfill([axis, limit, downcast]) Synonym for DataFrame.fillna() with method='bfill'.
bool() Return the bool of a single element Series or DataFrame.
clip([lower, upper, axis]) Trim values at input threshold(s).
combine(other, func[, fill_value]) Combine the Series with a Series or scalar according to func.
combine_first(other) Combine Series values, choosing the calling Series’s values first.
compare(other, align_axis, int] = 1, …) Compare to another Series and show the differences.
convert_dtypes(infer_objects, …) Convert columns to best possible dtypes using dtypes supporting pd.NA.
copy(deep) Make a copy of this object’s indices and data.
corr(other[, method, min_periods]) Compute correlation with other Series, excluding missing values.
count([level]) Return number of non-NA/null observations in the Series.
cov(other, min_periods, ddof) Compute covariance with Series, excluding missing values.
cummax([axis, skipna]) Return cumulative maximum over a DataFrame or Series axis.
cummin([axis, skipna]) Return cumulative minimum over a DataFrame or Series axis.
cumprod([axis, skipna]) Return cumulative product over a DataFrame or Series axis.
cumsum([axis, skipna]) Return cumulative sum over a DataFrame or Series axis.
describe([percentiles, include, exclude, …]) Generate descriptive statistics.
diff(periods) First discrete difference of element.
div(other[, level, fill_value, axis]) Return Floating division of series and other, element-wise (binary operator truediv).
divide(other[, level, fill_value, axis]) Return Floating division of series and other, element-wise (binary operator truediv).
divmod(other[, level, fill_value, axis]) Return Integer division and modulo of series and other, element-wise (binary operator divmod).
dot(other) Compute the dot product between the Series and the columns of other.
drop([labels, axis, index, columns, level, …]) Return Series with specified index labels removed.
drop_duplicates([keep, inplace]) Return Series with duplicate values removed.
droplevel(level[, axis]) Return DataFrame with requested index / column level(s) removed.
dropna([axis, inplace, how]) Return a new Series with missing values removed.
duplicated([keep]) Indicate duplicate Series values.
eq(other[, level, fill_value, axis]) Return Equal to of series and other, element-wise (binary operator eq).
equals(other) Test whether two objects contain the same elements.
ewm(com, span, halflife, Timedelta, …) Provide exponential weighted (EW) functions.
expanding(min_periods, center, axis, int] = 0) Provide expanding transformations.
explode(ignore_index) Transform each element of a list-like to a row.
factorize(sort, na_sentinel) Encode the object as an enumerated type or categorical variable.
ffill([axis, limit, downcast]) Synonym for DataFrame.fillna() with method='ffill'.
fillna([value, method, axis, inplace, …]) Fill NA/NaN values using the specified method.
filter([items, axis]) Subset the dataframe rows or columns according to the specified index labels.
first(offset) Select initial periods of time series data based on a date offset.
first_valid_index() Return index for first non-NA/null value.
floordiv(other[, level, fill_value, axis]) Return Integer division of series and other, element-wise (binary operator floordiv).
ge(other[, level, fill_value, axis]) Return Greater than or equal to of series and other, element-wise (binary operator ge).
get(key[, default]) Get item from object for given key (ex: DataFrame column).
groupby(*args, **kwargs) Group Series using a mapper or by a Series of columns.
gt(other[, level, fill_value, axis]) Return Greater than of series and other, element-wise (binary operator gt).
head(n) Return the first n rows.
hist([by, ax]) Draw histogram of the input series using matplotlib.
idxmax([axis, skipna]) Return the row label of the maximum value.
idxmin([axis, skipna]) Return the row label of the minimum value.
infer_objects() Attempt to infer better dtypes for object columns.
interpolate(method, axis, int] = 0, limit, …) Fill NaN values using an interpolation method.
isin(values) Whether elements in Series are contained in values.
isna() Detect missing values.
isnull() Detect missing values.
item() Return the first element of the underlying data as a Python scalar.
items() Lazily iterate over (index, value) tuples.
iteritems() Lazily iterate over (index, value) tuples.
keys() Return alias for index.
kurt([axis, skipna, level, numeric_only]) Return unbiased kurtosis over requested axis.
kurtosis([axis, skipna, level, numeric_only]) Return unbiased kurtosis over requested axis.
last(offset) Select final periods of time series data based on a date offset.
last_valid_index() Return index for last non-NA/null value.
le(other[, level, fill_value, axis]) Return Less than or equal to of series and other, element-wise (binary operator le).
lt(other[, level, fill_value, axis]) Return Less than of series and other, element-wise (binary operator lt).
mad([axis, skipna, level]) Return the mean absolute deviation of the values over the requested axis.
map(arg[, na_action]) Map values of Series according to input correspondence.
mask(cond[, other, inplace, axis, level, …]) Replace values where the condition is True.
max([axis, skipna, level, numeric_only]) Return the maximum of the values over the requested axis.
mean([axis, skipna, level, numeric_only]) Return the mean of the values over the requested axis.
median([axis, skipna, level, numeric_only]) Return the median of the values over the requested axis.
memory_usage([index, deep]) Return the memory usage of the Series.
min([axis, skipna, level, numeric_only]) Return the minimum of the values over the requested axis.
mod(other[, level, fill_value, axis]) Return Modulo of series and other, element-wise (binary operator mod).
mode([dropna]) Return the mode(s) of the Series.
mul(other[, level, fill_value, axis]) Return Multiplication of series and other, element-wise (binary operator mul).
multiply(other[, level, fill_value, axis]) Return Multiplication of series and other, element-wise (binary operator mul).
ne(other[, level, fill_value, axis]) Return Not equal to of series and other, element-wise (binary operator ne).
nlargest([n, keep]) Return the largest n elements.
notna() Detect existing (non-missing) values.
notnull() Detect existing (non-missing) values.
nsmallest([n, keep]) Return the smallest n elements.
nunique(dropna) Return number of unique elements in the object.
pad([axis, limit, downcast]) Synonym for DataFrame.fillna() with method='ffill'.
pct_change([periods, fill_method, limit, freq]) Percentage change between the current and a prior element.
pipe(func, *args, **kwargs) Apply func(self, *args, **kwargs).
pop(item) Return item and drops from series.
pow(other[, level, fill_value, axis]) Return Exponential power of series and other, element-wise (binary operator pow).
prod([axis, skipna, level, numeric_only, …]) Return the product of the values over the requested axis.
product([axis, skipna, level, numeric_only, …]) Return the product of the values over the requested axis.
quantile([q, interpolation]) Return value at the given quantile.
radd(other[, level, fill_value, axis]) Return Addition of series and other, element-wise (binary operator radd).
rank([axis]) Compute numerical data ranks (1 through n) along axis.
ravel([order]) Return the flattened underlying data as an ndarray.
rdiv(other[, level, fill_value, axis]) Return Floating division of series and other, element-wise (binary operator rtruediv).
rdivmod(other[, level, fill_value, axis]) Return Integer division and modulo of series and other, element-wise (binary operator rdivmod).
reindex([index]) Conform Series to new index with optional filling logic.
reindex_like(other, method, copy[, limit, …]) Return an object with matching indices as other object.
rename([index, axis, copy, inplace, level, …]) Alter Series index labels or name.
rename_axis([mapper, index, columns, axis, …]) Set the name of the axis for the index or columns.
reorder_levels(order) Rearrange index levels using input order.
repeat(repeats[, axis]) Repeat elements of a Series.
replace([to_replace, value, inplace, limit, …]) Replace values given in to_replace with value.
resample(rule[, axis, loffset, on, level]) Resample time-series data.
reset_index([level, drop, name, inplace]) Generate a new DataFrame or Series with the index reset.
rfloordiv(other[, level, fill_value, axis]) Return Integer division of series and other, element-wise (binary operator rfloordiv).
rmod(other[, level, fill_value, axis]) Return Modulo of series and other, element-wise (binary operator rmod).
rmul(other[, level, fill_value, axis]) Return Multiplication of series and other, element-wise (binary operator rmul).
rolling(window, timedelta, BaseOffset, …) Provide rolling window calculations.
round([decimals]) Round each value in a Series to the given number of decimals.
rpow(other[, level, fill_value, axis]) Return Exponential power of series and other, element-wise (binary operator rpow).
rsub(other[, level, fill_value, axis]) Return Subtraction of series and other, element-wise (binary operator rsub).
rtruediv(other[, level, fill_value, axis]) Return Floating division of series and other, element-wise (binary operator rtruediv).
sample([n, frac, replace, weights, …]) Return a random sample of items from an axis of object.
searchsorted(value[, side, sorter]) Find indices where elements should be inserted to maintain order.
sem([axis, skipna, level, ddof, numeric_only]) Return unbiased standard error of the mean over requested axis.
set_axis(labels, axis, int] = 0, inplace) Assign desired index to given axis.
set_flags(*, copy, allows_duplicate_labels) Return a new object with updated flags.
shift([periods, freq, axis, fill_value]) Shift index by desired number of periods with an optional time freq.
skew([axis, skipna, level, numeric_only]) Return unbiased skew over requested axis.
slice_shift(periods[, axis]) Equivalent to shift without copying data.
sort_index([axis, level]) Sort Series by index labels.
sort_values([axis]) Sort by the values.
squeeze([axis]) Squeeze 1 dimensional axis objects into scalars.
std([axis, skipna, level, ddof, numeric_only]) Return sample standard deviation over requested axis.
sub(other[, level, fill_value, axis]) Return Subtraction of series and other, element-wise (binary operator sub).
subtract(other[, level, fill_value, axis]) Return Subtraction of series and other, element-wise (binary operator sub).
sum([axis, skipna, level, numeric_only, …]) Return the sum of the values over the requested axis.
swapaxes(axis1, axis2[, copy]) Interchange axes and swap values axes appropriately.
swaplevel([i, j, copy]) Swap levels i and j in a MultiIndex.
tail(n) Return the last n rows.
take(indices[, axis, is_copy]) Return the elements in the given positional indices along an axis.
to_clipboard(excel, sep, **kwargs) Copy object to the system clipboard.
to_csv(path_or_buf, str, IO[T], …) Write object to a comma-separated values (csv) file.
to_dict([into]) Convert Series to {label -> value} dict or dict-like object.
to_excel(excel_writer, sheet_name, na_rep, …) Write object to an Excel sheet.
to_frame([name]) Convert Series to DataFrame.
to_hdf(path_or_buf, key, mode, complevel, …) Write the contained data to an HDF5 file using HDFStore.
to_json(path_or_buf, str, IO[T], …) Convert the object to a JSON string.
to_latex([buf, columns, col_space, header, …]) Render object to a LaTeX tabular, longtable, or nested table/tabular.
to_list() Return a list of the values.
to_markdown(buf, mode, index, …) Print Series in Markdown-friendly format.
to_numpy([dtype, copy, na_value]) A NumPy ndarray representing the values in this Series or Index.
to_pandas() Convert Lux Series to Pandas Series
to_period([freq, copy]) Convert Series from DatetimeIndex to PeriodIndex.
to_pickle(path, compression, Dict[str, Any], …) Pickle (serialize) object to file.
to_sql(name, con[, schema, index_label, …]) Write records stored in a DataFrame to a SQL database.
to_string([buf, na_rep, float_format, …]) Render a string representation of the Series.
to_timestamp([freq, how, copy]) Cast to DatetimeIndex of Timestamps, at beginning of period.
to_xarray() Return an xarray object from the pandas object.
tolist() Return a list of the values.
transform(func, str, List[Union[Callable, …) Call func on self producing a Series with transformed values.
transpose(*args, **kwargs) Return the transpose, which is by definition self.
truediv(other[, level, fill_value, axis]) Return Floating division of series and other, element-wise (binary operator truediv).
truncate([before, after, axis]) Truncate a Series or DataFrame before and after some index value.
tshift(periods[, freq]) Shift the time index, using the index’s frequency if available.
tz_convert(tz[, axis, level]) Convert tz-aware axis to target time zone.
tz_localize(tz[, axis, level, ambiguous]) Localize tz-naive index of a Series or DataFrame to target time zone.
unique() Overridden method for pd.Series.unique with cached results.
unstack([level, fill_value]) Unstack, also known as pivot, Series with MultiIndex to produce DataFrame.
update(other) Modify Series in place using values from passed Series.
value_counts(normalize, sort, ascending[, bins]) Return a Series containing counts of unique values.
var([axis, skipna, level, ddof, numeric_only]) Return unbiased variance over requested axis.
view([dtype]) Create a new view of the Series.
where(cond[, other, inplace, axis, level, …]) Replace values where the condition is False.
xs(key[, axis, level]) Return cross-section from the Series/DataFrame.

Attributes

T Return the transpose, which is by definition self.
array The ExtensionArray of the data backing this Series or Index.
at Access a single value for a row/column label pair.
attrs Dictionary of global attributes of this dataset.
axes Return a list of the row axis labels.
dtype Return the dtype object of the underlying data.
dtypes Return the dtype object of the underlying data.
empty
exported Get selected visualizations as exported Vis List
flags Get the properties associated with this pandas object.
hasnans Return if I have any nans; enables various perf speedups.
iat Access a single value for a row/column pair by integer position.
iloc Purely integer-location based indexing for selection by position.
index The index (axis labels) of the Series.
is_monotonic Return boolean if values in the object are monotonic_increasing.
is_monotonic_decreasing Return boolean if values in the object are monotonic_decreasing.
is_monotonic_increasing Alias for is_monotonic.
is_unique Return boolean if values in the object are unique.
loc Access a group of rows and columns by label(s) or a boolean array.
name Return the name of the Series.
nbytes Return the number of bytes in the underlying data.
ndim Number of dimensions of the underlying data, by definition 1.
recommendation
shape Return a tuple of the shape of the underlying data.
size Return the number of elements in the underlying data.
values Return Series as ndarray or ndarray-like depending on the dtype.
groupby(*args, **kwargs)[source]

Group Series using a mapper or by a Series of columns.

A groupby operation involves some combination of splitting the object, applying a function, and combining the results. This can be used to group large amounts of data and compute operations on these groups.

Parameters:
  • by (mapping, function, label, or list of labels) – Used to determine the groups for the groupby. If by is a function, it’s called on each value of the object’s index. If a dict or Series is passed, the Series or dict VALUES will be used to determine the groups (the Series’ values are first aligned; see .align() method). If an ndarray is passed, the values are used as-is to determine the groups. A label or list of labels may be passed to group by the columns in self. Notice that a tuple is interpreted as a (single) key.
  • axis ({0 or 'index', 1 or 'columns'}, default 0) – Split along rows (0) or columns (1).
  • level (int, level name, or sequence of such, default None) – If the axis is a MultiIndex (hierarchical), group by a particular level or levels.
  • as_index (bool, default True) – For aggregated output, return object with group labels as the index. Only relevant for DataFrame input. as_index=False is effectively “SQL-style” grouped output.
  • sort (bool, default True) – Sort group keys. Get better performance by turning this off. Note this does not influence the order of observations within each group. Groupby preserves the order of rows within each group.
  • group_keys (bool, default True) – When calling apply, add group keys to index to identify pieces.
  • squeeze (bool, default False) –

    Reduce the dimensionality of the return type if possible, otherwise return a consistent type.

    Deprecated since version 1.1.0.

  • observed (bool, default False) – This only applies if any of the groupers are Categoricals. If True: only show observed values for categorical groupers. If False: show all values for categorical groupers.
  • dropna (bool, default True) –

    If True, and if group keys contain NA values, NA values together with row/column will be dropped. If False, NA values will also be treated as the key in groups

    New in version 1.1.0.

Returns:

Returns a groupby object that contains information about the groups.

Return type:

SeriesGroupBy

See also

resample()
Convenience method for frequency conversion and resampling of time series.

Notes

See the user guide for more.

Examples

>>> ser = pd.Series([390., 350., 30., 20.],
...                 index=['Falcon', 'Falcon', 'Parrot', 'Parrot'], name="Max Speed")
>>> ser
Falcon    390.0
Falcon    350.0
Parrot     30.0
Parrot     20.0
Name: Max Speed, dtype: float64
>>> ser.groupby(["a", "b", "a", "b"]).mean()
a    210.0
b    185.0
Name: Max Speed, dtype: float64
>>> ser.groupby(level=0).mean()
Falcon    370.0
Parrot     25.0
Name: Max Speed, dtype: float64
>>> ser.groupby(ser > 100).mean()
Max Speed
False     25.0
True     370.0
Name: Max Speed, dtype: float64

Grouping by Indexes

We can groupby different levels of a hierarchical index using the level parameter:

>>> arrays = [['Falcon', 'Falcon', 'Parrot', 'Parrot'],
...           ['Captive', 'Wild', 'Captive', 'Wild']]
>>> index = pd.MultiIndex.from_arrays(arrays, names=('Animal', 'Type'))
>>> ser = pd.Series([390., 350., 30., 20.], index=index, name="Max Speed")
>>> ser
Animal  Type
Falcon  Captive    390.0
        Wild       350.0
Parrot  Captive     30.0
        Wild        20.0
Name: Max Speed, dtype: float64
>>> ser.groupby(level=0).mean()
Animal
Falcon    370.0
Parrot     25.0
Name: Max Speed, dtype: float64
>>> ser.groupby(level="Type").mean()
Type
Captive    210.0
Wild       185.0
Name: Max Speed, dtype: float64

We can also choose to include NA in group keys or not by defining dropna parameter, the default setting is True:

>>> ser = pd.Series([1, 2, 3, 3], index=["a", 'a', 'b', np.nan])
>>> ser.groupby(level=0).sum()
a    3
b    3
dtype: int64
>>> ser.groupby(level=0, dropna=False).sum()
a    3
b    3
NaN  3
dtype: int64
>>> arrays = ['Falcon', 'Falcon', 'Parrot', 'Parrot']
>>> ser = pd.Series([390., 350., 30., 20.], index=arrays, name="Max Speed")
>>> ser.groupby(["a", "b", "a", np.nan]).mean()
a    210.0
b    350.0
Name: Max Speed, dtype: float64
>>> ser.groupby(["a", "b", "a", np.nan], dropna=False).mean()
a    210.0
b    350.0
NaN   20.0
Name: Max Speed, dtype: float64
to_pandas() → pandas.core.series.Series[source]

Convert Lux Series to Pandas Series

Returns:
Return type:pd.Series
unique()[source]

Overridden method for pd.Series.unique with cached results. Return unique values of Series object. Uniques are returned in order of appearance. Hash table-based unique, therefore does NOT sort. :returns: The unique values returned as a NumPy array. :rtype: ndarray or ExtensionArray

See also

https()
//pandas.pydata.org/pandas-docs/stable/reference/api/pandas.Series.unique.html
exported

Get selected visualizations as exported Vis List

Notes

Convert the _selectedVisIdxs dictionary into a programmable VisList Example _selectedVisIdxs :

{‘Correlation’: [0, 2], ‘Occurrence’: [1]}

indicating the 0th and 2nd vis from the Correlation tab is selected, and the 1st vis from the Occurrence tab is selected.

Returns:When there are no exported vis, return empty list -> [] When all the exported vis is from the same tab, return a VisList of selected visualizations. -> VisList(v1, v2…) When the exported vis is from the different tabs, return a dictionary with the action name as key and selected visualizations in the VisList. -> {“Enhance”: VisList(v1, v2…), “Filter”: VisList(v5, v7…), ..}
Return type:Union[Dict[str,VisList], VisList]