model_clo_insight.analysis.data module

Module for methods related to dataframe aggregation

stratify(df, group_by_col, count_col, par_col, is_sort=True, top_n=None)

Computes stratification of a specified dataframe column

Parameters
  • df (pd.DataFrame) – dataframe column is taken from

  • group_by_col (str) – column to use for the groupby operation

  • count_col (str) – column to use for counting

  • par_col (str) – column to use for par

  • is_sort (bool) – optional; if true, sort the results column by percent (high to low)

  • top_n (int) – optional; limit the results to the top n items

Returns

Count, Par, Percent as columns and rows are categories

Return type

df (pd.DataFrame)

w_avg(df, values, weights)

Computes weighted average of a specified dataframe column

Parameters
  • df (pd.DataFrame) – dataframe column is taken from

  • values (str) – column name for values

  • weights (str) – column name for weights

Returns

computed weighted average

Return type

float