tabmat package
- tabmat.from_df(df, dtype=<class 'numpy.float64'>, sparse_threshold=0.1, cat_threshold=4, object_as_cat=False, cat_position='expand', drop_first=False, categorical_format='{name}[{category}]', cat_missing_method='fail', cat_missing_name='(MISSING)')
Transform a DataFrame into an efficient SplitMatrix.
- Parameters:
df (DataFrame) – This can be any dataframes supported by narwhals (pandas, polars, etc.).
dtype (np.dtype, default np.float64) – dtype of all sub-matrices of the resulting SplitMatrix.
sparse_threshold (float, default 0.1) – Density threshold below which numerical columns will be stored in a sparse format.
cat_threshold (int, default 4) – Number of levels of a categorical column under which the column will be stored as sparse one-hot-encoded columns instead of CategoricalMatrix
object_as_cat (bool, default False) – If True, DataFrame columns stored as python objects will be treated as categorical columns.
cat_position (str {'end'|'expand'}, default 'expand') – Position of the categorical variable in the index. If “last”, all the categoricals (including the ones that did not satisfy cat_threshold) will be placed at the end of the index list. If “expand”, all the variables will remain in the same order.
drop_first (bool, default False) – If true, categoricals variables will have their first category dropped. This allows multiple categorical variables to be included in an unregularized model. If False, all categories are included.
cat_missing_method (str {'fail'|'zero'|'convert'}, default 'fail') – How to handle missing values in categorical columns: - if ‘fail’, raise an error if there are missing values. - if ‘zero’, missing values will represent all-zero indicator columns. - if ‘convert’, missing values will be converted to the ‘(MISSING)’ category.
cat_missing_name (str, default '(MISSING)') – Name of the category to which missing values will be converted if
cat_missing_method='convert'
.categorical_format (str)
- Return type:
- tabmat.from_pandas(df, dtype=<class 'numpy.float64'>, sparse_threshold=0.1, cat_threshold=4, object_as_cat=False, cat_position='expand', drop_first=False, categorical_format='{name}[{category}]', cat_missing_method='fail', cat_missing_name='(MISSING)')
Deprecated. Please use from_df instead.
Transform a pandas.DataFrame into an efficient SplitMatrix.
- Parameters:
df (pd.DataFrame) – pandas DataFrame to convert.
dtype (np.dtype, default np.float64) – dtype of all sub-matrices of the resulting SplitMatrix.
sparse_threshold (float, default 0.1) – Density threshold below which numerical columns will be stored in a sparse format.
cat_threshold (int, default 4) – Number of levels of a categorical column under which the column will be stored as sparse one-hot-encoded columns instead of CategoricalMatrix
object_as_cat (bool, default False) – If True, DataFrame columns stored as python objects will be treated as categorical columns.
cat_position (str {'end'|'expand'}, default 'expand') – Position of the categorical variable in the index. If “last”, all the categoricals (including the ones that did not satisfy cat_threshold) will be placed at the end of the index list. If “expand”, all the variables will remain in the same order.
drop_first (bool, default False) – If true, categoricals variables will have their first category dropped. This allows multiple categorical variables to be included in an unregularized model. If False, all categories are included.
cat_missing_method (str {'fail'|'zero'|'convert'}, default 'fail') – How to handle missing values in categorical columns: - if ‘fail’, raise an error if there are missing values. - if ‘zero’, missing values will represent all-zero indicator columns. - if ‘convert’, missing values will be converted to the ‘(MISSING)’ category.
cat_missing_name (str, default '(MISSING)') – Name of the category to which missing values will be converted if
cat_missing_method='convert'
.categorical_format (str)
- Return type:
- tabmat.from_csc(mat, threshold=0.1, column_names=None, term_names=None)
Convert a CSC-format sparse matrix into a
SplitMatrix
.The
threshold
parameter specifies the density below which a column is treated as sparse.- Parameters:
mat (csc_matrix)
- class tabmat.MatrixBase
Bases:
ABC
Base class for all matrix classes.
MatrixBase
cannot be instantiated.- property A: ndarray
Convert self into an np.ndarray. Synonym for
toarray()
.
- property column_names
Column names of the matrix.
- abstract get_names(type='column', missing_prefix=None, indices=None)
Get column names.
For columns that do not have a name, a default name is created using the following pattern:
"{missing_prefix}{start_index + i}"
wherei
is the index of the column.- Parameters:
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).missing_prefix (Optional[str], default None) – Prefix to use for columns that do not have a name. If None, then no default name is created.
indices (list[int] | None) – The indices used for columns that do not have a name. If
None
, then the indices arelist(range(self.shape[1]))
.
- Returns:
Column names.
- Return type:
list[Optional[str]]
- abstract matvec(other, cols=None, out=None)
Perform: self[:, cols] @ other[cols], so result[i] = sum_j self[i, j] other[j].
The ‘cols’ parameter allows restricting to a subset of the matrix without making a copy. If provided:
result[i] = sum_{j in cols} self[i, j] other[j].
If ‘out’ is provided, we modify ‘out’ in place by adding the output of this operation to it.
- Parameters:
cols (ndarray | None)
out (ndarray | None)
- abstract sandwich(d, rows=None, cols=None)
Perform a sandwich product: (self[rows, cols].T * d[rows]) @ self[rows, cols].
The rows and cols parameters allow restricting to a subset of the matrix without making a copy.
- Parameters:
d (ndarray)
rows (ndarray | None)
cols (ndarray | None)
- Return type:
ndarray
- set_names(names, type='column')
Set column names.
- Parameters:
names (list[Optional[str]]) – Names to set.
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).
- standardize(weights, center_predictors, scale_predictors)
Return a StandardizedMatrix along with the column means and column standard deviations.
It is often useful to modify a dataset so that each column has mean zero and standard deviation one. This function does this “standardization” without modifying the underlying dataset by storing shifting and scaling factors that are then used whenever an operation is performed with the new StandardizedMatrix.
Note: If center_predictors is False, col_means will be zeros.
Note: If scale_predictors is False, col_stds will be None.
- Parameters:
weights (ndarray)
center_predictors (bool)
scale_predictors (bool)
- Return type:
tuple[Any, ndarray, ndarray | None]
- property term_names
Term names of the matrix.
For differences between column names and term names, see
get_names
.
- abstract toarray()
Convert self into an np.ndarray.
- Return type:
ndarray
- abstract transpose_matvec(vec, rows=None, cols=None, out=None)
Perform: self[rows, cols].T @ vec[rows], so result[i] = sum_j self[j, i] vec[j].
The rows and cols parameters allow restricting to a subset of the matrix without making a copy.
If ‘rows’ and ‘cols’ are provided:
result[i] = sum_{j in rows} self[j, cols[i]] vec[j].
Note that the length of the output is len(cols).
If
out
is provided:out[cols[i]] += sum_{j in rows} self[j, cols[i]] vec[j]
- Parameters:
vec (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- class tabmat.DenseMatrix(input_array, column_names=None, term_names=None)
Bases:
MatrixBase
A
numpy.ndarray
subclass with several additional functions that allow it to share the MatrixBase API with SparseMatrix and CategoricalMatrix.In particular, we have added:
The
sandwich
productgetcol
to support the same interface as SparseMatrix for retrieving a single columntoarray
matvec
- property T
Returns a view of the array with axes transposed.
- astype(dtype, order='K', casting='unsafe', copy=True)
Copy of the array, cast to a specified type.
- property dtype
Data type of the array’s elements.
- get_names(type='column', missing_prefix=None, indices=None)
Get column names.
For columns that do not have a name, a default name is created using the following pattern:
"{missing_prefix}{start_index + i}"
wherei
is the index of the column.- Parameters:
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).missing_prefix (Optional[str], default None) – Prefix to use for columns that do not have a name. If None, then no default name is created.
indices (list[int] | None) – The indices used for columns that do not have a name. If
None
, then the indices arelist(range(self.shape[1]))
.
- Returns:
Column names.
- Return type:
list[Optional[str]]
- getcol(i)
Return matrix column at specified index.
- matvec(vec, cols=None, out=None)
Perform self[:, cols] @ other[cols].
- Parameters:
vec (ndarray | list)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- multiply(other)
Element-wise multiplication.
This assumes that
other
is a vector of sizeself.shape[0]
.
- property ndim
Number of array dimensions.
- sandwich(d, rows=None, cols=None)
Perform a sandwich product: X.T @ diag(d) @ X.
- Parameters:
d (ndarray)
rows (ndarray | None)
cols (ndarray | None)
- Return type:
ndarray
- set_names(names, type='column')
Set column names.
- Parameters:
names (list[Optional[str]]) – Names to set.
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).
- property shape
Tuple of array dimensions.
- toarray()
Return array representation of matrix.
- transpose()
Returns a view of the array with axes transposed.
- transpose_matvec(vec, rows=None, cols=None, out=None)
Perform: self[rows, cols].T @ vec[rows].
- Parameters:
vec (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- unpack()
Return the underlying numpy.ndarray.
- class tabmat.SparseMatrix(input_array, shape=None, dtype=None, copy=False, column_names=None, term_names=None)
Bases:
MatrixBase
A scipy.sparse csc matrix subclass that allows such objects to conform to the
MatrixBase
interface.SparseMatrix is instantiated in the same way as scipy.sparse.csc_matrix.
- Parameters:
shape (tuple[int, int])
dtype (dtype)
- property T
Returns a view of the array with axes transposed.
- property array_csc
Return the CSC representation of the matrix.
- property array_csr
Cache the CSR representation of the matrix.
- astype(dtype, order='K', casting='unsafe', copy=True)
Return SparseMatrix cast to new type.
- property data
Data of the matrix.
- dot(other)
Return the dot product as a scipy sparse matrix.
- property dtype
Data-type of the array’s elements.
- get_names(type='column', missing_prefix=None, indices=None)
Get column names.
For columns that do not have a name, a default name is created using the following pattern:
"{missing_prefix}{start_index + i}"
wherei
is the index of the column.- Parameters:
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).missing_prefix (Optional[str], default None) – Prefix to use for columns that do not have a name. If None, then no default name is created.
indices (list[int] | None) – The indices used for columns that do not have a name. If
None
, then the indices arelist(range(self.shape[1]))
.
- Returns:
Column names.
- Return type:
list[Optional[str]]
- getcol(i)
Return matrix column at specified index.
- property indices
Indices of the matrix.
- property indptr
Indptr of the matrix.
- matvec(vec, cols=None, out=None)
Perform self[:, cols] @ other[cols].
- Parameters:
cols (ndarray | None)
out (ndarray | None)
- multiply(other)
Element-wise multiplication.
See
scipy.sparse.csc_matrix.multiply
. The method is taken almost directly from the parent class except thatother
is assumed to be a vector of sizeself.shape[0]
.
- property ndim
Number of array dimensions.
- sandwich(d, rows=None, cols=None)
Perform a sandwich product: X.T @ diag(d) @ X.
- Parameters:
d (ndarray)
rows (ndarray | None)
cols (ndarray | None)
- Return type:
ndarray
- sandwich_dense(B, d, rows, L_cols, R_cols)
Perform a sandwich product: self.T @ diag(d) @ B.
- Parameters:
B (ndarray)
d (ndarray)
rows (ndarray | None)
L_cols (ndarray | None)
R_cols (ndarray | None)
- Return type:
ndarray
- set_names(names, type='column')
Set column names.
- Parameters:
names (list[Optional[str]]) – Names to set.
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).
- property shape
Tuple of array dimensions.
- toarray()
Return a dense ndarray representation of the matrix.
- tocsc(copy=False)
Return the matrix in CSC format.
- transpose()
Returns a view of the array with axes transposed.
- transpose_matvec(vec, rows=None, cols=None, out=None)
Perform: self[rows, cols].T @ vec[rows].
- Parameters:
vec (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- unpack()
Return the underlying scipy.sparse.csc_matrix.
- class tabmat.CategoricalMatrix(cat_vec, categories=None, drop_first=False, dtype=<class 'numpy.float64'>, column_name=None, term_name=None, column_name_format='{name}[{category}]', cat_missing_method='fail', cat_missing_name='(MISSING)')
Bases:
MatrixBase
A faster, more memory efficient sparse matrix adapted to the specific settings of a one-hot encoded categorical variable.
- Parameters:
cat_vec – array-like vector of categorical data.
categories (np.ndarray, default None) – If provided, cat_vec is assumed to be an array-like vector of indices.
drop_first (bool) – drop the first level of the dummy encoding. This allows a CategoricalMatrix to be used in an unregularized setting.
cat_missing_method (str {'fail'|'zero'|'convert'}, default 'fail') –
if ‘fail’, raise an error if there are missing values.
if ‘zero’, missing values will represent all-zero indicator columns.
if ‘convert’, missing values will be converted to the
cat_missing_name
category.
cat_missing_name (str, default '(MISSING)') – Name of the category to which missing values will be converted if
cat_missing_method='convert'
. If this category already exists, an error will be raised.dtype (numpy.dtype) – data type
column_name (str | None)
term_name (str | None)
column_name_format (str)
- astype(dtype, order='K', casting='unsafe', copy=True)
Return CategoricalMatrix cast to new type.
- property cat
Return a series with same data as what was initially fed to __init__.
This property is available for backward compatibility.
- get_names(type='column', missing_prefix=None, indices=None)
Get column names.
For columns that do not have a name, a default name is created using the following pattern:
"{missing_prefix}{start_index + i}"
wherei
is the index of the column.- Parameters:
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).missing_prefix (Optional[str], default None) – Prefix to use for columns that do not have a name. If None, then no default name is created.
indices (list[int] | None) – The indices used for columns that do not have a name. If
None
, then the indices arelist(range(self.shape[1]))
.
- Returns:
Column names.
- Return type:
list[Optional[str]]
- getcol(i)
Return matrix column at specified index.
- Parameters:
i (int)
- Return type:
- matvec(other, cols=None, out=None)
Multiply self with vector ‘other’, and add vector ‘out’ if it is present.
out[i] += sum_j mat[i, j] other[j] = other[mat.indices[i]]
The cols parameter allows restricting to a subset of the matrix without making a copy.
If out is None, then a new array will be returned.
Test: test_matrices::test_matvec
- Parameters:
other (list | ndarray)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- multiply(other)
Element-wise multiplication.
This assumes that
other
is a vector of sizeself.shape[0]
.- Return type:
- recover_orig()
Return 1d numpy array with same data as what was initially fed to __init__.
Test: matrix/test_categorical_matrix::test_recover_orig
- Return type:
ndarray
- sandwich(d, rows=None, cols=None)
Perform a sandwich product: X.T @ diag(d) @ X.
sandwich(self, d)[i, j] = (self.T @ diag(d) @ self)[i, j] = sum_k (self[k, i] (diag(d) @ self)[k, j]) = sum_k self[k, i] sum_m diag(d)[k, m] self[m, j] = sum_k self[k, i] d[k] self[k, j] = 0 if i != j sandwich(self, d)[i, i] = sum_k self[k, i] ** 2 * d(k)
The rows and cols parameters allow restricting to a subset of the matrix without making a copy.
- Parameters:
d (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
- Return type:
dia_matrix
- set_names(names, type='column')
Set column names.
- Parameters:
names (list[Optional[str]]) – Names to set.
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).
- to_sparse_matrix()
Return a tabmat.SparseMatrix representation.
- toarray()
Return array representation of matrix.
- Return type:
ndarray
- tocsr()
Return scipy csr representation of matrix.
- Return type:
csr_matrix
- transpose_matvec(vec, rows=None, cols=None, out=None)
Perform: self[rows, cols].T @ vec[rows].
for i in cols: out[i] += sum_{j in rows} self[j, i] vec[j] self[j, i] = 1(indices[j] == i) for j in rows: for i in cols: out[i] += (indices[j] == i) * vec[j]
- If cols == range(self.shape[1]), then for every row j, there will be exactly
one relevant column, so you can do
for j in rows, out[indices[j]] += vec[j]
The rows and cols parameters allow restricting to a subset of the matrix without making a copy.
If out is None, then a new array will be returned.
Test: tests/test_matrices::test_transpose_matvec
- Parameters:
vec (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- unpack()
Return the underlying pandas.Categorical.
- class tabmat.SplitMatrix(matrices, indices=None)
Bases:
MatrixBase
A class for matrices with sparse, dense and categorical parts.
For real-world tabular data, it’s common for the same dataset to contain a mix of columns that are naturally dense, naturally sparse and naturally categorical. Representing each of these sets of columns in the format that is most natural allows for a significant speedup in matrix multiplications compared to representations that are entirely dense or entirely sparse.
Initialize a SplitMatrix directly with a list of
matrices
and a list of columnindices
for each matrix. Most of the time, it will be best to usetabmat.from_pandas()
ortabmat.from_csc()
to initialize aSplitMatrix
.- Parameters:
matrices (Sequence[MatrixBase]) – The sub-matrices composing the columns of this SplitMatrix.
indices (list[ndarray] | None) – If
indices
is not None, then for each matrix passed inmatrices
,indices
must contain the set of columns which that matrix covers.
- astype(dtype, order='K', casting='unsafe', copy=True)
Return SplitMatrix cast to new type.
- get_names(type='column', missing_prefix=None, indices=None)
Get column names.
For columns that do not have a name, a default name is created using the following pattern:
"{missing_prefix}{start_index + i}"
wherei
is the index of the column.- Parameters:
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).missing_prefix (Optional[str], default None) – Prefix to use for columns that do not have a name. If None, then no default name is created.
indices (list[int] | None) – The indices used for columns that do not have a name. If
None
, then the indices arelist(range(self.shape[1]))
.
- Returns:
Column names.
- Return type:
list[Optional[str]]
- getcol(i)
Return matrix column at specified index.
- Parameters:
i (int)
- Return type:
ndarray | csr_matrix
- matvec(v, cols=None, out=None)
Perform self[:, cols] @ other[cols].
- Parameters:
v (ndarray)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- multiply(other)
Element-wise multiplication.
This assumes that
other
is a vector of sizeself.shape[0]
.
- sandwich(d, rows=None, cols=None)
Perform a sandwich product: X.T @ diag(d) @ X.
- Parameters:
d (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
- Return type:
ndarray
- set_names(names, type='column')
Set column names.
- Parameters:
names (list[Optional[str]]) – Names to set.
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).
- toarray()
Return array representation of matrix.
- Return type:
ndarray
- transpose_matvec(v, rows=None, cols=None, out=None)
Perform: self[rows, cols].T @ vec[rows].
self.transpose_matvec(v, rows, cols) = self[rows, cols].T @ v[rows] self.transpose_matvec(v, rows, cols)[i] = sum_{j in rows} self[j, cols[i]] v[j] = sum_{j in rows} sum_{mat in self.matrices} 1(cols[i] in mat) self[j, cols[i]] v[j]
- Parameters:
v (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- class tabmat.StandardizedMatrix(mat, shift, mult=None)
Bases:
object
StandardizedMatrix allows for storing a matrix standardized to have columns that have mean zero and standard deviation one without modifying underlying sparse matrices.
To be precise, for a StandardizedMatrix:
self[i, j] = (self.mult[j] * self.mat[i, j]) + self.shift[j]
This class is returned from
MatrixBase.standardize
.- Parameters:
mat (MatrixBase)
shift (ndarray | list)
mult (ndarray | list | None)
- property A: ndarray
Return array representation of self.
- astype(dtype, order='K', casting='unsafe', copy=True)
Return StandardizedMatrix cast to new type.
- property column_names
Column names of the matrix.
- get_names(type='column', missing_prefix=None, indices=None)
Get column names.
For columns that do not have a name, a default name is created using the following pattern:
"{missing_prefix}{start_index + i}"
wherei
is the index of the column.- Parameters:
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).missing_prefix (Optional[str], default None) – Prefix to use for columns that do not have a name. If None, then no default name is created.
indices (list[int] | None) – The indices used for columns that do not have a name. If
None
, then the indices arelist(range(self.shape[1]))
.
- Returns:
Column names.
- Return type:
list[Optional[str]]
- getcol(i)
Return matrix column at specified index.
Returns a StandardizedMatrix.
>>> from scipy import sparse as sps >>> x = StandardizedMatrix(SparseMatrix(sps.eye(3).tocsc()), shift=[0, 1, -2]) >>> col_1 = x.getcol(1) >>> isinstance(col_1, StandardizedMatrix) True >>> col_1.toarray() array([[1.], [2.], [1.]])
- Parameters:
i (int)
- matvec(other_mat, cols=None, out=None)
Perform self[:, cols] @ other[cols].
This function returns a dense output, so it is best geared for the matrix-vector case.
- Parameters:
other_mat (ndarray | list)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- multiply(other)
Element-wise multiplication.
Note that the output of this function is always a DenseMatrix and might require a lot more memory. This assumes that
other
is a vector of sizeself.shape[0]
.- Return type:
- sandwich(d, rows=None, cols=None)
Perform a sandwich product: X.T @ diag(d) @ X.
- Parameters:
d (ndarray)
rows (ndarray | None)
cols (ndarray | None)
- Return type:
ndarray
- set_names(names, type='column')
Set column names.
- Parameters:
names (list[Optional[str]]) – Names to set.
type (str {'column'|'term'}) – Whether to get column names or term names. The main difference is that a categorical submatrix counts as one term, but can count as multiple columns. Furthermore, matrices created from formulas distinguish between columns and terms (c.f.
formulaic
docs).
- property term_names
Term names of the matrix.
For differences between column names and term names, see
get_names
.
- toarray()
Return array representation of matrix.
- Return type:
ndarray
- transpose_matvec(other, rows=None, cols=None, out=None)
Perform: self[rows, cols].T @ vec[rows].
Let self.shape = (N, K) and other.shape = (M, N). Let shift_mat = outer(ones(N), shift)
(X.T @ other)[k, i] = (X.mat.T @ other)[k, i] + (shift_mat @ other)[k, i] (shift_mat @ other)[k, i] = (outer(shift, ones(N)) @ other)[k, i] = sum_j outer(shift, ones(N))[k, j] other[j, i] = sum_j shift[k] other[j, i] = shift[k] other.sum(0)[i] = outer(shift, other.sum(0))[k, i]
With row and col restrictions:
- self.transpose_matvec(other, rows, cols)[i, j]
- = self.mat.transpose_matvec(other, rows, cols)[i, j]
(outer(self.shift, ones(N))[rows, cols] @ other[cols])
- = self.mat.transpose_matvec(other, rows, cols)[i, j]
shift[cols[i]] other.sum(0)[rows[j]
- Parameters:
other (ndarray | list)
rows (ndarray | None)
cols (ndarray | None)
out (ndarray | None)
- Return type:
ndarray
- unstandardize()
Get unstandardized (base) matrix.
- Return type: