Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Model prediction in Autoencoder does not support adjusting batch size #539

Open
rrjing85 opened this issue Jan 1, 2024 · 0 comments
Open

Comments

@rrjing85
Copy link

rrjing85 commented Jan 1, 2024

Hi,

I tried to use the Autoencoder model from Pyod for outlier detection, it is a great function for supporting a deep learning model without constructing the neural network on our own.

One issue I found is that it does not provide an option for adjusting batch size in model prediction part. Adjusting batch size is important when the data size varies a lot. It can dramatically affect the time cost dramatically.

Here is the code I am referring to:
def decision_function(self, X):
"""Predict raw anomaly score of X using the fitted detector.

    The anomaly score of an input sample is computed based on different
    detector algorithms. For consistency, outliers are assigned with
    larger anomaly scores.

    Parameters
    ----------
    X : numpy array of shape (n_samples, n_features)
        The training input samples. Sparse matrices are accepted only
        if they are supported by the base estimator.

    Returns
    -------
    anomaly_scores : numpy array of shape (n_samples,)
        The anomaly score of the input samples.
    """
    check_is_fitted(self, ['model_', 'history_'])
    X = check_array(X)

    if self.preprocessing:
        X_norm = self.scaler_.transform(X)
    else:
        X_norm = np.copy(X)

    # Predict on X and return the reconstruction errors
    **pred_scores = self.model_.predict(X_norm)**
    return pairwise_distances_no_broadcast(X_norm, pred_scores)

So keras model prediction supports batch size as a parameter. If we could simple adjust the code from pred_scores = self.model_.predict(X_norm) to pred_scores = self.model_.predict(X_norm, batch_size=32) and modify other parts accordingly, it would solve this problem.

Thanks for all your great work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant