ecl-logo Documentation

AnalyzeClusters

AnalyzeClusters[data]clusteringObject

partitions the data points contained in data into distinct similarity groups.

Details

  • Suitable tabular data include any rectangular matrix in the form {{_?NumericQ|_?QuantityQ..}..}.
  • Manual gating entails defining lists of 1D, 2D, or 3D filters. To be included in a partition, a given data point must pass all filters in the corresponding list. Any points not captured by any of the manual gates are excluded from the analysis.
  • Each 1D filter includes an index denoting the data column used for gating, a real-valued threshold, and an indicator denoting whether data points below or above the threshold are included.
  • Each 2D filter includes a pair of indices denoting the two data columns used for gating, a 2D polygon defining the gate, and an indicator denoting whether data points within the polygon are included or excluded.
  • Each 3D filter includes a set of indices denoting the three data columns used for gating, a 3D ellipsoid defining the gate, and an indicator denoting whether data points within the ellipsoid are included or excluded.
  • Input
    Output
    Data Annotation Options
    Data Preprocessing Options
    Methodology Options
    Messages
  • ClusterAssignmentsLengthMismatchThe specified number of cluster assignments (`1`) does not match the number of partitions generated for the input data (`2`). Please ensure that the number of assignments matches the expected number of partitions. The provided ClusterAssignments will either be truncated or padded with Null to match the number of detected partitions.
    ClusterLabelsLengthMismatchThe specified number of cluster labels (`1`) does not match the number of partitions generated for the input data (`2`). Please ensure that the number of labels matches the expected number of partitions. The provided ClusterLabels will either be truncated or padded with sequential integers to match the number of detected partitions.
    DimensionLabelsLengthMismatchThe specified number of dimension labels (`1`) does not match the number of dimensions in the input data (`2`). Please ensure that the number of labels matches the dimensionality of the input data. Defaulting DimensionLabels to None.
    DimensionUnitsLengthMismatchThe specified DimensionUnits `1` do not match the number of columns in the input data (`2`). Please check that the DimensionUnits are correctly specified. Any units present in the input data will be used for all remaining analysis. If the input data do not contain units, DimensionUnits will be set to None.
    EmptyManualGatesNone of the input data lie within the specified ManualGates. Please specify one or more lists of gates defining each partition.
    GatesOverlapThe specified ManualGates assign at least one point to multiple distinct partitions. Please ensure that the ManualGates definition assigns each point to at most one partition. Defaulting each point to the first viable partition.
    Invalid1DGateDefinitionThe specified 1D gate `1` is invalid. Please check that the specified gate dimension is valid. This gate definition will be ignored throughout the remaining analysis.
    Invalid2DGateDefinitionThe specified 2D polygonal gate `1` is invalid. Please check that all specified gate dimensions are valid, and that the specified polygon contains at least three non-collinear points lying in a 2D plane. This gate definition will be ignored throughout the remaining analysis.
    Invalid3DGateDefinitionThe specified 3D gate `1` is invalid. Please check that all specified gate dimensions are valid, and that the specified ellipsoid is defined in 3D. This gate definition will be ignored throughout the remaining analysis.
    InvalidClusteredDimensionsThe specified value of ClusteredDimensions (`1`) contains dimensions that are inconsistent with the shape of the input data. Please check that the dimensionality of ClusteredDimensions is correct. All dimensions of the data will be used throughout the remaining analysis.
    InvalidDomainFunctionDefinitionThe specified domain constraint `1` is invalid because it does not return a boolean value when given a single data point as input. Please check that the function accepts single points as input and only returns True or False. This domain constraint will be ignored throughout the remaining analysis.
    InvalidLogTransformationApplying the specified Scale (`1`) would result in the log-transformation of negative values. Please check that the choice of Scale is well suited to the input data. Defaulting Scale to Linear for any dimensions of the data that contain negative values.
    NoIncludedDataThe specified gates (`1`) exclude all points in the input data. Please check that the gates are correctly specified, and consider specifying less restrictive gates.
    NumberOfClustersExceedsDataSizeThe specified NumberOfClusters (`1`) exceeds the number of points in the input data (`2`). Please consider specifying a lower value for NumberOfClusters. Defaulting NumberOfClusters to Automatic.
    NumberOfClustersNotSupportedThe Method `1` does not support specifying NumberOfClusters. Please consider setting NumberOfClusters to Automatic so that the number of clusters may be inferred from the input data. If a specific number of clusters is required, you might also consider specifying an alternate Method. Defaulting NumberOfClusters to Automatic.
    NumberOfClustersRequiredThe Method `1` requires that NumberOfClusters also be specified. Please specify NumberOfClusters. If the expected number of clusters is unknown you might also consider specifying an alternate Method. The NumberOfClusters will be estimated using the Spectral clustering algorithm.
    RedundantDomainMethod is set to Manual but one or more domain constraints were also specified. Please either set Method to Automatic or remove the specified Domain. The specified Domain will be ignored in all subsequent analyses.
    ScaleDimensionMismatchThe specified Scale (`1`) does not match the number of columns present in the data (`2`). Please check that the Scale is correctly specified, or consider specifying a single value to be applied to all dimensions of the data. Defaulting the value of Scale to Linear.

Examples

Basic Examples  (1)

Perform unsupervised clustering of multi-dimensional point data:

Options  (34)

ClusterAssignments  (2)

Assign identity models to each partition:

If no ClusterAssignments are specified, they will default to Null:

ClusterDissimilarityFunction  (1)

Specify the function used to assess the pairwise distance between clusters when ClusteringAlgorithm is set to Agglomerate or SPADE:

ClusterDomainOutliers  (1)

Assign each point excluded by the specified Domain to the nearest identified partition:

ClusteredDimensions  (1)

Specify which dimensions are used to automatically partition the input data:

ClusteringAlgorithm  (1)

Specify which strategy is used to automatically partition the input data when Method is set to Automatic:

ClusterLabels  (1)

Specify a name for each partition:

CovarianceType  (1)

Specify the form of the covariance matrix when ClusteringAlgorithm is set to GaussianMixture:

CriterionFunction  (1)

Specify the evaluation metric used to select an appropriate clustering algorithm during automated clustering:

DensityResamplingThreshold  (1)

Specify the maximum local density at which a data point is subject to density-dependent resampling when ClusteringAlgorithm is set to SPADE:

DimensionLabels  (1)

Specify a name for each dimension of the input data:

DimensionUnits  (1)

Specify units for each dimension of the input data:

DistanceFunction  (1)

Specify the function used to evaluate the similarity of two data points during automated clustering:

Domain  (3)

Specify which input data points are used to partition the input data by defining one or more 2D polygonal gates:

Specify which input data points are used to partition the input data by defining one or more 3D gates:

Specify which input data points are used to partition the input data by defining one or more constraint functions:

ManualGates  (3)

Manually partition data using 1D threshold gates:

Manually partition data using 2D polygonal gates:

Manually partition data using 3D ellipsoidal gates:

MaxEdgeLength  (1)

Specify the maximum allowable distance between adjacent clusters when ClusteringAlgorithm is set to SpanningTree:

MaxIterations  (1)

Specify the maximum allowable number of expectation-maximization iterations used to fit a mixture model when ClusteringAlgorithm is set to GaussianMixture:

Method  (1)

Use an automated clustering algorithm to partition the input data:

NeighborhoodRadius  (1)

Specify the separation distance below which two data points are considered neighbors when ClusteringAlgorithm is set to DBSCAN, MeanShift, or Spectral:

NeighborsNumber  (1)

Specify the minimum number of adjacent points required for a data point to be considered a core point when ClusteringAlgorithm is set to DBSCAN:

Normalize  (1)

Transform each dimension of the input data to a 0 to 1 interval:

NumberOfClusters  (1)

Specify the desired number of partitions to be detected during automatic clustering:

OutlierDensityQuantile  (1)

Specify the quantile of local densities below which a given data point should be considered an outlier when ClusteringAlgorithm is set to SPADE:

Output  (2)

Return an interactive preview of the analysis result:

Return the entire set of resolved option values:

PerformanceGoal  (1)

Specify the PerformanceGoal used to select an automated clustering method:

Scale  (2)

Log-transform all of the input data:

Apply Log and Reciprocal transformations to specific dimensions of the input data:

TargetDensityQuantile  (1)

Specify the quantile of local densities to which the density of data points should be resampled when ClusteringAlgorithm is set to SPADE:

Upload  (1)

Upload the resultant Object[Analysis,Clusters] object to Constellation:

Messages  (18)

ClusterAssignmentsLengthMismatch  (1)

Issue a warning if the specified ClusterAssignments does not match the number of identified clusters:

ClusterLabelsLengthMismatch  (1)

Issue a warning if the specified ClusterLabels do not match the number of identified clusters:

DimensionLabelsLengthMismatch  (1)

Issue a warning if the specified DimensionLabels do not match the length of the input data:

DimensionUnitsLengthMismatch  (1)

Issue a warning if the specified DimensionUnits do not match the length of the input data:

EmptyManualGates  (1)

Issue an error if the specified Domain excludes all points in the input data:

GatesOverlap  (1)

Issue a warning if the specified ManualGates assign any of the input data points to multiple distinct partitions:

Invalid1DGateDefinition  (1)

Issue a warning if an invalid 1D gate is specified:

Invalid2DGateDefinition  (1)

Issue a warning if an invalid 2D gate is specified:

Invalid3DGateDefinition  (1)

Issue a warning if an invalid 3D gate is specified:

InvalidClusteredDimensions  (1)

Issue a warning if the specified ClusteredDimensions are inconsistent with the dimensionality of the input data:

InvalidDomainFunctionDefinition  (1)

Issue a warning if a pure function constraint in the specified Domain does not return a boolean value when given an input data point:

InvalidLogTransformation  (1)

Issue a warning if the specified Scale entails log-transforming a negative value:

NoIncludedData  (1)

Issue an error if the specified Domain excludes all points in the input data:

NumberOfClustersExceedsDataSize  (1)

Issue a warning if NumberOfClusters exceeds the number of points to be clustered:

NumberOfClustersNotSupported  (1)

Issue a warning if specifying NumberOfClusters is not supported for the selected ClusteringAlgorithm:

NumberOfClustersRequired  (1)

Issue a warning if NumberOfClusters was not specified but the specified ClusteringAlgorithm requires specifying the desired number of clusters:

RedundantDomain  (1)

Issue a warning if Method is set to Manual but the Domain is specified:

ScaleDimensionMismatch  (1)

Issue a warning if the specified Scale does not match the dimensionality of the input data: