Welcome to gower-metric documentation!#
Gower-metric is a Python library for calculating distance for mixed-type variables derived as the complement of the Gower’s similarity coefficient.
Main features include:
Support for mixed data types (categorical, numerical, ordinal, binary)
Podani’s support
Efficient computation using NumPy framework
Numerical friendly thanks to transform call
Easy integration with pandas DataFrames
Customizable weighting for different variable types
MIT License
Note
This project is under active development. If you would like to contribute, or if you find any issues, please visit the main [repository]() and submit a pull request or open an issue.
Installation#
The easiest way to install the gower_metric package is via pip:
pip install gower-metric
Quick start#
In order to import class, which calculate Gower’s metric, you need to import it as follows:
from gower_metric import Gower
After that, we have to initialize the features type dictionary:
data = [[1, 'a', 3.5], [2, 'b', 4.0], [3, 'a', 2.5], [4, 'c', 5.0]]
feature_types = {
0: "ratio_scale_interval",
1: "categorical_nominal",
2: "ratio_scale_interval"
}
gower = Gower(feature_types=feature_types)
Finally, we can fit our data and calculate Gower’s distance over first and second rows:
gower.fit(data)
distance = gower(data[0], data[1])
Tip
To calculate the pairwise distances for the entire dataset, you can do it manually or use an auxiliary function, like: scipy.spatial.distance.pdist or sklearn.metrics.pairwise_distances.
Important
It is crucial to not be mistaken here! The keys of the dictionary must correspond to the indices of the columns in your dataset, and the values must accurately represent the type of data in each column. This ensures that the Gower’s metric is calculated correctly based on the nature of each feature.
Installation
User Guide
API Reference