Skip to content

API Reference

KneeLocator

The primary class for knee/elbow point detection.

kneed.knee_locator.KneeLocator

Bases: object

Once instantiated, this class attempts to find the point of maximum curvature on a line. The knee is accessible via the .knee attribute.

Parameters:

Name Type Description Default
x array - like

x values, must be the same length as y.

required
y array - like

y values, must be the same length as x.

required
S float

Sensitivity. The number of minimum data points below the local distance maximum before calling a knee. The original paper suggests a default of 1.0.

1.0
curve str

If "concave", the algorithm will detect knees. If "convex", it will detect elbows.

"concave"
direction str

One of {"increasing", "decreasing"}.

"increasing"
interp_method str

One of {"interp1d", "polynomial"}.

"interp1d"
online bool

If True, kneed will correct old knee points as it traverses the curve. If False, it returns the first knee found.

False
polynomial_degree int

The degree of the fitting polynomial. Only used when interp_method="polynomial". Passed to numpy.polyfit as the deg parameter.

7

Attributes:

Name Type Description
x ndarray

x values.

y ndarray

y values.

S float

Sensitivity, original paper suggests default of 1.0.

curve str

If "concave", the algorithm detects knees. If "convex", it detects elbows.

direction str

One of {"increasing", "decreasing"}.

interp_method str

One of {"interp1d", "polynomial"}.

online bool

If True, corrects old knee points. If False, returns first knee.

polynomial_degree int

The degree of the fitting polynomial.

N int

The number of x values in the input data.

Ds_y ndarray

The y values from the fitted spline.

x_normalized ndarray

The normalized x values.

y_normalized ndarray

The normalized y values.

x_difference ndarray

The x values of the difference curve.

y_difference ndarray

The y values of the difference curve.

maxima_indices ndarray

The indices of each of the maxima on the difference curve.

x_difference_maxima ndarray

The x values from the difference curve where the local maxima are located.

y_difference_maxima ndarray

The y values from the difference curve where the local maxima are located.

minima_indices ndarray

The indices of each of the minima on the difference curve.

x_difference_minima ndarray

The x values from the difference curve where the local minima are located.

y_difference_minima ndarray

The y values from the difference curve where the local minima are located.

Tmx ndarray

The threshold values on the difference curve for determining the knee point.

knee float or None

The x value of the knee point. None if no knee/elbow was detected.

knee_y float or None

The y value of the knee point. None if no knee/elbow was detected.

norm_knee float or None

The normalized x value of the knee point.

norm_knee_y float or None

The normalized y value of the knee point.

all_knees set

All the x values of the identified knee points.

all_norm_knees set

All the normalized x values of the identified knee points.

all_knees_y list

All the y values of the identified knee points.

all_norm_knees_y list

All the normalized y values of the identified knee points.

elbow float or None

Alias for knee.

elbow_y float or None

Alias for knee_y.

norm_elbow float or None

Alias for norm_knee.

norm_elbow_y float or None

Alias for norm_knee_y.

all_elbows set

Alias for all_knees.

all_elbows_y list

Alias for all_knees_y.

all_norm_elbows set

Alias for all_norm_knees.

all_norm_elbows_y list

Alias for all_norm_knees_y.

Source code in kneed/knee_locator.py
 20
 21
 22
 23
 24
 25
 26
 27
 28
 29
 30
 31
 32
 33
 34
 35
 36
 37
 38
 39
 40
 41
 42
 43
 44
 45
 46
 47
 48
 49
 50
 51
 52
 53
 54
 55
 56
 57
 58
 59
 60
 61
 62
 63
 64
 65
 66
 67
 68
 69
 70
 71
 72
 73
 74
 75
 76
 77
 78
 79
 80
 81
 82
 83
 84
 85
 86
 87
 88
 89
 90
 91
 92
 93
 94
 95
 96
 97
 98
 99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
class KneeLocator(object):
    """Once instantiated, this class attempts to find the point of maximum
    curvature on a line. The knee is accessible via the ``.knee`` attribute.

    Parameters
    ----------
    x : array-like
        x values, must be the same length as y.
    y : array-like
        y values, must be the same length as x.
    S : float, default 1.0
        Sensitivity. The number of minimum data points below the local
        distance maximum before calling a knee. The original paper suggests
        a default of 1.0.
    curve : str, default "concave"
        If ``"concave"``, the algorithm will detect knees. If ``"convex"``,
        it will detect elbows.
    direction : str, default "increasing"
        One of ``{"increasing", "decreasing"}``.
    interp_method : str, default "interp1d"
        One of ``{"interp1d", "polynomial"}``.
    online : bool, default False
        If True, kneed will correct old knee points as it traverses the
        curve. If False, it returns the first knee found.
    polynomial_degree : int, default 7
        The degree of the fitting polynomial. Only used when
        ``interp_method="polynomial"``. Passed to ``numpy.polyfit`` as
        the ``deg`` parameter.

    Attributes
    ----------
    x : numpy.ndarray
        x values.
    y : numpy.ndarray
        y values.
    S : float
        Sensitivity, original paper suggests default of 1.0.
    curve : str
        If ``"concave"``, the algorithm detects knees. If ``"convex"``,
        it detects elbows.
    direction : str
        One of ``{"increasing", "decreasing"}``.
    interp_method : str
        One of ``{"interp1d", "polynomial"}``.
    online : bool
        If True, corrects old knee points. If False, returns first knee.
    polynomial_degree : int
        The degree of the fitting polynomial.
    N : int
        The number of ``x`` values in the input data.
    Ds_y : numpy.ndarray
        The y values from the fitted spline.
    x_normalized : numpy.ndarray
        The normalized x values.
    y_normalized : numpy.ndarray
        The normalized y values.
    x_difference : numpy.ndarray
        The x values of the difference curve.
    y_difference : numpy.ndarray
        The y values of the difference curve.
    maxima_indices : numpy.ndarray
        The indices of each of the maxima on the difference curve.
    x_difference_maxima : numpy.ndarray
        The x values from the difference curve where the local maxima
        are located.
    y_difference_maxima : numpy.ndarray
        The y values from the difference curve where the local maxima
        are located.
    minima_indices : numpy.ndarray
        The indices of each of the minima on the difference curve.
    x_difference_minima : numpy.ndarray
        The x values from the difference curve where the local minima
        are located.
    y_difference_minima : numpy.ndarray
        The y values from the difference curve where the local minima
        are located.
    Tmx : numpy.ndarray
        The threshold values on the difference curve for determining the
        knee point.
    knee : float or None
        The x value of the knee point. None if no knee/elbow was detected.
    knee_y : float or None
        The y value of the knee point. None if no knee/elbow was detected.
    norm_knee : float or None
        The normalized x value of the knee point.
    norm_knee_y : float or None
        The normalized y value of the knee point.
    all_knees : set
        All the x values of the identified knee points.
    all_norm_knees : set
        All the normalized x values of the identified knee points.
    all_knees_y : list
        All the y values of the identified knee points.
    all_norm_knees_y : list
        All the normalized y values of the identified knee points.
    elbow : float or None
        Alias for ``knee``.
    elbow_y : float or None
        Alias for ``knee_y``.
    norm_elbow : float or None
        Alias for ``norm_knee``.
    norm_elbow_y : float or None
        Alias for ``norm_knee_y``.
    all_elbows : set
        Alias for ``all_knees``.
    all_elbows_y : list
        Alias for ``all_knees_y``.
    all_norm_elbows : set
        Alias for ``all_norm_knees``.
    all_norm_elbows_y : list
        Alias for ``all_norm_knees_y``.
    """

    def __init__(
        self,
        x: Iterable[float],
        y: Iterable[float],
        S: float = 1.0,
        curve: str = "concave",
        direction: str = "increasing",
        interp_method: str = "interp1d",
        online: bool = False,
        polynomial_degree: int = 7,
    ):
        # Step 0: Raw Input
        self.x = np.array(x)
        self.y = np.array(y)
        self.curve = curve
        self.direction = direction
        self.N = len(self.x)
        self.S = S
        self.all_knees = set()
        self.all_norm_knees = set()
        self.all_knees_y = []
        self.all_norm_knees_y = []
        self.online = online
        self.polynomial_degree = polynomial_degree

        # I'm implementing Look Before You Leap (LBYL) validation for direction
        # and curve arguments. This is not preferred in Python. The motivation
        # is that the logic inside the conditional once y_difference[j] is less
        # than threshold in find_knee() could have been evaluated improperly if
        # they weren't one of convex, concave, increasing, or decreasing,
        # respectively.
        valid_curve = self.curve in VALID_CURVE
        valid_direction = self.direction in VALID_DIRECTION
        if not all((valid_curve, valid_direction)):
            raise ValueError(
                "Please check that the curve and direction arguments are valid."
            )

        # Step 1: fit a smooth line
        if interp_method == "interp1d":
            uspline = interpolate.interp1d(self.x, self.y)
            self.Ds_y = uspline(self.x)
        elif interp_method == "polynomial":
            p = np.poly1d(np.polyfit(x, y, self.polynomial_degree))
            self.Ds_y = p(x)
        else:
            raise ValueError(
                "{} is an invalid interp_method parameter, use either 'interp1d' or 'polynomial'".format(
                    interp_method
                )
            )

        # Step 2: normalize values
        self.x_normalized = self.__normalize(self.x)
        self.y_normalized = self.__normalize(self.Ds_y)

        # Step 3: Calculate the Difference curve
        self.y_normalized = self.transform_y(
            self.y_normalized, self.direction, self.curve
        )
        # normalized difference curve
        self.y_difference = self.y_normalized - self.x_normalized
        self.x_difference = self.x_normalized.copy()

        # Step 4: Identify local maxima/minima
        # local maxima
        self.maxima_indices = argrelextrema(self.y_difference, np.greater_equal)[0]
        self.x_difference_maxima = self.x_difference[self.maxima_indices]
        self.y_difference_maxima = self.y_difference[self.maxima_indices]

        # local minima
        self.minima_indices = argrelextrema(self.y_difference, np.less_equal)[0]
        self.x_difference_minima = self.x_difference[self.minima_indices]
        self.y_difference_minima = self.y_difference[self.minima_indices]

        # Step 5: Calculate thresholds
        self.Tmx = self.y_difference_maxima - (
            self.S * np.abs(np.diff(self.x_normalized).mean())
        )

        # Step 6: find knee
        self.knee, self.norm_knee = self.find_knee()

        # Step 7: If we have a knee, extract data about it
        self.knee_y = self.norm_knee_y = None
        if self.knee:
            self.knee_y = self.y[self.x == self.knee][0]
            self.norm_knee_y = self.y_normalized[self.x_normalized == self.norm_knee][0]

    @staticmethod
    def __normalize(a: Iterable[float]) -> Iterable[float]:
        """Normalize an array to [0, 1].

        Parameters
        ----------
        a : array-like
            The array to normalize.

        Returns
        -------
        numpy.ndarray
            The normalized array.
        """
        return (a - min(a)) / (max(a) - min(a))

    @staticmethod
    def transform_y(y: Iterable[float], direction: str, curve: str) -> float:
        """Transform y to concave, increasing based on given direction and curve.

        Parameters
        ----------
        y : array-like
            The y values to transform.
        direction : str
            One of ``{"increasing", "decreasing"}``.
        curve : str
            One of ``{"concave", "convex"}``.

        Returns
        -------
        numpy.ndarray
            The transformed y values.
        """
        # convert elbows to knees
        if direction == "decreasing":
            if curve == "concave":
                y = np.flip(y)
            elif curve == "convex":
                y = y.max() - y
        elif direction == "increasing" and curve == "convex":
            y = np.flip(y.max() - y)

        return y

    def find_knee(
        self,
    ):
        """Identify the knee value and set the instance attributes.

        This method is called automatically when ``KneeLocator`` is
        instantiated.

        Returns
        -------
        tuple
            ``(knee, norm_knee)`` where each is a float or None.
        """
        if not self.maxima_indices.size:
            # No local maxima found in the difference curve
            # The line is probably not polynomial, try plotting
            # the difference curve with plt.plot(knee.x_difference, knee.y_difference)
            # Also check that you aren't mistakenly setting the curve argument
            return None, None
        # placeholder for which threshold region i is located in.
        maxima_threshold_index = 0
        minima_threshold_index = 0
        traversed_maxima = False
        # State flag to control knee detection after local minima
        detection_active = True
        # traverse the difference curve
        for i, x in enumerate(self.x_difference):
            # skip points on the curve before the the first local maxima
            if i < self.maxima_indices[0]:
                continue

            j = i + 1

            # reached the end of the curve
            if i == (len(self.x_difference) - 1):
                break

            # if we're at a local max, increment the maxima threshold index and continue
            if (self.maxima_indices == i).any():
                threshold = self.Tmx[maxima_threshold_index]
                threshold_index = i
                maxima_threshold_index += 1
                # Reactivate detection when we reach a local maximum
                detection_active = True
            # values in difference curve are at or after a local minimum
            if (self.minima_indices == i).any():
                threshold = 0.0
                minima_threshold_index += 1
                # Deactivate detection after a local minimum until next local maximum
                detection_active = False

            if detection_active and self.y_difference[j] < threshold:
                if self.curve == "convex":
                    if self.direction == "decreasing":
                        knee = self.x[threshold_index]
                        norm_knee = self.x_normalized[threshold_index]
                    else:
                        knee = self.x[-(threshold_index + 1)]
                        norm_knee = self.x_normalized[threshold_index]

                elif self.curve == "concave":
                    if self.direction == "decreasing":
                        knee = self.x[-(threshold_index + 1)]
                        norm_knee = self.x_normalized[threshold_index]
                    else:
                        knee = self.x[threshold_index]
                        norm_knee = self.x_normalized[threshold_index]

                # add the y value at the knee
                y_at_knee = self.y[self.x == knee][0]
                y_norm_at_knee = self.y_normalized[self.x_normalized == norm_knee][0]
                if knee not in self.all_knees:
                    self.all_knees_y.append(y_at_knee)
                    self.all_norm_knees_y.append(y_norm_at_knee)

                # now add the knee
                self.all_knees.add(knee)
                self.all_norm_knees.add(norm_knee)

                # if detecting in offline mode, return the first knee found
                if self.online is False:
                    return knee, norm_knee

        if self.all_knees == set():
            # No knee was found
            return None, None

        return knee, norm_knee

    def plot_knee_normalized(
        self,
        figsize: Optional[Tuple[int, int]] = None,
        title: str = "Normalized Knee Point",
        xlabel: Optional[str] = None,
        ylabel: Optional[str] = None,
    ):
        """Plot the normalized curve, the difference curve, and the knee.

        Parameters
        ----------
        figsize : tuple of int, optional
            The figure size of the plot, e.g. ``(12, 8)``.
        title : str, default "Normalized Knee Point"
            Title of the visualization.
        xlabel : str, optional
            X-axis label.
        ylabel : str, optional
            Y-axis label.
        """
        if not _has_matplotlib:
            raise _matplotlib_not_found_err

        if figsize is None:
            figsize = (6, 6)

        plt.figure(figsize=figsize)
        plt.title(title)
        if xlabel:
            plt.xlabel(xlabel)
        if ylabel:
            plt.ylabel(ylabel)
        plt.plot(self.x_normalized, self.y_normalized, "b", label="normalized curve")
        plt.plot(self.x_difference, self.y_difference, "r", label="difference curve")
        plt.xticks(
            np.arange(self.x_normalized.min(), self.x_normalized.max() + 0.1, 0.1)
        )
        plt.yticks(
            np.arange(self.y_difference.min(), self.y_normalized.max() + 0.1, 0.1)
        )

        plt.vlines(
            self.norm_knee,
            plt.ylim()[0],
            plt.ylim()[1],
            linestyles="--",
            label="knee/elbow",
        )
        plt.legend(loc="best")

    def plot_knee(
        self,
        figsize: Optional[Tuple[int, int]] = None,
        title: str = "Knee Point",
        xlabel: Optional[str] = None,
        ylabel: Optional[str] = None,
    ):
        """Plot the curve and the knee, if it exists.

        Parameters
        ----------
        figsize : tuple of int, optional
            The figure size of the plot, e.g. ``(12, 8)``.
        title : str, default "Knee Point"
            Title of the visualization.
        xlabel : str, optional
            X-axis label.
        ylabel : str, optional
            Y-axis label.
        """
        if not _has_matplotlib:
            raise _matplotlib_not_found_err

        if figsize is None:
            figsize = (6, 6)

        plt.figure(figsize=figsize)
        plt.title(title)
        if xlabel:
            plt.xlabel(xlabel)
        if ylabel:
            plt.ylabel(ylabel)
        plt.plot(self.x, self.y, "b", label="data")
        plt.vlines(
            self.knee, plt.ylim()[0], plt.ylim()[1], linestyles="--", label="knee/elbow"
        )
        plt.legend(loc="best")

    # Niceties for users working with elbows rather than knees
    @property
    def elbow(self):
        return self.knee

    @property
    def norm_elbow(self):
        return self.norm_knee

    @property
    def elbow_y(self):
        return self.knee_y

    @property
    def norm_elbow_y(self):
        return self.norm_knee_y

    @property
    def all_elbows(self):
        return self.all_knees

    @property
    def all_norm_elbows(self):
        return self.all_norm_knees

    @property
    def all_elbows_y(self):
        return self.all_knees_y

    @property
    def all_norm_elbows_y(self):
        return self.all_norm_knees_y

__normalize(a) staticmethod

Normalize an array to [0, 1].

Parameters:

Name Type Description Default
a array - like

The array to normalize.

required

Returns:

Type Description
ndarray

The normalized array.

Source code in kneed/knee_locator.py
@staticmethod
def __normalize(a: Iterable[float]) -> Iterable[float]:
    """Normalize an array to [0, 1].

    Parameters
    ----------
    a : array-like
        The array to normalize.

    Returns
    -------
    numpy.ndarray
        The normalized array.
    """
    return (a - min(a)) / (max(a) - min(a))

transform_y(y, direction, curve) staticmethod

Transform y to concave, increasing based on given direction and curve.

Parameters:

Name Type Description Default
y array - like

The y values to transform.

required
direction str

One of {"increasing", "decreasing"}.

required
curve str

One of {"concave", "convex"}.

required

Returns:

Type Description
ndarray

The transformed y values.

Source code in kneed/knee_locator.py
@staticmethod
def transform_y(y: Iterable[float], direction: str, curve: str) -> float:
    """Transform y to concave, increasing based on given direction and curve.

    Parameters
    ----------
    y : array-like
        The y values to transform.
    direction : str
        One of ``{"increasing", "decreasing"}``.
    curve : str
        One of ``{"concave", "convex"}``.

    Returns
    -------
    numpy.ndarray
        The transformed y values.
    """
    # convert elbows to knees
    if direction == "decreasing":
        if curve == "concave":
            y = np.flip(y)
        elif curve == "convex":
            y = y.max() - y
    elif direction == "increasing" and curve == "convex":
        y = np.flip(y.max() - y)

    return y

find_knee()

Identify the knee value and set the instance attributes.

This method is called automatically when KneeLocator is instantiated.

Returns:

Type Description
tuple

(knee, norm_knee) where each is a float or None.

Source code in kneed/knee_locator.py
def find_knee(
    self,
):
    """Identify the knee value and set the instance attributes.

    This method is called automatically when ``KneeLocator`` is
    instantiated.

    Returns
    -------
    tuple
        ``(knee, norm_knee)`` where each is a float or None.
    """
    if not self.maxima_indices.size:
        # No local maxima found in the difference curve
        # The line is probably not polynomial, try plotting
        # the difference curve with plt.plot(knee.x_difference, knee.y_difference)
        # Also check that you aren't mistakenly setting the curve argument
        return None, None
    # placeholder for which threshold region i is located in.
    maxima_threshold_index = 0
    minima_threshold_index = 0
    traversed_maxima = False
    # State flag to control knee detection after local minima
    detection_active = True
    # traverse the difference curve
    for i, x in enumerate(self.x_difference):
        # skip points on the curve before the the first local maxima
        if i < self.maxima_indices[0]:
            continue

        j = i + 1

        # reached the end of the curve
        if i == (len(self.x_difference) - 1):
            break

        # if we're at a local max, increment the maxima threshold index and continue
        if (self.maxima_indices == i).any():
            threshold = self.Tmx[maxima_threshold_index]
            threshold_index = i
            maxima_threshold_index += 1
            # Reactivate detection when we reach a local maximum
            detection_active = True
        # values in difference curve are at or after a local minimum
        if (self.minima_indices == i).any():
            threshold = 0.0
            minima_threshold_index += 1
            # Deactivate detection after a local minimum until next local maximum
            detection_active = False

        if detection_active and self.y_difference[j] < threshold:
            if self.curve == "convex":
                if self.direction == "decreasing":
                    knee = self.x[threshold_index]
                    norm_knee = self.x_normalized[threshold_index]
                else:
                    knee = self.x[-(threshold_index + 1)]
                    norm_knee = self.x_normalized[threshold_index]

            elif self.curve == "concave":
                if self.direction == "decreasing":
                    knee = self.x[-(threshold_index + 1)]
                    norm_knee = self.x_normalized[threshold_index]
                else:
                    knee = self.x[threshold_index]
                    norm_knee = self.x_normalized[threshold_index]

            # add the y value at the knee
            y_at_knee = self.y[self.x == knee][0]
            y_norm_at_knee = self.y_normalized[self.x_normalized == norm_knee][0]
            if knee not in self.all_knees:
                self.all_knees_y.append(y_at_knee)
                self.all_norm_knees_y.append(y_norm_at_knee)

            # now add the knee
            self.all_knees.add(knee)
            self.all_norm_knees.add(norm_knee)

            # if detecting in offline mode, return the first knee found
            if self.online is False:
                return knee, norm_knee

    if self.all_knees == set():
        # No knee was found
        return None, None

    return knee, norm_knee

plot_knee_normalized(figsize=None, title='Normalized Knee Point', xlabel=None, ylabel=None)

Plot the normalized curve, the difference curve, and the knee.

Parameters:

Name Type Description Default
figsize tuple of int

The figure size of the plot, e.g. (12, 8).

None
title str

Title of the visualization.

"Normalized Knee Point"
xlabel str

X-axis label.

None
ylabel str

Y-axis label.

None
Source code in kneed/knee_locator.py
def plot_knee_normalized(
    self,
    figsize: Optional[Tuple[int, int]] = None,
    title: str = "Normalized Knee Point",
    xlabel: Optional[str] = None,
    ylabel: Optional[str] = None,
):
    """Plot the normalized curve, the difference curve, and the knee.

    Parameters
    ----------
    figsize : tuple of int, optional
        The figure size of the plot, e.g. ``(12, 8)``.
    title : str, default "Normalized Knee Point"
        Title of the visualization.
    xlabel : str, optional
        X-axis label.
    ylabel : str, optional
        Y-axis label.
    """
    if not _has_matplotlib:
        raise _matplotlib_not_found_err

    if figsize is None:
        figsize = (6, 6)

    plt.figure(figsize=figsize)
    plt.title(title)
    if xlabel:
        plt.xlabel(xlabel)
    if ylabel:
        plt.ylabel(ylabel)
    plt.plot(self.x_normalized, self.y_normalized, "b", label="normalized curve")
    plt.plot(self.x_difference, self.y_difference, "r", label="difference curve")
    plt.xticks(
        np.arange(self.x_normalized.min(), self.x_normalized.max() + 0.1, 0.1)
    )
    plt.yticks(
        np.arange(self.y_difference.min(), self.y_normalized.max() + 0.1, 0.1)
    )

    plt.vlines(
        self.norm_knee,
        plt.ylim()[0],
        plt.ylim()[1],
        linestyles="--",
        label="knee/elbow",
    )
    plt.legend(loc="best")

plot_knee(figsize=None, title='Knee Point', xlabel=None, ylabel=None)

Plot the curve and the knee, if it exists.

Parameters:

Name Type Description Default
figsize tuple of int

The figure size of the plot, e.g. (12, 8).

None
title str

Title of the visualization.

"Knee Point"
xlabel str

X-axis label.

None
ylabel str

Y-axis label.

None
Source code in kneed/knee_locator.py
def plot_knee(
    self,
    figsize: Optional[Tuple[int, int]] = None,
    title: str = "Knee Point",
    xlabel: Optional[str] = None,
    ylabel: Optional[str] = None,
):
    """Plot the curve and the knee, if it exists.

    Parameters
    ----------
    figsize : tuple of int, optional
        The figure size of the plot, e.g. ``(12, 8)``.
    title : str, default "Knee Point"
        Title of the visualization.
    xlabel : str, optional
        X-axis label.
    ylabel : str, optional
        Y-axis label.
    """
    if not _has_matplotlib:
        raise _matplotlib_not_found_err

    if figsize is None:
        figsize = (6, 6)

    plt.figure(figsize=figsize)
    plt.title(title)
    if xlabel:
        plt.xlabel(xlabel)
    if ylabel:
        plt.ylabel(ylabel)
    plt.plot(self.x, self.y, "b", label="data")
    plt.vlines(
        self.knee, plt.ylim()[0], plt.ylim()[1], linestyles="--", label="knee/elbow"
    )
    plt.legend(loc="best")

DataGenerator

Utility class for generating synthetic test data.

kneed.data_generator.DataGenerator

Bases: object

Generate synthetic data to work with kneed.

Source code in kneed/data_generator.py
class DataGenerator(object):
    """Generate synthetic data to work with kneed."""

    @staticmethod
    def noisy_gaussian(
        mu: float = 50, sigma: float = 10, N: int = 100, seed=42
    ) -> Tuple[Iterable[float], Iterable[float]]:
        """Recreate NoisyGaussian from the original Kneedle paper.

        Parameters
        ----------
        mu : float, default 50
            The mean value to build a normal distribution around.
        sigma : float, default 10
            The standard deviation of the distribution.
        N : int, default 100
            The number of samples to draw from to build the normal distribution.
        seed : int, default 42
            An integer to set the random seed.

        Returns
        -------
        tuple of numpy.ndarray
            ``(x, y)`` arrays.
        """
        np.random.seed(seed)
        z = np.random.normal(loc=mu, scale=sigma, size=N)
        x = np.sort(z)
        y = np.array(range(N)) / float(N)
        return x, y

    @staticmethod
    def figure2() -> Tuple[Iterable[float], Iterable[float]]:
        """Recreate the values in figure 2 from the original Kneedle paper.

        Returns
        -------
        tuple of numpy.ndarray
            ``(x, y)`` arrays.
        """
        with np.errstate(divide="ignore"):
            x = np.linspace(0.0, 1, 10)
            return x, np.true_divide(-1, x + 0.1) + 5

    @staticmethod
    def convex_increasing() -> Tuple[Iterable[float], Iterable[float]]:
        """Generate a sample increasing convex function.

        Returns
        -------
        tuple of numpy.ndarray
            ``(x, y)`` arrays.
        """
        x = np.arange(0, 10)
        y_convex_inc = np.array([1, 2, 3, 4, 5, 10, 15, 20, 40, 100])
        return x, y_convex_inc

    @staticmethod
    def convex_decreasing() -> Tuple[Iterable[float], Iterable[float]]:
        """Generate a sample decreasing convex function.

        Returns
        -------
        tuple of numpy.ndarray
            ``(x, y)`` arrays.
        """
        x = np.arange(0, 10)
        y_convex_dec = np.array([100, 40, 20, 15, 10, 5, 4, 3, 2, 1])
        return x, y_convex_dec

    @staticmethod
    def concave_decreasing() -> Tuple[Iterable[float], Iterable[float]]:
        """Generate a sample decreasing concave function.

        Returns
        -------
        tuple of numpy.ndarray
            ``(x, y)`` arrays.
        """
        x = np.arange(0, 10)
        y_concave_dec = np.array([99, 98, 97, 96, 95, 90, 85, 80, 60, 0])
        return x, y_concave_dec

    @staticmethod
    def concave_increasing() -> Tuple[Iterable[float], Iterable[float]]:
        """Generate a sample increasing concave function.

        Returns
        -------
        tuple of numpy.ndarray
            ``(x, y)`` arrays.
        """
        x = np.arange(0, 10)
        y_concave_inc = np.array([0, 60, 80, 85, 90, 95, 96, 97, 98, 99])
        return x, y_concave_inc

    @staticmethod
    def bumpy() -> Tuple[Iterable[float], Iterable[float]]:
        """Generate a sample function with local minima/maxima.

        Returns
        -------
        tuple
            ``(x, y)`` where x is a list and y is a list.
        """
        x_bumpy = list(range(90))
        y_bumpy = [
            7305.0,
            6979.0,
            6666.6,
            6463.2,
            6326.5,
            6048.8,
            6032.8,
            5762.0,
            5742.8,
            5398.2,
            5256.8,
            5227.0,
            5001.7,
            4942.0,
            4854.2,
            4734.6,
            4558.7,
            4491.1,
            4411.6,
            4333.0,
            4234.6,
            4139.1,
            4056.8,
            4022.5,
            3868.0,
            3808.3,
            3745.3,
            3692.3,
            3645.6,
            3618.3,
            3574.3,
            3504.3,
            3452.4,
            3401.2,
            3382.4,
            3340.7,
            3301.1,
            3247.6,
            3190.3,
            3180.0,
            3154.2,
            3089.5,
            3045.6,
            2989.0,
            2993.6,
            2941.3,
            2875.6,
            2866.3,
            2834.1,
            2785.1,
            2759.7,
            2763.2,
            2720.1,
            2660.1,
            2690.2,
            2635.7,
            2632.9,
            2574.6,
            2556.0,
            2545.7,
            2513.4,
            2491.6,
            2496.0,
            2466.5,
            2442.7,
            2420.5,
            2381.5,
            2388.1,
            2340.6,
            2335.0,
            2318.9,
            2319.0,
            2308.2,
            2262.2,
            2235.8,
            2259.3,
            2221.0,
            2202.7,
            2184.3,
            2170.1,
            2160.0,
            2127.7,
            2134.7,
            2102.0,
            2101.4,
            2066.4,
            2074.3,
            2063.7,
            2048.1,
            2031.9,
        ]
        return x_bumpy, y_bumpy

noisy_gaussian(mu=50, sigma=10, N=100, seed=42) staticmethod

Recreate NoisyGaussian from the original Kneedle paper.

Parameters:

Name Type Description Default
mu float

The mean value to build a normal distribution around.

50
sigma float

The standard deviation of the distribution.

10
N int

The number of samples to draw from to build the normal distribution.

100
seed int

An integer to set the random seed.

42

Returns:

Type Description
tuple of numpy.ndarray

(x, y) arrays.

Source code in kneed/data_generator.py
@staticmethod
def noisy_gaussian(
    mu: float = 50, sigma: float = 10, N: int = 100, seed=42
) -> Tuple[Iterable[float], Iterable[float]]:
    """Recreate NoisyGaussian from the original Kneedle paper.

    Parameters
    ----------
    mu : float, default 50
        The mean value to build a normal distribution around.
    sigma : float, default 10
        The standard deviation of the distribution.
    N : int, default 100
        The number of samples to draw from to build the normal distribution.
    seed : int, default 42
        An integer to set the random seed.

    Returns
    -------
    tuple of numpy.ndarray
        ``(x, y)`` arrays.
    """
    np.random.seed(seed)
    z = np.random.normal(loc=mu, scale=sigma, size=N)
    x = np.sort(z)
    y = np.array(range(N)) / float(N)
    return x, y

figure2() staticmethod

Recreate the values in figure 2 from the original Kneedle paper.

Returns:

Type Description
tuple of numpy.ndarray

(x, y) arrays.

Source code in kneed/data_generator.py
@staticmethod
def figure2() -> Tuple[Iterable[float], Iterable[float]]:
    """Recreate the values in figure 2 from the original Kneedle paper.

    Returns
    -------
    tuple of numpy.ndarray
        ``(x, y)`` arrays.
    """
    with np.errstate(divide="ignore"):
        x = np.linspace(0.0, 1, 10)
        return x, np.true_divide(-1, x + 0.1) + 5

convex_increasing() staticmethod

Generate a sample increasing convex function.

Returns:

Type Description
tuple of numpy.ndarray

(x, y) arrays.

Source code in kneed/data_generator.py
@staticmethod
def convex_increasing() -> Tuple[Iterable[float], Iterable[float]]:
    """Generate a sample increasing convex function.

    Returns
    -------
    tuple of numpy.ndarray
        ``(x, y)`` arrays.
    """
    x = np.arange(0, 10)
    y_convex_inc = np.array([1, 2, 3, 4, 5, 10, 15, 20, 40, 100])
    return x, y_convex_inc

convex_decreasing() staticmethod

Generate a sample decreasing convex function.

Returns:

Type Description
tuple of numpy.ndarray

(x, y) arrays.

Source code in kneed/data_generator.py
@staticmethod
def convex_decreasing() -> Tuple[Iterable[float], Iterable[float]]:
    """Generate a sample decreasing convex function.

    Returns
    -------
    tuple of numpy.ndarray
        ``(x, y)`` arrays.
    """
    x = np.arange(0, 10)
    y_convex_dec = np.array([100, 40, 20, 15, 10, 5, 4, 3, 2, 1])
    return x, y_convex_dec

concave_decreasing() staticmethod

Generate a sample decreasing concave function.

Returns:

Type Description
tuple of numpy.ndarray

(x, y) arrays.

Source code in kneed/data_generator.py
@staticmethod
def concave_decreasing() -> Tuple[Iterable[float], Iterable[float]]:
    """Generate a sample decreasing concave function.

    Returns
    -------
    tuple of numpy.ndarray
        ``(x, y)`` arrays.
    """
    x = np.arange(0, 10)
    y_concave_dec = np.array([99, 98, 97, 96, 95, 90, 85, 80, 60, 0])
    return x, y_concave_dec

concave_increasing() staticmethod

Generate a sample increasing concave function.

Returns:

Type Description
tuple of numpy.ndarray

(x, y) arrays.

Source code in kneed/data_generator.py
@staticmethod
def concave_increasing() -> Tuple[Iterable[float], Iterable[float]]:
    """Generate a sample increasing concave function.

    Returns
    -------
    tuple of numpy.ndarray
        ``(x, y)`` arrays.
    """
    x = np.arange(0, 10)
    y_concave_inc = np.array([0, 60, 80, 85, 90, 95, 96, 97, 98, 99])
    return x, y_concave_inc

bumpy() staticmethod

Generate a sample function with local minima/maxima.

Returns:

Type Description
tuple

(x, y) where x is a list and y is a list.

Source code in kneed/data_generator.py
@staticmethod
def bumpy() -> Tuple[Iterable[float], Iterable[float]]:
    """Generate a sample function with local minima/maxima.

    Returns
    -------
    tuple
        ``(x, y)`` where x is a list and y is a list.
    """
    x_bumpy = list(range(90))
    y_bumpy = [
        7305.0,
        6979.0,
        6666.6,
        6463.2,
        6326.5,
        6048.8,
        6032.8,
        5762.0,
        5742.8,
        5398.2,
        5256.8,
        5227.0,
        5001.7,
        4942.0,
        4854.2,
        4734.6,
        4558.7,
        4491.1,
        4411.6,
        4333.0,
        4234.6,
        4139.1,
        4056.8,
        4022.5,
        3868.0,
        3808.3,
        3745.3,
        3692.3,
        3645.6,
        3618.3,
        3574.3,
        3504.3,
        3452.4,
        3401.2,
        3382.4,
        3340.7,
        3301.1,
        3247.6,
        3190.3,
        3180.0,
        3154.2,
        3089.5,
        3045.6,
        2989.0,
        2993.6,
        2941.3,
        2875.6,
        2866.3,
        2834.1,
        2785.1,
        2759.7,
        2763.2,
        2720.1,
        2660.1,
        2690.2,
        2635.7,
        2632.9,
        2574.6,
        2556.0,
        2545.7,
        2513.4,
        2491.6,
        2496.0,
        2466.5,
        2442.7,
        2420.5,
        2381.5,
        2388.1,
        2340.6,
        2335.0,
        2318.9,
        2319.0,
        2308.2,
        2262.2,
        2235.8,
        2259.3,
        2221.0,
        2202.7,
        2184.3,
        2170.1,
        2160.0,
        2127.7,
        2134.7,
        2102.0,
        2101.4,
        2066.4,
        2074.3,
        2063.7,
        2048.1,
        2031.9,
    ]
    return x_bumpy, y_bumpy

find_shape

Utility function to auto-detect the curve direction and type.

kneed.shape_detector.find_shape(x, y)

Detect the direction and curve type of the data.

Fits a first-degree polynomial to the data and uses the coefficients to determine if the curve is increasing/decreasing and concave/convex.

Parameters:

Name Type Description Default
x array - like

x values.

required
y array - like

y values.

required

Returns:

Type Description
tuple of str

(direction, curve) where direction is "increasing" or "decreasing" and curve is "concave" or "convex".

Source code in kneed/shape_detector.py
def find_shape(x, y):
    """Detect the direction and curve type of the data.

    Fits a first-degree polynomial to the data and uses the coefficients
    to determine if the curve is increasing/decreasing and concave/convex.

    Parameters
    ----------
    x : array-like
        x values.
    y : array-like
        y values.

    Returns
    -------
    tuple of str
        ``(direction, curve)`` where direction is ``"increasing"`` or
        ``"decreasing"`` and curve is ``"concave"`` or ``"convex"``.
    """

    p = np.polyfit(x, y, deg=1)
    x1, x2 = int(len(x) * 0.2), int(len(x) * 0.8)
    q = np.mean(y[x1:x2]) - np.mean(x[x1:x2] * p[0] + p[1])
    if p[0] > 0 and q > 0:
        return "increasing", "concave"
    if p[0] > 0 and q <= 0:
        return "increasing", "convex"
    if p[0] <= 0 and q > 0:
        return "decreasing", "concave"
    return "decreasing", "convex"