Coverage for mindsdb / api / mysql / mysql_proxy / utilities / lightwood_dtype.py: 100%
20 statements
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-21 00:36 +0000
« prev ^ index » next coverage.py v7.13.1, created at 2026-01-21 00:36 +0000
1class dtype:
2 """
3 Definitions of all data types currently supported. Dtypes currently supported include:
6 - **Numerical**: Data that should be represented in the form of a number. Currently ``integer``, ``float``, and ``quantity`` are supported.
7 - **Categorical**: Data that represents a class or label and is discrete. Currently ``binary``, ``categorical``, and ``tags`` are supported.
8 - **Date/Time**: Time-series data that is temporal/sequential. Currently ``date``, and ``datetime`` are supported.
9 - **Text**: Data that can be considered as language information. Currently ``short_text``, and ``rich_text`` are supported. Short text has a small vocabulary (~ 100 words) and is generally a limited number of characters. Rich text is anything with greater complexity.
10 - **Complex**: Data types that require custom techniques. Currently ``audio``, ``video`` and ``image`` are available, but highly experimental.
11 - **Array**: Data in the form of a sequence where order must be preserved. ``tsarray`` dtypes are for "normal" columns that will be transformed to arrays at a row-level because they will be treated as time series.
12 - **Miscellaneous**: Miscellaneous data descriptors include ``empty``, an explicitly unknown value versus ``invalid``, a data type not currently supported.
14 Custom data types may be implemented here as a flag for subsequent treatment and processing. You are welcome to include your own definitions, so long as they do not override the existing type names (alternatively, if you do, please edit subsequent parts of the preprocessing pipeline to correctly indicate how you want to deal with these data types).
15 """ # noqa
17 # Numerical type data
18 integer = "integer"
19 float = "float"
20 quantity = "quantity"
22 # Categorical type data
23 binary = "binary"
24 categorical = "categorical"
25 tags = "tags"
27 # Dates and Times (time-series)
28 date = "date"
29 datetime = "datetime"
31 # Text
32 short_text = "short_text"
33 rich_text = "rich_text"
35 # Complex Data types
36 image = "image"
37 audio = "audio"
38 video = "video"
40 # Series/Sequences
41 num_array = "num_array"
42 cat_array = "cat_array"
43 num_tsarray = 'num_tsarray'
44 cat_tsarray = 'cat_tsarray'
46 # Misc (Unk/NaNs)
47 empty = "empty"
48 invalid = "invalid"