JSON-AI Config
- api.json_ai.add_implicit_values(json_ai)[source]
To enable brevity in writing, auto-generate the “unspecified/missing” details required in the ML pipeline.
- Params:
json_ai:
JsonAI
object that describes the ML pipeline that may not have every detail fully specified.- Return type:
JsonAI
- Returns:
JSONAI
object with all necessary parameters that were previously left unmentioned filled in.
- api.json_ai.generate_json_ai(type_information, statistical_analysis, problem_definition)[source]
Given
type_infer.TypeInformation
,dataprep_ml.StatisticalAnalysis
, and theProblemDefinition
, generate a JSON config file with the necessary elements of the ML pipeline populated.- Parameters:
TypeInformation – Specifies what data types each column within the dataset are. Generated by mindsdb/type_infer.
statistical_analysis (
StatisticalAnalysis
) –problem_definition (
ProblemDefinition
) – Specifies details of the model training/building procedure, as defined byProblemDefinition
- Return type:
JsonAI
- Returns:
JSON-AI object with fully populated details of the ML pipeline
- api.json_ai.lookup_encoder(col_dtype, col_name, is_target, problem_defintion, is_target_predicting_encoder, statistical_analysis)[source]
Assign a default encoder for a given column based on its data type, and whether it is a target. Encoders intake raw (but cleaned) data and return an feature representation. This function assigns, per data type, what the featurizer should be. This function runs on each column within the dataset available for model building to assign how it should be featurized.
Users may override to create a custom encoder to enable their own featurization process. However, in order to generate template JSON-AI, this code runs automatically. Users may edit the generated syntax and use custom approaches while model building.
For each encoder, “args” may be passed. These args depend an encoder requires during its preparation call.
- Parameters:
col_dtype (
str
) – A data-type of a column specifiedcol_name (
str
) – The name of the columnis_target (
bool
) – Whether the column is the target for prediction. If true, only certain possible feature representations are allowed, particularly for complex data types.problem_definition – The
ProblemDefinition
criteria; this populates specifics on how models and encoders may be trained.is_target_predicting_encoder (
bool
) –