SELECT
statement¶
Description¶
The SELECT
statement fetches predictions from the model table. The data is returned on the fly and not saved.
But there are ways to save predictions data! You can save your predictions as a view using the CREATE VIEW
statement. Please note that a view is a saved query and does not store data like a table. Another way is to insert your predictions into a table using the INSERT INTO
statement.
Syntax¶
Single Prediction¶
Here is the syntax for fetching a single prediction from the model table:
SELECT [target_variable], [target_variable]_explain
FROM mindsdb.[predictor_name]
WHERE [column]=[value]
AND [column]=[value];
Grammar Matters
Here are some points to keep in mind while writing queries in MindsDB:
1. The [column]=[value]
pairs may be joined by AND
or OR
keywords.
2. Do not use any quotations for numerical values.
3. Use single quotes for strings.
On execution, we get:
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| [target_variable] | [target_variable]_explain |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| [predicted_value] | {"predicted_value": 4394, "confidence": 0.99, "anomaly": null, "truth": null, "confidence_lower_bound": 4313, "confidence_upper_bound": 4475} |
+-------------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
Where:
Name | Description |
---|---|
[target_variable] |
Name of the column to be predicted. |
[target_variable]_explain |
Object of the JSON type that contains the predicted_value and additional information such as confidence , anomaly , truth , confidence_lower_bound , confidence_upper_bound . |
[predictor_name] |
Name of the model used to make the prediction. |
WHERE [column]=[value] AND ... |
WHERE clause used to pass the data values for which the prediction is made. |
Bulk Predictions¶
Here is the syntax for making predictions in bulk by joining the data source table with the model table:
SELECT m.[target_variable], t.[column1], t.[column2]
FROM [integration_name].[table_name] AS t
JOIN mindsdb.[predictor_name] AS m;
On execution, we get:
+----------------------+-------------+-------------+
| [target_variable] | [column1] | [column2] |
+----------------------+-------------+-------------+
| [predicted_value_1] | value1.1 | value2.1 |
| [predicted_value_2] | value1.2 | value2.2 |
| [predicted_value_3] | value1.3 | value2.3 |
| [predicted_value_4] | value1.4 | value2.4 |
| [predicted_value_5] | value1.5 | value2.5 |
+----------------------+-------------+-------------+
Where:
Name | Description |
---|---|
m.[target_variable] |
Name of the column to be predicted. The m. in front indicates that this column comes from the mindsdb.[predictor_name] table. |
t.[column1], t.[column2] |
Columns from the data source table ([integration_name].[table_name] ) that you want to see in the output. |
[integration_name].[table_name] |
Data source table that is joined with the model table (mindsdb.[predictor_name] ). |
[predictor_name] |
Name of the model used to make predictions. |
Please note that in the case of bulk predictions, we do not pass the data values for which the prediction is made. It is because bulk predictions use all data available in the data source table.
Example¶
Single Prediction¶
Let's predict the rental_price
value using the home_rentals_model
model for the property having sqft=823
, location='good'
, neighborhood='downtown'
, and days_on_market=10
.
SELECT sqft, location, neighborhood, days_on_market, rental_price, rental_price_explain
FROM mindsdb.home_rentals_model1
WHERE sqft=823
AND location='good'
AND neighborhood='downtown'
AND days_on_market=10;
On execution, we get:
+-------+----------+--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| sqft | location | neighborhood | days_on_market | rental_price | rental_price_explain |
+-------+----------+--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
| 823 | good | downtown | 10 | 4394 | {"predicted_value": 4394, "confidence": 0.99, "anomaly": null, "truth": null, "confidence_lower_bound": 4313, "confidence_upper_bound": 4475} |
+-------+----------+--------------+----------------+--------------+-----------------------------------------------------------------------------------------------------------------------------------------------+
Bulk Predictions¶
Now let's make bulk predictions to predict the rental_price
value using the home_rentals_model
model joined with the data source table.
SELECT t.sqft, t.location, t.neighborhood, t.days_on_market, t.rental_price AS real_price,
m.rental_price AS predicted_rental_price
FROM example_db.demo_data.home_rentals AS t
JOIN mindsdb.home_rentals_model AS m
LIMIT 5;
On execution, we get:
+-------+----------+-----------------+----------------+--------------+-----------------------------+
| sqft | location | neighborhood | days_on_market | real_price | predicted_rental_price |
+-------+----------+-----------------+----------------+--------------+-----------------------------+
| 917 | great | berkeley_hills | 13 | 3901 | 3886 |
| 194 | great | berkeley_hills | 10 | 2042 | 2007 |
| 543 | poor | westbrae | 18 | 1871 | 1865 |
| 503 | good | downtown | 10 | 3026 | 3020 |
| 1066 | good | thowsand_oaks | 13 | 4774 | 4748 |
+-------+----------+-----------------+----------------+--------------+-----------------------------+
Select from integration¶
Simple select¶
In this example query contains only tables from one integration and therefore will be sent to integration database (integration name will be cut from table name)
SELECT location, max(sqft)
FROM example_db.demo_data.home_rentals
GROUP BY location
LIMIT 5;
Raw select from integration¶
It is also possible to send raw query to integration. It can be useful when query to integration can not be parsed as sql
Syntax:
SELECT ... FROM <integration_name> ( <raw query> )
Example of select from mongo integration using mongo query
SELECT * FROM mongo (
db.house_sales2.find().limit(1)
)
Complex queries¶
- Subselect on data from integration.
It can be useful in cases when integration engine doesn't support some functions, for example grouping. In that case all data from raw select are passed to mindsdb and then subselect performs on them inside mindsdb
SELECT type, max(bedrooms), last(MA)
FROM mongo (
db.house_sales2.find().limit(300)
) GROUP BY 1
- Unions
It is possible to use UNION
/ UNION ALL
operators.
It this case every subselect from union will be fetched and merged to one result-set on mindsdb side
SELECT
data.time as date, data.target
FROM datasource.table_name as data
UNION ALL
SELECT
model.time as date, model.target as target
FROM mindsdb.model as model
JOIN datasource.table_name as t
WHERE t.time > LATEST AND t.group = 'value';