Model Serving

The presto-query-predictor package implemented a Flask web application for model serving. The service is encapsulated in the predictor_app variable, which can be easily used by running

from query_predictor.predictor.predictor_app import predictor_app

predictor_app.run()

There are two API endpoints:

/v1/cpu

This API endpoint receives an HTTP request with the query statement carried in the query field as a JSON message in the request body. It returns a response with the expected CPU time range wrapped in the response body. An example of the response body is shown below.

{
    "cpu_pred_label": 0,
    "cpu_pred_str": "< 30s"
}

/v1/memory

This API endpoint receives an HTTP request with the query statement carried in the query field as a JSON message in the request body. It returns a response with the expected peak memory bytes range wrapped in the response body. An example of the response body is shown below.

{
    "memory_pred_label": 0,
    "memory_pred_str": "< 1MB"
}

The web service requires four models trained beforehand:

CPU vectorization model
Memory vectorization model
CPU classification model
Memory classification model

The parameters about these models can be provisioned through a serving configuration YAML file. An example is shown below.

models:
  cpu_model:
    label: cpu_time_label
    feature: query
    type: XGBoost
    path: models/model-cpu.bin
    name: XGBoost-CPU
    description: An XGBoost model to predict cpu time of each SQL query
    version: 0.1.0
  memory_model:
    label: peak_memory_label
    feature: query
    type: XGBoost
    path: models/model-memory.bin
    name: XGBoost-Memory
    description: An XGBoost model to predict peak memory bytes of each SQL query
    version: 0.1.0
vectorizers:
  cpu_vectorizer:
    feature: query
    type: tfidf
    path: models/vec-cpu.bin
    name: tfidf-cpu
    description: A TF-IDF vectorizer for SQL queries
    version: 0.1.0
  memory_vectorizer:
    feature: query
    type: tfidf
    path: models/vec-memory.bin
    name: tfidf-memory
    description: A TF-IDF vectorizer for SQL queries
    version: 0.1.0

Info

The predictor_app provides a simple interface to serve the models in the production environment. You can also use other web frameworks to serve these models.