02. AI-Based Topic Classification (LLM & Gemini)

Expert2 hours

1. Overview and Scenario

Situation: Thousands of tickets are classified as “Others”. There are too many for agents to read and classify individually.

“Can’t AI read and classify these for us?”

In the past, tens of thousands of training data were needed, but now you can just tell an LLM (Large Language Model) to “read this and classify it”. Let’s call Google Gemini directly from within BigQuery.

2. BigQuery ML + Gemini Integration

In BigQuery, you can call the Gemini model (gemini-pro) using the ML.GENERATE_TEXT function. (Note: Separate Google Cloud Vertex AI connection setup may be required.)

❓ Problem 1: Create Gemini Model (DDL)

Q. Write a query to create a remote model (remote_model) based on gemini-pro in BigQuery. (In the practice environment, we’ll assume it’s already created and proceed.)

BigQuery (SQL)

💡

Hint: Use CREATE MODEL statement and specify REMOTE_SERVICE_TYPE.

View Solution


-- Model creation example (requires admin permissions)
CREATE OR REPLACE MODEL `your-project-id.retail_analytics_us.gemini_pro_model`
REMOTE WITH CONNECTION `projects/your-project-id/locations/us/connections/vertex-ai`
OPTIONS(endpoint = 'gemini-pro');

3. Prompt Engineering

To get good results from AI, you need to ask good questions (Prompts).

❓ Problem 2: Write Ticket Classification Prompt

Q. Write a prompt to classify tickets into one of the following categories and execute with ML.GENERATE_TEXT.

Categories: Product Quality, Shipping, Payment, General Inquiry

BigQuery (SQL)

In BigQuery, assemble the prompt using CONCAT.

View Solution


SELECT
    ticket_id,
    issue_type as original_label,
    ml_generate_text_result['candidates'][0]['content']['parts'][0]['text'] AS ai_classification
FROM
  ML.GENERATE_TEXT(
    MODEL `your-project-id.retail_analytics_us.gemini_pro_model`,
    (
      SELECT
        ticket_id,
        issue_type,
        CONCAT(
          'You are a customer service manager. Classify the following ticket into one of these categories: ',
          '[Product Quality, Shipping, Payment, General Inquiry]. ',
          'Do not explain, just return the category name. ',
          'Ticket comments: ', comment
        ) AS prompt
      FROM `your-project-id.retail_analytics_us.cs_tickets_dummy`
      WHERE status = 'open'
      LIMIT 5
    ),
    STRUCT(
      0.0 AS temperature,  -- Set to 0 for consistent answers
      10 AS max_output_tokens
    )
  );

4. Result Verification and Summary Generation

Beyond simple classification, you can also generate “3-line summaries” or “draft responses”.

❓ Problem 3: Ticket Summary and Response Recommendation

Q. Execute a prompt that says “Summarize the ticket content in 3 lines and write a draft response for the agent to send”.

BigQuery (SQL)

View Solution


SELECT
    ticket_id,
    ml_generate_text_result['candidates'][0]['content']['parts'][0]['text'] AS ai_response
FROM
  ML.GENERATE_TEXT(
    MODEL `your-project-id.retail_analytics_us.gemini_pro_model`,
    (
      SELECT
        ticket_id,
        CONCAT(
          'Summarize this ticket in 1 sentence and suggest a polite response draft. ',
          'Ticket: ', comment
        ) AS prompt
      FROM `your-project-id.retail_analytics_us.cs_tickets_dummy`
      LIMIT 3
    ),
    STRUCT(0.5 AS temperature, 200 AS max_output_tokens)
  );

💡 Summary

BigQuery ML: Apply the latest LLM to large-scale data using SQL alone.
Prompt Engineering: Giving AI a role (Role), constraints (Constraints), and examples (Few-shot) improves performance.
Applications: Auto-classification, summarization, sentiment analysis, translation, and countless more.

In the next chapter, we’ll use traditional Machine Learning (Classification) to predict satisfaction faster and cheaper than LLMs.

02. AI-Based Topic Classification (LLM & Gemini)

1. Overview and Scenario

2. BigQuery ML + Gemini Integration

❓ Problem 1: Create Gemini Model (DDL)

BigQuery (SQL)

View Solution

3. Prompt Engineering

❓ Problem 2: Write Ticket Classification Prompt

BigQuery (SQL)

View Solution

Python (API)

View Solution

4. Result Verification and Summary Generation

❓ Problem 3: Ticket Summary and Response Recommendation

BigQuery (SQL)

View Solution

💡 Summary