Skip to Content

02. AI-Based Topic Classification (LLM & Gemini)

Expert2 hours

1. Overview and Scenario

Situation: Thousands of tickets are classified as “Others”. There are too many for agents to read and classify individually.

“Can’t AI read and classify these for us?”

In the past, tens of thousands of training data were needed, but now you can just tell an LLM (Large Language Model) to “read this and classify it”. Let’s call Google Gemini directly from within BigQuery.


2. BigQuery ML + Gemini Integration

In BigQuery, you can call the Gemini model (gemini-pro) using the ML.GENERATE_TEXT function. (Note: Separate Google Cloud Vertex AI connection setup may be required.)

❓ Problem 1: Create Gemini Model (DDL)

Q. Write a query to create a remote model (remote_model) based on gemini-pro in BigQuery. (In the practice environment, we’ll assume it’s already created and proceed.)

💡

Hint: Use CREATE MODEL statement and specify REMOTE_SERVICE_TYPE.

View Solution

-- Model creation example (requires admin permissions) CREATE OR REPLACE MODEL `your-project-id.retail_analytics_us.gemini_pro_model` REMOTE WITH CONNECTION `projects/your-project-id/locations/us/connections/vertex-ai` OPTIONS(endpoint = 'gemini-pro');

3. Prompt Engineering

To get good results from AI, you need to ask good questions (Prompts).

❓ Problem 2: Write Ticket Classification Prompt

Q. Write a prompt to classify tickets into one of the following categories and execute with ML.GENERATE_TEXT.

  • Categories: Product Quality, Shipping, Payment, General Inquiry

In BigQuery, assemble the prompt using CONCAT.

View Solution

SELECT ticket_id, issue_type as original_label, ml_generate_text_result['candidates'][0]['content']['parts'][0]['text'] AS ai_classification FROM ML.GENERATE_TEXT( MODEL `your-project-id.retail_analytics_us.gemini_pro_model`, ( SELECT ticket_id, issue_type, CONCAT( 'You are a customer service manager. Classify the following ticket into one of these categories: ', '[Product Quality, Shipping, Payment, General Inquiry]. ', 'Do not explain, just return the category name. ', 'Ticket comments: ', comment ) AS prompt FROM `your-project-id.retail_analytics_us.cs_tickets_dummy` WHERE status = 'open' LIMIT 5 ), STRUCT( 0.0 AS temperature, -- Set to 0 for consistent answers 10 AS max_output_tokens ) );

4. Result Verification and Summary Generation

Beyond simple classification, you can also generate “3-line summaries” or “draft responses”.

❓ Problem 3: Ticket Summary and Response Recommendation

Q. Execute a prompt that says “Summarize the ticket content in 3 lines and write a draft response for the agent to send”.

View Solution

SELECT ticket_id, ml_generate_text_result['candidates'][0]['content']['parts'][0]['text'] AS ai_response FROM ML.GENERATE_TEXT( MODEL `your-project-id.retail_analytics_us.gemini_pro_model`, ( SELECT ticket_id, CONCAT( 'Summarize this ticket in 1 sentence and suggest a polite response draft. ', 'Ticket: ', comment ) AS prompt FROM `your-project-id.retail_analytics_us.cs_tickets_dummy` LIMIT 3 ), STRUCT(0.5 AS temperature, 200 AS max_output_tokens) );

💡 Summary

  • BigQuery ML: Apply the latest LLM to large-scale data using SQL alone.
  • Prompt Engineering: Giving AI a role (Role), constraints (Constraints), and examples (Few-shot) improves performance.
  • Applications: Auto-classification, summarization, sentiment analysis, translation, and countless more.

In the next chapter, we’ll use traditional Machine Learning (Classification) to predict satisfaction faster and cheaper than LLMs.

Last updated on

🤖AI 모의면접실전처럼 연습하기