sql-server

7 Пост

oracle

7 Пост

postgresql

10 Пост

my-sql

2 Пост

common-sql

1 Пост

News

5 Новости

Oracle Text Search Index Types

The Oracle Text component provides specialized indexing mechanisms for searching textual data inside Oracle databases.
Unlike standard B-tree indexes, Oracle Text indexes create inverted token structures optimized for text retrieval.

Oracle Text provides three primary index types, each designed for a different search problem:

Index Type Purpose
Oracle Text CONTEXT Index Full-text search in large documents
Oracle Text CTXCAT Index Short text search combined with structured filtering
Oracle Text CTXRULE Index Matching incoming text against stored rules

These index types solve different information-retrieval scenarios inside Oracle.


1. CONTEXT Index

Purpose

The CONTEXT index is the primary Oracle Text index used for full-text document search.

It builds an inverted index of tokens extracted from text columns and stores them in internal DR$ tables.
This structure allows Oracle to quickly locate documents containing specific terms.

Typical use cases include:

  • document repositories

  • knowledge bases

  • articles and reports

  • email archives

  • large descriptive text fields

Search operations use the CONTAINS operator.

Example search pattern:

 
CONTAINS(document_text,'oracle AND database') > 0
 

Wildcard Searching

CONTEXT indexes support wildcard searches such as:

 
BANK*
SARG%
A_T
 

Wildcards available in Oracle Text:

Symbol Meaning
  • | prefix match |
    % | multi-character wildcard |
    _ | single-character wildcard |

Efficient wildcard searching requires enabling prefix indexing through a WORDLIST preference.
This configuration instructs Oracle Text to store additional prefix tokens during indexing.

Key parameters include:

Parameter Meaning
PREFIX_INDEX enables prefix token indexing
PREFIX_MIN_LENGTH minimum prefix stored
PREFIX_MAX_LENGTH maximum prefix stored

Without these preferences, wildcard queries may require token scanning, which reduces performance.


2. CTXCAT Index

Purpose

The CTXCAT index is optimized for catalog-style searches, where text searching must be combined with structured column filtering.

Unlike CONTEXT indexes, CTXCAT indexes are designed for:

  • short text fields

  • transactional systems

  • frequent updates

Typical use cases include:

  • product catalogs

  • item descriptions

  • searchable transaction labels

  • attribute filtering with numeric or date conditions

Search operations use the CATSEARCH operator.

Example search pattern:

 
CATSEARCH(product_name,'laptop','price < 1000')
 

In this query:

  • 'laptop' represents the text search expression

  • 'price < 1000' represents the structured filter

CTXCAT indexes update automatically during DML operations, so they do not require synchronization.


3. CTXRULE Index

Purpose

The CTXRULE index supports rule-based text classification.

Instead of searching documents, CTXRULE indexes rules expressed as text queries.
Incoming text is then evaluated against these rules to determine which rules match.

This reverses the traditional search direction:

 
query → documents
 

becomes

 
document → matching rules
 

Typical use cases include:

  • email routing systems

  • news topic filtering

  • fraud detection

  • alert systems

  • automatic document classification

Matching operations use the MATCHES operator.

Example pattern:

 
MATCHES(rule_expression,'incoming document text') > 0
 

The result identifies which rules apply to the document.


Comparison of Oracle Text Index Types

Feature CONTEXT CTXCAT CTXRULE
Primary purpose document search catalog search rule classification
Typical text size large documents short text fields rule expressions
Structured filtering limited strong support not applicable
DML updates requires synchronization automatic automatic
Search operator CONTAINS CATSEARCH MATCHES

Conceptual Roles

The three index types address different search scenarios:

 
CONTEXT → search documents
CTXCAT → search catalog records
CTXRULE → classify documents using rules