Skip to content

Full-text Search

Full-text search allows users to search based on file content. Once enabled, Cloudreve extracts file content via Apache Tika when files are uploaded or updated, and sends it to Meilisearch for indexing. Users can then match text content within documents during searches to quickly locate target files.

Prerequisites

Full-text search depends on the following two external services. Please deploy them before enabling:

ServicePurposeDefault Port
Apache TikaExtract text content from files9998
MeilisearchIndex the extracted text content7700

Deploy Apache Tika Server

Apache Tika is an open-source content extraction tool that supports extracting text from common document formats for indexing.

Start Apache Tika Server via Docker:

bash
docker run -d --name tika -p 9998:9998 apache/tika:latest

Full Image

If you need the full Tika processing capabilities (including GDAL and Tesseract OCR), use the full image:

bash
docker run -d --name tika -p 9998:9998 apache/tika:latest-full

The full image includes built-in OCR support for English, Italian, French, Spanish, and German. To add other languages, refer to the Dockerfile for custom builds.

After starting, visit http://localhost:9998 to verify the service is running properly.

Deploy Meilisearch

Meilisearch is an open-source search engine used to index file content and provide efficient search capabilities.

Generate API Key

First, generate a random string as the Meilisearch Master Key:

bash
openssl rand -hex 32

Save this key securely, as it will be needed when configuring Cloudreve.

Start Meilisearch

Start Meilisearch via Docker:

bash
docker run -d \
  --name meilisearch \
  -p 7700:7700 \
  -e MEILI_MASTER_KEY='<Your API Key>' \
  -v $(pwd)/meili_data:/meili_data \
  getmeili/meilisearch:latest

After starting, visit http://localhost:7700 to verify the service is running properly.

In the Cloudreve admin panel, navigate to Filesystem -> Full-text Search to enable full-text search, and fill in the following configuration:

Indexer (Meilisearch)

ParameterDescription
Meilisearch EndpointMeilisearch service address, defaults to http://localhost:7700. If Cloudreve runs in a Docker container, use the address within the container network.
API KeyThe Master Key generated in the previous step.
AI Semantic SearchOptional feature, see AI Semantic Search for setup instructions.

Content Extractor (Apache Tika)

ParameterDescription
Apache Tika EndpointTika Server address, defaults to http://localhost:9998. If Cloudreve runs in a Docker container, use the address within the container network.
Supported ExtensionsSpecify the list of file extensions to index; files not in the list will be skipped. For supported file types, see Apache Tika Supported Formats.
Max File SizeFiles exceeding this size will not be indexed.

Chunking

ParameterDescription
Chunk SizeFile content will be chunked at this size before indexing to improve search accuracy. Meilisearch recommends a chunk size of approximately 1KB.

Verify Configuration

After saving the settings, refresh the page and upload a new document file, then try searching for content within the file to verify that indexing is working properly.

Reindex Existing Files

After enabling full-text search, Cloudreve does not automatically index existing files. To index historical files, go to the admin panel Filesystem -> Full-text Search -> Indexer (Meilisearch) -> Index Operations, and click Rebuild Index.

WARNING

Rebuilding the index will re-extract content and rebuild the index for all eligible files. This may take a long time if there are many files, so it is recommended to perform this during off-peak hours.

Next Steps

To enable AI semantic search for a smarter search experience, see AI Semantic Search.