本内容经过机器翻译。请参见机器翻译免责声明。

Frequently Asked Questions about Qsirch On-Premises LLM

关于 Qsirch 本地 LLM 的常见问题

最后修订日期: 2025-10-01

Applicable Products

Qsirch 6.0.0

Applied Firmware:

QTS 5.1.0 or above
QuTS hero 5.1.0 or above
QuTScloud 5.1.0 or above

Overview

Qsirch is a powerful search engine designed exclusively for QNAP NAS, enabling users to quickly locate files and information.

Qsirch 6.0 introduces support for on-premises LLM models with RAG Search and RAG multi-turn conversations, delivering smarter, context-aware search capabilities while keeping your data secure and private.

FAQs

Q1: Does Qsirch on-prem LLM send data outside?

No. All data processing, analysis, and LLM inference are performed locally within the NAS system. No content is uploaded or transmitted externally, ensuring data privacy and security.

Q2: Can I use on-prem LLM if my NAS does not have a GPU?

No. On-prem LLM inference requires GPU computation. Without a GPU, you can connect to a cloud LLM via API for RAG search and still experience AI-powered searches.

Q3: Can on-prem LLM and cloud LLM be used together?

Yes. If your NAS meets the hardware requirements for on-prem LLM and the model is downloaded, and you also connect to a cloud LLM via API, you can freely switch between model sources during RAG search.

Q4: How much NAS storage space is required for model deployment?

It depends on the model size. LLM model files typically range from several to tens of GBs. It is recommended to store models on SSDs to reduce loading and inference latency.

Q5: What is RAG multi-turn conversation?

RAG multi-turn conversation allows the AI to retain the current conversation context and provide follow-up analysis and responses based on previous search results, without requiring users to re-enter the full query each time.

Q6: How long is multi-turn conversation history retained?

The system retains a certain number of multi-turn conversation histories for future references and searches. When the storage limit is exceeded, older records are automatically deleted based on the last modified date to ensure performance and a smooth user experience.

Q7: Does multi-turn conversation significantly impact performance?

The more conversations and the longer the context, the more system resources are required. Under typical usage scenarios, the impact is minimal. However, if handling a large number of files or multiple searches simultaneously, response times may slow down. We recommend using multi-turn conversation when sufficient hardware resources (especially GPU/VRAM) are available to ensure the best experience.

Q8: Can I switch the search scope in RAG search?

Yes. The default search scope is "Global Search." If "Specified Folder Search" is selected, you must choose at least 1 folder and can select up to 50 folders.

Q9: Can on-prem LLM be shared across multiple NAS devices?

No. Each NAS must independently deploy and download the model. Model files cannot be directly shared between NAS devices.

Q10: Are model updates performed automatically?

Cloud models are updated alongside Qsirch version updates. On-prem models are updated with the LLM Core on a regular basis. Users must download the new version of Qsirch or LLM Core to avoid performance or compatibility issues due to version changes.

Q11: Can encrypted folders be included as data sources for RAG search?

Yes, but encrypted folders must be unlocked before searching; otherwise, the system cannot access their contents. Any folder accessible within Qsirch can be used as a source for RAG search.

Q12: Does using API integration with cloud LLM require additional costs?

Yes. Cloud LLM services (such as OpenAI, Google Gemini, etc.) are billed according to the provider's API pricing policy. QNAP does not charge additional fees for this integration.

Q13: Why can't the on-prem LLM be started? Is it related to GPU VRAM?

Yes. If the size of the on-prem model exceeds the available GPU memory (VRAM), the system will not be able to load the model successfully, which prevents the feature from starting. To ensure proper operation, check whether your GPU VRAM is sufficient for the selected model. If resources are insufficient, consider switching to a smaller model or upgrading your GPU hardware.

适用产品

Qsirch 6.0.0

应用的固件：

QTS 5.1.0 或以上
QuTS hero 5.1.0 或以上
QuTScloud 5.1.0 或以上

概述

Qsirch 是专为 QNAP NAS 设计的强大搜索引擎，使用户能够快速定位文件和信息。

Qsirch 6.0引入了对本地 LLM 模型的支持，结合 RAG 搜索和RAG 多轮对话，提供更智能、上下文感知的搜索功能，同时确保您的数据安全和隐私。

常见问题

Q1：Qsirch 本地 LLM 会将数据发送到外部吗？

不会。所有数据处理、分析和 LLM 推理均在 NAS 系统内本地进行。没有内容被上传或传输到外部，确保数据隐私和安全。

Q2：如果我的 NAS 没有 GPU，我可以使用本地 LLM 吗？

不能。本地 LLM 推理需要 GPU 计算。如果没有 GPU，您可以通过 API 连接到云 LLM 进行 RAG 搜索，仍然可以体验 AI 驱动的搜索。

Q3：本地 LLM 和云 LLM 可以一起使用吗？

是的。如果您的 NAS 符合本地 LLM 的硬件要求并且模型已下载，同时通过 API 连接到云 LLM，您可以在 RAG 搜索期间自由切换模型来源。

Q4：部署模型需要多少 NAS 存储空间？

这取决于模型的大小。LLM 模型文件通常在几个到几十 GB 之间。建议将模型存储在 SSD 上，以减少加载和推理延迟。

Q5：什么是 RAG 多轮对话？

RAG 多轮对话允许 AI 保留当前对话上下文，并根据之前的搜索结果提供后续分析和响应，而无需用户每次重新输入完整查询。

Q6：多轮对话历史保留多久？

系统保留一定数量的多轮对话历史以供未来参考和搜索。当超过存储限制时，旧记录会根据较后修改日期自动删除，以确保性能和流畅的用户体验。

Q7：多轮对话是否影响性能？

对话越多、上下文越长，系统资源需求越大。在典型使用场景下，影响较小。然而，如果同时处理大量文件或多次搜索，响应时间可能会变慢。我们建议在有足够的硬件资源（尤其是 GPU/VRAM）时使用多轮对话，以确保良好的体验。

Q8：我可以在 RAG 搜索中切换搜索范围吗？

可以。默认搜索范围是“全局搜索”。如果选择“指定文件夹搜索”，您必须选择至少 1 个文件夹，较多可以选择 50 个文件夹。

Q9：本地 LLM 可以在多个 NAS 设备之间共享吗？

不能。每个 NAS 必须独立部署和下载模型。模型文件不能在 NAS 设备之间直接共享。

Q10：模型更新是自动进行的吗？

云模型会随着 Qsirch 版本更新而更新。本地模型会定期与 LLM 核心一起更新。用户必须下载新版本的 Qsirch 或 LLM 核心，以避免因版本变化导致的性能或兼容性问题。

Q11：加密文件夹可以作为 RAG 搜索的数据源吗？

可以，但加密文件夹必须在搜索前解锁，否则系统无法访问其内容。任何在 Qsirch 中可访问的文件夹都可以用作 RAG 搜索的来源。

Q12：使用 API 集成云 LLM 需要额外费用吗？

是的。云 LLM 服务（如 OpenAI、Google Gemini 等）根据提供商的 API 定价政策收费。QNAP 不收取此集成的额外费用。

Q13：为什么本地 LLM 无法启动？是否与 GPU VRAM 有关？

是的。如果本地模型的大小超过可用的 GPU 内存（VRAM），系统将无法成功加载模型，从而无法启动该功能。为确保正常运行，请检查您的 GPU VRAM 是否足够支持所选模型。如果资源不足，请考虑切换到较小的模型或升级您的 GPU 硬件。

这篇文章有帮助吗？

是否

请告诉我们如何改进这篇文章：

这篇文章缺了重点讯息
这篇文章的解决方案没有用
这篇文章太过复杂
这篇文章包含了不正确的讯息
这篇文章的信息已过时

如果您想提供其他意见，请于下方输入。