How to Evaluate an AI Product Assistant for the Lighting Industry
Every AI vendor will tell you their product is accurate, fast, and easy to use. In the absence of a meaningful evaluation framework, those claims are impossible to distinguish. This guide gives you seven specific criteria for evaluating an AI product assistant in a lighting industry context — and the questions to ask that surface the real answers rather than the marketing ones.
These criteria are drawn from the failure modes that show up most consistently when AI tools are deployed in product knowledge workflows: wrong answers that sound right, correct answers that can't be verified, and tools that work in demos but degrade in daily use.
Criterion 1 — Does it answer from your catalog, or from general knowledge?
This is the single most important question in any AI product tool evaluation. There are two fundamentally different types of AI product assistants:
- Catalog-grounded tools that answer exclusively from documents you have provided — your spec sheets, ordering guides, compatibility tables, and product catalogs. Every answer is traceable to a specific source document.
- General-knowledge tools that answer from training data — a broad model that has ingested information from across the internet, including product information about your category. These tools can answer questions about lighting products in general, but cannot reliably answer questions about your specific part numbers, configurations, and current catalog.
In a product sales context, the distinction is critical. A tool that answers from general knowledge will produce plausible-sounding responses about products that may no longer exist, configurations that are not valid, and specifications that belong to a different manufacturer's product. The test is simple: ask the tool about a product that was discontinued two years ago and see what it says.
Ask the vendor: "If I add a new product to my catalog today, when will the AI be able to answer questions about it?" The answer reveals the architecture. Catalog-grounded tools respond in hours or less. General-knowledge tools require retraining — a process that takes days or weeks.
Criterion 2 — Can every answer be traced to a source?
In a lighting specification workflow, an answer is only as good as the ability to verify it. When a rep tells a specifier that a fixture has a specific lumen output at a specific CCT, that claim needs to be traceable to a photometric data sheet or product specification — not trusted on faith from an AI.
Source traceability also matters when an AI answer is wrong. If the tool cannot tell you which document it used to generate a response, you have no way to identify whether the error is in the document, in the AI's reading of the document, or in a gap between what you asked and what the document covers.
Ask the vendor: "When the AI gives an answer, can I see which document it came from and where in that document the information appears?" If the answer is no, or if the attribution is at the document level only rather than the section level, treat that as a limitation for audit and quality control purposes.
Criterion 3 — How does it handle what it doesn't know?
This criterion is often overlooked in AI evaluations because demos are designed to show what the tool can do, not what happens at the edges. But in daily use, the edge cases are constant: a product that was introduced last month and isn't in the indexed documents yet, a compatibility question the spec sheet doesn't address, a configuration that falls outside the documented range.
The right behaviour is to acknowledge the gap clearly and direct the user to the appropriate resource — a product specialist, a technical support contact, or the manufacturer directly. The wrong behaviour is to generate a confident answer from inference that may be incorrect.
Test this in the demo: Ask about a product that is not in the catalog, or ask a question that your own team cannot answer without consulting an engineer. Evaluate whether the tool says "I don't have that information" or whether it produces something that sounds plausible but is unverifiable.
Criterion 4 — How quickly does it reflect catalog changes?
A lighting manufacturer's catalog is not static. Products are added, discontinued, and revised continuously. Pricing changes. Certifications expire and are renewed. Compatibility rules change when a new accessory line is introduced.
An AI product tool that doesn't reflect your current catalog is not just inconvenient — it is a liability. A rep who confidently quotes a discontinued product because the AI didn't know it was discontinued has a problem that goes beyond a wrong quote.
Ask the vendor: "What is the process for updating the AI when my catalog changes? How long does it take for a new spec sheet to be queryable?" The answer should be measured in hours, not weeks.
Criterion 5 — Is the deployment under your brand?
The brand question matters more than it might initially seem. An AI product assistant that presents as a third-party tool — even a good one — introduces a layer of distance between your company and the service experience. Your reps and your customers are interacting with someone else's interface.
A white-label deployment that presents under your brand name, your visual identity, and your product vocabulary creates a different experience — one where your company is the source of the intelligence, not a conduit to someone else's platform.
This distinction also has commercial implications. A branded tool reinforces your company's expertise every time it's used. A third-party tool reinforces the platform's brand instead.
Criterion 6 — Where does your data go, and who can access it?
Product documentation is proprietary. Ordering guides, compatibility tables, and application notes represent years of engineering work and market expertise. Understanding where this data goes when you upload it to an AI platform is not a legal formality — it is a business question.
The minimum standard is a written data processing agreement that addresses: whether your data is used to train shared models, how long it is retained after contract termination, who on the vendor's team can access it, and what the deletion process looks like.
See our full guide on AI data security for manufacturers for a complete checklist of questions to ask any AI vendor before uploading proprietary documentation.
Criterion 7 — Can it handle the complexity of your actual catalog?
Generic demos use simple queries. "What is the lumen output of this fixture?" is a straightforward retrieval question. The real complexity of lighting product sales looks different:
- "Is this trim ring compatible with this housing when installed in a remodel application with a 5-inch diameter hole?"
- "We need a 4000K, 90 CRI, wet-location luminaire that can work with a trailing-edge dimmer and mount to a 1-inch stem — what's the closest option in the current catalog?"
- "The contractor has already ordered this housing. What trims are compatible and what's the lead time on the 80 CRI version?"
These questions require the AI to hold multiple constraints simultaneously, cross-reference multiple documents, and understand the relationship between product components. Test with questions from your actual daily workflow — not from the demo script the vendor prepared.
The evaluation process in practice
The most reliable evaluation method is a pilot against a section of your own catalog. Ask the vendor to index a representative sample of your product documentation — one or two product families with the full accompanying documentation. Then have your most experienced product specialists ask the questions they answer every day.
What you're looking for is not perfection — it's the ratio of correct and verifiable answers to incorrect or unverifiable ones, and the quality of the tool's behaviour when it reaches the limits of its knowledge.
Evaluate Aurex against your own catalog
Every Aurex demo runs against real manufacturer documentation
We don't use a generic demo catalog. We configure Aurex against a sample of your actual product documentation — your spec sheets, your ordering guides, your compatibility tables — and show you exactly how it performs against the questions your team asks every day.
Every answer in the demo is traced to a source document. Every gap is acknowledged honestly. You see the real performance before any commitment.
Request a Demo