Relevance Evaluation Specialist

The Relevance Evaluation Specialist is responsible for assessing how effectively an AI-powered search engine surfaces and generates content in response to user queries. This role focuses on evaluating the alignment between user intent and system outputs, including documents, images, videos, and AI-generated answers. The evaluator applies structured judgment to determine whether retrieved or generated content satisfies the user's information need, using predefined relevance criteria and standardized scoring guidelines.

Core responsibilities include reviewing query–artifact pairs and assigning relevance scores on a five-point scale, ranging from not relevant to fully satisfying the user's intent. The specialist must analyze and infer user intent by examining query language, system-provided context such as timing and requester metadata, and any available interaction history. Accurate evaluation requires using the AI search engine's retrieval and assistant capabilities to reproduce queries, compare expected versus actual results, and determine the appropriateness of surfaced content.

In addition to document relevance, the role involves evaluating the quality of AI-generated responses. This includes assessing factual correctness, completeness, clarity, and whether the response appropriately addresses the inferred user intent. The evaluator must also determine whether the system behaves correctly when information is incomplete or unavailable, such as providing partial answers or explicitly acknowledging uncertainty. To support accurate assessments, the specialist is expected to independently research unfamiliar terminology, concepts, or domain-specific knowledge referenced in queries or responses.

The position requires the ability to formulate effective search queries and conversational prompts, including constructing and executing KQL queries against Elasticsearch indices when necessary to validate retrieval behavior. Evaluation decisions must be applied consistently across a high volume of tasks and documented clearly in accordance with established quality, audit, and compliance standards. Accuracy, consistency, and adherence to evaluation guidelines are prioritized over speed.

Qualified candidates must demonstrate strong English reading comprehension and written communication skills, the ability to reason through ambiguous or underspecified queries, and disciplined attention to detail. Familiarity with modern search behavior—including keyword-based, natural-language, and conversational querying—is required, along with comfort using web-based tools and internal evaluation platforms.

Preferred qualifications include prior experience in relevance labeling, search quality evaluation, content review, or data annotation; exposure to AI-driven search or information retrieval systems; and the ability to write basic Python scripts for task automation or data handling. Familiarity with Elasticsearch concepts or query languages, as well as experience evaluating technical, business, or operational content across varied domains, is considered beneficial.

This role operates within a rule-driven evaluation environment that emphasizes independent judgment within defined guidelines. Performance is measured by consistency, correctness, and alignment with relevance standards, with direct impact on the accuracy, usefulness, and reliability of the AI-powered search engine.

Salary Range: $23.49 - $37.10 USD (Hourly)

Astreya offers comprehensive benefits to all Regular, Full-Time Employees, including:

Medical provided through UHC (PPO, HSA, Surest options) / Medical provided through Kaiser (HMO option only) for California employees only
Dental provided through UHC
Nationwide Vision provided by UHC
Flexible Spending Account for Health & Dependent Care
Pre-Tax Account for Commuter Benefit/Parking & Transit (location-specific)
Continuing Education and Professional Development via various integrated platforms, e.g. Udemy and Coursera
Corporate Wellness Program provided by Goomi Group
Employee Assistance Program
Wellness Days
401k Plan
Basic and Supplemental Life Insurance
Short Term & Long Term Disability
Critical Illness, Critical Hospital, and Voluntary Accident Insurance
Tuition Reimbursement (available 6 months after start date, capped)
Paid Time Off (accrued and prorated, maximum of 120 hours annually)
Paid Holidays
Any other statutory leaves, paid time, or other ancillary benefits required under state and federal law

Suggest a correction

Service Desk Specialist I - Remote Eligible

Astreya

Free Jobs Digest

NoDegree

Relevance Evaluation Specialist

Service Desk Specialist I - Remote Eligible

About Astreya