NHS National Record Locator (NRL) API Integration: Error Handling, Response Codes, and Edge Cases

Integrating with the NHS National Record Locator (NRL) is less about “making a request and getting a DocumentReference back” and more about building a client that stays safe, predictable and diagnosable under real-world pressure. In production, your integration will be exercised by partial patient context, mixed producer behaviours, evolving pointer lifecycles, strict security boundaries, and the occasional infrastructure wobble. The difference between a fragile integration and an operationally mature one is almost always in the details: how you interpret response codes, how you parse and categorise FHIR OperationOutcome, and how you design for the awkward edge cases that only appear once you have volume.

NRL’s error model is also deceptively layered. Some failures are generated by the NRL service itself (for example, duplicate pointers or business-rule validation), while others come from shared Spine components (for example, missing headers or unsupported media types), and still others can be produced by the Spine Secure Proxy (SSP) in front of the service (for example, downstream timeouts). Your application can’t treat all 4xx and 5xx responses as identical “errors”; it needs a policy-driven approach that decides what to show to users, what to retry, what to log, and what to escalate.

This article focuses on the parts that typically cause the most integration pain: error handling patterns, response code interpretation, and the edge cases that appear once you move beyond happy-path demos. The aim is to help you design an NRL client that is resilient, supportable, and aligned with how the NRL ecosystem behaves in practice.

NRL error handling fundamentals: HTTP status codes plus FHIR OperationOutcome

NRL error handling works best when you treat the HTTP status code as the envelope and OperationOutcome as the payload that tells you what actually happened. The status code tells you the broad class of failure (client error vs server error vs transient gateway problem), while OperationOutcome.issue provides the structured reason, including severity, an issue type, and a Spine-aligned error code that can be used for consistent categorisation across national APIs.

A common early mistake is to build logic such as “400 means the request is wrong; 500 means retry”. That’s not sufficient for NRL. A 400 could be a missing or invalid mandatory header, a malformed payload, an invalid NHS number, an invalid search parameter, or a business-rule violation in a DocumentReference that looks syntactically valid but fails NRL pointer rules. Each of these requires a different product response: some can be corrected by the user, some require configuration changes, and some should be treated as integration defects.

Conversely, not all errors that feel like “your fault” arrive as 4xx. If you send a request through SSP and the downstream service is unavailable, you may see a 502 “bad gateway” style response. If the downstream times out, a 504 can result. Those aren’t problems your users can fix, but your client still needs to provide a clear message, capture diagnostic context, and decide whether to retry.

Another subtlety is that the response format is not always what you requested. In the wider Spine ecosystem, some low-level infrastructure and security errors can be generated by components that do not honour content negotiation in the way your application expects. That means you should be prepared to parse OperationOutcome in more than one representation, or at least fail gracefully and surface the raw body in secure logs for troubleshooting. If your error-handling pipeline assumes JSON only, you can end up with an error about your inability to parse an error—which is the sort of operational irony that makes on-call shifts longer than they need to be.

Finally, treat OperationOutcome.id (where present) as an important correlation key. While you should never expose sensitive internal identifiers to end users in a way that leaks information, capturing that ID in logs and support tooling can materially reduce time-to-resolution when working with platform support teams. Pair it with your own correlation IDs and a request/response audit trail (with appropriate redaction) and you have the basis for supportable, evidence-driven troubleshooting.

NRL response codes you must handle in consumer and producer workflows

NRL integrations usually fall into two broad modes: consumer behaviours (searching and reading pointers) and producer behaviours (creating, updating/superseding, and deleting pointers). Each mode has distinct “normal” responses and distinct failure patterns, so your client should treat them as separate flows with their own success and error policies.

In consumer flows, a successful search typically returns a FHIR Bundle containing zero or more DocumentReference resources that match the criteria (such as NHS number and type filters). A successful read returns a single DocumentReference. In producer flows, successful create/supersede interactions return confirmation that the pointer was persisted (and may return the created resource or a response wrapper depending on the interaction pattern), while delete interactions confirm removal or inactivation according to the API’s lifecycle rules.

The important part is not merely recognising success, but recognising which kind of success you got. For example, “search succeeded and returned zero results” is not the same operationally as “search failed because the NHS number is invalid” or “search failed because mandatory headers were missing”. If you collapse these into a single “no records” behaviour, you will mask genuine defects and create unsafe user experiences.

The list below highlights the response codes that most commonly matter in NRL integration, along with the practical meaning you should assign them. Exact payload content varies by scenario, but this gives you a robust mapping that you can use as the backbone of your error-handling policy.

200 OK: The request succeeded. For consumer search, expect a Bundle. For consumer read, expect a DocumentReference. For some update/delete patterns, you may also see 200 with a confirmation payload. Your application should still validate that the returned resource(s) meet the profile expectations you rely on (for example, required fields you later use to build a “view document” link).
201 Created: The pointer was created successfully. Treat this as a positive outcome for producer create flows and record the created resource identifiers and master identifiers for future reconciliation.
400 Bad Request: This is a broad bucket and is where many integration issues surface. Common subtypes include missing/invalid headers, invalid parameters, invalid NHS number, payload syntax errors, and payload business-rule violations. Your client should not treat all 400s equally; it should parse OperationOutcome and categorise by the Spine error code and diagnostics.
401 Unauthorised / credential errors: Typically indicates authentication or token problems. These should be treated as configuration or operational issues rather than user-correctable input problems. Ensure you do not automatically retry in a tight loop; fix the underlying credential path.
403 Forbidden: Usually indicates authorisation failures or consent/access restrictions. Depending on context, it can mean the client is not authorised for the interaction (for example, ASID not authorised), access is denied, or the request doesn’t align with access controls. Users should see a clear “you don’t have permission” style message; support teams should see the detailed categorisation.
404 Not Found: In NRL, this can mean “no record found” for a referenced resource (such as a DocumentReference) or for a patient lookup context depending on the operation. It’s critical to distinguish “pointer not found” from “patient not found” or “organisation not found” patterns by reading OperationOutcome.details.
405 Method Not Allowed / Not Supported: Often indicates an invalid verb for an endpoint or a restriction that prevents certain methods. Treat as an integration defect and fix your client, not your data.
415 Unsupported Media Type: Indicates issues with Accept, Content-Type, or _format usage. This is usually a client configuration problem rather than a transient outage.
422 Unprocessable Entity: Commonly used where the request is well-formed but semantically invalid against FHIR profile constraints or business rules. This often overlaps conceptually with “invalid resource” and “invalid parameter” categories and should prompt payload validation improvements.
500 Internal Server Error: An unexpected server-side issue. Your client should log rich context, present a service-unavailable message to users, and consider retry behaviour only if it is safe and bounded.
501 Not Implemented: The server does not support the requested resource or operation. This is almost always a client compatibility/versioning issue rather than a transient fault.
502 Bad Gateway / 504 Gateway Timeout: Typically generated by gateway/proxy layers when downstream services are unavailable or slow. These are the classic candidates for controlled retries with backoff, because they are often transient.

One of the most helpful design decisions you can make is to build a small internal “NRL response interpreter” that produces a normalised result object for the rest of your application. Instead of sprinkling status checks across your codebase, you centralise the policy: parse HTTP status, parse OperationOutcome when present, extract a category, extract a user-facing message template, extract a support-facing diagnostic package, and return a single, structured decision.

That interpreter should also handle the reality that errors can be generated by different layers. For example, a missing fromASID or toASID header is not the same thing as an invalid DocumentReference business rule, even though both may be 400-series outcomes. Your interpreter should explicitly recognise “header validation” outcomes and route them to configuration remediation, not to “user input correction”.

Building a resilient NRL client: validation, retries, idempotency and observability

A robust NRL integration starts before you ever make a call. Pre-validation is not about duplicating server logic; it is about preventing avoidable failures that generate noise, latency, and operational cost. Validate NHS numbers locally (including checksum), validate that you have the required headers configured, validate that you can construct required identifiers and parameters, and validate content negotiation settings. The aim is to catch the obvious problems before they become runtime errors in a live clinical workflow.

For producer operations, pay particular attention to FHIR profile alignment and NRL pointer business rules. Many “mysterious” integration failures are actually predictable outcomes of a DocumentReference that is syntactically valid JSON but semantically wrong for NRL. The difference matters: “payload syntax” failures (malformed JSON) should never occur in production if you control serialisation properly, while “payload business rules” failures are a sign you need stricter modelling, better validation, or clearer mapping from your domain model into DocumentReference.

Retries are where resilience can turn into harm if you are not careful. Not every failure is retryable, and not every retry is safe. For NRL, a good retry strategy is based on three questions: is the error transient, is the operation idempotent, and can you bound the impact of retrying? Gateway timeouts and downstream offline responses are generally retry candidates; missing headers and invalid payloads are not. For create/supersede flows, idempotency is crucial: if your client times out after sending a create request, you must avoid creating duplicates when you retry.

Master identifiers are often part of the solution here. If you design your producer flow to use a consistent, stable masterIdentifier for a pointer, you can handle “unknown outcome” scenarios more safely. If a request times out, you can reconcile by searching for an existing pointer with that master identifier before you attempt another create. This turns a potentially unsafe retry into a controlled, evidence-based recovery workflow. It also reduces the chances of hitting duplicate rejection outcomes, which are operationally noisy and can confuse end users.

Observability is not optional in NRL integration; it is part of what makes the integration safe. At a minimum, every request should carry correlation identifiers that flow into logs, and every response—especially OperationOutcome—should be captured in a structured way. Do not just log “status 400”. Log the OperationOutcome.issue.code, the details coding code/display, and diagnostics (with appropriate redaction). Over time, you’ll build a dataset that reveals where your integration is weak: recurring missing header faults, recurring invalid parameter faults, recurring organisation reference errors, and so on.

A practical operational pattern is to define a small set of “NRL incident classes” and map your normalised response interpreter into them. For example: Configuration/headers, Authentication, Authorisation, Patient context, Parameter misuse, Payload mapping/business rules, Duplicate/idempotency, Service transient, Service hard failure. Once you have that taxonomy, you can drive consistent alerting thresholds, dashboards, and runbooks. Your on-call team will thank you, because they will not need to reverse engineer every failure from raw HTTP logs in the middle of an incident.

NRL edge cases that break naive integrations in production

Edge cases in NRL are rarely “exotic”; they are usually normal clinical reality colliding with assumptions baked into a simplistic client. The most common category is mismatch between what you think a pointer lifecycle should look like and what it actually looks like once updates, supersedes, and deletes occur over time.

A classic example is the “inactive DocumentReference” scenario. Many clients assume that if they can find a pointer’s identifier, they can read or modify it. In practice, pointer status and lifecycle rules matter. If a pointer is no longer current, the service may reject retrieval or modification attempts, and your application needs to interpret that outcome correctly. Operationally, this often happens when a record was superseded, replaced, or withdrawn, and your system is attempting to act on a stale identifier it cached earlier.

Organisation and custodian alignment is another common pitfall. NRL uses organisational identity in a way that can be surprising if you have not modelled it carefully. For producer-side operations like update or delete, you may need to ensure that the organisation identity in the pointer (for example, custodian) aligns with the identity associated with your sending configuration. If you attempt to update or delete a pointer that your system does not “own” according to those rules, the service can reject it. Your integration should treat this as a business-rule/authorisation-adjacent error, not as a transient failure.

Search edge cases can also cause confusion. Developers often interpret “404” as “no pointers exist”, but consumer searches may instead return an empty result set in a successful response. Meanwhile, a 404 response can indicate “no record found” for certain resolution behaviours, depending on the interaction. If your user interface shows “no documents available” for both cases, clinicians may not realise that the system is actually failing to search correctly due to a parameter or header error.

Another subtle edge case is content negotiation and format handling. If your client uses _format or sets Accept incorrectly, you can receive unsupported media type outcomes. This is straightforward to fix, but can be difficult to diagnose if your client only logs the HTTP status and not the OperationOutcome details. It can also cause secondary failures if your application assumes JSON but receives something else. The safest approach is to make your parsing robust and your logging explicit, so you can see format-related problems immediately.

Finally, consider the “duplicate pointer” and “unknown outcome” scenarios as first-class edge cases, not rare anomalies. Timeouts happen. Network partitions happen. If you do not design for idempotency and reconciliation, you will either create duplicates (which is bad) or you will refuse to retry (which can be worse in time-critical workflows). Your integration should have a deliberate approach for: request timed out, did it create or not? That approach is usually a combination of stable identifiers, follow-up searches, and safe retry limits.

The checklist below covers the edge cases that most often surface after go-live. If you explicitly design for these, you avoid the bulk of painful surprises.

Invalid or missing mandatory headers (for example, fromASID, toASID, Authorization) causing immediate 400-series failures that look like “NRL is down” if you don’t parse OperationOutcome.
Invalid NHS number in a consumer search or producer create request, particularly when upstream systems store temporary numbers, partial identifiers, or mis-keyed values.
Invalid _format, Accept, or Content-Type leading to unsupported media type outcomes and parsing failures.
Parameter shape issues such as missing system or value components in identifier-style parameters, or invalid type filters that don’t match expected coding systems.
Payload business-rule failures where DocumentReference is structurally valid but violates NRL pointer rules (missing mandatory fields, wrong cardinality, invalid references).
Duplicate rejections when creating a pointer with a masterIdentifier that already exists for the same patient.
Inactive pointer lifecycle where a DocumentReference is no longer current, causing read/modify attempts to fail unexpectedly.
Custodian/organisation mismatch in update/delete flows when the pointer’s custodian does not align with the organisation identity tied to the sending configuration.
SSP-generated transient errors (bad gateway, gateway timeout) that should trigger bounded retries and graceful degradation rather than hard failure.
Mixed-layer error formats where infrastructure errors return different representations than your client expects, causing secondary parsing errors.

When you encounter these edge cases, the goal is not just to “handle” them; it is to handle them in a way that maintains clinical trust. That means predictable messages, consistent behaviour, and avoidance of false reassurance. A clinician would rather see “Record locator temporarily unavailable—please try again” than “No documents found” if the system never actually performed a valid search.

Testing and operational readiness for NRL API error scenarios

Testing an NRL integration purely with happy-path requests is a fast route to a brittle production release. A better approach is to treat error scenarios as core functionality. Your goal is to prove that your system behaves correctly when it receives each major class of failure: it should show an appropriate message, record appropriate diagnostics, avoid unsafe retries, and recover when conditions return to normal.

Start with contract-style testing around your response interpreter. Feed it representative responses: 400 with missing/invalid header outcomes, 400 with invalid parameter outcomes, 400/422 with invalid resource outcomes, 404 “no record found” outcomes, 403 forbidden outcomes, and 502/504 transient gateway outcomes. Validate that each one produces the correct normalised category and correct application behaviour. This testing is far cheaper than diagnosing failures in production, and it gives you confidence that later refactors won’t accidentally degrade error handling.

In parallel, invest in payload validation testing for producer flows. This is where many NRL integrations struggle: the mapping from your domain model into a profile-conformant DocumentReference is complex, and small mistakes can produce confusing outcomes. Unit test your mapping layer with both valid and intentionally invalid payloads so you can see how the service responds and how your application reports that response. Make sure you include tests for duplicates, missing mandatory fields, and lifecycle transitions such as supersede behaviour.

Operational readiness is as much about tooling as it is about code. Your support and engineering teams need to be able to answer questions quickly: What request was sent? With what identifiers? What headers were included (safely redacted)? What did the service return? What was the OperationOutcome code and diagnostics? Was this error transient and retried? Did it succeed later? Without that visibility, teams tend to fall back on guesswork and repeated “try again” advice, which is frustrating and can be unsafe.

Monitoring should also reflect the layered nature of NRL failures. A spike in 400 missing-header outcomes suggests a deployment or configuration regression. A spike in 403 forbidden outcomes may indicate credential or authorisation changes. A spike in 502/504 outcomes points towards service or network instability and should trigger incident workflows rather than bug tickets against the client. If you only monitor “error rate”, you’ll end up treating fundamentally different problems as identical, which slows recovery and blurs accountability.

Finally, bake graceful degradation into your user journeys. NRL is often used in time-sensitive contexts, and if the locator is temporarily unavailable, users need an alternative path: guidance to retry, guidance to use local records, or instructions to contact the appropriate service desk depending on your organisational policy. The goal is to keep the workflow safe and transparent. A well-designed integration does not just fail; it fails usefully, with the minimum possible cognitive load on clinicians.

When all of these pieces are in place—policy-driven response interpretation, robust parsing, careful retry and idempotency design, explicit edge-case handling, and operational tooling—NRL integration becomes far more predictable. You spend less time chasing intermittent faults, you reduce clinical frustration, and you build an integration that can evolve with the wider NHS ecosystem rather than breaking whenever something changes upstream.

Need help with NHS National Record Locator (NRL) API integration? Get in touch today, or find out more about our NHS National Record Locator (NRL) API Integration services.

Get in touch

Need help with NHS National Record Locator (NRL) API integration?

Is your team looking for help with NHS National Record Locator (NRL) API integration? Click the button below.