RightShip speeds up text-heavy PDF labeling by 3× with Label Studio Enterprise

RightShip is a global maritime risk management and environmental assessment organization, working with shipowners, charterers, and operators to improve safety standards and reduce risk across the shipping industry. A key part of this mission involves analyzing inspection findings and close-out reports from ship managers, which include detailed explanations of root cause, corrective action, and preventive action for issues identified on vessels.

These reports contain valuable operational insight, but the data is difficult to use in its raw form. The documents are text-heavy, inconsistent in format, and often challenging to process reliably with standard OCR workflows. As a result, much of this information remains underutilized despite its potential to inform risk models and improve decision-making.

To make this information usable, RightShip’s AI team built a workflow that extracts structured text from inspection PDFs, routes that content into human review, and uses subject matter experts to create high-quality labeled data for downstream systems. As the team moved from early experiments into repeatable workflows, they needed a better way to manage annotation, support non-technical reviewers, and maintain data quality over time.

With Label Studio Enterprise, RightShip created a more structured and scalable process for turning complex document data into reliable training and evaluation data.

The challenge: unlocking valuable data trapped in unstructured PDFs

Before adopting Label Studio, RightShip’s team was managing labeling work in spreadsheets. That process created significant back-and-forth with subject matter experts and did not scale as experiments expanded. Each project required its own spreadsheet, and every new batch meant re-sending files and walking reviewers through what to label and how to label it.

In practice, that overhead added up quickly. The team held a recurring one-hour coordination meeting each week with five people (two members of the AI team and three SMEs) just to align on labeling expectations and project setup. As document volume increased, spreadsheets also became a poor fit for long, text-heavy fields. Scrolling through large blocks of text slowed reviewers down, and in some cases the files became difficult to manage or would crash.

The team also faced a more fundamental challenge: the source documents themselves were not standardized. Inspection reports arrived in different formats, and the sections that mattered most were not always easy to extract reliably. Rather than relying on OCR alone, RightShip developed a workflow that converted PDFs into images and used multimodal models to extract the relevant content. That extracted text was then mapped into structured fields such as root cause, corrective action, and preventive action.

Once the data was prepared, the next step was making it usable for subject matter experts. That meant giving reviewers a simple, consistent way to classify text without asking them to work inside technical tools or manage large taxonomies manually.

Why RightShip chose Label Studio

RightShip needed a labeling environment that could support structured classification tasks while remaining easy for non-technical subject matter experts to use.

Label Studio Enterprise also made it easier to operationalize work with SMEs. Instead of sending spreadsheets and scheduling a walkthrough for each new project, RightShip could publish a project and notify SMEs that the data was ready. The team also used the project description to clarify instructions directly in the workflow, which reduced setup time and made it easier for SMEs to start annotating independently.

Label Studio Enterprise also gave the team a more intuitive interface for annotation, including structured choices that reduced the need for reviewers to type labels manually. This was especially important in a workflow with a large number of possible categories. Instead of relying on spreadsheet conventions and repeated clarification, the team could give reviewers a cleaner experience with clearly defined options and a more controlled labeling process.

Label Studio also fit naturally into a broader workflow that combines model-based extraction, human review, and ongoing dataset updates. Rather than building and maintaining an internal labeling layer from scratch, RightShip was able to use Label Studio as the central system for human annotation while connecting it to the rest of its pipeline through exports and API-based workflows.

Spreadsheets didn’t scale for us.

With Label Studio, we reduced average annotation time on heavy tasks from ~15 minutes to ~5 minutes per task, and structured dropdowns made labeling faster and more consistent.

Disha Grover

AI Engineering Lead

The workflow: from inspection PDF to usable labeled data

RightShip’s process begins with inspection PDFs and close-out reports submitted by ship managers. Because these documents vary widely in structure, the team first converts them into images and uses multimodal AI to extract the relevant text. An additional model then organizes that extracted content into specific sections, such as root cause, corrective action, and preventive action.

That structured output becomes the basis for annotation tasks in Label Studio.

Inside Label Studio, subject matter experts review the extracted text and apply classification labels, including categories such as risk severity. This gives RightShip a more consistent way to capture expert judgment and turn document-based operational knowledge into structured training data.

The labeled outputs then feed back into the team’s broader AI workflow, including periodic refreshes to the vector store and updates to downstream systems that rely on this data.

This matters because RightShip is not treating labeling as a one-time exercise. The AI team continues to own the labeling and data refresh process over time, which makes repeatability and workflow quality especially important.

Supporting quality and consistency at scale

As RightShip expanded its work, maintaining label quality across multiple reviewers became a core part of the project.

The team built a high-confidence review process in which multiple annotators review the same items, helping ensure that important classifications are not based on a single opinion. For projects with multiple labels per task, this provides a stronger foundation for consistency and more confidence in the final dataset.

Label Studio Enterprise supports this kind of structured review workflow by giving the team a centralized environment for annotation, overlap, tracking, and export. RightShip has also explored review and adjudication workflows that can help streamline how final decisions are made when multiple annotators are involved.

For a team running repeated experiments and ongoing refresh cycles, that quality layer is critical. It helps ensure that the data used to support downstream AI systems remains reliable as projects scale.

Before Label Studio, the team regularly held a one-hour weekly alignment session with both the AI team and SMEs to keep projects moving and ensure instructions were clear. With Label Studio, much of that project guidance moved into the workflow itself, which made SME participation more self-serve.

Disha Grover

AI Engineering Lead

From early experimentation to broader rollout

RightShip initially adopted Label Studio during an early evaluation phase, then moved to a full Enterprise deployment as the value of the workflow became clearer. As internal adoption grew, the project expanded beyond the initial setup into a broader rollout across teams and experiments.

By early 2026, RightShip’s AI team had expanded beyond a single experiment into multiple parallel annotation efforts. With a workflow that SMEs could pick up quickly and a faster per-task annotation cadence (15 minutes down to ~5 minutes on average), the team could publish new projects more predictably and keep labeling moving without constant coordination. That shift reflects a move from one-off experimentation to a repeatable operational process for producing labeled data.

Outcomes

While the exact metrics are still being finalized, the project already points to several clear outcomes:

3× faster annotation on text-heavy tasks: average time per task decreased from 15 minutes to 5 minutes, driven by a workflow better suited for long text and structured labeling.
Less coordination overhead for project setup: SMEs could start work as soon as a project was published, with instructions captured directly in the project description rather than repeated walkthroughs.
Better operational visibility: the team could see what was pending, track pace, and publish new projects as soon as SMEs were ready, keeping work moving without manual status chasing.
A repeatable process for ongoing dataset refreshes: the workflow supported continued labeling and updates over time rather than one-off rounds.

As RightShip continues to expand its AI work, Label Studio Enterprise gives the team a practical foundation for turning complex maritime inspection documents into usable, high-quality data.