ServiceNow

Enterprise

company

Verified

https://www.servicenow.com/

ServiceNow

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

Hari-sub authored a paper 14 days ago

EVA-Bench: A New End-to-end Framework for Evaluating Voice Agents

onifemibam authored a paper 17 days ago

Developing Safe and Responsible Large Language Models -- A Comprehensive Framework

gabegma authored a paper 17 days ago

Azimuth: Systematic Error Analysis for Text Classification

View all activity

Papers

Apriel-Reasoner: RL Post-Training for General-Purpose and Efficient Reasoning

CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents

View all Papers

Articles

PipelineRL

Apr 25, 2025

• 45

Organization Card

Community About org cards

Welcome to ServiceNow's page on HuggingFace!

ServiceNow® is the AI platform for business transformation. We bring intelligence to every corner of your business by offering a single, cloud-based platform that combines AI, data, and workflows to help enterprises automate and manage critical processes across IT, HR, security, and more. For more information on our company and its products, visit our corporate website: ServiceNow - Put AI to Work.

On this site here, you will find open-source publications, including work from our fundamental AI research team. You can also find more open-source publications on our GitHub organization.

Discover below a few of the projects we're especially proud to showcase.

Benchmarks

BigDocsBench is a benchmark designed to evaluate VLM document understanding at scale.

BrowserGym Leaderboard was created to evaluate LLMs, VLMs, and agents on web navigation tasks.

UI-Vision, a benchmark for GUI visual grounding.

Models

BigCode is an open scientific collaboration focused on the responsible development of LLM for code. It addresses the lack of transparency in LLM development by promoting open governance, open datasets, and collaborative research.

StarCoder is a state-of-the-art, 15 B-parameter open-source language model for code, trained on 1 trillion tokens extracted from GitHub repositories spanning over 80 programming languages, and it achieves top performance on benchmarks like HumanEval---surpassing both open and closed-source alternatives---while offering an extensive 8K+ context window and enhanced safety features like PII redaction and attribution tracing.

Apriel-Nemotron-15b-Thinker, a 15B-parameter reasoning model in ServiceNow's Apriel SLM series, delivering state-of-the-art performance on both enterprise and academic benchmarks while using only half the memory of larger models.

StarVector, a code-driven image generation framework.

AlignVLM, a VLM that adapts visual features for large language models

Datasets

The Stack v2 is the largest open-access pretraining dataset for code-focused LLMs---featuring 67.5 TB (≈900 billion tokens) of meticulously curated, deduplicated, and cleaned source code---enabling next-gen models like StarCoder2 to train effectively at scale.

Repliqa is a human-curated evaluation dataset designed to test how well LLMs use contextual information from provided documents. It contains context--question--answer triplets based on realistic but fictional documents about invented people, places, and events---removing the chance for models to rely on memorized facts.