finnvoorhees
/

ModernBERT-CoreML

Model card Files Files and versions Community

ModernBERT-CoreML / README.md

finnvoorhees's picture

Auto-generate attention mask

19e58f9 23 days ago

|

1.44 kB

	---
	license: apache-2.0
	language:
	- en
	base_model:
	- answerdotai/ModernBERT-base
	- answerdotai/ModernBERT-large
	base_model_relation: quantized
	tags:
	- fill-mask
	- masked-lm
	- long-context
	- modernbert
	---

	# ModernBERT-CoreML

	This repo contains [ModernBERT-base](https://huggingface.co/answerdotai/ModernBERT-base) and [ModernBERT-large](https://huggingface.co/answerdotai/ModernBERT-large) converted to CoreML.

	### Example Usage

	```swift
	import CoreML
	import Tokenizers

	let text = "The capital of Ireland is [MASK]."

	print("Loading…")
	let model = try await ModernBERT_base.load()
	let tokenizer = try await AutoTokenizer.from(pretrained: "answerdotai/ModernBERT-base")

	print("Tokenizing…")
	let tokens = tokenizer(text)
	let inputIDs = MLShapedArray(scalars: tokens.map(Int32.init), shape: [1, tokens.count])
	let input = ModernBERT_baseInput(input_ids: inputIDs)

	print("Predicting…")
	let output = try await model.prediction(input: input)
	let logits = output.logitsShapedArray

	print("Decoding…")
	let maskPosition = tokens.firstIndex(of: tokenizer.convertTokenToId("[MASK]")!)!
	let predictedTokenID = await MLTensor(logits[0, maskPosition]).argmax().shapedArray(of: Int32.self).scalar!
	let predictedTokenText = tokenizer.decode(tokens: [Int(predictedTokenID)])

	print("Result:")
	print(text.replacingOccurrences(of: "[MASK]", with: predictedTokenText.trimmingCharacters(in: .whitespaces)))
	// The capital of Ireland is Dublin.
	```